Microarray: Difference between revisions

From 太極
Jump to navigation Jump to search
Line 145: Line 145:
* [https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbaa019/5753843?rss=1 A survey of gene expression meta-analysis: methods and applications] Daniel Toro-Domínguez 2020
* [https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbaa019/5753843?rss=1 A survey of gene expression meta-analysis: methods and applications] Daniel Toro-Domínguez 2020
* [https://cran.r-project.org/web/packages/rmeta/index.html rmeta package] which was used in the [https://bioconductor.org/packages/release/bioc/html/genefu.html genefu] package. See also [[R#R_Graphs_Gallery|R Graphs Gallery]].
* [https://cran.r-project.org/web/packages/rmeta/index.html rmeta package] which was used in the [https://bioconductor.org/packages/release/bioc/html/genefu.html genefu] package. See also [[R#R_Graphs_Gallery|R Graphs Gallery]].
* [https://youtu.be/IkduL5iRdqo Meta-Analysis in R with {metafor}] (video)

Revision as of 13:16, 31 May 2022

Overview

Analysis of microarray experiments of gene expression profiling

Pathway

Time Course

Limma package with few time points

> data(drosEmbryoRMA)
Warning message:
'drosEmbryoRMA' looks like a pre-2.4.0 S4 object: please recreate it 
  • Limma Guide Section 9.6 (no .Rd file nor R code!).
# Step 1 - Read the design and create eset.
targets <- read.table(file='stdin', header=T)
FileName	Target
File1	wt.0hr
File2	wt.0hr
File3	wt.6hr
File4	wt.24hr
File5	mu.0hr
File6	mu.0hr
File7	mu.6hr
File8	mu.24hr
# Hit Ctrl+D twice.
# eset = rma(ReadAffy())
exprs <- matrix(rnorm(1000*8), nr=1000)
eset <- ExpressionSet(assayData = exprs)
colnames(exprs) <- targets[,1]

lev <- c("wt.0hr","wt.6hr","wt.24hr","mu.0hr","mu.6hr","mu.24hr")
f <- factor(targets$Target, levels=lev)
design <- model.matrix(~0+f)
colnames(design) <- lev
fit <- lmFit(eset, design)

# Step 2 - Which genes respond at either the 6 hour or 24 hour times in the wild-type?
cont.wt <- makeContrasts(
"wt.6hr-wt.0hr",
"wt.24hr-wt.6hr",
levels=design)
fit2 <- contrasts.fit(fit, cont.wt)
fit2 <- eBayes(fit2)
topTableF(fit2, adjust="BH")
# Output:
    wt.6hr.wt.0hr wt.24hr.wt.6hr     AveExpr        F     P.Value adj.P.Val
112     3.2749023     -4.5401508  0.23644620 7.192590 0.006274150 0.9079656
142     0.2028617      3.5727319  0.31292924 6.425034 0.009419387 0.9079656
151    -2.4795481      4.5473025  0.28953869 6.138177 0.011026934 0.9079656
830     0.9768946      2.7571045  0.55529723 5.982599 0.012027262 0.9079656
371    -2.5848778      4.4752189 -0.68087827 5.813166 0.013235184 0.9079656
861    -3.6229467      1.2348225 -0.19658086 5.810336 0.013256488 0.9079656
645     2.6729536     -4.3124373  0.09475226 5.789848 0.013411914 0.9079656
926     3.5387477     -0.9448989  0.15060787 5.484057 0.015994787 0.9079656
113     2.5057968     -4.0472781  0.12254199 5.372589 0.017072749 0.9079656
802     0.5210958      2.9178136 -0.54632441 5.140624 0.019589928 0.9079656
> cont.wt
         Contrasts
Levels    wt.6hr-wt.0hr wt.24hr-wt.6hr
  wt.0hr             -1              0
  wt.6hr              1             -1
  wt.24hr             0              1
  mu.0hr              0              0
  mu.6hr              0              0
  mu.24hr             0              0

# Step 3- Which genes respond (i.e., change over time) in the mutant?
cont.mu <- makeContrasts(
"mu.6hr-mu.0hr",
"mu.24hr-mu.6hr",
levels=design)
fit2 <- contrasts.fit(fit, cont.mu)
fit2 <- eBayes(fit2)
topTableF(fit2, adjust="BH")
# Output:
    mu.6hr.mu.0hr mu.24hr.mu.6hr     AveExpr        F     P.Value adj.P.Val
203   -4.70500261      4.3795233 -0.52285214 8.753341 0.002919147 0.8967851
797   -3.39270257     -0.5862948 -0.02746123 8.328550 0.003568277 0.8967851
828   -4.25071499      1.8429580  0.34257837 7.590078 0.005125338 0.8967851
764    0.02937019      3.7623514  0.32860517 6.991332 0.006965102 0.8967851
570   -2.14961101     -2.1415450  0.52207117 6.558165 0.008764670 0.8967851
457    2.28175860     -4.5011533 -0.25060779 6.522692 0.008933944 0.8967851
593    2.57911585      1.3339224  0.18262853 6.460369 0.009240381 0.8967851
693   -0.43909594      3.8805006 -0.30368945 6.117644 0.011153404 0.8967851
414    1.76240994     -4.3652464  0.12864332 5.887795 0.012686997 0.8967851
309   -3.57816645      3.4458003 -0.09221823 5.847096 0.012982697 0.8967851

# Step 4- Which genes respond differently over time in the mutant relative to the wild-type?
cont.dif <- makeContrasts(
Dif6hr =(mu.6hr-mu.0hr)-(wt.6hr-wt.0hr),
Dif24hr=(mu.24hr-mu.6hr)-(wt.24hr-wt.6hr),
levels=design)
fit2 <- contrasts.fit(fit, cont.dif)
fit2 <- eBayes(fit2)
topTableF(fit2, adjust="BH")
# Output:
        Dif6hr   Dif24hr     AveExpr        F     P.Value adj.P.Val
797 -6.0630607  1.468086 -0.02746123 8.922239 0.002699040 0.8685775
18  -1.7460726  7.130525  0.12631684 8.707761 0.002981974 0.8685775
764 -0.3370846  5.759971  0.32860517 7.512472 0.005329540 0.8685775
564  5.9827916 -4.071109  0.11052980 7.237032 0.006132233 0.8685775
24  -5.3124510  2.462141 -0.07153017 6.291352 0.010133010 0.8685775
113 -4.1366307  5.990927  0.12254199 6.138815 0.011023028 0.8685775
313 -2.0419712  6.157636 -0.12618538 5.979547 0.012047886 0.8685775
913 -4.4262959  6.195062  0.46998972 5.562351 0.015283627 0.8685775
926 -4.9695447  1.222076  0.15060787 5.496894 0.015875656 0.8685775
674 -0.9636603 -3.808384  0.13392844 5.020330 0.021058895 0.8685775

A case study using Limma package

http://www.biomedcentral.com/1756-0500/3/81

Limma package with many time points

The Limma user guide mentioned that when there are many time points for each group, it is reasonable to assume the expression changes smoothly over time rather than making discrete jumps from one time point to another. This type of time course can be analyzed by fitting a temporal trend using a regression spline or a polynomial.

Consider the case with 2 groups. Each group has 16 time points. We use a cubic spline curve with a modest number of knots. Choosing effective degrees of freedom to be in range 3-5 is reasonable.

library(splines)
X <- ns(targets$Time, df =5)
Group <- factor(targets$Group)
design <- model.matrix(~Group*X)
fit <- lmFit(y, design)
fit <- eBayes(fit) 
# This creates a model with 12 parameters, with the last 5 corresponding 
# to interaction, i.e., to differences in the curves between groups. 
# To detect genes with different time trends for treatment vs control:
topTable(fit, coef=8:12)

This conducts a moderated F-test for each gene on 5 df, which can detect very general differences between the treatment and control curves.

Note that for this analysis, it is not necessary to have replicates, nor is it necessary for the two treatment groups to be observed at identical time points.

maSigPro package

timecourse package

meta analysis