Microarray: Difference between revisions
| Line 2: | Line 2: | ||
* http://www.biostat.jhsph.edu/~ririzarr/688/ | * http://www.biostat.jhsph.edu/~ririzarr/688/ | ||
== Limma package == | == Limma package with few time points == | ||
* http://master.bioconductor.org/help/course-materials/2005/BioC2005/labs/lab01/drosEmbryo/ or http://bioinf.wehi.edu.au/marray/jsm2005/lab5/lab5.html. Note the package drosEmbryo is not available on Bioc although it can be downloaded from [http://bioinf.wehi.edu.au/marray/jsm2005/Required/Software/drosEmbryo.html bioinf.wehi.edu.au]. It still cannot be used. | * http://master.bioconductor.org/help/course-materials/2005/BioC2005/labs/lab01/drosEmbryo/ or http://bioinf.wehi.edu.au/marray/jsm2005/lab5/lab5.html. Note the package drosEmbryo is not available on Bioc although it can be downloaded from [http://bioinf.wehi.edu.au/marray/jsm2005/Required/Software/drosEmbryo.html bioinf.wehi.edu.au]. It still cannot be used. | ||
| Line 111: | Line 111: | ||
== A case study using Limma package == | == A case study using Limma package == | ||
http://www.biomedcentral.com/1756-0500/3/81 | http://www.biomedcentral.com/1756-0500/3/81 | ||
== Limma package with many time points == | |||
The Limma user guide mentioned that when there are many time points for each group, it is reasonable to assume the expression changes smoothly over time rather than making discrete jumps from one time point to another. This type of time course can be analyzed by fitting a temporal trend using a regression spline or a polynomial. | |||
Consider the case with 2 groups. Each group has 16 time points. We use a cubic spline curve with a modest number of knots. Choosing effective degrees of freedom to be in range 3-5 is reasonable. | |||
<pre> | |||
library(splines) | |||
X <- ns(targets$Time, df =5) | |||
Group <- factor(targets$Group) | |||
design <- model.matrix(~Group*X) | |||
fit <- lmFit(y, design) | |||
fit <- eBayes(fit) | |||
# This creates a model with 12 parameters, with the last 5 corresponding | |||
# to interaction, i.e., to differences in the curves between groups. | |||
# To detect genes with different time trends for treatment vs control: | |||
topTable(fit, coef=8:12) | |||
</pre> | |||
This conducts a moderated F-test for each gene on 5 df, which can detect very general differences between the treatment and control curves. | |||
Note that for this analysis, it is not necessary to have replicates, nor is it necessary for the two treatment groups to be observed at identical time points. | |||
== timecourse package == | == timecourse package == | ||
Revision as of 13:41, 3 November 2014
Time Course
Limma package with few time points
- http://master.bioconductor.org/help/course-materials/2005/BioC2005/labs/lab01/drosEmbryo/ or http://bioinf.wehi.edu.au/marray/jsm2005/lab5/lab5.html. Note the package drosEmbryo is not available on Bioc although it can be downloaded from bioinf.wehi.edu.au. It still cannot be used.
> data(drosEmbryoRMA) Warning message: 'drosEmbryoRMA' looks like a pre-2.4.0 S4 object: please recreate it
- Limma Guide Section 9.6 (no .Rd file nor R code!).
# Step 1 - Read the design and create eset.
targets <- read.table(file='stdin', header=T)
FileName Target
File1 wt.0hr
File2 wt.0hr
File3 wt.6hr
File4 wt.24hr
File5 mu.0hr
File6 mu.0hr
File7 mu.6hr
File8 mu.24hr
# Hit Ctrl+D twice.
# eset = rma(ReadAffy())
exprs <- matrix(rnorm(1000*8), nr=1000)
eset <- ExpressionSet(assayData = exprs)
colnames(exprs) <- targets[,1]
lev <- c("wt.0hr","wt.6hr","wt.24hr","mu.0hr","mu.6hr","mu.24hr")
f <- factor(targets$Target, levels=lev)
design <- model.matrix(~0+f)
colnames(design) <- lev
fit <- lmFit(eset, design)
# Step 2 - Which genes respond at either the 6 hour or 24 hour times in the wild-type?
cont.wt <- makeContrasts(
"wt.6hr-wt.0hr",
"wt.24hr-wt.6hr",
levels=design)
fit2 <- contrasts.fit(fit, cont.wt)
fit2 <- eBayes(fit2)
topTableF(fit2, adjust="BH")
# Output:
wt.6hr.wt.0hr wt.24hr.wt.6hr AveExpr F P.Value adj.P.Val
112 3.2749023 -4.5401508 0.23644620 7.192590 0.006274150 0.9079656
142 0.2028617 3.5727319 0.31292924 6.425034 0.009419387 0.9079656
151 -2.4795481 4.5473025 0.28953869 6.138177 0.011026934 0.9079656
830 0.9768946 2.7571045 0.55529723 5.982599 0.012027262 0.9079656
371 -2.5848778 4.4752189 -0.68087827 5.813166 0.013235184 0.9079656
861 -3.6229467 1.2348225 -0.19658086 5.810336 0.013256488 0.9079656
645 2.6729536 -4.3124373 0.09475226 5.789848 0.013411914 0.9079656
926 3.5387477 -0.9448989 0.15060787 5.484057 0.015994787 0.9079656
113 2.5057968 -4.0472781 0.12254199 5.372589 0.017072749 0.9079656
802 0.5210958 2.9178136 -0.54632441 5.140624 0.019589928 0.9079656
> cont.wt
Contrasts
Levels wt.6hr-wt.0hr wt.24hr-wt.6hr
wt.0hr -1 0
wt.6hr 1 -1
wt.24hr 0 1
mu.0hr 0 0
mu.6hr 0 0
mu.24hr 0 0
# Step 3- Which genes respond (i.e., change over time) in the mutant?
cont.mu <- makeContrasts(
"mu.6hr-mu.0hr",
"mu.24hr-mu.6hr",
levels=design)
fit2 <- contrasts.fit(fit, cont.mu)
fit2 <- eBayes(fit2)
topTableF(fit2, adjust="BH")
# Output:
mu.6hr.mu.0hr mu.24hr.mu.6hr AveExpr F P.Value adj.P.Val
203 -4.70500261 4.3795233 -0.52285214 8.753341 0.002919147 0.8967851
797 -3.39270257 -0.5862948 -0.02746123 8.328550 0.003568277 0.8967851
828 -4.25071499 1.8429580 0.34257837 7.590078 0.005125338 0.8967851
764 0.02937019 3.7623514 0.32860517 6.991332 0.006965102 0.8967851
570 -2.14961101 -2.1415450 0.52207117 6.558165 0.008764670 0.8967851
457 2.28175860 -4.5011533 -0.25060779 6.522692 0.008933944 0.8967851
593 2.57911585 1.3339224 0.18262853 6.460369 0.009240381 0.8967851
693 -0.43909594 3.8805006 -0.30368945 6.117644 0.011153404 0.8967851
414 1.76240994 -4.3652464 0.12864332 5.887795 0.012686997 0.8967851
309 -3.57816645 3.4458003 -0.09221823 5.847096 0.012982697 0.8967851
# Step 4- Which genes respond differently over time in the mutant relative to the wild-type?
cont.dif <- makeContrasts(
Dif6hr =(mu.6hr-mu.0hr)-(wt.6hr-wt.0hr),
Dif24hr=(mu.24hr-mu.6hr)-(wt.24hr-wt.6hr),
levels=design)
fit2 <- contrasts.fit(fit, cont.dif)
fit2 <- eBayes(fit2)
topTableF(fit2, adjust="BH")
# Output:
Dif6hr Dif24hr AveExpr F P.Value adj.P.Val
797 -6.0630607 1.468086 -0.02746123 8.922239 0.002699040 0.8685775
18 -1.7460726 7.130525 0.12631684 8.707761 0.002981974 0.8685775
764 -0.3370846 5.759971 0.32860517 7.512472 0.005329540 0.8685775
564 5.9827916 -4.071109 0.11052980 7.237032 0.006132233 0.8685775
24 -5.3124510 2.462141 -0.07153017 6.291352 0.010133010 0.8685775
113 -4.1366307 5.990927 0.12254199 6.138815 0.011023028 0.8685775
313 -2.0419712 6.157636 -0.12618538 5.979547 0.012047886 0.8685775
913 -4.4262959 6.195062 0.46998972 5.562351 0.015283627 0.8685775
926 -4.9695447 1.222076 0.15060787 5.496894 0.015875656 0.8685775
674 -0.9636603 -3.808384 0.13392844 5.020330 0.021058895 0.8685775
A case study using Limma package
http://www.biomedcentral.com/1756-0500/3/81
Limma package with many time points
The Limma user guide mentioned that when there are many time points for each group, it is reasonable to assume the expression changes smoothly over time rather than making discrete jumps from one time point to another. This type of time course can be analyzed by fitting a temporal trend using a regression spline or a polynomial.
Consider the case with 2 groups. Each group has 16 time points. We use a cubic spline curve with a modest number of knots. Choosing effective degrees of freedom to be in range 3-5 is reasonable.
library(splines) X <- ns(targets$Time, df =5) Group <- factor(targets$Group) design <- model.matrix(~Group*X) fit <- lmFit(y, design) fit <- eBayes(fit) # This creates a model with 12 parameters, with the last 5 corresponding # to interaction, i.e., to differences in the curves between groups. # To detect genes with different time trends for treatment vs control: topTable(fit, coef=8:12)
This conducts a moderated F-test for each gene on 5 df, which can detect very general differences between the treatment and control curves.
Note that for this analysis, it is not necessary to have replicates, nor is it necessary for the two treatment groups to be observed at identical time points.