Power: Difference between revisions

From 太極
Jump to navigation Jump to search
Line 118: Line 118:
* [https://cran.r-project.org/web/packages/ssizeRNA/index.html ssizeRNA] w/ vignette
* [https://cran.r-project.org/web/packages/ssizeRNA/index.html ssizeRNA] w/ vignette
* power.t.test(), power.anova.test(), power.prop.test() from [https://stat.ethz.ch/R-manual/R-devel/library/stats/html/00Index.html stats] package
* power.t.test(), power.anova.test(), power.prop.test() from [https://stat.ethz.ch/R-manual/R-devel/library/stats/html/00Index.html stats] package
== RNA-seq ==
* [https://academic.oup.com/bib/article/19/4/713/2920205#118987067 Feasibility of sample size calculation for RNA-seq studies] Poplawski 2018.
** The ‘Scotty’ (MATLAB) tool performed best
** ‘Scotty’, [https://cran.r-project.org/web/packages/ssizeRNA/index.html ssizeRNA] 2016 and [https://www.bioconductor.org/packages/release/bioc/html/PROPER.html PROPER] 2015 generated comparable results.
** Bi et al. showed that ssizeRNA provided a more accurate estimate of power/sample size than RnaSeqSampleSize; ssizeRNA and RnaSeqSampleSize provided results much faster than PROPER. See [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6291796/ Power and sample size calculations for high-throughput sequencing-based experiments] 2018
* [https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2445-2 Empirical assessment of the impact of sample number and read depth on RNA-Seq analysis workflow performance]
* [https://youtu.be/WW94W-DBf2U?t=1260 Number of replicates needed] is dependent on the following that varies from gene to gene
** Within group variance
** Read coverage
** Desired detectable effect size


== ScRNA-seq ==
== ScRNA-seq ==

Revision as of 08:29, 9 April 2022

Power analysis/Sample Size determination

Binomial distribution

Power analysis for default Bayesian t-tests

http://daniellakens.blogspot.com/2016/01/power-analysis-for-default-bayesian-t.html

Using simulation for power analysis

Power analysis and sample size calculation for Agriculture

http://r-video-tutorial.blogspot.com/2017/07/power-analysis-and-sample-size.html

Power calculation for proportions (shiny app)

https://juliasilge.shinyapps.io/power-app/

Derive the formula/manual calculation

[math]\displaystyle{ \begin{align} Power & = P_{\mu_1-\mu_2 = \Delta}(\frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\sigma^2/n + \sigma^2/n}} \gt Z_{\alpha /2}) + P_{\mu_1-\mu_2 = \Delta}(\frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\sigma^2/n + \sigma^2/n}} \lt -Z_{\alpha /2}) \\ & \approx P_{\mu_1-\mu_2 = \Delta}(\frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\sigma^2/n + \sigma^2/n}} \gt Z_{\alpha /2}) \\ & = P_{\mu_1-\mu_2 = \Delta}(\frac{\bar{X}_1 - \bar{X}_2 - \Delta}{\sqrt{2 * \sigma^2/n}} \gt Z_{\alpha /2} - \frac{\Delta}{\sqrt{2 * \sigma^2/n}}) \\ & = \Phi(-(Z_{\alpha /2} - \frac{\Delta}{\sqrt{2 * \sigma^2/n}})) \\ & = 1 - \beta =\Phi(Z_\beta) \end{align} }[/math]

Therefore

[math]\displaystyle{ \begin{align} Z_{\beta} &= - Z_{\alpha/2} + \frac{\Delta}{\sqrt{2 * \sigma^2/n}} \\ Z_{\beta} + Z_{\alpha/2} & = \frac{\Delta}{\sqrt{2 * \sigma^2/n}} \\ 2 * (Z_{\beta} + Z_{\alpha/2})^2 * \sigma^2/\Delta^2 & = n \\ n & = 2 * (Z_{\beta} + Z_{\alpha/2})^2 * \sigma^2/\Delta^2 \end{align} }[/math]
# alpha = .05, delta = 200, n = 79.5, sigma=450
1 - pnorm(1.96 - 200*sqrt(79.5)/(sqrt(2)*450)) + pnorm(-1.96 - 200*sqrt(79.5)/(sqrt(2)*450))
# [1] 0.8
pnorm(-1.96 - 200*sqrt(79.5)/(sqrt(2)*450))
# [1] 9.58e-07
1 - pnorm(1.96 - 200*sqrt(79.5)/(sqrt(2)*450)) 
# [1] 0.8

Calculating required sample size in R and SAS

pwr package is used. For two-sided test, the formula for sample size is

[math]\displaystyle{ n_{\mbox{each group}} = \frac{2 * (Z_{\alpha/2} + Z_\beta)^2 * \sigma^2}{\Delta^2} = \frac{2 * (Z_{\alpha/2} + Z_\beta)^2}{d^2} }[/math]

where [math]\displaystyle{ Z_\alpha }[/math] is value of the Normal distribution which cuts off an upper tail probability of [math]\displaystyle{ \alpha }[/math], [math]\displaystyle{ \Delta }[/math] is the difference sought, [math]\displaystyle{ \sigma }[/math] is the presumed standard deviation of the outcome, [math]\displaystyle{ \alpha }[/math] is the type 1 error, [math]\displaystyle{ \beta }[/math] is the type II error and (Cohen's) d is the effect size - difference between the means divided by the pooled standard deviation.

# An example from http://www.stat.columbia.edu/~gelman/stuff_for_blog/c13.pdf#page=3
# Method 1.
require(pwr)
pwr.t.test(d=200/450, power=.8, sig.level=.05,
           type="two.sample", alternative="two.sided")
#
#     Two-sample t test power calculation 
#
#              n = 80.4
#              d = 0.444
#      sig.level = 0.05
#          power = 0.8
#    alternative = two.sided
#
# NOTE: n is number in *each* group

# Method 2.
2*(qnorm(.975) + qnorm(.8))^2*450^2/(200^2)
# [1] 79.5
2*(1.96 + .84)^2*450^2 / (200^2)
# [1] 79.4

And stats::power.t.test() function.

power.t.test(n = 79.5, delta = 200, sd = 450, sig.level = .05,
             type ="two.sample", alternative = "two.sided")
#
#     Two-sample t test power calculation 
#
#              n = 79.5
#          delta = 200
#             sd = 450
#      sig.level = 0.05
#          power = 0.795
#    alternative = two.sided
#
# NOTE: n is number in *each* group

R package related to power analysis

CRAN Task View: Design of Experiments

RNA-seq

ScRNA-seq

scRNA-seq

Russ Lenth Java applets

https://homepage.divms.uiowa.edu/~rlenth/Power/index.html

Bootstrap method

The upstrap Crainiceanu & Crainiceanu, Biostatistics 2018

Multiple Testing Case

Optimal Sample Size for Multiple Testing The Case of Gene Expression Microarrays

Unbalanced randomization

Can unbalanced randomization improve power?

Yes, unbalanced randomization can improve power, in some situations