Bootstrap
Jump to navigation
Jump to search
General
- Bootstrap from Wikipedia.
- This contains an overview of different methods for computing bootstrap confidence intervals.
- boot.ci() from the 'boot' package provides a short explanation for different methods for computing bootstrap confidence intervals.
- Bootstrapping made easy and tidy with slipper
- bootstrap package. "An Introduction to the Bootstrap" by B. Efron and R. Tibshirani, 1993
- boot package. Functions and datasets for bootstrapping from the book Bootstrap Methods and Their Application by A. C. Davison and D. V. Hinkley (1997, CUP). A short course material can be found here.The main functions are boot() and boot.ci().
- https://www.rdocumentation.org/packages/boot/versions/1.3-20
- R in Action Nonparametric bootstrapping
# Compute the bootstrapped 95% confidence interval for R-squared in the linear regression rsq <- function(data, indices, formula) { d <- data[indices,] # allows boot to select sample fit <- lm(formula, data=d) return(summary(fit)$r.square) } # 'formula' is optional depends on the problem # bootstrapping with 1000 replications set.seed(1234) bootobject <- boot(data=mtcars, statistic=rsq, R=1000, formula=mpg~wt+disp) plot(bootobject) # or plot(bootobject, index = 1) if we have multiple statistics ci <- boot.ci(bootobject, conf = .95, type=c("perc", "bca") ) # default type is "all" which contains c("norm","basic", "stud", "perc", "bca"). # 'bca' (Bias Corrected and Accelerated) by Efron 1987 uses # percentiles but adjusted to account for bias and skewness. # Level Percentile BCa # 95% ( 0.6838, 0.8833 ) ( 0.6344, 0.8549 ) # Calculations and Intervals on Original Scale # Some BCa intervals may be unstable ci$bca[4:5] # [1] 0.6343589 0.8549305 # the mean is not the same mean(c(0.6838, 0.8833 )) # [1] 0.78355 mean(c(0.6344, 0.8549 )) # [1] 0.74465 summary(lm(mpg~wt+disp, data = mtcars))$r.square # [1] 0.7809306
- Resampling Methods in R: The boot Package by Canty
- An introduction to bootstrap with applications with R by Davison and Kuonen.
- http://people.tamu.edu/~alawing/materials/ESSM689/Btutorial.pdf
- http://statweb.stanford.edu/~tibs/sta305files/FoxOnBootingRegInR.pdf
- http://www.stat.wisc.edu/~larget/stat302/chap3.pdf
- https://www.stat.cmu.edu/~cshalizi/402/lectures/08-bootstrap/lecture-08.pdf. Variance, se, bias, confidence interval (basic, percentile), hypothesis testing, parametric & non-parametric bootstrap, bootstrapping regression models.
- Understanding Bootstrap Confidence Interval Output from the R boot Package which covers the nonparametric and parametric bootstrap.
- http://www.math.ntu.edu.tw/~hchen/teaching/LargeSample/references/R-bootstrap.pdf No package is used
- http://web.as.uky.edu/statistics/users/pbreheny/621/F10/notes/9-21.pdf Bootstrap confidence interval
- http://www-stat.wharton.upenn.edu/~stine/research/spida_2005.pdf
- Optimism corrected bootstrapping (Harrell et al 1996)
- Adjusting for optimism/overfitting in measures of predictive ability using bootstrapping
- Part 1: Optimism corrected bootstrapping: a problematic method
- Part 2: Optimism corrected bootstrapping is definitely bias, further evidence
- Part 3: Two more implementations of optimism corrected bootstrapping show shocking bias
- Part 4: Why does bias occur in optimism corrected bootstrapping?
- Part 5: Code corrections to optimism corrected bootstrapping series
- Bootstrapping Part 2: Calculating p-values!!! from StatQuest
- Using bootstrapped sampling to assess variability in score predictions. The rsample (General Resampling Infrastructure) package was used.
Nonparametric bootstrap
This is the most common bootstrap method
The upstrap Crainiceanu & Crainiceanu, Biostatistics 2018
Parametric bootstrap
- Parametric bootstraps resample a known distribution function, whose parameters are estimated from your sample
- http://www.math.ntu.edu.tw/~hchen/teaching/LargeSample/notes/notebootstrap.pdf#page=3 No package is used
- A parametric or non-parametric bootstrap?
- https://www.stat.cmu.edu/~cshalizi/402/lectures/08-bootstrap/lecture-08.pdf#page=11
- simulatorZ Bioc package
Examples
Standard error
foo <- function() mean(sample(x, replace = TRUE)) set.seed(1234) x <- rnorm(300) set.seed(1) sd(replicate(10000, foo())) # [1] 0.05717679 sd(x)/sqrt(length(x)) # The se of mean is s/sqrt(n) # [1] 0.05798401 set.seed(1234) x <- rpois(300, 2) set.seed(1) sd(replicate(10000, foo())) # [1] 0.08038607 sd(x)/sqrt(length(x)) # The se of mean is s/sqrt(n) # [1] 0.08183151
Bootstrapping Extreme Value Estimators
Bootstrapping Extreme Value Estimators de Haan, 2022