ICC: Difference between revisions
Jump to navigation
Jump to search
(→Basic) |
(→Basic) |
||
Line 12: | Line 12: | ||
: <math> | : <math> | ||
ICC(1,1) = \frac{\sigma_\alpha^2}{\sigma_\alpha^2+\sigma_\varepsilon^2}. | ICC(1,1) = \frac{\sigma_\alpha^2}{\sigma_\alpha^2+\sigma_\varepsilon^2}. | ||
</math> See also the formula and estimation [https://www.uvm.edu/~statdhtx/StatPages/icc/icc.html here] | </math> | ||
: See also the formula and estimation [https://www.uvm.edu/~statdhtx/StatPages/icc/icc.html here]. The intuition can be found in [https://stats.stackexchange.com/a/287391 examples] where the 1st case has large <math>\sigma_\alpha^2</math> and the other does not. | |||
* The reason ICC(1,1) defined above is '''intraclass correlation <math>Corr(Y_{ij}, Y_{ik})</math>''' can be derived here: [https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Health-User-Groups/Maki-InterraterReliability-Apr2014.pdf Calculating the Intraclass Correlation Coefficient (ICC) in SAS]. ''Excellent!!'' | * The reason ICC(1,1) defined above is '''intraclass correlation <math>Corr(Y_{ij}, Y_{ik})</math>''' can be derived here: [https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Health-User-Groups/Maki-InterraterReliability-Apr2014.pdf Calculating the Intraclass Correlation Coefficient (ICC) in SAS]. ''Excellent!!'' | ||
* [https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0219854 Intraclass correlation – A discussion and demonstration of basic features] PLOS 2019. It gives a good comprehensive review with formulas and some simulation studies. | * [https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0219854 Intraclass correlation – A discussion and demonstration of basic features] PLOS 2019. It gives a good comprehensive review with formulas and some simulation studies. |
Revision as of 11:55, 7 January 2021
Basic
ICC: intra-class correlation
- https://en.wikipedia.org/wiki/Intraclass_correlation (the random effect [math]\displaystyle{ \alpha_j }[/math] in the one-way random model should be subjects, not raters)
- Intraclass Correlation from Statistics How To
- Shrout, P.E., Fleiss, J.L. (1979), Intraclass correlation: uses in assessing rater reliability, Psychological Bulletin, 86, 420-428.
- ICC(1,1): each subject is measured by a different set of k randomly selected raters?;
- ICC(2,1): k raters are randomly selected, then, each subject is measured by the same set of k raters;
- ICC(3,1): similar to ICC(2,1) but k raters are fixed.
- [math]\displaystyle{ Y_{ij} = \mu + \alpha_i + \varepsilon_{ij}, }[/math] where [math]\displaystyle{ \alpha_i }[/math] is the random effect from subject i,
- [math]\displaystyle{ ICC(1,1) = \frac{\sigma_\alpha^2}{\sigma_\alpha^2+\sigma_\varepsilon^2}. }[/math]
- See also the formula and estimation here. The intuition can be found in examples where the 1st case has large [math]\displaystyle{ \sigma_\alpha^2 }[/math] and the other does not.
- The reason ICC(1,1) defined above is intraclass correlation [math]\displaystyle{ Corr(Y_{ij}, Y_{ik}) }[/math] can be derived here: Calculating the Intraclass Correlation Coefficient (ICC) in SAS. Excellent!!
- Intraclass correlation – A discussion and demonstration of basic features PLOS 2019. It gives a good comprehensive review with formulas and some simulation studies.
- Intraclass correlation coefficient vs. F-test (one-way ANOVA)?
- Good ICC, bad CV or vice-versa, how to interpret? Two examples are given show the difference of between-sample (think of using one value such as the average to represent a sample) variability and within-sample variability.
- Intraclass Correlation Coefficient in R. icc() [irr package] and the function ICC() [psych package] are considered with a simple example.
- Simulation. https://stats.stackexchange.com/q/135345 chapter 4 of this book: Headrick, T. C. (2010). Statistical simulation: Power method polynomials and other transformations. Boca Raton, FL: Chapman & Hall
- https://bookdown.org/roback/bookdown-bysh/ch-corrdata.html
- https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Health-User-Groups/Maki-InterraterReliability-Apr2014.pdf
R packages
The main input is a matrix of n subjects x p raters. Each rater is a class/group.
- psych: ICC()
- irr: icc() for one-way or two-way model. This works on my data 30k by 58. The default option gives ICC(1). It can also compute ICC(A,1)/agreement and ICC(C,1)/consistency.
- psy: icc(). No options are provided. I got an error: vector memory exhausted (limit reached?) when the data is 30k by 58.
- rptR:
Examples
psych package data
It shows ICC1 = ICC(1,1)
R> library(psych) R> (o <- ICC(anxiety, lmer=FALSE) ) Call: ICC(x = anxiety, lmer = FALSE) Intraclass correlation coefficients type ICC F df1 df2 p lower bound upper bound Single_raters_absolute ICC1 0.18 1.6 19 40 0.094 -0.0405 0.44 Single_random_raters ICC2 0.20 1.8 19 38 0.056 -0.0045 0.45 Single_fixed_raters ICC3 0.22 1.8 19 38 0.056 -0.0073 0.48 Average_raters_absolute ICC1k 0.39 1.6 19 40 0.094 -0.1323 0.70 Average_random_raters ICC2k 0.43 1.8 19 38 0.056 -0.0136 0.71 Average_fixed_raters ICC3k 0.45 1.8 19 38 0.056 -0.0222 0.73 Number of subjects = 20 Number of Judges = 3 R> library(irr) R> (o2 <- icc(anxiety, model="oneway")) # subjects be considered as random effects Single Score Intraclass Correlation Model: oneway Type : consistency Subjects = 20 Raters = 3 ICC(1) = 0.175 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(19,40) = 1.64 , p = 0.0939 95%-Confidence Interval for ICC Population Values: -0.077 < ICC < 0.484 R> o$results["Single_raters_absolute", "ICC"] [1] 0.1750224 R> o2$value [1] 0.1750224 R> icc(anxiety, model="twoway", type = "consistency") Single Score Intraclass Correlation Model: twoway Type : consistency Subjects = 20 Raters = 3 ICC(C,1) = 0.216 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(19,38) = 1.83 , p = 0.0562 95%-Confidence Interval for ICC Population Values: -0.046 < ICC < 0.522 R> icc(anxiety, model="twoway", type = "agreement") Single Score Intraclass Correlation Model: twoway Type : agreement Subjects = 20 Raters = 3 ICC(A,1) = 0.198 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(19,39.7) = 1.83 , p = 0.0543 95%-Confidence Interval for ICC Population Values: -0.039 < ICC < 0.494
library(magrittr) library(ggplot2) set.seed(1) r1 <- round(rnorm(20, 10, 4)) r2 <- round(r1 + 10 + rnorm(20, 0, 2)) r3 <- round(r1 + 20 + rnorm(20, 0, 2)) df <- data.frame(r1, r2, r3) %>% pivot_longer(cols=1:3) df %>% ggplot(aes(x=name, y=value)) + geom_point() df0 <- cbind(r1, r2, r3) icc(df0, model="oneway") # ICC(1) = -0.262 --> Negative. # Shift can mess up the ICC. See wikipedia. icc(df0, model="twoway", type = "consistency") # ICC(C,1) = 0.846 --> Make sense icc(df0, model="twoway", type = "agreement") # ICC(A,1) = 0.106 --> Why? ICC(df0) Call: ICC(x = df0, lmer = T) Intraclass correlation coefficients type ICC F df1 df2 p lower bound upper bound Single_raters_absolute ICC1 -0.26 0.38 19 40 9.9e-01 -0.3613 -0.085 Single_random_raters ICC2 0.11 17.43 19 38 2.9e-13 0.0020 0.293 Single_fixed_raters ICC3 0.85 17.43 19 38 2.9e-13 0.7353 0.920 Average_raters_absolute ICC1k -1.65 0.38 19 40 9.9e-01 -3.9076 -0.307 Average_random_raters ICC2k 0.26 17.43 19 38 2.9e-13 0.0061 0.555 Average_fixed_raters ICC3k 0.94 17.43 19 38 2.9e-13 0.8929 0.972 Number of subjects = 20 Number of Judges = 3
Wine rating
Intraclass Correlation: Multiple Approaches from David C. Howell. The data appeared on the paper by Shrout and Fleiss 1979.
> library(psych); library(lme4) > rating <- matrix(c(9, 2, 5, 8, 6, 1, 3, 2, 8, 4, 6, 8, 7, 1, 2, 6, 10, 5, 6, 9, 6, 2, 4, 7), ncol=4, byrow=TRUE) > (o <- ICC(rating)) > o$results[, 1:2] type ICC Single_raters_absolute ICC1 0.1657423 # match with icc(, "oneway") Single_random_raters ICC2 0.2897642 # match with icc(, "twoway", "agreement") Single_fixed_raters ICC3 0.7148415 # match with icc(, "twoway", "consistency") Average_raters_absolute ICC1k 0.4427981 Average_random_raters ICC2k 0.6200510 Average_fixed_raters ICC3k 0.9093159 # Plot > rating2 <- data.frame(rating) %>% dplyr::bind_cols(data.frame(subj = paste0("s", 1:nrow(rating)))) %>% tidyr::pivot_longer(1:4, names_to="group", values_to="y") rating2%>% ggplot(aes(x=group, y=y)) + geom_point()
> library(irr) > icc(rating, "oneway") Single Score Intraclass Correlation Model: oneway Type : consistency Subjects = 6 Raters = 4 ICC(1) = 0.166 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(5,18) = 1.79 , p = 0.165 95%-Confidence Interval for ICC Population Values: -0.133 < ICC < 0.723 > icc(rating, "twoway", "agreement") Single Score Intraclass Correlation Model: twoway Type : agreement Subjects = 6 Raters = 4 ICC(A,1) = 0.29 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(5,4.79) = 11 , p = 0.0113 95%-Confidence Interval for ICC Population Values: 0.019 < ICC < 0.761 > icc(rating, "twoway", "consistency") Single Score Intraclass Correlation Model: twoway Type : consistency Subjects = 6 Raters = 4 ICC(C,1) = 0.715 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(5,15) = 11 , p = 0.000135 95%-Confidence Interval for ICC Population Values: 0.342 < ICC < 0.946
> anova(aov(y ~ subj + group, rating2)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) subj 5 56.208 11.242 11.027 0.0001346 *** group 3 97.458 32.486 31.866 9.454e-07 *** Residuals 15 15.292 1.019 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > (11.242 - (97.458+15.292)/18) / (11.242 + 3*(97.458+15.292)/18) [1] 0.165751 # ICC(1) = (BMS - WMS) / (BMS + (k-1)WMS) # k = number of raters > (11.242 - 1.019) / (11.242 + 3*1.019 + 4*(32.486-1.019)/6) [1] 0.2897922 # ICC(2,1) = (BMS - EMS) / (BMS + (k-1)EMS + k(JMS-EMS)/n) # n = number of subjects/targets > (11.242 - 1.019) / (11.242 + 3*1.019) [1] 0.7149451 # ICC(3,1)
Wine rating2
Introclass correlation (from Real Statistics Using Excel) with a simple example.
R> wine <- cbind(c(1,1,3,6,6,7,8,9), c(2,3,8,4,5,5,7,9), c(0,3,1,3,5,6,7,9), c(1,2,4,3,6,2,9,8)) R> icc(wine, model="oneway") Single Score Intraclass Correlation Model: oneway Type : consistency Subjects = 8 Raters = 4 ICC(1) = 0.728 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(7,24) = 11.7 , p = 2.18e-06 95%-Confidence Interval for ICC Population Values: 0.434 < ICC < 0.927 # For one-way random model, the order of raters is not important R> wine2 <- wine R> for(j in 1:8) wine2[j, ] <- sample(wine[j,]) R> icc(wine2, model="oneway") Single Score Intraclass Correlation Model: oneway Type : consistency Subjects = 8 Raters = 4 ICC(1) = 0.728 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(7,24) = 11.7 , p = 2.18e-06 95%-Confidence Interval for ICC Population Values: 0.434 < ICC < 0.927 R> icc(wine, model="twoway", type="agreement") Single Score Intraclass Correlation Model: twoway Type : agreement Subjects = 8 Raters = 4 ICC(A,1) = 0.728 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(7,24) = 11.8 , p = 2.02e-06 95%-Confidence Interval for ICC Population Values: 0.434 < ICC < 0.927 R> icc(wine, model="twoway", type="consistency") Single Score Intraclass Correlation Model: twoway Type : consistency Subjects = 8 Raters = 4 ICC(C,1) = 0.729 F-Test, H0: r0 = 0 ; H1: r0 > 0 F(7,21) = 11.8 , p = 5.03e-06 95%-Confidence Interval for ICC Population Values: 0.426 < ICC < 0.928
Two-way fixed effects model
R> wine3 <- data.frame(wine) %>% dplyr::bind_cols(data.frame(subj = paste0("s", 1:8))) %>% tidyr::pivot_longer(1:4, names_to="group", values_to="y") R> wine3 %>% ggplot(aes(x=group, y=y)) + geom_point() R> anova(aov(y ~ subj + group, data = wine3)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) subj 7 188.219 26.8884 11.7867 5.026e-06 *** group 3 7.344 2.4479 1.0731 0.3818 Residuals 21 47.906 2.2813 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R> anova(aov(y ~ group + subj, data = wine3)) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) group 3 7.344 2.4479 1.0731 0.3818 subj 7 188.219 26.8884 11.7867 5.026e-06 *** Residuals 21 47.906 2.2812 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R> library(car) R> Anova(aov(y ~ subj + group, data = wine3)) Anova Table (Type II tests) Response: y Sum Sq Df F value Pr(>F) subj 188.219 7 11.7867 5.026e-06 *** group 7.344 3 1.0731 0.3818 Residuals 47.906 21 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1