Revision as of 10:55, 7 January 2021

Basic

ICC: intra-class correlation

https://en.wikipedia.org/wiki/Intraclass_correlation (the random effect [math]\displaystyle{ \alpha_j }[/math] in the one-way random model should be subjects, not raters)
- Intraclass Correlation from Statistics How To
- Shrout, P.E., Fleiss, J.L. (1979), Intraclass correlation: uses in assessing rater reliability, Psychological Bulletin, 86, 420-428.
- ICC(1,1): each subject is measured by a different set of k randomly selected raters?;
- ICC(2,1): k raters are randomly selected, then, each subject is measured by the same set of k raters;
- ICC(3,1): similar to ICC(2,1) but k raters are fixed.
[math]\displaystyle{ Y_{ij} = \mu + \alpha_i + \varepsilon_{ij}, }[/math] where [math]\displaystyle{ \alpha_i }[/math] is the random effect from subject i,

[math]\displaystyle{ ICC(1,1) = \frac{\sigma_\alpha^2}{\sigma_\alpha^2+\sigma_\varepsilon^2}. }[/math]

See also the formula and estimation here. The intuition can be found in examples where the 1st case has large [math]\displaystyle{ \sigma_\alpha^2 }[/math] and the other does not.

The reason ICC(1,1) defined above is intraclass correlation [math]\displaystyle{ Corr(Y_{ij}, Y_{ik}) }[/math] can be derived here: Calculating the Intraclass Correlation Coefficient (ICC) in SAS. Excellent!!
Intraclass correlation – A discussion and demonstration of basic features PLOS 2019. It gives a good comprehensive review with formulas and some simulation studies.
Intraclass correlation coefficient vs. F-test (one-way ANOVA)?
Good ICC, bad CV or vice-versa, how to interpret? Two examples are given show the difference of between-sample (think of using one value such as the average to represent a sample) variability and within-sample variability.
Intraclass Correlation Coefficient in R. icc() [irr package] and the function ICC() [psych package] are considered with a simple example.
Simulation. https://stats.stackexchange.com/q/135345 chapter 4 of this book: Headrick, T. C. (2010). Statistical simulation: Power method polynomials and other transformations. Boca Raton, FL: Chapman & Hall
https://bookdown.org/roback/bookdown-bysh/ch-corrdata.html
https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Health-User-Groups/Maki-InterraterReliability-Apr2014.pdf

R packages

The main input is a matrix of n subjects x p raters. Each rater is a class/group.

psych: ICC()
irr: icc() for one-way or two-way model. This works on my data 30k by 58. The default option gives ICC(1). It can also compute ICC(A,1)/agreement and ICC(C,1)/consistency.
psy: icc(). No options are provided. I got an error: vector memory exhausted (limit reached?) when the data is 30k by 58.
rptR:

Examples

psych package data

It shows ICC1 = ICC(1,1)

R> library(psych)
R> (o <- ICC(anxiety, lmer=FALSE) )
Call: ICC(x = anxiety, lmer = FALSE)

Intraclass correlation coefficients 
                         type  ICC   F df1 df2     p lower bound upper bound
Single_raters_absolute   ICC1 0.18 1.6  19  40 0.094     -0.0405        0.44
Single_random_raters     ICC2 0.20 1.8  19  38 0.056     -0.0045        0.45
Single_fixed_raters      ICC3 0.22 1.8  19  38 0.056     -0.0073        0.48
Average_raters_absolute ICC1k 0.39 1.6  19  40 0.094     -0.1323        0.70
Average_random_raters   ICC2k 0.43 1.8  19  38 0.056     -0.0136        0.71
Average_fixed_raters    ICC3k 0.45 1.8  19  38 0.056     -0.0222        0.73

 Number of subjects = 20     Number of Judges =  3

R> library(irr)
R> (o2 <- icc(anxiety, model="oneway")) # subjects be considered as random effects
 Single Score Intraclass Correlation

   Model: oneway 
   Type : consistency 

   Subjects = 20 
     Raters = 3 
     ICC(1) = 0.175

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
   F(19,40) = 1.64 , p = 0.0939 

 95%-Confidence Interval for ICC Population Values:
  -0.077 < ICC < 0.484

R> o$results["Single_raters_absolute", "ICC"]
[1] 0.1750224
R> o2$value
[1] 0.1750224

R> icc(anxiety, model="twoway", type = "consistency")
 Single Score Intraclass Correlation

   Model: twoway 
   Type : consistency 

   Subjects = 20 
     Raters = 3 
   ICC(C,1) = 0.216

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
   F(19,38) = 1.83 , p = 0.0562 

 95%-Confidence Interval for ICC Population Values:
  -0.046 < ICC < 0.522
R> icc(anxiety, model="twoway", type = "agreement")
 Single Score Intraclass Correlation

   Model: twoway 
   Type : agreement 

   Subjects = 20 
     Raters = 3 
   ICC(A,1) = 0.198

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
 F(19,39.7) = 1.83 , p = 0.0543 

 95%-Confidence Interval for ICC Population Values:
  -0.039 < ICC < 0.494

library(magrittr)
library(ggplot2)

set.seed(1)
r1 <- round(rnorm(20, 10, 4))
r2 <- round(r1 + 10 + rnorm(20, 0, 2))
r3 <- round(r1 + 20 + rnorm(20, 0, 2))
df <- data.frame(r1, r2, r3) %>% pivot_longer(cols=1:3)
df %>% ggplot(aes(x=name, y=value)) + geom_point()

df0 <- cbind(r1, r2, r3)
icc(df0, model="oneway")  #  ICC(1) = -0.262  --> Negative. 
                          #  Shift can mess up the ICC. See wikipedia.
icc(df0, model="twoway", type = "consistency")  # ICC(C,1) = 0.846 --> Make sense
icc(df0, model="twoway", type = "agreement")    # ICC(A,1) = 0.106 --> Why?

ICC(df0)
Call: ICC(x = df0, lmer = T)

Intraclass correlation coefficients 
                         type   ICC     F df1 df2       p lower bound upper bound
Single_raters_absolute   ICC1 -0.26  0.38  19  40 9.9e-01     -0.3613      -0.085
Single_random_raters     ICC2  0.11 17.43  19  38 2.9e-13      0.0020       0.293
Single_fixed_raters      ICC3  0.85 17.43  19  38 2.9e-13      0.7353       0.920
Average_raters_absolute ICC1k -1.65  0.38  19  40 9.9e-01     -3.9076      -0.307
Average_random_raters   ICC2k  0.26 17.43  19  38 2.9e-13      0.0061       0.555
Average_fixed_raters    ICC3k  0.94 17.43  19  38 2.9e-13      0.8929       0.972

 Number of subjects = 20     Number of Judges =  3

Wine rating

Intraclass Correlation: Multiple Approaches from David C. Howell. The data appeared on the paper by Shrout and Fleiss 1979.

> library(psych); library(lme4)
> rating <- matrix(c(9,    2,   5,    8,
                   6,    1,   3,    2,
                   8,    4,   6,    8,
                   7,    1,   2,    6,
                   10,   5,   6,    9,
                   6,   2,   4,    7), ncol=4, byrow=TRUE)
> (o <- ICC(rating))
> o$results[, 1:2]
                         type       ICC
Single_raters_absolute   ICC1 0.1657423  # match with icc(, "oneway")
Single_random_raters     ICC2 0.2897642  # match with icc(, "twoway", "agreement")
Single_fixed_raters      ICC3 0.7148415  # match with icc(, "twoway", "consistency")
Average_raters_absolute ICC1k 0.4427981
Average_random_raters   ICC2k 0.6200510
Average_fixed_raters    ICC3k 0.9093159

# Plot
> rating2 <- data.frame(rating) %>% 
  dplyr::bind_cols(data.frame(subj = paste0("s", 1:nrow(rating)))) %>% 
  tidyr::pivot_longer(1:4, names_to="group", values_to="y")
rating2%>% ggplot(aes(x=group, y=y)) + geom_point()

> library(irr)
> icc(rating, "oneway")
 Single Score Intraclass Correlation

   Model: oneway 
   Type : consistency 

   Subjects = 6 
     Raters = 4 
     ICC(1) = 0.166

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
    F(5,18) = 1.79 , p = 0.165 

 95%-Confidence Interval for ICC Population Values:
  -0.133 < ICC < 0.723

> icc(rating, "twoway", "agreement")
 Single Score Intraclass Correlation

   Model: twoway 
   Type : agreement 

   Subjects = 6 
     Raters = 4 
   ICC(A,1) = 0.29

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
  F(5,4.79) = 11 , p = 0.0113 

 95%-Confidence Interval for ICC Population Values:
  0.019 < ICC < 0.761

> icc(rating, "twoway", "consistency")
 Single Score Intraclass Correlation

   Model: twoway 
   Type : consistency 

   Subjects = 6 
     Raters = 4 
   ICC(C,1) = 0.715

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
    F(5,15) = 11 , p = 0.000135 

 95%-Confidence Interval for ICC Population Values:
  0.342 < ICC < 0.946

> anova(aov(y ~ subj + group, rating2))
Analysis of Variance Table

Response: y
          Df Sum Sq Mean Sq F value    Pr(>F)    
subj       5 56.208  11.242  11.027 0.0001346 ***
group      3 97.458  32.486  31.866 9.454e-07 ***
Residuals 15 15.292   1.019                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> (11.242 - (97.458+15.292)/18) / (11.242 + 3*(97.458+15.292)/18)
[1] 0.165751   # ICC(1) = (BMS - WMS) / (BMS + (k-1)WMS)
               # k = number of raters

> (11.242 - 1.019) / (11.242 + 3*1.019 + 4*(32.486-1.019)/6)
[1] 0.2897922  # ICC(2,1) = (BMS - EMS) / (BMS + (k-1)EMS + k(JMS-EMS)/n)
               # n = number of subjects/targets

> (11.242 - 1.019) / (11.242 + 3*1.019)
[1] 0.7149451  # ICC(3,1)

Wine rating2

Introclass correlation (from Real Statistics Using Excel) with a simple example.

R> wine <- cbind(c(1,1,3,6,6,7,8,9), c(2,3,8,4,5,5,7,9), 
                 c(0,3,1,3,5,6,7,9), c(1,2,4,3,6,2,9,8))
R> icc(wine, model="oneway")
 Single Score Intraclass Correlation

   Model: oneway 
   Type : consistency 

   Subjects = 8 
     Raters = 4 
     ICC(1) = 0.728

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
    F(7,24) = 11.7 , p = 2.18e-06 

 95%-Confidence Interval for ICC Population Values:
  0.434 < ICC < 0.927

# For one-way random model, the order of raters is not important
R> wine2 <- wine
R> for(j in 1:8) wine2[j, ] <- sample(wine[j,])
R> icc(wine2, model="oneway")
 Single Score Intraclass Correlation

   Model: oneway 
   Type : consistency 

   Subjects = 8 
     Raters = 4 
     ICC(1) = 0.728

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
    F(7,24) = 11.7 , p = 2.18e-06 

 95%-Confidence Interval for ICC Population Values:
  0.434 < ICC < 0.927

R> icc(wine, model="twoway", type="agreement")
 Single Score Intraclass Correlation

   Model: twoway 
   Type : agreement 

   Subjects = 8 
     Raters = 4 
   ICC(A,1) = 0.728

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
    F(7,24) = 11.8 , p = 2.02e-06 

 95%-Confidence Interval for ICC Population Values:
  0.434 < ICC < 0.927

R> icc(wine, model="twoway", type="consistency")
 Single Score Intraclass Correlation

   Model: twoway 
   Type : consistency 

   Subjects = 8 
     Raters = 4 
   ICC(C,1) = 0.729

 F-Test, H0: r0 = 0 ; H1: r0 > 0 
    F(7,21) = 11.8 , p = 5.03e-06 

 95%-Confidence Interval for ICC Population Values:
  0.426 < ICC < 0.928

Two-way fixed effects model

R> wine3 <- data.frame(wine) %>% 
            dplyr::bind_cols(data.frame(subj = paste0("s", 1:8))) %>% 
            tidyr::pivot_longer(1:4, names_to="group", values_to="y")
R> wine3 %>% ggplot(aes(x=group, y=y)) + geom_point()

R> anova(aov(y ~ subj + group, data = wine3))
Analysis of Variance Table

Response: y
          Df  Sum Sq Mean Sq F value    Pr(>F)    
subj       7 188.219 26.8884 11.7867 5.026e-06 ***
group      3   7.344  2.4479  1.0731    0.3818    
Residuals 21  47.906  2.2813                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R> anova(aov(y ~ group + subj, data = wine3))
Analysis of Variance Table

Response: y
          Df  Sum Sq Mean Sq F value    Pr(>F)    
group      3   7.344  2.4479  1.0731    0.3818    
subj       7 188.219 26.8884 11.7867 5.026e-06 ***
Residuals 21  47.906  2.2812                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R> library(car)
R> Anova(aov(y ~ subj + group, data = wine3))
Anova Table (Type II tests)

Response: y
           Sum Sq Df F value    Pr(>F)    
subj      188.219  7 11.7867 5.026e-06 ***
group       7.344  3  1.0731    0.3818    
Residuals  47.906 21                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

@@ Line 12: / Line 12: @@
 : <math>
 ICC(1,1) = \frac{\sigma_\alpha^2}{\sigma_\alpha^2+\sigma_\varepsilon^2}.
-</math> See also the formula and estimation [https://www.uvm.edu/~statdhtx/StatPages/icc/icc.html here]
+</math>
+: See also the formula and estimation [https://www.uvm.edu/~statdhtx/StatPages/icc/icc.html here]. The intuition can be found in [https://stats.stackexchange.com/a/287391 examples] where the 1st case has large <math>\sigma_\alpha^2</math> and the other does not.
 * The reason ICC(1,1) defined above is '''intraclass correlation <math>Corr(Y_{ij}, Y_{ik})</math>''' can be derived here: [https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Health-User-Groups/Maki-InterraterReliability-Apr2014.pdf Calculating the Intraclass Correlation Coefficient (ICC) in SAS]. ''Excellent!!''
 * [https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0219854 Intraclass correlation – A discussion and demonstration of basic features] PLOS 2019. It gives a good comprehensive review with formulas and some simulation studies.