Abstract
AbstractCalculation of a confidence interval for intraclass correlation to assess inter-rater reliability is problematic when the number of raters is small and the rater effect is not negligible. Intervals produced by existing methods are uninformative: the lower bound is often close to zero, even in cases where the reliability is good and the sample size is large. In this paper, we show that this problem is unavoidable without extra assumptions and we propose two new approaches. The first approach assumes that the raters are sufficiently trained and is related to a sensitivity analysis. The second approach is based on a model with fixed rater effect. Using either approach, we obtain conservative and informative confidence intervals even from samples with only two raters. We illustrate our point with data on the development of neuromotor functions in children and adolescents.