# The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability

@article{Fleiss1973TheEO, title={The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability}, author={Joseph L. Fleiss and Jacob Cohen}, journal={Educational and Psychological Measurement}, year={1973}, volume={33}, pages={613 - 619} }

or weighted kappa (Spitzer, Cohen, Fleiss and Endicott, 1967; Cohen, 1968a). Kappa is the proportion of agreement corrected for chance, and scaled to vary from -1 to +1 so that a negative value indicates poorer than chance agreement, zero indicates exactly chance agreement, and a positive value indicates better than chance agreement. A value of unity indicates perfect agreement. The use of kappa implicitly assumes that all disagreements are equally serious. When the investigator can specify the… Expand

#### 2,701 Citations

A Note on the Interpretation of Weighted Kappa and its Relations to Other Rater Agreement Statistics for Metric Scales

- Psychology
- 2004

This article presents a formula for weighted kappa in terms of rater means, rater variances, and the rater covariance that is particularly helpful in emphasizing that weighted kappa is an absolute… Expand

Conditional inequalities between Cohen's kappa and weighted kappas

- Mathematics
- 2013

Abstract Cohen’s kappa and weighted kappa are two standard tools for describing the degree of agreement between two observers on a categorical scale. For agreement tables with three or more… Expand

Some Paradoxical Results for the Quadratically Weighted Kappa

- Mathematics
- 2012

The quadratically weighted kappa is the most commonly used weighted kappa statistic for summarizing interrater agreement on an ordinal scale. The paper presents several properties of the… Expand

Beyond kappa: A review of interrater agreement measures

- Mathematics
- 1999

In 1960, Cohen introduced the kappa coefficient to measure chance-corrected nominal scale agreement between two raters. Since then, numerous extensions and generalizations of this interrater… Expand

Agreement among 2 × 2 Agreement Indices

- Mathematics
- 1984

A variety of measures of reliability for two-category nominal scales are reviewed and compared. It is shown that upon correcting these indices for chance agreement, there are only five distinct… Expand

Equivalences of weighted kappas for multiple raters

- Mathematics
- 2012

Abstract Cohen’s unweighted kappa and weighted kappa are popular descriptive statistics for measuring agreement between two raters on a categorical scale. With m ≥ 3 raters, there are several views… Expand

A note on Cohen’s weighted kappa coefficient of agreement with linear weights

- Mathematics
- 2009

Abstract Vanbelle and Albert [S. Vanbelle, A. Albert, A note on the linearly weighted kappa coefficient for ordinal scales, Statistical Methodology 6 (2008) 157–163] showed that the observed and… Expand

Weighted Specific-Category Kappa Measure of Interobserver Agreement

- Psychology, Medicine
- Psychological reports
- 2003

A Kappa-based weighted measure (Kws) of agreement on some specific category s, with Kw being a weighted average of all Kwss is proposed, with both measures being suitable for ordinal categories because of the weights being used. Expand

Chance-corrected measures for 2 × 2 tables that coincide with weighted kappa.

- Mathematics, Medicine
- The British journal of mathematical and statistical psychology
- 2011

This paper presents the general function, linear in both numerator and denominator, that becomes weighted kappa after correction for chance. Expand

Comparison of the Null Distributions of Weighted Kappa and the C Ordinal Statistic

- Mathematics
- 1977

It frequently occurs in psychological research that an investigator is interested in assessing the ex tent of interrater agreement when the data are measured on an ordinal scale. This monte carlo… Expand

#### References

SHOWING 1-10 OF 16 REFERENCES

Large sample standard errors of kappa and weighted kappa.

- Mathematics
- 1969

The statistics kappa (Cohen, 1960) and weighted kappa (Cohen, 1968) were introduced to provide coefficients of agreement between two raters for nominal scales. Kappa is appropriate when all… Expand

Bivariate Agreement Coefficients for Reliability of Data

- Mathematics
- 1970

The quality of data in content analysis, in surveys with openended questions, in the observation of unstructured social events, and so on, critically depends on the reliability with which primary… Expand

A Coefficient of Agreement for Nominal Scales

- Psychology
- 1960

CONSIDER Table 1. It represents in its formal characteristics a situation which arises in the clinical-social-personality areas of psychology, where it frequently occurs that the only useful level of… Expand

Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit.

- Mathematics, Medicine
- Psychological bulletin
- 1968

The Kw provides for the incorpation of ratio-scaled degrees of disagreement (or agreement) to each of the cells of the k * k table of joi. Expand

MOMENTS OF THE STATISTICS KAPPA AND WEIGHTED KAPPA

- Mathematics
- 1968

This paper considers the mean and variance of the two statistics, kappa and weighted kappa, which are useful in measuring agreement between two raters, in the situation where they independently… Expand

Quantification of agreement in psychiatric diagnosis. A new approach.

- Psychology, Medicine
- Archives of general psychiatry
- 1967

As generally used, all of the methods used for quantifying the salient features of the data suffer from one or more deficiencies which are illustrated using the hypothetical data of Table 1. Expand

Multiple regression as a general data-analytic system.

- Mathematics
- 1968

Techniques for using multiple regression (MR) as a general variance-accounting procedure of great flexibility, power, and fidelity to research aims in both manipulative and observational… Expand

Mental status schedule. Properties of factor-analytically derived scales.

- Psychology, Medicine
- Archives of general psychiatry
- 1967

The latest version of the MSS, designated Form A, is currently being used in a number of projects involving such varied problems as the research evaluation of treatment, case finding, routine admission assessment, and the phenomenology of mental disorders. Expand

DIAGNO. A computer program for psychiatric diagnosis utilizing the differential diagnostic procedure.

- Medicine
- Archives of general psychiatry
- 1968

The availability of a computer program for psychiatric diagnosis with demonstrated validity would make possible meaningful comparisons of the diagnostic composition of various populations. Expand

The analysis of proximities: Multidimensional scaling with an unknown distance function. I.

- 1962

A computer program is described that is designed to reconstruct the metric configuration of a set of points in Euclidean space on the basis of essentially nonmetric information about that… Expand