# Applying the right statistics: analyses of measurement studies

@article{Bland2003ApplyingTR, title={Applying the right statistics: analyses of measurement studies}, author={J. Martin Bland and D. G. Altman}, journal={Ultrasound in Obstetrics and Gynecology}, year={2003}, volume={22} }

The study of measurement error, observer variation and agreement between different methods of measurement are frequent topics in the imaging literature. We describe the problems of some applications of correlation and regression methods to these studies, using recent examples from this literature. We use a simulated example to show how these problems and misinterpretations arise. We describe the 95% limits of agreement approach and a similar, appropriate, regression technique. We discuss the…

## 1,264 Citations

### Agreed statistics: measurement method comparison.

- Environmental ScienceAnesthesiology
- 2012

An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability.

### Conducting correlation analysis: important limitations and pitfalls

- PhysicsClinical kidney journal
- 2021

Why the coefficient is invalid when used to assess agreement of two methods aiming to measure a certain value, and better alternatives, such as the intraclass coefficient and Bland–Altman’s limits of agreement are discussed.

### Measurement Consistency from Magnet Resonance Images 1

- Computer Science
- 2008

In almost all cases, using only one method is insufficient and it is recommended that several methods be used simultaneously, and in general, ANOVA performs the best.

### Reliability, repeatability and reproducibility: analysis of measurement errors in continuous variables

- PsychologyUltrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology
- 2008

The general concepts of agreement and reliability are distinguished to aid researchers in considering which are relevant for their particular application, and the fact that reliability depends on the population in which measurements are made, and not just on the measurement errors of the measurement method is highlighted.

### Statistical Methods: Reliability Assessment and Method Comparison

- Medicine
- 2017

Common statistical methods for assessing reliability and agreement between methods, including the intraclass correlation coefficient, coefficient of variation, Bland-Altman plot, limits of agreement, percent agreement, and the kappa statistic are described.

### How Replicates Can Inform Potential Users of a Measurement Procedure about Measurement Error: Basic Concepts and Methods

- PhysicsDiagnostics
- 2021

This paper encourages clearly conceptually distinguishing between investigations of the measurement error of a single measurement procedure and the comparison between different measurement procedures or observers and describes the link to the existing general statistical methodology.

### Improvements in the application and reporting of advanced Bland–Altman methods of comparison

- BiologyJournal of Clinical Monitoring and Computing
- 2014

The implementation of a freely available implementation accompanied by a formal description of the more advanced Bland–Altman comparison methods is provided and a standard format of reporting is proposed that would improve analysis and interpretation of comparison studies.

### Longitudinal MRI data analysis in presence of measurement error but absence of replicates

- Psychology
- 2018

This article proposes a novel method for the analysis of unreplicated longitudinal data under the presence of measurement errors using mixed-effect regression and develops a new EM-Variogram technique to estimate regression coefficients as well as variance components.

### The Case for Using the Repeatability Coefficient When Calculating Test–Retest Reliability

- MedicinePloS one
- 2013

A case is made for clinicians to consider measurement error (ME) indices Coefficient of Repeatability (CR) or the Smallest Real Difference (SRD) over relative reliability coefficients like the Pearson’s (r) and the Intraclass Correlation Coefficient (ICC) while selecting tools to measure change and inferring change as true.

## References

SHOWING 1-10 OF 19 REFERENCES

### Statistics Notes: Measurement error and correlation coefficients

- PhysicsBMJ
- 1996

This work considers the use of correlation coefficients to quantify measurement error, a variation between measurements of the same quantity on the same individual which has a simple clinical interpretation.

### Measuring agreement in method comparison studies

- MathematicsStatistical methods in medical research
- 1999

The 95% limits of agreement, estimated by mean difference 1.96 standard deviation of the differences, provide an interval within which 95% of differences between measurements by the two methods are expected to lie.

### Measurement in Medicine: The Analysis of Method Comparison Studies

- Medicine
- 1983

This paper shall describe what is usually done, show why this is inappropriate, suggest a better approach, and ask why such studies are done so badly.

### Comparing methods of measurement: why plotting difference against standard method is misleading

- EconomicsThe Lancet
- 1995

### STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT

- PhysicsThe Lancet
- 1986

### Renal volume measurements: accuracy and repeatability of US compared with that of MR imaging.

- Medicine, PhysicsRadiology
- 1999

Renal volume was underestimated with US and a comparable underestimation was found when the ellipsoid formula was applied to MR images, indicating that the inaccuracy of US renal volume measurements occurred because the kidney does not resemble an ellipSOid and was not primarily related to the imaging modality.

### Statistics Notes: Some examples of regression towards the mean

- MedicineBMJ
- 1994

It is shown that regression towards the mean occurs whenever the authors select an extreme group based on one variable and then measure another variable for that group, and that the mean of the extreme group is now closer to themean of the whole population.

### Single X-ray absorptiometry: Performance characteristics and comparison with single photon absorptiometry

- MedicineOsteoporosis International
- 2005

The new single X-ray absorptiometry forearm bone densitometer described in this paper has performance characteristics which allows it to be used both for diagnostic purposes and for the follow-up of treatment.

### Statistic Notes: Regression towards the mean

- HistoryBMJ
- 1994

The statistical term “regression,” from a Latin root meaning “going back,” was first used by Francis Galton in his paper “Regression towards Mediocrity in Hereditary Stature.”1 Galton related the…

### Pulmonary emphysema: subjective visual grading versus objective quantification with macroscopic morphometry and thin-section CT densitometry.

- Medicine, PhysicsRadiology
- 1999

Systematic overestimation and moderate interobserver agreement may compromise subjective visual grading of emphysema, which suggests that subjectiveVisual grading should be supplemented with objective methods to achieve precise, reader-independent quantification of empysema.