When does measurement invariance matter?

@article{Borsboom2006WhenDM,
  title={When does measurement invariance matter?},
  author={Denny Borsboom},
  journal={Medical care},
  year={2006},
  volume={44 11 Suppl 3},
  pages={
          S176-81
        }
}
  • D. Borsboom
  • Published 1 November 2006
  • Psychology
  • Medical care
question whether observed differences in psychometric test scores can be attributed to differences in the properties that such tests measure is relevant in many research domains; examples include the proper interpretation of differences in intelligence test scores across different generations of people,1 gender differences in affectivity,2 and crosscultural differences in personality. This question also has generated some of the most conspicuous controversies in the social and life sciences… 
Assessing relationship quality across cultures: An examination of measurement equivalence
Researchers are increasingly studying close relationships across cultural contexts. One issue that arises when applying scales originally developed in Western countries to a different cultural
The importance of measurement invariance in neurocognitive ability testing
  • J. Wicherts
  • Psychology
    The Clinical neuropsychologist
  • 2016
TLDR
It is argued that measurement invariance is a core issue in determining whether population-based norms are valid for different subgroups and is crucial for valid use of neurocognitive tests in clinical, educational, and professional practice.
The formalization of fairness: issues in testing for measurement invariance using subtest scores
Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear
Stereotype threat and group differences in test performance: a question of measurement invariance.
TLDR
ST theory is related to the psychometric concept of measurement invariance and it is shown that ST effects may be viewed as a source of measurement bias and are detectable by means of multi-group confirmatory factor analysis.
Measurement invariance in confirmatory factor analysis: an illustration using IQ test performance of minorities
Measurement invariance with respect to groups is an essential aspect of the fair use of scores of intelligence tests and other psychological measurements. It is widely believed that equal factor
New Ways of Dealing with Lacking Measurement Invariance
TLDR
This chapter investigates and discusses how results of MI analyses should be interpreted and whether they should be reported on with regard to contents, and introduces an approach to examining the conditions under which comparison among cultural groups is possible even if MI is lacking.
Examining the measurement equivalence of the Conditional Reasoning Test for Aggression across U.S. and Croatian samples
The Conditional Reasoning Test for Aggression (CRT-A; James et al., 2005) is based on the ideas that aggressive individuals use motive-based cognitive biases to see their behavior as reasonable and
Scale length does matter: Recommendations for measurement invariance testing with categorical factor analysis and item response theory approaches.
TLDR
This study compared the performance of scale- and item-level approaches based on multiple group categorical confirmatory factor analysis and multiple group item response theory in testing MI with ordinal data and found that MG-CCFA-based approaches outperformed MG-IRT- based approaches when testing MI at the scale level.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
Do Self-Report Instruments Allow Meaningful Comparisons Across Diverse Population Groups?: Testing Measurement Invariance Using the Confirmatory Factor Analysis Framework
TLDR
A nested hierarchy of hypotheses is tested that addresses the cross-group invariance of the instrument's psychometric properties, including configural, metric (or pattern), strong (or scalar), and strict factorial invariance.
Different Kinds of DIF: A Distinction Between Absolute and Relative Forms of Measurement Invariance and Bias
In this article, a distinction is made between absolute and relative measurement. Absolute measurement refers to the measurement of traits on a group-invariant scale, and relative measurement refers
An Essay on Measurement and Factorial Invariance
TLDR
The purpose of this essay is to review the definitions and assumptions associated with factorial invariance, placing this formulation in the context of bias, fairness, and equity.
Identifying Cultural Differences in Items and Traits
The authors investigated the cross-cultural measurement equivalence of items in the English-language version of the NEO Personality Inventory, a measure of the five-factor personality model, in a
Differential Item Functioning on the Mini-Mental State Examination: An Application of the Mantel-Haenszel and Standardization Procedures
TLDR
This work defines DIF, describes 2 standard procedures for measuring Dif, applies these DIF procedures to the Mini-Mental State Examination, and contrasts DIF with score equity analysis (SEA).
Identification of Measurement Differences Between English and Spanish Language Versions of the Mini-Mental State Examination: Detecting Differential Item Functioning Using MIMIC Modeling
TLDR
Failing to account for measurement differences may lead to spurious inferences regarding language group differences in level of underlying level of cognitive functioning, and the MIMIC model can be used to detect and adjust for such measurement differences in substantive research.
Gender differences on negative affectivity: an IRT study of differential item functioning on the Multidimensional Personality Questionnaire Stress Reaction Scale.
TLDR
Results do not support arguments that measures of negative affective dispositions "artificially" produce gender mean differences by focusing on specific selected content areas and illustrate how even in an essentially unidimensional scale, comparison of group mean differences can be affected by multidimensionality caused by item clusters that share similar content.
Identification of Differential Item Functioning Using Item Response Theory and the Likelihood-Based Model Comparison Approach: Application to the Mini-Mental State Examination
TLDR
IRT and the likelihood-based model comparison approach comprise a powerful tool for DIF detection that can aid in the development, refinement, and evaluation of measures for use in ethnically diverse populations.
Evaluating the impact of partial factorial invariance on selection in two populations.
TLDR
This work evaluates the impact of partial invariance on accuracy of selection on the basis of a composite of the measures whose factor structure is being studied, assuming a single-factor model holds.
...
1
2
3
...