• Corpus ID: 17446218

A reexamination of Lord's Wald test for differential item functioning using item response theory and modern error estimation

  title={A reexamination of Lord's Wald test for differential item functioning using item response theory and modern error estimation},
  author={Michelle M. Langer},
The detection of differential item functioning (DIF) is an essential step in increasing the validity of a test for all groups. The item response theory (IRT) model comparison approach has been shown to be the most flexible and powerful method for DIF detection; however, it is computationally-intensive, requiring many model-refittings. The Wald test, originally employed by Lord for DIF detection, is asymptotically equivalent to this approach and requires only one model fitting. In this research… 
The Langer-Improved Wald Test for DIF Testing With Multiple Groups
Differential item functioning (DIF) occurs when the probability of responding in a particular category to an item differs for members of different groups who are matched on the construct being
Longitudinal Differential Item Functioning Detection Using Bifactor Models and the Wald Test
The use of longitudinal data for studying cross-time changes is built on the key assumption that properties (e.g., slopes and intercepts) of the repeatedly-used items remain unchanged over time. True
A Monte Carlo Study of an Iterative Wald Test Procedure for DIF Analysis
This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent, with results indicated that the iterative approach performed well for polytomous data in all conditions, with well-controlled Type I error rates and high power.
Anchor Selection Using the Wald Test Anchor-All-Test-All Procedure
Alternative anchor-selection strategies based on a modified version of the Wald χ2 test that is implemented in flexMIRT and IRTPRO are explored, and comparisons with methods based on the popular likelihood ratio test are made.
Generalized full-information item bifactor analysis.
An efficient full-information maximum marginal likelihood estimator is derived by extending Gibbons and Hedeker's bifactor dimension reduction method so that the optimization of the marginal log-likelihood requires only 2-dimensional integration regardless of the dimensionality of the latent variables.
Item Response Theory With Covariates (IRT-C)
It was found that the IRT-C procedure could accurately recover the latent means and the three-parameter logistic model parameters well with a substantial sample size of 20,000, and good power to detect DIF across all covariates was observed.
Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning.
Simulation and empirical results show that when large amounts of DIF are present and sample sizes are large, lasso regularization has far better control of Type I error than the likelihood ratio test method with little decrement in power, providing strong evidence that lassoRegularization is a promising alternative for testing DIF and selecting anchors.
Detecting Differential Item Functioning Using Cognitive Diagnosis Models: Applications of the Wald Test and Likelihood Ratio Test in a University Entrance Examination
ABSTRACT The present study aims to examine gender differential item functioning (DIF) in the reading comprehension section of a high stakes test using cognitive diagnosis models. Based on the
Full-information item bifactor analysis: Model, parameter estimation and application
The item response model of full-information item bifactor analysis is described and the dimension reduction method used in parameter estimation is introduced, which it is believed could be valuable and useful in various situations.


IRT-Based Internal Measures of Differential Functioning of Items and Tests
Internal measures of differential functioning of items and tests (DHFIT) based on item response theory (IRT) are proposed. Within the DFIT context, the new differential test functioning (DTF) index
Assessment of differential item functioning.
This study discusses how to assess practical significance of DIF at both item and test levels and reviews three methods of establishing a common metric: the equal-mean-difficulty method, the all-other-item method, and the constant-item (CI) method.
Detecting potentially biased test items : Comparison of IRT area and Mantel-Haenszel methods
The purpose of this study was to compare the IRT-based area method and the Mantel-Haenszel method for investigating differential item functioning (DIF), to determine the degree of agreement between
Assessment of Differential Item Functioning for Performance Tasks
Although the belief has been expressed that performance assessments are intrinsically more fair than multiple-choice measures, some forms of performance assessment may in fact be more likely than
Estimation of latent ability using a response pattern of graded scores
Estimation of latent ability using the entire response pattern of free-response items is discussed, first in the general case and then in the case where the items are scored in a graded way,
Assessing Differential Item Functioning Among Multiple Groups: A Comparison of Three Mantel-Haenszel Procedures
It is often the case in performing a differential item functioning (DIF) analysis that comparisons are made between a single reference group and multiple focal groups. Conducting a separate test of
Differential item functioning (DIF) assessment attempts to identify items or item types for which subpopulations of examinees exhibit performance differentials that are not consistent with the
Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures.
Using the methods of modern psychometric theory (in addition to those of classical test theory), the performance of the Attention subscale of the Mattis Dementia Rating Scale was examined and bias in screening measures across education and ethnic and racial subgroups was examined.
A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF
A model-based modification (SIBTEST) of the standardization index based upon a multidimensional IRT bias modeling approach is presented that detects and estimates DIF or item bias simultaneously for
Analysis of Differential Item Functioning in the NAEP History Assessment
The Mantel-Haenszel approach for investigating differential item functioning was applied to U.S. history items that were administered as part of the National Assessment of Educational Progress. On