• Publications
  • Influence
A Paradox in the Study of the Benefits of Test‐Item Review
According to a popular belief, test takers should trust their initial instinct and retain their initial responses when they have the opportunity to review test items. More than 80 years of empiricalExpand
  • 19
  • 6
  • PDF
Assessing learning in the classroom
Be planned, discrete events (e.g., no-stakes quiz, muddiest point, minute paper, directed paraphrasing, blind poll). Arise spontaneously (e.g., posing questions, listening to student questions andExpand
  • 40
  • 6
A Framework for Policies and Practices to Improve Test Security Programs: Prevention, Detection, Investigation, and Resolution (PDIR)
Test security is not an end in itself; it is important because we want to be able to make valid interpretations from test scores. In this article, I propose a framework for comprehensive testExpand
  • 5
  • 2
Study of a Dual-Language Test Booklet in Eighth-Grade Mathematics
The purpose of this study was to address the effectiveness of a Spanish-English dual-language test booklet in 8th-grade mathematics. This study used analyses of test data (n = 402) as well asExpand
  • 29
  • 2
  • PDF
Evaluating Comparative Judgment as an Approach to Essay Scoring
ABSTRACT As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certainExpand
  • 22
  • 2
A Comparison of IRT, Delta Plot, and Mantel-Haenszel Techniques for Detecting Differential Item Functioning Across Subpopulations in the Maryland Test of Citizenship Skills.
Use of item response theory (IRT), the delta plot method, and Mantel-Haenszel techniques to assess differential item functioning (DIF) across racial and gender groups associated with the MarylandExpand
  • 6
  • 1
  • PDF
The Consistency Between Raters Scoring in Different Test Years
The consistency between raters over 3 years of a high-stakes performance assessment was examined in 2 studies that involved students in Grades 3, 5, and 8. The students' performance was evaluated inExpand
  • 41
  • 1
Matching the Judgmental Task with Standard Setting Panelist Expertise: The Item-Descriptor (ID) Matching Method
TLDR
We describe the Item- Descriptor Matching method, a method based on IRT item mapping, which matches items to performance level descriptors that are used for reporting test results. Expand
  • 19
  • 1
  • PDF
Using NAEP for Interstate Comparisons: The Beginnings of a “National Achievement Test” and “National Curriculum”
Current plans to use National Assessment of Educational Progress (NAEP) results to compare and rank states may lead to a perception of NAEP as a “national achievement test” representing a “nationalExpand
  • 10
  • 1
...
1
2
3
4
5
...