How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory

  title={How Many Imputations are Really Needed? Some Practical Clarifications of Multiple Imputation Theory},
  author={John W. Graham and Allison E. Olchowski and Tamika D. Gilreath},
  journal={Prevention Science},
Multiple imputation (MI) and full information maximum likelihood (FIML) are the two most common approaches to missing data analysis. In theory, MI and FIML are equivalent when identical models are tested using the same variables, and when m, the number of imputations performed with MI, approaches infinity. However, it is important to know how many imputations are necessary before MI and FIML are sufficiently equivalent in ways that are important to prevention scientists. MI theory suggests that… 
Improved methods for estimating fraction of missing information in multiple imputation
It is quantitatively demonstrated that E(γm) decreases with the increase of m so that E-m > γ0 for any finite m, so that γm would inevitably overestimateγ0 and three improved FMI estimation methods were proposed.
Multiple Imputation of Missing Data: A Simulation Study on a Binary Response
Currently, a growing number of programs become available in statistical software for multiple imputation of missing values. Among others, two algorithms are mainly implemented: Expectation
How Many Imputations Do You Need? A Two-stage Calculation Using a Quadratic Rule
A two-stage procedure is recommended in which you conduct a pilot analysis using a small-to-moderate number of imputations, then use the results to calculate the number ofImputations that are needed for a final analysis whose SE estimates will have the desired level of replicability.
Multiple imputation of missing covariate values in multilevel models with random slopes: a cautionary note
It is suggested that MI is able to recover most parameters, but is currently not well suited to capture slope variation entirely when covariate values are missing, and listwise deletion can be an alternative worth considering when preserving the slope variance is particularly important.
Auxiliary Variables in Multiple Imputation When Data Are Missing Not at Random
Most current implementations of multiple imputation (MI) assume that data are missing at random (MAR), but this assumption is generally untestable. We performed analyses to test the effects of
Multiple Imputation for Incomplete Data in Epidemiologic Studies
The theoretical underpinnings of multiple imputation are described, and application of this method is illustrated as part of a collaborative challenge to assess the performance of various techniques for dealing with missing data.
Statistical Inference in Missing Data by MCMC and Non-MCMC Multiple Imputation Algorithms: Assessing the Effects of Between-Imputation Iterations
Based on simulation experiments, the current study contends that EMB is a confidence proper (confidence-supporting) multiple imputation algorithm without between-imputation iterations; thus, E MB is more user-friendly than DA and FCS.
Multiple Imputation in Multilevel Models. A Revision of the Current Software and Usage Examples for Researchers
A thorough revision of the most recently developed software and functions about multiple imputation in multilevel models is presented and a set of suggestions, recommendations, and guides for helping researchers to handle missing data are derived.
Multiple Imputation of Missing Data for Multilevel Models
This paper provides guidance using MI in the context of several classes of multilevel models, including models with random intercepts, random slopes, cross-level interactions (CLIs), and missing data in categorical and group-level variables.


Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective.
The key ideas of multiple imputation are reviewed, the software programs currently available are discussed, and their use on data from the Adolescent Alcohol Prevention Trial is demonstrated.
Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation
This work adapts an algorithm and uses it to implement a general-purpose, multiple imputation model for missing data that is considerably faster and easier to use than the leading method recommended in the statistics literature.
Adding Missing-Data-Relevant Variables to FIML-Based Structural Equation Models
Conventional wisdom in missing data research dictates adding variables to the missing data model when those variables are predictive of (a) missingness and (b) the variables containing missingness.
A comparison of inclusive and restrictive strategies in modern missing data procedures.
A simulation was presented to assess the potential costs and benefits of a restrictive strategy, which makes minimal use of auxiliary variables, versus an inclusive strategy,Which shows that the inclusive strategy is to be greatly preferred.
Missing data: our view of the state of the art.
2 general approaches that come highly recommended: maximum likelihood (ML) and Bayesian multiple imputation (MI) are presented and may eventually extend the ML and MI methods that currently represent the state of the art.
Methods for handling missing data
This chapter begins by describing helpful typologies of missing data based on pattern and non-response mechanisms and summarizes a collection of commonly used but imperfect methods for dealing with missing data at the analysis stage.
Analysis of Incomplete Multivariate Data
The authors advocate a top-down and merging strategy to segmentation, starting from the top left corner of the image, but though the segmented image seems to look similar to the original, the authors wonder whether the results are really any good.
Research Methods in Psychology.
Now in its ninth successful edition, "Research Methods in Psychology" unites students' passion for psychology with their interest in answering questions about behavior and mental processes. The text