Corpus ID: 10201308

A multivariate technique for multiply imputing missing values using a sequence of regression models

  title={A multivariate technique for multiply imputing missing values using a sequence of regression models},
  author={Trivellore E. Raghunathan and James M. Lepkowski and John Van Hoewyk and Peter W. Solenberger},
  journal={Survey Methodology},
This article describes and evaluates a procedure for imputing missing values for a relatively complex data structure when the data are missing at random. The imputations are obtained by fitting a sequence of regression models and drawing values from the corresponding predictive distributions. The types of regression models used are linear, logistic, Poisson, generalized logit or a mixture of these depending on the type of variable being imputed. Two additional common features in the imputation… Expand
Assessment and Improvement of a Sequential Regression Multivariate Imputation Algorithm.
  • Jian Zhu
  • Mathematics
  • 2016
A sequential regression or chained equations imputation approach uses a Gibbs sampling type iterative algorithm which imputes the missing values using a sequence of conditional regression models. ItExpand
Convergence Properties of a Sequential Regression Multiple Imputation Algorithm
A sequential regression or chained equations imputation approach uses a Gibbs sampling-type iterative algorithm that imputes the missing values using a sequence of conditional regression models. ItExpand
Multiple Imputations Using Sequential Semi and Nonparametric Regressions
Multiple imputation is a general purpose method for analyzing data with missing values. Under this approach the missing set of values is replaced by several plausible sets of missing values to yieldExpand
Analysis of Variance from Multiply Imputed Data Sets
The analysis of variance is a popular method used in many scientific applications. There are standard software for handling unbalanced data due to missing values in the outcome/dependent variable.Expand
Sequential nonparametric regression multiple imputations
Multiple imputation, a general purpose method for analyzing data with missing values, involves replacing a missing set of values by several plausible sets of missing values to yield completed dataExpand
Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models
A method is proposed to multiply impute missing values in data sets containing variables with different measurement scales by modelling the joint distribution of the variables in the data through a sequence of generalised linear models, and data augmentation methods are used to draw imputations from a proper posterior distribution using Markov Chain Monte Carlo (MCMC). Expand
Multiple imputation for missing data via sequential regression trees.
The authors present a nonparametric approach for implementing multiple imputation via chained equations by using sequential regression trees as the conditional models and demonstrate that the method can result in more plausible imputations, and hence more reliable inferences, in complex settings than the naive application of standard sequential regression imputation techniques. Expand
Rounding strategies for multiply imputed binary data.
  • H. Demirtas
  • Mathematics, Medicine
  • Biometrical journal. Biometrische Zeitschrift
  • 2009
This article compares several rounding rules for binary variables based on simulated longitudinal data sets that have been used to illustrate other missing-data techniques and concludes that a good rule should be driven by borrowing information from other variables in the system rather than relying on the marginal characteristics. Expand
An Empirical Comparison of Multiple Imputation Methods for Categorical Data
The results suggest that default chained equations approaches based on generalized linear models are dominated by the default regression tree and Bayesian mixture model approaches, making both reasonable default engines for multiple imputation of categorical data. Expand
Imputation and variable selection in linear regression models with missing covariates.
Two alternative strategies to address the problem of choosing among linear regression models when there are missing covariates are proposed, one of which involves initially performing multiple imputation and then applying Bayesian variable selection to the multiply imputed data sets. Expand


A multiple-imputation analysis of a case-control study of the risk of primary cardiac arrest among pharmacologicallytreated hypertensives
SUMMARY A multiple-imputation method is developed for analysing data from an observational study where some covariate values are not observed. A hybrid approach is presented where the imputations areExpand
Maximum likelihood estimation for mixed continuous and categorical data with missing values
SUMMARY Maximum likelihood procedures for analysing mixed continuous and categorical data with missing values are presented. The general location model of Olkin & Tate (1961) and extensionsExpand
Performing likelihood ratio tests with multiply-imputed data sets
SUMMARY Existing procedures for obtaining significance levels from multiply-imputed data either (i) require access to the completed-data point estimates and variance-covariance matrices, which mayExpand
Large-sample significance levels from multiply imputed data using moment-based statistics and an F reference distribution
Abstract We present a procedure for computing significance levels from data sets whose missing values have been multiply imputed data. This procedure uses moment-based statistics, m ≤ 3 repeatedExpand
Missing data imputation using the multivariate t distribution
When a rectangular multivariate data set contains missing values, missing data imputation using the multivariate t distribution appears potentially useful, especially for robust inferences. AnExpand
Multiple Imputation After 18+ Years
Abstract Multiple imputation was designed to handle the problem of missing data in public-use data bases where the data-base constructor and the ultimate user are distinct entities. The objective isExpand
A Split Questionnaire Survey Design
A multiple imputation method for analyzing data from this design is developed, in which the imputations are created by random draws from the posterior predictive distribution of the missing parts, given the observed parts by using Gibbs sampling under a general location scale model. Expand
Jackknife variance estimation with survey data under hot deck imputation
SUMMARY Hot deck imputation is commonly employed for item nonresponse in sample surveys. It is also a common practice to treat the imputed values as if they are true values, and then compute theExpand
Inference from Iterative Simulation Using Multiple Sequences
The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterativeExpand
The calculation of posterior distributions by data augmentation
Abstract The idea of data augmentation arises naturally in missing value problems, as exemplified by the standard ways of filling in missing cells in balanced two-way tables. Thus data augmentationExpand