# A multivariate technique for multiply imputing missing values using a sequence of regression models

@article{Raghunathan2001AMT, title={A multivariate technique for multiply imputing missing values using a sequence of regression models}, author={Trivellore E. Raghunathan and James M. Lepkowski and John Van Hoewyk and Peter W. Solenberger}, journal={Survey Methodology}, year={2001}, volume={27}, pages={85-95} }

This article describes and evaluates a procedure for imputing missing values for a relatively complex data structure when the data are missing at random. The imputations are obtained by fitting a sequence of regression models and drawing values from the corresponding predictive distributions. The types of regression models used are linear, logistic, Poisson, generalized logit or a mixture of these depending on the type of variable being imputed. Two additional common features in the imputation… Expand

#### 1,924 Citations

Assessment and Improvement of a Sequential Regression Multivariate Imputation Algorithm.

- Mathematics
- 2016

A sequential regression or chained equations imputation approach uses a Gibbs sampling type iterative algorithm which imputes the missing values using a sequence of conditional regression models. It… Expand

Convergence Properties of a Sequential Regression Multiple Imputation Algorithm

- Mathematics
- 2015

A sequential regression or chained equations imputation approach uses a Gibbs sampling-type iterative algorithm that imputes the missing values using a sequence of conditional regression models. It… Expand

Multiple Imputations Using Sequential Semi and Nonparametric Regressions

- 2007

Multiple imputation is a general purpose method for analyzing data with missing values. Under this approach the missing set of values is replaced by several plausible sets of missing values to yield… Expand

Analysis of Variance from Multiply Imputed Data Sets

- 2011

The analysis of variance is a popular method used in many scientific applications. There are standard software for handling unbalanced data due to missing values in the outcome/dependent variable.… Expand

Sequential nonparametric regression multiple imputations

Multiple imputation, a general purpose method for analyzing data with missing values, involves replacing a missing set of values by several plausible sets of missing values to yield completed data… Expand

Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models

- Mathematics, Computer Science
- Comput. Stat. Data Anal.
- 2016

A method is proposed to multiply impute missing values in data sets containing variables with different measurement scales by modelling the joint distribution of the variables in the data through a sequence of generalised linear models, and data augmentation methods are used to draw imputations from a proper posterior distribution using Markov Chain Monte Carlo (MCMC). Expand

Multiple imputation for missing data via sequential regression trees.

- Medicine
- American journal of epidemiology
- 2010

The authors present a nonparametric approach for implementing multiple imputation via chained equations by using sequential regression trees as the conditional models and demonstrate that the method can result in more plausible imputations, and hence more reliable inferences, in complex settings than the naive application of standard sequential regression imputation techniques. Expand

Rounding strategies for multiply imputed binary data.

- Mathematics, Medicine
- Biometrical journal. Biometrische Zeitschrift
- 2009

This article compares several rounding rules for binary variables based on simulated longitudinal data sets that have been used to illustrate other missing-data techniques and concludes that a good rule should be driven by borrowing information from other variables in the system rather than relying on the marginal characteristics. Expand

An Empirical Comparison of Multiple Imputation Methods for Categorical Data

- Mathematics, Computer Science
- 2015

The results suggest that default chained equations approaches based on generalized linear models are dominated by the default regression tree and Bayesian mixture model approaches, making both reasonable default engines for multiple imputation of categorical data. Expand

Imputation and variable selection in linear regression models with missing covariates.

- Mathematics, Medicine
- Biometrics
- 2005

Two alternative strategies to address the problem of choosing among linear regression models when there are missing covariates are proposed, one of which involves initially performing multiple imputation and then applying Bayesian variable selection to the multiply imputed data sets. Expand

#### References

SHOWING 1-10 OF 30 REFERENCES

A multiple-imputation analysis of a case-control study of the risk of primary cardiac arrest among pharmacologicallytreated hypertensives

- Mathematics
- 1996

SUMMARY A multiple-imputation method is developed for analysing data from an observational study where some covariate values are not observed. A hybrid approach is presented where the imputations are… Expand

Maximum likelihood estimation for mixed continuous and categorical data with missing values

- Mathematics
- 1985

SUMMARY Maximum likelihood procedures for analysing mixed continuous and categorical data with missing values are presented. The general location model of Olkin & Tate (1961) and extensions… Expand

Performing likelihood ratio tests with multiply-imputed data sets

- Mathematics
- 1992

SUMMARY Existing procedures for obtaining significance levels from multiply-imputed data either (i) require access to the completed-data point estimates and variance-covariance matrices, which may… Expand

Large-sample significance levels from multiply imputed data using moment-based statistics and an F reference distribution

- Mathematics
- 1991

Abstract We present a procedure for computing significance levels from data sets whose missing values have been multiply imputed data. This procedure uses moment-based statistics, m ≤ 3 repeated… Expand

Missing data imputation using the multivariate t distribution

- Mathematics
- 1995

When a rectangular multivariate data set contains missing values, missing data imputation using the multivariate t distribution appears potentially useful, especially for robust inferences. An… Expand

Multiple Imputation After 18+ Years

- Mathematics
- 1996

Abstract Multiple imputation was designed to handle the problem of missing data in public-use data bases where the data-base constructor and the ultimate user are distinct entities. The objective is… Expand

A Split Questionnaire Survey Design

- Computer Science
- 1995

A multiple imputation method for analyzing data from this design is developed, in which the imputations are created by random draws from the posterior predictive distribution of the missing parts, given the observed parts by using Gibbs sampling under a general location scale model. Expand

Jackknife variance estimation with survey data under hot deck imputation

- Mathematics
- 1992

SUMMARY Hot deck imputation is commonly employed for item nonresponse in sample surveys. It is also a common practice to treat the imputed values as if they are true values, and then compute the… Expand

Inference from Iterative Simulation Using Multiple Sequences

- Mathematics
- 1992

The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterative… Expand

The calculation of posterior distributions by data augmentation

- Mathematics
- 1987

Abstract The idea of data augmentation arises naturally in missing value problems, as exemplified by the standard ways of filling in missing cells in balanced two-way tables. Thus data augmentation… Expand