Can one assess whether missing data are missing at random in medical studies?

@article{Potthoff2006CanOA,
  title={Can one assess whether missing data are missing at random in medical studies?},
  author={Richard F. Potthoff and Gail E. Tudor and Karen S. Pieper and Vic Hasselblad},
  journal={Statistical Methods in Medical Research},
  year={2006},
  volume={15},
  pages={213 - 234}
}
For handling missing data, newer methods such as those based on multiple imputation are generally more accurate than older ones and entail weaker assumptions. Yet most do assume that data are missing at random (MAR). The issue of assessing whether the MAR assumption holds to begin with has been largely ignored. In fact, no way to directly test MAR is available. We propose an alternate assumption, MAR+, that can be tested. MAR+ always implies MAR, so inability to reject MAR+ bodes well for MAR… 

Tables from this paper

What impact do assumptions about missing data have on conclusions? A practical sensitivity analysis for a cancer survival registry
TLDR
The method of pattern mixture sensitivity analysis after multiple imputation using colorectal cancer data as an example highlighted, which suggested a smaller association between Dukes’ stage and death, though the association remained positive and with similarly low p values.
Erratum to: What impact do assumptions about missing data have on conclusions? a practical sensitivity analysis for a cancer survival registry
TLDR
The method of pattern mixture sensitivity analysis after multiple imputation using colorectal cancer data as an example highlights the importance of making people aware of the need to test the MAR assumption and suggests a smaller association between Dukes’ stage and death.
Semi-Parametric Methods for Missing Data and Causal Inference
TLDR
This dissertation provides necessary and sufficient conditions for nonparametric identification of the full data distribution under MNAR with the aid of an IV and proposes inverse probability weighted estimation, outcome regression based estimation and doubly robust estimation of the mean of an outcome subject to MNAR.
Canonical Causal Diagrams to Guide the Treatment of Missing Data in Epidemiologic Studies
Abstract With incomplete data, the “missing at random” (MAR) assumption is widely understood to enable unbiased estimation with appropriate methods. While the need to assess the plausibility of MAR
Diagnosing missing always at random in multivariate data
TLDR
Three different diagnostic tests are proposed that not only indicate when this assumption is incorrect but also suggest which variables are the most likely culprits, and evidence for its violation should encourage the careful statistician to conduct targeted sensitivity analyses.
MCAR is not necessary for the complete cases to constitute a simple random subsample of the target sample
TLDR
It is shown that, unlike MCAR, AAR response mechanisms can be missing not at random (MNAR), and it is concluded that before pooling partially complete and complete cases into an analysis, the investigator should consider how selection might impact on the representativeness of the cases included in the pooled analysis (compared to those comprising the complete cases only).
Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable.
TLDR
This paper provides necessary and sufficient conditions for nonparametric identification of the full data distribution under MNAR with the aid of an IV and proposes inverse probability weighted estimation, outcome regression-based estimation and doubly robust estimation of the mean of an outcome subject to MNAR.
Practical considerations for sensitivity analysis after multiple imputation applied to epidemiological studies with incomplete data
TLDR
The practical utility of, and advocate, a pragmatic widely applicable approach to exploring plausible departures from the MAR assumption post multiple imputation is demonstrated and guidelines for applying this approach to epidemiological studies are developed.
A review and evaluation of standard methods to handle missing data on time-varying confounders in marginal structural models
TLDR
Existing methods to handling missing data in MSMs are reviewed and a simulation study is performed to compare the performance of complete case analysis, the last observation carried forward (LOCF), the missingness pattern approach (MPA), multiple imputation (MI) and inverse-probability-of-missingness weighting (IPMW).
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
A Test of Missing Completely at Random for Multivariate Data with Missing Values
Abstract A common concern when faced with multivariate data with missing values is whether the missing data are missing completely at random (MCAR); that is, whether missingness depends on the
Multiple Imputation After 18+ Years
TLDR
A description of the assumed context and objectives of multiple imputation is provided, and a review of the multiple imputations framework and its standard results are reviewed.
Testing ignorable missingness in estimating equation approaches for longitudinal data
We address the matter of determining whether or not missing data in longitudinal studies are ignorable with regard to quasilikelihood or estimating equations approaches. This involves testing for
A test of missing completely at random for generalised estimating equations with missing data
We consider inference from generalised estimating equations when data are incomplete. A test for missing completely at random is proposed to help decide whether or not we should adjust estimating
A test of the missing data mechanism for repeated categorical data.
TLDR
A simple and practical test of the missing mechanism in incomplete repeated categorical data using data from a longitudinal investigation of obesity in school-age children using a test criterion given in general form by Wald is developed.
A comparison of inclusive and restrictive strategies in modern missing data procedures.
TLDR
A simulation was presented to assess the potential costs and benefits of a restrictive strategy, which makes minimal use of auxiliary variables, versus an inclusive strategy,Which shows that the inclusive strategy is to be greatly preferred.
Logistic regression with incompletely observed categorical covariates--investigating the sensitivity against violation of the missing at random assumption.
TLDR
This work presents a framework to specify alternative missing value mechanisms such that maximum likelihood estimation of the regression parameters under a specified alternative is possible and allows investigation of the sensitivity of a single estimate against violations of the missing at random assumption.
Missing data
  • John L.P. Thompson, G. Levy
  • Mathematics
    Amyotrophic lateral sclerosis and other motor neuron disorders : official publication of the World Federation of Neurology, Research Group on Motor Neuron Diseases
  • 2004
TLDR
The importance of missing data in RCTs is emphasized, and how the problem can be handled in an unbiased way by imputation procedures is discussed, and some recommendations for trial design and conduct are made that are tailored to R CTs for ALS.
INFERENCE AND MISSING DATA
Two results are presented concerning inference when data may be missing. First, ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the
Modeling the Drop-Out Mechanism in Repeated-Measures Studies
TLDR
Methods that simultaneously model the data and the drop-out process within a unified model-based framework are discussed, and possible extensions outlined.
...
1
2
3
4
...