How Much Should We Trust Differences-in-differences Estimates?*


Most papers that employ Differences-in-Differences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data on female wages from the Current Population Survey. For each law, we use OLS to compute the DD estimate of its “effect” as well as the standard error of this estimate. These conventional DD standard errors severely understate the standard deviation of the estimators: we Žnd an “effect” signiŽcant at the 5 percent level for up to 45 percent of the placebo interventions. We use Monte Carlo simulations to investigate how well existing methods help solve this problem. Econometric corrections that place a speciŽc parametric form on the time-series process do not perform well. Bootstrap (taking into account the autocorrelation of the data) works well when the number of states is large enough. Two corrections based on asymptotic approximation of the variance-covariancematrix work well for moderate numbers of states and one correction that collapses the time series information into a “pre”and “post”-period and explicitly takes into account the effective sample size works well even for small numbers of states.

8 Figures and Tables

Citations per Year

1,583 Citations

Semantic Scholar estimates that this publication has 1,583 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Bertrand2004HowMS, title={How Much Should We Trust Differences-in-differences Estimates?*}, author={Marianne Bertrand and Esther Duflo and Sendhil Mullainathan and Abhijit Banerjee and Victor Chernozhukov and Michael D. Grossman and Jerry Hausman and Kei Hirano and Bo E. Honor{\'e} and Guido W. Imbens and Jeffrey R. Kling and K{\'e}vin Lang and Steven D . Levitt and Kevin Murphy}, year={2004} }