Exploratory Data Analysis for Complex Models

@article{Gelman2004ExploratoryDA,
  title={Exploratory Data Analysis for Complex Models},
  author={Andrew Gelman},
  journal={Journal of Computational and Graphical Statistics},
  year={2004},
  volume={13},
  pages={755 - 779}
}
  • A. Gelman
  • Published 2004
  • Computer Science
  • Journal of Computational and Graphical Statistics
“Exploratory” and “confirmatory” data analysis can both be viewed as methods for comparing observed data to what would be obtained under an implicit or explicit statistical model. For example, many of Tukey's methods can be interpreted as checks against hypothetical linear models and Poisson distributions. In more complex situations, Bayesian methods can be useful for constructing reference distributions for various plots that are useful in exploratory data analysis. This article proposes an… Expand
Exploratory Data Analysis
TLDR
The philosophical justification for EDA is presented in terms of C.S. Pierce's concept of abduction and the recognition of a broad range of analytic needs that arise throughout the research process. Expand
Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference
TLDR
It is described how without a grounding in theories of human statistical inference, research in exploratory visual analysis can lead to contradictory interface objectives and representations of uncertainty that can discourage users from drawing valid inferences. Expand
Visualization in Bayesian Data Analysis
Modern Bayesian statistical science commonly proceeds without reference to statistical graphics; both involve computation, but they are rarely considered to be connected. Traditional views about theExpand
Statistical inference for exploratory data analysis and model diagnostics
  • A. Buja, D. Cook, +4 authors H. Wickham
  • Medicine, Biology
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
  • 2009
TLDR
The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent, and teachers might find that incorporating these protocols into the curriculum improves their students’ statistical thinking. Expand
Exploratory Data Analysis
TLDR
Key components of EDA are the combination of statistical models and graphics and the incorporation of domain knowledge, and interactive graphical tools are particularly valuable for exploratory work. Expand
Validation of Visual Statistical Inference , with Application to Linear Models
Statistical graphics play a crucial role in exploratory data analysis, model checking and diagnosis. Until recently there were no formal visual methods in place for determining statisticalExpand
Visual Statistical Inference for Regression Parameters
Statistical graphics play a crucial role in exploratory data analysis, model checking and diagnosis. Until recently there were no formal visual methods in place for determining statisticalExpand
Getting the most from your curves: Exploring and reporting data using informative graphical techniques
TLDR
The role of exploratory data analysis in detecting Type I and Type II errors is considered and it is proposed that essential summary statistics and information about the shape and variability of data should be reported via graphical techniques. Expand
Visualizing Count Data Regressions Using Rootograms
ABSTRACT The rootogram is a graphical tool associated with the work of J. W. Tukey that was originally used for assessing goodness of fit of univariate distributions. Here, we extend the rootogram toExpand
Exploratory Data Analysis using Random Forests ∗
Although the rise of "big data" has made machine learning algorithms more visible and relevant for social scientists, they are still widely considered to be "black box" models that are not wellExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 107 REFERENCES
Graphical Methods for Assessing Logistic Regression Models
Abstract In ordinary linear regression, graphical diagnostic displays can be very useful for detecting and examining anomalous features in the fit of a model to data. For logistic regression models,Expand
Two graphical displays for outlying and influential observations in regression
SUMMARY The paper describes two procedures for detecting observations with outlying values either in the response variable or in the explanatory variables in multiple regression. These procedures areExpand
Probability plotting methods for the analysis of data.
SUMMARY This paper describes and discusses graphical techniques, based on the primitive empirical cumulative distribution function and on quantile (Q-Q) plots, percent (P-P) plots and hybrids ofExpand
Multiple imputation for model checking: completed-data plots with missing and latent data.
TLDR
The methods of missing-data model checking can be interpreted as "predictive inference" in a non-Bayesian context and the graphical diagnostics within this framework are considered. Expand
Diagnostic checks for discrete data regression models using posterior predictive simulations
Model checking with discrete data regressions can be difficult because the usual methods such as residual plots have complicated reference distributions that depend on the parameters in the model.Expand
Graphical Methods for Data Analysis
TLDR
This paper presents a meta-modelling framework for developing and assessing regression models for multivariate and multi-dimensional data distributions and describes the distribution of a set of data. Expand
Models, assumptions and model checking in ecological regressions
Ecological regression is based on assumptions that are untestable from aggregate data. However, these assumptions seem more questionable in some applications than in others. There has been someExpand
Statistical Computing and Graphics Let's Practice What We Preach: Turning Tables into Graphs
TLDR
It is shown how it is possible to improve the presentations using graphs that actually take up less space than the original tables, with a particularly effective tool to be multiple repeated line plots. Expand
Posterior Predictive $p$-Values
Extending work of Rubin, this paper explores a Bayesian counterpart of the classical $p$-value, namely, a tail-area probability of a "test statistic" under a null hypothesis. The BayesianExpand
Bayesian Data Analysis
TLDR
Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided. Expand
...
1
2
3
4
5
...