Statistical inference for exploratory data analysis and model diagnostics

@article{Buja2009StatisticalIF,
  title={Statistical inference for exploratory data analysis and model diagnostics},
  author={A. Buja and D. Cook and H. Hofmann and M. Lawrence and Eun-Kyung Lee and Deborah F. Swayne and H. Wickham},
  journal={Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences},
  year={2009},
  volume={367},
  pages={4361 - 4383}
}
  • A. Buja, D. Cook, +4 authors H. Wickham
  • Published 2009
  • Medicine, Biology
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of ‘discoveries’ is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled… Expand
Investigations into Visual Statistical Inference
TLDR
This work provides instructions on how to design human subject experiments to use Amazon’s Mechanical Turk to implement the lineup protocol, a new method that enables the data plot to be compared with null plots, in order to obtain estimates of statistical significance of structure. Expand
Validation of Visual Statistical Inference, Applied to Linear Models
TLDR
Inferential methods for statistical graphics are developed further by refining the terminology of visual inference and framing the lineup protocol in a context that allows direct comparison with conventional tests in scenarios when a conventional test exists. Expand
Validation of Visual Statistical Inference , with Application to Linear Models
Statistical graphics play a crucial role in exploratory data analysis, model checking and diagnosis. Until recently there were no formal visual methods in place for determining statisticalExpand
Visual Statistical Inference for Regression Parameters
Statistical graphics play a crucial role in exploratory data analysis, model checking and diagnosis. Until recently there were no formal visual methods in place for determining statisticalExpand
Variations of Q–Q Plots: The Power of Our Eyes!
Abstract In statistical modeling, we strive to specify models that resemble data collected in studies or observed from processes. Consequently, distributional specification and parameter estimationExpand
Explorations of the lineup protocol for visual inference: application to high dimension, low sample size problems and metrics to assess the quality
TLDR
The research conducted and described in this thesis explores the use of visual inference on understanding low dimensional pictures of HDLSS data using data collected from Amazon Turk studies conducted with lineups for studying an array of exploratory data analysis tasks. Expand
Diagnostics for mixed/hierarchical linear models
TLDR
An overview of the diagnostic tools available for hierarchical linear models that are familiar from linear models are presented and the utility of the lineup protocol for residual analysis with complex models is discussed. Expand
Designing for interactive exploratory data analysis requires theories of graphical inference
jhullman@northwestern.edu Abstract. Research and development in computer science and statistics have produced increasingly sophisticated software interfaces for interactive and exploratory analysis,Expand
Human Factors Influencing Visual Statistical Inference
TLDR
Results from multiple visual inference studies using Amazon's Mechanical Turk are examined to provide an assessment of the power of factors, which suggest that individual skills vary substantially, but demographics do not have a huge effect on performance. Expand
Diagnostic tools for hierarchical linear models
TLDR
An overview of the diagnostic tools available for hierarchical linear models that are familiar from linear models are presented and the utility of the lineup protocol for residual analysis with complex models is discussed. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 69 REFERENCES
Two graphical displays for outlying and influential observations in regression
SUMMARY The paper describes two procedures for detecting observations with outlying values either in the response variable or in the explanatory variables in multiple regression. These procedures areExpand
Exploratory Data Analysis for Complex Models
TLDR
This article proposes an approach to unify exploratory data analysis with more formal statistical methods based on probability models, developed in the context of examples from fields including psychology, medicine, and social science. Expand
A Bayesian Sampling Approach to Regression Model Checking
A necessary step in any regression analysis is checking the fit of the model to the data. Graphical methods are often employed to allow visualization of features that the data should exhibit if theExpand
Rotation tests
TLDR
This paper describes a generalised framework for doing Monte Carlo tests in multivariate linear regression and offers an exact Monte Carlo solution to a classical problem of multiple testing. Expand
Bootstrap Methods and their Application
TLDR
This book gives a broad and up-to-date coverage of bootstrap methods, with numerous applied examples, developed in a coherent way with the necessary theoretical basis, including improved Monte Carlo simulation. Expand
The Grammar of Graphics
  • Mark Bailey
  • Mathematics, Computer Science
  • Technometrics
  • 2007
TLDR
The book describes clearly and intuitively the differences between exploratory and confirmatory factor analysis, and discusses how to construct, validate, and assess the goodness of fit of a measurement model in SEM by confirmatory factors analysis. Expand
Problem Solving: A Statistician's Guide
TLDR
The strife for simplicity is indeed one of the most significant developments in a statistics course over the past 25 years, even more important than the generalised use of computers in the manipulation of data. Expand
Local Regression Models
Local regression models are regression models where the parameters are ‘localized’, that is, they are allowed to vary with some or all of the covariates in a general way. Suppose that (Y, X) areExpand
Calibration for Simultaneity : ( Re ) Sampling Methods for Simultaneous Inference with Applications to Function Estimation and Functional Data
CfS applies whenever inference is based on a single distribution, for example: 1) fixed distributions such as Gaussians when diagnosing distributional assumptions, 2) conditional null distributionsExpand
The 2005 Neyman Lecture: Dynamic Indeterminism in Science
TLDR
Jerzy Neyman's life history and some of his contributions to applied statistics are reviewed, and a number of data sets and corresponding substantive questions are addressed. Expand
...
1
2
3
4
5
...