Optimizing Prediction Using Bayesian Model Averaging: Examples Using Large-Scale Educational Assessments

@article{Kaplan2018OptimizingPU,
  title={Optimizing Prediction Using Bayesian Model Averaging: Examples Using Large-Scale Educational Assessments},
  author={David Kaplan and Chansoon Lee},
  journal={Evaluation Review},
  year={2018},
  volume={42},
  pages={423 - 457}
}
This article provides a review of Bayesian model averaging as a means of optimizing the predictive performance of common statistical models applied to large-scale educational assessments. The Bayesian framework recognizes that in addition to parameter uncertainty, there is uncertainty in the choice of models themselves. A Bayesian approach to addressing the problem of model uncertainty is the method of Bayesian model averaging. Bayesian model averaging searches the space of possible models for… 

Figures and Tables from this paper

A Conceptual Introduction to Bayesian Model Averaging

In this conceptual introduction, the principles of BMA are explained, its advantages over all-or-none model selection are described, and its utility is showcased in three examples: analysis of covariance, meta-analysis, and network analysis.

An Approach to Addressing Multiple Imputation Model Uncertainty Using Bayesian Model Averaging

An extensive simulation study is conducted comparing the extent of model uncertainty in multiple imputation and a consistent advantage to the Bayesian model averaging approach against normal theory-based Bayesian imputation not accounting for model uncertainty.

Bayesian probabilistic forecasting with large-scale educational trend data: a case study using NAEP

A Bayesian probabilistic forecasting workflow is provided that can be used with large-scale assessment trend data generally, and that workflow with an application to the state NAEP assessments is demonstrated.

On the Choice of the Item Response Model for Scaling PISA Data: Model Selection Based on Information Criteria and Quantifying Model Uncertainty

In educational large-scale assessment studies such as PISA, item response theory (IRT) models are used to summarize students’ performance on cognitive test items across countries. In this article,

Why Model Averaging

Model averaging is a means of allowing for model uncertainty in estimation which can provide better estimates and more reliable confidence intervals than model selection. We illustrate its use via

A Bootstrapping Assessment on A U.S. Education Indicator Construction Through Multiple Imputation

The results show that the 20-imputation setting has reduced the standard error and improved normality in comparison to the five-imputed setting, and the findings from this study indicate that an increase of the resampling number is unlikely to reduce the standard errors.

Policies and Practices of Assessment: A Showcase for the Use (and Misuse) of International Large Scale Assessments in Educational Effectiveness Research

International Large Scale Assessments (ILSAs) such as TIMSS and PISA provide comparative indicators and trend information on educational systems. Scholars repeatedly claimed that ILSAs should be

References

SHOWING 1-10 OF 92 REFERENCES

Bayesian Model Averaging for Propensity Score Analysis

An approximate Bayesian model averaging approach based on the model-averaged propensity score estimates produced by the R package BMA is investigated but that ignores uncertainty in the propensity score.

Bayesian Model Averaging Over Directed Acyclic Graphs With Implications for the Predictive Performance of Structural Equation Models

This article examines Bayesian model averaging as a means of addressing predictive performance in Bayesian structural equation models by considering a structural equation model as a special case of a directed acyclic graph and provides an algorithm that searches the model space for submodels and obtains a weighted average of the submodels using posterior model probabilities as weights.

Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression

The results show that in most cases Bayesian model averaging selects the correct model and out-performs stepwise approaches at predicting an event of interest.

Bayesian Model Averaging for Linear Regression Models

Abstract We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the

Bayesian Model Averaging and Model Search Strategies

In regression models, such as generalized linear models, there is often substantial prior uncertainty about the choice of covariates to include. Conceptually, the Bayesian paradigm can easily

Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window

Abstract We consider the problem of model selection and accounting for model uncertainty in high-dimensional contingency tables, motivated by expert system applications. The approach most used

POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES

This paper considers Bayesian counterparts of the classical tests for good- ness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior

Bayesian model averaging employing fixed and flexible priors: The BMS package for R

The BMS (Bayesian model sampling) package for R that implements Bayesian model averaging for linear regression models excels in allowing for a variety of prior structures, among them the "binomial-beta" prior on the model space and the so-called "hyper-g" specifications for Zellner's g prior.

Bayesian measures of model complexity and fit

The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages.

Bayesian Model Averaging: A Tutorial

Bayesian model averaging (BMA) provides a coherent mechanism for ac- counting for this model uncertainty and provides improved out-of- sample predictive performance.
...