Data is dead... without what-if models

  title={Data is dead... without what-if models},
  author={Peter J. Haas and Paul P. Maglio and Patricia G. Selinger and Wang Chiew Tan},
  journal={Proceedings of the VLDB Endowment},
  pages={1486 - 1489}
Current database technology has raised the art of scalable descriptive analytics to a very high level. Unfortunately, what enterprises really need is prescriptive analytics to identify optimal business, policy, investment, and engineering decisions in the face of uncertainty. Such analytics, in turn, rest on deep predictive analytics that go beyond mere statistical forecasting and are imbued with an understanding of the fundamental mechanisms that govern a system's behavior, allowing what-if… 

Figures from this paper

In-Database Decision Support: Opportunities and Challenges

It is indicated how deep integration between the DBMS, predictive models, and optimization software creates opportunities for rich prescriptive-query functionality with good scalability and performance.

Mega-modeling for Big Data Analytics

This work proposes mega-modelling as a new holistic data and model management system for the acquisition, composition, integration, management, querying and mining of data and models, capable of mastering the co-evolution ofData and models and of supporting the creation of what-if analyses, predictive analytics and scenario explorations.

Exploring Data Partitions for What-if Analysis

Techniques to recommend data ranges for what-if analysis, which capture not only data regularities, but also the trade-off between conflicting goals, are proposed.

$Υ$-DB: A system for data-driven hypothesis management and analytics

This demo presents a first prototype of the $\Upsilon$-DB system and showcases its core innovative features by means of use case scenarios in computational science in which the hypotheses are extracted from a model repository on the web and evaluated (rated/ranked) as probabilistic data.

What-If Analysis with Conflicting Goals: Recommending Data Ranges for Exploration

Techniques to recommend data ranges for what-if analysis, which capture not only data regularities, but also the trade-off between conflicting goals, are proposed.

Managing large-scale scientific hypotheses as uncertain data with support for predictive analytics

This paper shows the applicability of Υ-DB in a real-world scenario and presents use cases in Computational Hemodynamics derived from the Physiome project.

Augmenting Decision Making via Interactive What-If Analysis

Four functionalities that are necessary to enable business users to interactively learn and reason about the relationships (functions) between sets of data attributes thereby facilitating data-driven decision making are argued for and implemented in S YSTEM D, an interactive visual data analysis system enablingbusiness users to experiment with the data by asking what-if questions.

Situation aware computing for big data

This work introduces a Knowledge Intensive Data-processing System (KIDS) that empowers Big Data applications to support situation awareness and bridges the gap between the world of low-value data and theworld of high-value information and knowledge.

Information visualisation and data analysis using web mash-up systems

In order to address complex data problems, a comprehensive and robust visualisation model and system is introduced to analyse complex data sets through data analysis and information visualisation to make it possible for the decision makers to identify the expected and discover the unexpected.



E = MC3: managing uncertain enterprise data in a cluster-computing environment

A new system, called MC3 (Monte Carlo Computation on a Cluster), is provided, that extends the MCDB approach to the map-reduce processing framework and can exploit the robustness and scalability of map- reduce, and can handle data stored in non-relational formats.

MCDB: a monte carlo approach to managing uncertain data

MCDB is introduced, a system for managing uncertain data that is based on a Monte Carlo approach, which can easily handle arbitrary joint probability distributions over discrete or continuous attributes, arbitrarily complex SQL queries, and arbitrary functionals of the query-result distribution such as means, variances, and quantiles.

Clio grows up: from research prototype to industrial tool

The architecture and algorithms behind Clio are revisited, some implementation issues, optimizations needed for scalability, and general lessons learned in the road towards creating an industrial-strength tool are discussed.

Behavioral simulations in MapReduce

BRACE (Big Red Agent-based Computation Engine), which extends the MapReduce framework to process behavioral simulations efficiently across a cluster and includes a high-level language called BRASIL (the Big Red Agent SImulation Language), which has object-oriented features for programming simulations, but can be compiled to a dataflow representation for automatic parallelization and optimization.

Business Dynamics—Systems Thinking and Modeling for a Complex World

This book is most obviously relevant to practitioners who already have some experience of multiagency facilitation, but might also serve as an introduction to working in this arena, if carefully supplemented with further reading and exploration of the topics it covers.


The Smarter Planet Platform for Analysis and Simulation of Health (Splash) is described, a platform for composing multiple heterogeneous models, simulations, and datasets that comprises mechanisms for cataloging, describing, connecting, and executing a set of models together.

ManyEyes: a Site for Visualization at Internet Scale

The design and deployment of Many Eyes is described, a public Web site where users may upload data, create interactive visualizations, and carry on discussions to support collaboration around visualizations at a large scale by fostering a social style of data analysis.

Simulation models of obesity: a review of the literature and implications for research and policy

  • D. LevyP. Mabry B. Swinburn
  • Medicine
    Obesity reviews : an official journal of the International Association for the Study of Obesity
  • 2011
An overview of existing SMs for obesity is presented, categorizing these models according to their focus: health and economic outcomes, trends in obesity as a function of past trends, physiologically based behavioural models, environmental contributors to obesity and policy interventions.

Integrated simulation and gaming architecture for incident management training

This paper presents a novel approach integrating gaming and simulation systems for training of decision makers and responders on the same scenarios preparing them to work together as a team.

A Systems-Oriented Multilevel Framework for Addressing Obesity in the 21st Century

The goal was to create a climate of training, funding, and academic and institutional support for obesity research that will offer sustainable solutions to the obesity problem.