Corpus ID: 14214939

The BigDawg Architecture and Reference Implementation

  title={The BigDawg Architecture and Reference Implementation},
  author={Jennie Duggan and Aaron J. Elmore and Tim Kraska and Samuel Madden and Timothy G. Mattson and Michael Stonebraker},
This paper presents the reference implementation of a new architecture for future “Big Data” applications. Such applications require “big analytics” as one might expect, but they also require real-time streaming support, real-time analytics, data visualization, and cross-storage queries. We are guided by the principle “one size does not fit all” [7], and we build on top of three storage engines, each designed for specialized use cases. In addition, we demonstrate novel support for querying… Expand
The BigDawg monitoring framework
A monitoring framework for the BigDawg federated database system which maintains performance information on benchmark queries and can determine the optimal query execution plan for similar incoming queries. Expand
D4M: Bringing associative arrays to database engines
The process of building the D4M-SciDB connector is described and the present performance of this connection is described in order to showcase how new databases may be supported by D 4M. Expand
Classifying, evaluating and advancing big data benchmarks
The thesis is an attempt to re-define system benchmarking taking into account the new requirements posed by the Big Data applications, with the explosion of Artificial Intelligence (AI) and new hardware computing power, this is a first step towards a more holistic approach to benchmarking. Expand
Performance evaluation of an integrated RFI database for the MeerKAT/SKA radio telescope
The findings thus provide a guide to the proposed integrated RFI system at MeerKAT/SKA radio telescope and find that SciDB and Accumulo scale better than PSQL under multi-user environments. Expand
ModelWizard: Toward Interactive Model Construction
This work proposes an interactive model construction framework grounded in composable operations, and proposes ModelWizard, a domain-specific language embedded in F# to construct Tabular models, as a new model construction paradigm, speeding discovery of the universe's structure. Expand
Big Data Technology Accelerate Genomics Precision Medicine
  • Hao Li
  • Computer Science
  • ArXiv
  • 2017
This paper demonstrates how Intel Big Data Technology and Architecture help to facilitate and accelerate the genomics life science research in data store and utilization. Expand


Demonstration of the Myria big data management service
This interactive demonstration will guide visitors through an exploration of several key Myria features by interfacing with the live system to analyze big datasets over the web. Expand
Dynamic distributed dimensional data model (D4M) database and computation system
  • J. Kepner, W. Arcand, +13 authors Charles Yee
  • Computer Science
  • 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2012
D4M (Dynamic Distributed Dimensional Data Model) has been developed to provide a mathematically rich interface to tuple stores (and structured query language “SQL” databases) and it is possible to create composable analytics with significantly less effort than using traditional approaches. Expand
SEEDB: Automatically Generating Query Visualizations
This work demonstrates SeeDB, a system that partially automates this task: given a query, SeeDB explores the space of all possible visualizations, and automatically identifies and recommends to the analyst those visualizations it finds to be most "interesting" or "useful". Expand
A Demonstration of SciDB: A Science-Oriented DBMS
An overview of Sci DB's key features is presented and a demonstration of the first version of SciDB on data and operations from one of the authors' lighthouse users, the Large Synoptic Survey Telescope (LSST). Expand
"One Size Fits All": An Idea Whose Time Has Come and Gone?
In einer späteren Publikation [St07] stellte Michael Stonebraker sogar die These auf, dass es keine Anwendungen gibt, für die die traditionellen Datenbanksysteme die beste Alternative sind. Expand
Multiparameter Intelligent Monitoring in Intensive Care Ii ( Mimic-Ii ) : A Public-Access Intensive Care Unit Database
W e report the establishment of the Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) research database that is notable for four factors: it is publicly and freely available toExpand
Multiparameter Intelligent Monitoring in Intensive Care II: A public-access intensive care unit database*
MIMIC-II documents a diverse and very large population of intensive care unit patient stays and contains comprehensive and detailed clinical data, including physiological waveforms and minute-by-minute trends for a subset of records. Expand