Aurora: a new model and architecture for data stream management

  title={Aurora: a new model and architecture for data stream management},
  author={Daniel J. Abadi and Donald Carney and Ugur Çetintemel and Mitch Cherniack and Christian Convey and Sangdon Lee and Michael Stonebraker and Nesime Tatbul and Stanley B. Zdonik},
  journal={The VLDB Journal},
Abstract.This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual inputs from many sources (e.g., sensors) rather than from human operators requires one to rethink the fundamental architecture of a DBMS for this application area. In this paper, we present… 

The Anatomy of a Stream Processing System

The basic processing model and architecture of MavStream – a new Data Stream Management System (DSMS) being developed at UT Arlington is described and the effect of different scheduling strategies and buffer sizes on the performance and output is provided.

A System for Processing Continuous Queries over Infinite Data Streams

The outcome of this paper shows that it is possible to process continuous data streams from several sources and in some ways a better performance is also achievable.

Integrating a Stream Processing Engine and Databases for Persistent Streaming Data Management

This paper describes the data stream management system, which employs an architecture combining a stream processing engine and DBMS, and a proposed query language that supports not only filtering, join, and projection over data streams, but also continuous persistence requirements for stream data.

DBMS meets DSMS - Towards a Federated Solution

The requirements and benefits for integrating data stream processing with database management systems are described and how to design a federated system which provides the benefits of both approaches is discussed.

SPC: a distributed, scalable platform for data mining

The SPC programming model is described, which is to the best of the authors' knowledge, the first to support stream-mining applications using a subscription-like model for specifying stream connections as well as to provide support for non-relational operators.

Data Stream Management Systems

  • Sandra Geisler
  • Computer Science
    Data Exchange, Information, and Streams
  • 2013
This chapter gives an overview of the basics of data streams, architecture principles of DSMS and the used query languages, and details data quality aspects in DSMS as these play an important role for various applications based on data streams.

An Efficient and Highly Available Distributed Data Management System

An efficient and highly available distributed data management system (DDMS) that consists of three components of a PUCC node, stream manager, and data manager is proposed and it is revealed that the query-processing performance with YCSB could be improved by combining the data manager with the stream manager.

Management and Federation of Stream Processing Applications

This thesis explores stream processing with the help of traditional stream processing applications as well as applications that process personal information as data streams with different requirements on how stream processing solutions can be deployed, integrated, extended, and federated.

Data Ingestion for the Connected World

It is argued that in many “Big Data” applications, getting data into the system correctly and at scale via traditional ETL processes is a fundamental roadblock to being able to perform timely analytics or make real-time decisions.



Alert: An Architecture for Transforming a Passive DBMS into an Active DBMS

The design of Alert and its implementation in the Starburst extensible DBMS is presented and a layered architecture that allows the semantics of a variety of production rule languages to be supported on top is provided.

The Design of the Borealis Stream Processing Engine

This paper outlines the basic design and functionality of Borealis, and presents a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.

Tribeca: A System for Managing Large Databases of Network Traffic

Tribeca is an extensible, stream-oriented DBMS designed to support network traffic analysis that combines ideas from temporal and sequence databases with an implementation optimized for databases stored on high speed ID-1 tapes or arriving in real time from the network.

Continuous queries over data streams

A general and flexible architecture for query processing in the presence of data streams is specified, which captures most previous work on continuous queries and data streams, as well as related concepts such as triggers and materialized views.

Online aggregation

A new online aggregation interface is proposed that permits users to both observe the progress of their aggregation queries and control execution on the fly, and a suite of techniques that extend a database system to meet these requirements are presented.

Exotica: a project on advanced transaction management and workflow systems

This paper is an overview of the Exotica project, currently in progress at the IBM Almaden Research Center. The project aims at exploring several research areas from advanced transaction management

Fjording the stream: an architecture for queries over streaming sensor data

  • S. MaddenM. Franklin
  • Computer Science
    Proceedings 18th International Conference on Data Engineering
  • 2002
This work presents the Fjords architecture for managing multiple queries over many sensors, and shows how it can be used to limit sensor resource demands while maintaining high query throughput.

Continuously adaptive continuous queries over streams

We present a continuously adaptive, continuous query (CACQ) implementation based on the eddy query processing framework. We show that our design provides significant performance benefits over

An adaptive query execution system for data integration

It is demonstrated that the Tukwila architecture extends previous innovations in adaptive execution (such as query scrambling, mid-execution re-optimization, and choose nodes), and experimental evidence that the techniques result in behavior desirable for a data integration system is presented.

Eddies: continuously adaptive query processing

This paper introduces a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs, and describes the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated.