Yuanzhen Ji

Learn More
Distributed data stream processing systems, like Twitter Storm or Yahoo! S4, have been primarily focusing on adapting to varying event rates. However, as these systems are becoming increasingly multi-tenant, adaptation to the varying query load is becoming an equally important problem. In this paper we present FUGU – an elastic allocator for Complex Event(More)
Elastic scaling allows data stream processing systems to dynamically scale in and out to react to workload changes. As a consequence, unexpected load peaks can be handled and the extent of the overprovisioning can be reduced. However, the strategies used for elastic scaling of such systems need to be tuned manually by the user. This is an error prone and(More)
Executing continuous queries over out-of-order data streams, where tuples are not ordered according to timestamps, is challenging; because high result accuracy and low result latency are two conflicting performance metrics. Although many applications allow trading exact query results for lower latency, they still expect the produced results to meet a(More)
One fundamental challenge in data stream processing is to cope with the ubiquity of disorder of tuples within a stream caused by network latency, operator parallelization, merging of asynchronous streams, etc. High result accuracy and low result latency are two conflicting goals in out-of-order stream processing. Different applications may prefer different(More)
One of the major problems faced by the high-tech manufacturing industry is the need for automated and timely detection of anomalies which can lead to failures of the manufacturing equipment. Failures of the high-tech manufacturing equipment have a direct negative impact on the operating margin and consequently profit of the high-tech manufacturing industry.(More)
Handling timestamp-disorder among stream tuples is a basic requirement for data stream processing, and involves an inevitable tradeoff between the latency and the quality of stream query results. To meet the tradeoff requirements of diverse streaming applications, the approach of <i>buffer-based, quality-driven disorder handling (QDDH)</i> was proposed(More)
The constantly increasing number of connected devices and sensors results in increasing volume and velocity of sensor-based streaming data. Traditional approaches for processing high velocity sensor data rely on stream processing engines. However, the increasing complexity of continuous queries executed on top of high velocity data has resulted in growing(More)
Elasticity describes the ability of any distributed system to scale to a varying number of hosts in response to workload changes. It has become a mandatory architectural property for state of the art cloud-based data stream processing systems, as it allows treatment of unexpected load peaks and cost-efficient execution at the same time. Although such(More)
Over the last few years, the increasing demand on processing streaming data with high throughput and low latency has led to the development of specialized stream processing engines (SPE). Although existing SPEs show high performance in evaluating stateless operations and stateful operations with small windows, their performance degrades significantly when(More)
  • 1