• Corpus ID: 54445966

Megaphone: Live state migration for distributed streaming dataflows

@article{Hoffmann2018MegaphoneLS,
  title={Megaphone: Live state migration for distributed streaming dataflows},
  author={Moritz Hoffmann and Andrea Lattuada and Frank McSherry and Vasiliki Kalavri and John Liagouris and Timothy Roscoe},
  journal={ArXiv},
  year={2018},
  volume={abs/1812.01371}
}
We design and implement Megaphone, a data migration mechanism for stateful distributed dataflow engines with latency objectives. [] Key Method Megaphone is implemented as a library on an unmodified timely dataflow implementation, and provides an operator interface compatible with its existing APIs. We evaluate Megaphone on established benchmarks with varying amounts of state and observe that compared to na\"ive approaches Megaphone reduces service latencies during reconfiguration by orders of magnitude…
InferLine : Prediction Pipeline Provisioning and Management for Tight Latency Objectives
TLDR
InferLine is introduced, a system which provisions and executes ML prediction pipelines subject to end-to-end latency constraints by proactively optimizing and reactively controlling per-model configurations in a fine-grained fashion and generalizes across state-of-the-art model serving frameworks.

References

SHOWING 1-10 OF 27 REFERENCES
Latency-conscious dataflow reconfiguration
TLDR
The implementation, prototyped on Timely Dataflow, provides a scalable stateful operator template compatible with existing APIs that carefully reorganizes data to minimize migration overhead and reduces service latencies by orders of magnitude.
ChronoStream: Elastic stateful stream computation in the cloud
  • Yingjun Wu, K. Tan
  • Computer Science
    2015 IEEE 31st International Conference on Data Engineering
  • 2015
TLDR
This work introduces ChronoStream, a distributed system specifically designed for elastic stateful stream computation in the cloud that can scale linearly and achieve transparent elasticity and high availability without sacrificing system performance or affecting collocated tenants.
Chi: A Scalable and Programmable Control Plane for Distributed Stream Processing Systems
TLDR
A novel control-plane design, Chi, is investigated, which supports continuous monitoring and feedback, and enables dynamic re-configuration, and leverages the key insight of embedding control-planes messages in the data-plane channels to achieve a low-latency and flexible control plane for stream-processing systems.
State Management in Apache Flink®: Consistent Stateful Distributed Stream Processing
TLDR
Flink's core pipelined, in-flight mechanism is presented which guarantees the creation of lightweight, consistent, distributed snapshots of application state, progressively, without impacting continuous execution, and the low performance trade-offs of the approach are demonstrated.
Gloss: Seamless Live Reconfiguration and Reoptimization of Stream Programs
TLDR
Gloss, for the first time, avoids periods of zero throughput during the reconfiguration of both stateless and stateful SDF based stream programs and permits it to reoptimize the application for entirely new configurations that it may not have encountered before.
Naiad: a timely dataflow system
TLDR
It is shown that many powerful high-level programming models can be built on Naiad's low-level primitives, enabling such diverse tasks as streaming data analysis, iterative machine learning, and interactive graph mining.
Zephyr: live migration in shared nothing databases for elastic cloud platforms
TLDR
Zephyr is proposed, a technique to efficiently migrate a live database in a shared nothing transactional database architecture that uses phases of on-demand pull and asynchronous push of data, requires minimal synchronization, and provides ACID guarantees during migration and ensures correctness in the presence of failures.
Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows
TLDR
DS2, an automatic scaling controller for large-scale stream processors which combines a general performance model of streaming dataflows with lightweight instrumentation to estimate the true processing and output rates of individual dataflow operators is presented.
Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases
TLDR
The Squall technique for supporting live reconfiguration in partitioned, main memory DBMSs supports fine-grained repartitioning of databases in the presence of distributed transactions, high throughput client workloads, and replicated data.
Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration
TLDR
Albatross migrates the database cache and the state of active transactions to ensure minimal impact on transaction execution while allowing transactions active during migration to continue execution, and guarantees serializability while ensuring correctness during failures.
...
...