Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB+

  title={Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB+},
  author={S{\o}ren Kejser Jensen and T. Pedersen and Christian Thomsen},
  journal={2021 IEEE 37th International Conference on Data Engineering (ICDE)},
To monitor critical infrastructure, high quality sensors sampled at a high frequency are increasingly used. However, as they produce huge amounts of data, only simple aggregates are stored. This removes outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing time series with dimensions that exploits correlation in and among time series. Specifically, we propose compressing groups of correlated time series using an extensible set of… Expand
Demonstration of ModelarDB: Model-Based Management of Dimensional Time Series
This demonstration demonstrates ModelarDB, a model-based Time Series Management System (TSMS) for time series with dimensions and possibly gaps, and provides fast ingestion and a high compression ratio by adaptively compressing time series using a set of models to accommodate changes in the structure of each time series over time. Expand
Extreme-Scale Model-Based Time Series Management with ModelarDB (Invited Talk)
This work presents a model-based approach for managing extremescale time series that approximates the time series values using mathematical functions (models) and stores only model coefficients rather than data values and shows that ModelarDB provides up to 14× faster ingestion due to high compression. Expand
Temporal Models on Time Series Databases
This paper presents how metamodels and their instances, i.e., models, can be partially mapped to time series databases, and shows the efficiency of applying derived runtime properties as time series queries also for large model histories. Expand
The Danish National Energy Data Lake: Requirements, Technical Architecture, and Tool Selection
The requirements for FEDDL are described based on a representative LL case study, its technical architecture is presented, and a comparison of relevant tools is provided along with the arguments for which ones the authors selected. Expand


ModelarDB: Modular Model-Based Time Series Management with Spark and Cassandra
An online, adaptive multi-model compression algorithm that maintains data values within a user-defined error bound (possibly zero) is proposed that achieves fast ingestion, good compression, and fast, scalable online aggregate query processing at the same time. Expand
MTSC: An Effective Multiple Time Series Compressing Approach
This paper defines a novel representation model, which uses a base series and a single value to represent each series, and proposes two graph-based algorithms that can achieve higher compression ratio and is much more efficient by sacrificing the compression ratio slightly. Expand
Two-Level Data Compression using Machine Learning in Time Series Database
This paper proposes a two-level compression model that selects a proper compression scheme for each individual point, so that diverse patterns can be captured at a fine granularity and introduces a reinforcement learning based approach to learn parameter values automatically. Expand
Towards Online Multi-model Approximation of Time Series
This paper investigates the innovative concept of efficiently combining multiple approximation models in real-time and proves that this approach dynamically adapts to the properties of the data stream and approximates each data segment with the most suitable model. Expand
Gorilla: A Fast, Scalable, In-Memory Time Series Database
Gorilla, Facebook's in-memory TSDB, is introduced and insight is that users of monitoring systems do not place much emphasis on individual data points but rather on aggregate analysis, and recent data points are of much higher value than older points to quickly detect and diagnose the root cause of an ongoing problem. Expand
CORAD: Correlation-Aware Compression of Massive Time Series using Sparse Dictionary Coding
This work demonstrates how one can leverage the correlation across several related time series streams to both drastically improve the compression efficiency and reduce the accuracy loss, and introduces a method to threshold the information loss of the compression. Expand
Modeling Large Time Series for Efficient Approximate Query Processing
This paper outlines a new system that does not query complete datasets but instead utilizes models to extract the requested information and introduces a SQL compatible query terminology to allow seamless integration of model-based querying into traditional data warehouses. Expand
A time-series compression technique and its application to the smart grid
This article presents a compression technique based on piecewise regression and two methods which describe the performance of the compression, and shows that the proposed compression technique can be implemented in a state-of-the-art database management system. Expand
GAMPS: compressing multi sensor data by grouping and amplitude scaling
This work presents GAMPS, a general framework that addresses the problem of collectively approximating a set of sensor signals using the least amount of space so that any individual signal can be efficiently reconstructed within a given maximum (L∞) error ε. Expand
Capturing sensor-generated time series with quality guarantees
An optimal online algorithm for constructing a piecewise constant approximation (PCA) of a time series which guarantees that the compressed representation satisfies an error bound on the L/sub /spl infin// distance is proposed. Expand