• Publications
  • Influence
Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
This work explores if and how generative adversarial networks can be used to incentivize data sharing by enabling a generic framework for sharing synthetic datasets with minimal expert knowledge and designs a custom workflow called DoppelGANger, which achieves up to 43% better fidelity than baseline models. Expand
KV-match: An Efficient Subsequence Matching Approach for Large Scale Time Series
A new index structure, KV-index, and the corresponding matching algorithm, Kv-match are proposed, which is of comparable size to the popular tree-style index while the query processing is order of magnitudes more efficient. Expand
Time Series Data Cleaning: A Survey
This survey provides a classification of time series data cleaning techniques and comprehensively reviews the state-of-the-art methods of each type and highlights possible directions time seriesData cleaning. Expand
PISA: An Index for Aggregating Big Time Series Data
A new segment tree based index called PISA is proposed, which has fast insertion performance and low latency for aggregation queries, and uses a forest to overcome the performance disadvantages of insertions in traditional segment trees. Expand
Generating High-fidelity, Synthetic Time Series Datasets with DoppelGANger
DoppelGANger is presented, a synthetic data generation framework based on generative adversarial networks (GANs) that achieves up to 43% better fidelity than baseline models, and captures structural properties of data that baseline methods are unable to learn. Expand
Misplaced Subsequences Repairing with Application to Multivariate Industrial Time Series Data
This work defines an inconsistent subsequences problem in multivariate time series, and proposes an integrity data repair approach to solve inconsistent problems, and shows that the method captures and repairs inconsistency problems effectively in industrial time series in complex IIoT scenarios. Expand
Pattern Matching with Adaptive Granularity Over Streaming Time Series
A novel approach to solve fine-grained matching of streaming time series data from sensors with lower latency and limited computing resource is proposed, which outperforms the brute-force method and MSM, a multi-step filter mechanism over the multi-scaled representation, by orders of magnitude. Expand
Exploring Data and Knowledge combined Anomaly Explanation of Multivariate Industrial Data
This paper addresses the anomaly explanation problem for multivariate IoT data and proposes a 3-step self-contained method to discover the anomaly events reflected by violation features, and develops knowledge update algorithms to improve the original knowledge set. Expand
Matching Consecutive Subpatterns Over Streaming Time Series
A novel representation Equal-Length Block (ELB) is proposed together with two efficient implementations, which work very well under all Lp-Norms without false dismissals and outperforms the brute-force method and MSM, a multi-step filter mechanism over the multi-scaled representation by orders of magnitude. Expand
GRAB: Finding Time Series Natural Structures via A Novel Graph-based Scheme
  • Yi Lu, Peng Wang, +4 authors Jianmin Wang
  • Computer Science
  • IEEE 37th International Conference on Data…
  • 1 April 2021
A novel graph-based approach, GRAB, is proposed, which partitions the time series into a set of non-overlapping fragments via the similarity between subsequences, and employs a graph partition method to cluster the fragments into states. Expand