Locality-Sensitive Hashing for Earthquake Detection: A Case Study Scaling Data-Driven Science

@article{Rong2018LocalitySensitiveHF,
  title={Locality-Sensitive Hashing for Earthquake Detection: A Case Study Scaling Data-Driven Science},
  author={Kexin Rong and Clara E. Yoon and Karianne J. Bergen and Hashem Elezabi and Peter D. Bailis and Philip Alexander Levis and Gregory C. Beroza},
  journal={ArXiv},
  year={2018},
  volume={abs/1803.09835}
}
In this work, we report on a novel application of Locality Sensitive Hashing (LSH) to seismic data at scale. Based on the high waveform similarity between reoccurring earthquakes, our application identifies potential earthquakes by searching for similar time series segments via LSH. However, a straightforward implementation of this LSH-enabled application has difficulty scaling beyond 3 months of continuous time series data measured at a single seismic station. As a case study of a data-driven… 

Locality-Sensitive Hashing for Earthquake Detection: A Case Study of Scaling Data-Driven Science (Extended Version)

In this work, we report on a novel application of Locality Sensitive Hashing (LSH) to seismic data at scale. Based on the high waveform similarity between reoccurring earthquakes, our application

Parallel Locality Sensitive Hashing for Network Discovery from Time Series

  • Computer Science
  • 2022
This thesis proposes a scalable system based on locality sensitive hashing by implementing it in parallel with independent hash functions, and concludes with discussion of the impact of the similarity measures on the network discovery results, as well as proposing further investigations into other parts of the parameter space.

Fast and Scalable Mining of Time Series Motifs with Probabilistic Guarantees

This paper improves on a straightforward application of LSH to time series data by developing a self-tuning algorithm that adapts to the data distribution and includes several optimizations to the algorithm, reducing redundant computations and leveraging the structure of time seriesData to speed up LSH computations.

STanford EArthquake Dataset (STEAD): A Global Data Set of Seismic Signals for AI

A high-quality, large-scale, and global data set of local earthquake and non-earthquake signals recorded by seismic instruments, which contains two categories: local earthquake waveforms and seismic noise waveforms that are free of earthquake signals.

Big Data Seismology

The discipline of seismology is based on observations of ground motion that are inherently undersampled in space and time. Our basic understanding of earthquake processes and our ability to resolve

Unsupervised Large‐Scale Search for Similar Earthquake Signals

Seismology has continuously recorded ground‐motion spanning up to decades. Blind, uninformed search for similar‐signal waveforms within this continuous data can detect small earthquakes missing

Scaling Time Series Motif Discovery with GPUs : Breaking the Quintillion Pairwise Comparisons a Day Barrier

The efficiency of the algorithm allowed to be demonstrated allows us to exhaustively consider datasets that are currently only approximately searchable, allowing us to find subtle precursor earthquakes that had previously escaped attention, and other novel seismic regularities.

Matrix Profile XIV: Scaling Time Series Motif Discovery with GPUs to Break a Quintillion Pairwise Comparisons a Day and Beyond

This work shows that with several novel insights the motif discovery envelope can be pushed using a novel scalable framework in conjunction with a deployment to commercial GPU clusters in the cloud, allowing us to find subtle precursor earthquakes that had previously escaped attention, and other novel seismic regularities.

DROP: Optimizing Stochastic Dimensionality Reduction blackvia Workload-Aware Progressive Sampling

  • Computer Science
  • 2019
This work shows how accounting for downstream analytics operations during dimensionality reduction via PCA allows stochastic methods to efficiently operate over very small subsamples of input data, thus reducing computational overhead and end-to-end runtime.

CREIME—A Convolutional Recurrent Model for Earthquake Identification and Magnitude Estimation

A multi-tasking deep learning model – the C onvolutional R ecurrent model for E arthquake I dentification and M agnitude E stimation (CREIME) that detects the first earthquake signal, from background seismic noise, determines the P-arrival time as well as estimates the magnitude using the raw of P-wave data compared to the previous studies.

References

SHOWING 1-10 OF 66 REFERENCES

Locality-Sensitive Hashing for Earthquake Detection: A Case Study of Scaling Data-Driven Science (Extended Version)

In this work, we report on a novel application of Locality Sensitive Hashing (LSH) to seismic data at scale. Based on the high waveform similarity between reoccurring earthquakes, our application

Scalable Similarity Search in Seismology: A New Approach to Large-Scale Earthquake Detection

FAST, a new earthquake detection method that leverages locality-sensitive hashing to enable waveform-similarity-based earthquake detection in long-duration continuous seismic data, is described.

Earthquake detection through computationally efficient similarity search

FAST detected most (21 of 24) cataloged earthquakes and 68 uncataloged earthquakes in 1 week of continuous data from a station located near the Calaveras Fault in central California, achieving detection performance comparable to that of autocorrelation, with some additional false detections.

Detecting earthquakes over a seismic network using single-station similarity measures

Scalable similarity search in seismology: a new approach to large-scale earthquake detection, in SISAP’16: Proceedings of the 9th Int.

Broadband Seismic Array Deployment and Data Analysis in Alberta

The availability and fidelity of broadband seismic instruments in recent years have greatly accelerated worldwide research on ground motion and seismic structure. Equipped with instruments that are

Streaming Similarity Search over one Billion Tweets using Parallel Locality-Sensitive Hashing

A new variant of LSH is described, called Parallel LSH (PLSH), designed to be extremely efficient, capable of scaling out on multiple nodes and multiple cores, and which supports high-throughput streaming of new data.

Convolutional neural network for earthquake detection and location

This work leverages the recent advances in artificial intelligence and presents ConvNetQuake, a highly scalable convolutional neural network for earthquake detection and location from a single waveform, and applies it to study the induced seismicity in Oklahoma, USA.

A comparison of select trigger algorithms for automated global seismic phase and event detection

While no algorithm was clearly optimal under all source, receiver, path, and noise conditions tested, an STA/LTA algorithm incorporating adaptive window lengths controlled by nonstationary seismogram spectral characteristics was found to provide an output that best met the requirements of a global correlated event-detection and location system.

Temporal variation in the magnitude‐frequency distribution during the Guy‐Greenbrier earthquake sequence

The recent increase in earthquake activity in the central U.S. has led to concerns about the hazard posed by induced earthquakes. Understanding earthquake phenomena and monitoring in all settings can

Perspectives of Cross-Correlation in Seismic Monitoring at the International Data Centre

We demonstrate that several techniques based on waveform cross-correlation are able to significantly reduce the detection threshold of seismic sources worldwide and to improve the reliability of
...