• Corpus ID: 208310367

LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

@article{Eshragh2022LSAREL,
title={LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data},
author={Ali Eshragh and Fred Roosta and Asef Nazari and Michael W. Mahoney},
journal={J. Mach. Learn. Res.},
year={2022},
volume={23},
pages={22:1-22:36}
}
• Published 27 November 2019
• Computer Science, Mathematics
• J. Mach. Learn. Res.
We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\mathcal{O}(\varepsilon))$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm…

Figures from this paper

Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data
• Computer Science
• 2021
Empirical results on large-scale synthetic time series data support the theoretical results and reveal the efficacy of the new efficient algorithm, called Rollage, to estimate the order of an AR model and subsequently fit the model.
Toeplitz Least Squares Problems, Fast Algorithms and Big Data
• Computer Science
ArXiv
• 2021
This work investigates and compares the quality of these two approximation algorithms on largescale synthetic and real-world data and concludes that RandNLA is effective in the context of big-data time series.
Augmented Tensor Decomposition with Stochastic Alternating Optimization
• Computer Science
• 2021
Tensor decompositions are powerful tools for dimensionality reduction and feature interpretation of multidimensional data such as signals. Existing tensor decomposition objectives (e.g., Frobenius
Augmented Tensor Decomposition with Stochastic Optimization
• Computer Science
ArXiv
• 2021
Tensor decompositions are powerful tools for dimensionality reduction and feature interpretation of multidimensional data such as signals. Existing tensor decomposition objectives (e.g., Frobenius
Surprise Maximization: A Dynamic Programming Approach
Borwein et al. [1] solved a “surprise maximization” problem by applying results from convex analysis and mathematical programming. Although, their proof is elegant, it requires advanced knowledge
MTC: Multiresolution Tensor Completion from Partial and Coarse Observations
• Computer Science
KDD
• 2021
The proposed Multi-resolution Tensor Completion model (MTC) explores tensor mode properties and leverages the hierarchy of resolutions to recursively initialize an optimization setup, and optimizes on the coupled system using alternating least squares to ensure low computational and space complexity.
Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition
• Computer Science
ArXiv
• 2020
This work presents an application of randomized numerical linear algebra to fitting the CP decomposition of sparse tensors, solving a significantly smaller sampled least squares problem at each iteration with probabilistic guarantees on the approximation errors.

References

SHOWING 1-10 OF 40 REFERENCES
Information-Based Optimal Subdata Selection for Big Data Linear Regression
• Computer Science
Journal of the American Statistical Association
• 2018
Theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to subsampling-based methods, sometimes by orders of magnitude, and the advantages of the new approach are also illustrated through analysis of real data.
A statistical perspective on algorithmic leveraging
• Computer Science
J. Mach. Learn. Res.
• 2015
This work provides an effective framework to evaluate the statistical properties of algorithmic leveraging in the context of estimating parameters in a linear regression model and shows that from the statistical perspective of bias and variance, neither leverage-based sampling nor uniform sampling dominates the other.
Fast approximation of matrix coherence and statistical leverage
• Computer Science
ICML
• 2012
A randomized algorithm is proposed that takes as input an arbitrary n × d matrix A, with n ≫ d, and returns, as output, relative-error approximations to all n of the statistical leverage scores.
The Importance of Environmental Factors in Forecasting Australian Power Demand
• Economics
Environmental Modeling & Assessment
• 2021
We develop a time series model to forecast weekly peak power demand for three main states of Australia for a yearly timescale, and show the crucial role of environmental factors in improving the
Randomized Algorithms for Matrices and Data
This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis.
Low-Rank Approximation and Regression in Input Sparsity Time
• Computer Science
ArXiv
• 2012
We design a new distribution over m × n matrices S so that, for any fixed n × d matrix A of rank r, with probability at least 9/10, ∥SAx∥2 = (1 ± ε)∥Ax∥2 simultaneously for all x ∈ Rd. Here, m is
Assessing stochastic algorithms for large scale nonlinear least squares problems using extremal probabilities of linear combinations of gamma random variables
• Mathematics, Computer Science
SIAM/ASA J. Uncertain. Quantification
• 2015
This paper proposes eight variants of a practical randomized algorithm where the uncertainties in the major stochastic steps are quantified, and proves tight necessary and sufficient conditions on the sample size to satisfy the prescribed probabilistic accuracy.
A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle
This paper models occasional, discrete shifts in the growth rate of a nonstationary series. Algorithms for inferring these unobserved shifts are presented, a byproduct of which permits estimation of