# LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

@article{Eshragh2022LSAREL, title={LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data}, author={Ali Eshragh and Fred Roosta and Asef Nazari and Michael W. Mahoney}, journal={J. Mach. Learn. Res.}, year={2022}, volume={23}, pages={22:1-22:36} }

We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\mathcal{O}(\varepsilon))$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm…

## 7 Citations

Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data

- Computer Science
- 2021

Empirical results on large-scale synthetic time series data support the theoretical results and reveal the efficacy of the new efficient algorithm, called Rollage, to estimate the order of an AR model and subsequently fit the model.

Toeplitz Least Squares Problems, Fast Algorithms and Big Data

- Computer ScienceArXiv
- 2021

This work investigates and compares the quality of these two approximation algorithms on largescale synthetic and real-world data and concludes that RandNLA is effective in the context of big-data time series.

Augmented Tensor Decomposition with Stochastic Alternating Optimization

- Computer Science
- 2021

Tensor decompositions are powerful tools for dimensionality reduction and feature interpretation of multidimensional data such as signals. Existing tensor decomposition objectives (e.g., Frobenius…

Augmented Tensor Decomposition with Stochastic Optimization

- Computer ScienceArXiv
- 2021

Tensor decompositions are powerful tools for dimensionality reduction and feature interpretation of multidimensional data such as signals. Existing tensor decomposition objectives (e.g., Frobenius…

Surprise Maximization: A Dynamic Programming Approach

- Mathematics
- 2020

Borwein et al. [1] solved a “surprise maximization” problem by applying results from convex analysis and mathematical programming. Although, their proof is elegant, it requires advanced knowledge…

MTC: Multiresolution Tensor Completion from Partial and Coarse Observations

- Computer ScienceKDD
- 2021

The proposed Multi-resolution Tensor Completion model (MTC) explores tensor mode properties and leverages the hierarchy of resolutions to recursively initialize an optimization setup, and optimizes on the coupled system using alternating least squares to ensure low computational and space complexity.

Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition

- Computer ScienceArXiv
- 2020

This work presents an application of randomized numerical linear algebra to fitting the CP decomposition of sparse tensors, solving a significantly smaller sampled least squares problem at each iteration with probabilistic guarantees on the approximation errors.

## References

SHOWING 1-10 OF 40 REFERENCES

Online adaptive lasso estimation in vector autoregressive models for high dimensional wind power forecasting

- Computer ScienceInternational Journal of Forecasting
- 2019

Information-Based Optimal Subdata Selection for Big Data Linear Regression

- Computer ScienceJournal of the American Statistical Association
- 2018

Theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to subsampling-based methods, sometimes by orders of magnitude, and the advantages of the new approach are also illustrated through analysis of real data.

A statistical perspective on algorithmic leveraging

- Computer ScienceJ. Mach. Learn. Res.
- 2015

This work provides an effective framework to evaluate the statistical properties of algorithmic leveraging in the context of estimating parameters in a linear regression model and shows that from the statistical perspective of bias and variance, neither leverage-based sampling nor uniform sampling dominates the other.

Fast approximation of matrix coherence and statistical leverage

- Computer ScienceICML
- 2012

A randomized algorithm is proposed that takes as input an arbitrary n × d matrix A, with n ≫ d, and returns, as output, relative-error approximations to all n of the statistical leverage scores.

The Importance of Environmental Factors in Forecasting Australian Power Demand

- EconomicsEnvironmental Modeling & Assessment
- 2021

We develop a time series model to forecast weekly peak power demand for three main states of Australia for a yearly timescale, and show the crucial role of environmental factors in improving the…

Randomized Algorithms for Matrices and Data

- Computer ScienceFound. Trends Mach. Learn.
- 2011

This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis.

Low-Rank Approximation and Regression in Input Sparsity Time

- Computer ScienceArXiv
- 2012

We design a new distribution over m × n matrices S so that, for any fixed n × d matrix A of rank r, with probability at least 9/10, ∥SAx∥2 = (1 ± ε)∥Ax∥2 simultaneously for all x ∈ Rd. Here, m is…

Assessing stochastic algorithms for large scale nonlinear least squares problems using extremal probabilities of linear combinations of gamma random variables

- Mathematics, Computer ScienceSIAM/ASA J. Uncertain. Quantification
- 2015

This paper proposes eight variants of a practical randomized algorithm where the uncertainties in the major stochastic steps are quantified, and proves tight necessary and sufficient conditions on the sample size to satisfy the prescribed probabilistic accuracy.

A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle

- Economics
- 1989

This paper models occasional, discrete shifts in the growth rate of a nonstationary series. Algorithms for inferring these unobserved shifts are presented, a byproduct of which permits estimation of…