Corpus ID: 197545218

Federated PCA with Adaptive Rank Estimation

  title={Federated PCA with Adaptive Rank Estimation},
  author={Andreas Grammenos and Rodrigo Mendoza-Smith and Cecilia Mascolo and Jon Crowcroft},
In many online machine learning and data science tasks such as data summarisation and feature compression, d-dimensional vectors are usually distributed across a large number of clients in a decentralised network and collected in a streaming fashion. This is increasingly common in modern applications due to the sheer volume of data generated and the clients’ constrained resources. In this setting, some clients are required to compute an update to a centralised target model independently using… Expand
Fast and Sample-Efficient Federated Low Rank Matrix Recovery from Column-wise Linear and Quadratic Projections
A novel gradient descent (GD) based solution called GD-min that needs only Ω((n + q)r log(1/ )) samples and O(mqnr log( 1/ )) time to obtain an -accurate estimate is introduced that is the best achievable sample complexity guarantee for a non-convex solution to the above problem. Expand
Federated Over-the-Air Subspace Learning from Incomplete Data
This work develops a federated over-the-air version of the power method (FedPM) and shows that its iterates converge as long as the channel noise is very small compared to the singular value of the matrix; and the ratio between its $(r+1)$-th and $r-th singular value is smaller than a constant less than one. Expand
Pronto: Federated Task Scheduling
This work presents a federated, asynchronous, memory-limited algorithm for online task scheduling across large-scale networks of hundreds of workers that is able to predict changes in the system responsiveness ahead of time based on the industry standard CPU Ready metric and can lead to better scheduling decisions and overall utilization of the available resources. Expand
Privacy Preserving PCA for Multiparty Modeling
A general multiparty modeling paradigm with Privacy Preserving Principal Component Analysis (PPPCA) for horizontally partitioned data and shows that the accuracy of the model built upon PPPCA is the same as the model with PCA that is built based on centralized plaintext data. Expand
Turning Channel Noise into an Accelerator for Over-the-Air Principal Component Analysis
The novelty of this design lies in exploiting channel noise to accelerate the descent in the region around each saddle point encountered by gradient descent, thereby increasing the convergence speed of over-the-air PCA. Expand
One-shot Distributed Algorithm for Generalized Eigenvalue Problem
A general distributed GEP framework with one-shot communication for GEP with modified method for eigenvalue decomposition if the symmetric data covariance has repeated eigenvalues, and the number of local servers is analyzed. Expand
Federated Over-Air Subspace Tracking from Incomplete and Corrupted Data
This work provides a new simple algorithm and guarantee for both ST with missing data ( ST-miss) and RST-miss and extends the approach and its analysis to provably solving these problems when the raw data is federated and when the over-air data communication modality is used for information exchange between the K peer nodes and the center. Expand


MOSES: A Streaming Algorithm for Linear Dimensionality Reduction
This paper introduces Memory-limited Online Subspace Estimation Scheme (MOSES) for both estimating the principal components of streaming data and reducing its dimension, and finds that MOSES consistently surpasses the state of the art in the authors' numerical experiments with both synthetic and real-world datasets, while being computationally inexpensive. Expand
Federated Learning: Strategies for Improving Communication Efficiency
Two ways to reduce the uplink communication costs are proposed: structured updates, where the user directly learns an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, which learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling. Expand
Streaming PCA with Many Missing Entries
We consider the streaming memory-constrained principal component analysis (PCA) problem with missing entries, where the available storage is linear in the dimensionality of the problem, and eachExpand
Memory Limited, Streaming PCA
An algorithm is presented that uses O(kp) memory and is able to compute the k-dimensional spike with O(p log p) sample-complexity - the first algorithm of its kind. Expand
Communication-Efficient Learning of Deep Networks from Decentralized Data
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets. Expand
Relative Errors for Deterministic Low-Rank Matrix Approximations
It is shown that Frequent Directions cannot be adapted to a sparse version in an obvious way that retains the l original rows of the matrix, as opposed to a linear combination or sketch of the rows. Expand
COLA: Decentralized Linear Learning
This work proposes COLA, a new decentralized training algorithm with strong theoretical guarantees and superior practical performance, that achieves communication efficiency, scalability, elasticity as well as resilience to changes in data and allows for unreliable and heterogeneous participating devices. Expand
Federated Multi-Task Learning
This work shows that multi-task learning is naturally suited to handle the statistical challenges of this setting, and proposes a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Expand
Improved Practical Matrix Sketching with Guarantees
This paper attempts to categorize and compare the most known methods under row-wise streaming updates with provable guarantees, and then to tweak some of these methods to gain practical improvements while retaining guarantees. Expand
A Distributed and Incremental SVD Algorithm for Agglomerative Data Analysis on Large Networks
  • M. Iwen, B. Ong
  • Computer Science, Mathematics
  • SIAM J. Matrix Anal. Appl.
  • 2016
It is shown that the SVD of a matrix can be constructed efficiently in a hierarchical approach and the hierarchical algorithm can be used to recover the largest singular values and left singular vectors with bounded error. Expand