NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter Access

  title={NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter Access},
  author={Alexander Renz-Wieland and Rainer Gemulla and Zoi Kaoudi and Volker Markl},
  journal={Proceedings of the 2022 International Conference on Management of Data},
Parameter servers (PSs) facilitate the implementation of distributed training for large machine learning tasks. In this paper, we argue that existing PSs are inefficient for tasks that exhibit non-uniform parameter access; their performance may even fall behind that of single node baselines. We identify two major sources of such non-uniform access: skew and sampling. Existing PSs are ill-suited for managing skew because they uniformly apply the same parameter management technique to all… 
Good Intentions: Adaptive Parameter Servers via Intent Signaling
This paper proposes a novel intent signaling mechanism that acts as an enabler for adaptivity and naturally integrates into ML tasks, and proposes a fully adaptive, zero-tuning PS called AdaPS based on this mechanism.


Parallel Training of Knowledge Graph Embedding Models: A Comparison of Techniques
It is found that efficient and effective parallel training of large-scale KGE models is indeed achievable but requires a careful choice of techniques, and that most of currently implemented training methods tend to have a negative impact on embedding quality.
Just Move It! Dynamic Parameter Allocation in Action
Parameter servers (PSs) ease the implementation of distributed machine learning systems, but their performance can fall behind that of single machine baselines due to communication overhead. We
Understanding Negative Sampling in Graph Representation Learning
This work systematically analyzes the role of negative sampling from the perspectives of both objective and risk, theoretically demonstrating that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance.
Dynamic parameter allocation in parameter servers
This paper proposes to integrate dynamic parameter allocation into parameter servers, describes an efficient implementation of such a parameter server called Lapse, and experimentally compares its performance to existing parameter servers across a number of machine learning tasks.
FlexPS: Flexible Parallelism Control in Parameter Server Architecture
This work proposes a new system, called FlexPS, which introduces a novel multi-stage abstraction to support flexible parallelism control and achieves significant speedups and resource saving compared with the state-of-the-art PS systems such as Petuum and Multiverso.
Backpropagation Applied to Handwritten Zip Code Recognition
This paper demonstrates how constraints from the task domain can be integrated into a backpropagation network through the architecture of the network, successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.
Analysis of Replication in Distributed Database Systems
An approximate analytical model is developed to study the tradeoffs of replicating data in a distributed database environment and it is found that the benefit of replicate data and the optimal number of replicates are sensitive to the concurrency control protocol.
Heterogeneity-aware Distributed Parameter Servers
A heterogeneity-aware algorithm that uses a constant learning rate schedule for updates before adding them to the global parameter allows us to suppress stragglers' harm on robust convergence and theoretically prove the valid convergence of both approaches.
node2vec: Scalable Feature Learning for Networks
In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.
Computing Web-scale Topic Models using an Asynchronous Parameter Server
APS-LDA is presented, which integrates state-of-the-art topic modeling with cluster computing frameworks such as Spark using a novel asynchronous parameter server and can, on a 480-core cluster, process up to 135× more data and 10× more topics without sacricing model quality.