# The power of slightly more than one sample in randomized load balancing

@article{Ying2015ThePO,
title={The power of slightly more than one sample in randomized load balancing},
author={Lei Ying and Ramakrishnan Srikant and Xiaohan Kang},
journal={2015 IEEE Conference on Computer Communications (INFOCOM)},
year={2015},
pages={1131-1139}
}
• Published 26 April 2015
• Computer Science
• 2015 IEEE Conference on Computer Communications (INFOCOM)
In many computing and networking applications, arriving tasks have to be routed to one of many servers, with the goal of minimizing queueing delays. When the number of processors is very large, a popular routing algorithm works as follows: select two servers at random and route an arriving task to the least loaded of the two. It is well-known that this algorithm dramatically reduces queueing delays compared to an algorithm which routes to a single randomly selected server. In recent cloud…
83 Citations

## Figures and Tables from this paper

Randomized load balancing with a helper
• Computer Science
2017 International Conference on Computing, Networking and Communications (ICNC)
• 2017
In this paper, a hybrid scheduling strategy is proposed, and it consists of a Pod scheduler and a throughput-limited helper, which has bounded maximum queue size in the large-system regime, which is in sharp contrast to the Pod scheduling whosemaximum queue size is unbounded.
Power-of-d-Choices with Memory: Fluid Limit and Optimality
• Computer Science
Math. Oper. Res.
• 2020
This paper considers the power-of-d-choice algorithm with the addition of a local memory that keeps track of the latest observations collected over time on the sampled servers, and shows that this algorithm is asymptotically optimal in the sense that the load balancer can always assign each job to an idle server in the large-system limit.
Power-of-$d$-Choices with Memory: Fluid Limit and Optimality
• Computer Science
• 2018
This paper considers the power-of-d-choice algorithm with the addition of a local memory that keeps track of the latest observations collected over time on the sampled servers, and shows that this algorithm is asymptotically optimal in the sense that the load balancer can always assign each job to an idle server in the large-server limit.
Efficient load balancing in large-scale systems
• Computer Science
2016 Annual Conference on Information Science and Systems (CISS)
• 2016
The threshold-based load balancing scheme can achieve similar performance as the power-of-d scheme with d(N) ≫ √N log(N), and thus diffusion-level optimality, with only O(1) rather than O( N) communication overhead per task.
Routing with blinkers: Online throughput maximization without queue length information
• Computer Science
2016 IEEE International Symposium on Information Theory (ISIT)
• 2016
A novel routing policy is proposed that “samples” the servers periodically and achieves maximum throughput, subject to a condition for the service discipline of the server.
On Delay-Optimal Scheduling in Queueing Systems with Replications
• Computer Science
ArXiv
• 2016
Low-complexity scheduling policies are designed and are proven to be delay-optimal or near delay-Optimal in stochastic ordering among all causal and non-preemptive policies in centralized and distributed multi-server systems.
A lower bound on the queueing delay in resource constrained load balancing
• Mathematics
• 2018
We consider the following distributed service model: jobs with unit mean, general distribution, and independent processing times arrive as a renewal process of rate $\lambda n$, with $0<\lambda<1$,
Delay Asymptotics and Bounds for Multi-Task Parallel Jobs
• Computer Science, Mathematics
PERV
• 2019
The approach converts the job delay problem into a corresponding balls-and-bins problem, and it is proved that the variant exhibits positive correlation, which greatly generalizes the asymptotic-independence type of results in the literature.
Scalable load balancing in networked systems: A survey of recent advances
• Computer Science
ArXiv
• 2018
It is demonstrated how Stochastic coupling techniques and stochastic-process limits play an instrumental role in establishing the asymptotic optimality and carries over to infinite-server settings, finite buffers, multiple dispatchers, servers arranged on graph topologies, and token-based load balancing including the popular Join-the-Idle-Queue (JIQ) scheme.

## References

SHOWING 1-10 OF 21 REFERENCES
The Power of Slightly More than One Sample in Randomized Load Balancing
• Computer Science
Math. Oper. Res.
• 2017
The number of sampled queues can be dramatically reduced by using the fact that tasks arrive in batches (called jobs), in particular, a subset of the queues such that the size of the subset is slightly larger than the batch size.
On the Power of (Even a Little) Resource Pooling
• Computer Science
• 2012
A multi-server model that captures a performance trade-off between centralized and distributed processing is proposed and analyzed, demonstrating a surprising phase transition in the steady-state delay scaling.
The Power of Two Choices in Randomized Load Balancing
This work uses a limiting, deterministic model representing the behavior as n/spl rarr//spl infin/ to approximate the behavior of finite systems and provides simulations that demonstrate that the method accurately predicts system behavior, even for relatively small systems.
Asymptotic independence of queues under randomized load balancing
• Mathematics
Queueing Syst. Theory Appl.
• 2012
This article considers the least loaded balancing problem, and considers the more difficult problem, where an arriving job is assigned to the queue with the fewest jobs, and demonstrates the ansatz when the service discipline is FIFO and the service time distribution has a decreasing hazard rate.
Load balancing and density dependent jump Markov processes
• M. Mitzenmacher
• Computer Science
Proceedings of 37th Conference on Foundations of Computer Science
• 1996
A new approach for analyzing both static and dynamic randomized load balancing strategies, and appears more realistic than similar models studied previously, in that it is both dynamic and open: that is, customers arrive over time, and the number of customers is not fixed.
Analysis of Load Balancing in Large Heterogeneous Processor Sharing Systems
• Computer Science
ArXiv
• 2013
Numerical results are presented to validate the theoretical results and to show that the power-of-two type schemes considered in this paper may not always result in better behaviour in terms of the mean sojourn time of jobs.
Decay of Tails at Equilibrium for FIFO Join the Shortest Queue Networks
• Mathematics, Computer Science
ArXiv
• 2011
The tail of the equilibrium queue size exhibits a wide range of behavior depending on the relationship between $\beta$ and $D$, and is shown to be doubly exponential and exponentially distributed when $\beta=D/(D-1)$.
Queueing system topologies with limited flexibility
• Mathematics, Computer Science
SIGMETRICS '13
• 2013
This work studies a multi-server model with n flexible servers and rn queues, connected through a fixed bipartite graph, and shows that a large capacity region (robustness) and diminishing queueing delay (performance) are jointly achievable even under very limited flexibility.
The power of two random choices: a survey of tech-niques and results
• Mathematics
• 2001
The important implication of this result is that even a small amount of choice can lead to drastically di erent results in load balancing.
Sparrow: distributed, low latency scheduling
• Computer Science
SOSP
• 2013
It is demonstrated that a decentralized, randomized sampling approach provides near-optimal performance while avoiding the throughput and availability limitations of a centralized design.