# Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

@article{Boor2019ScalableLB, title={Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods}, author={Mark van der Boor and Sem C. Borst and Johan van Leeuwaarden and Debankur Mukherjee}, journal={ArXiv}, year={2019}, volume={abs/1712.08555} }

We present an overview of scalable load balancing algorithms which provide favorable delay performance in large-scale systems, and yet only require minimal implementation overhead. Aimed at a broad audience, the paper starts with an introduction to the basic load balancing scenario, consisting of a single dispatcher where tasks arrive that must immediately be forwarded to one of $N$ single-server queues.
A popular class of load balancing algorithms are so-called power-of-$d$ or JSQ($d…

## 32 Citations

### Scalable load balancing in networked systems

- Computer Science
- 2017

An asymptotic regime where the diversity parameter d(N) depends on N is considered, and it is investigated what growth rate is required to match the optimal performance of the JSQ policy on fluid and diffusion scale, and achieve a vanishing waiting time in the limit.

### Scalable load balancing in networked systems: A survey of recent advances

- Computer ScienceSIAM Review
- 2022

It is demonstrated how Stochastic coupling techniques and stochastic-process limits play an instrumental role in establishing the asymptotic optimality and carries over to infinite-server settings, finite buffers, multiple dispatchers, servers arranged on graph topologies, and token-based load balancing including the popular Join-the-Idle-Queue (JIQ) scheme.

### Optimal Hyper-Scalable Load Balancing with a Strict Queue Limit

- Computer SciencePerform. Evaluation
- 2021

### Asymptotic Optimality of Speed-Aware JSQ for Heterogeneous Systems

- Mathematics, Computer ScienceArXiv
- 2022

A speed-aware version of the JSQ scheme for heterogeneous systems and it is shown that it achieves delay optimality in the ﬂuid limit and has a unique and globally attractive ﬁxed point.

### Load Balancing Under Strict Compatibility Constraints

- Computer ScienceSIGMETRICS
- 2021

Proportionally sparse random compatibility graphs can be constructed, which reduce the server-degree almost by a factor N/ln(N) compared to the complete bipartite compatibility graph.

### Steady-state analysis of load-balancing algorithms in the sub-Halfin-Whitt regime

- Computer ScienceJ. Appl. Probab.
- 2020

A sufficient condition under which the probability that an incoming job is routed to an idle server is 1 asymptotically (as $N \to \infty$) at steady state is established.

### Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System

- MathematicsArXiv
- 2018

A novel method is developed to prove that the subcritically loaded system is stable for large enough $N$, and establish convergence of steady-state distributions to the optimal one, as $N \to \infty$.

### Join-the-Shortest Queue diffusion limit in Halfin–Whitt regime: Sensitivity on the heavy-traffic parameter

- MathematicsThe Annals of Applied Probability
- 2020

Consider a system of $N$ parallel single-server queues with unit-exponential service time distribution and a single dispatcher where tasks arrive as a Poisson process of rate $\lambda(N)$. When a…

### Join-the-shortest queue diffusion limit in Halfin–Whitt regime: Tail asymptotics and scaling of extrema

- MathematicsThe Annals of Applied Probability
- 2019

Consider a system of $N$ parallel single-server queues with unit-exponential service time distribution and a single dispatcher where tasks arrive as a Poisson process of rate $\lambda(N)$. When a…

### Economies-of-scale in resource sharing systems: tutorial and partial review of the QED heavy-traffic regime

- Computer Science
- 2017

The mathematics behind the Quality-and-Efficiency Driven (QED) regime, which lets the system operate close to full utilization, while the number of servers grows simultaneously large and delays remain manageable, is reviewed.

## References

SHOWING 1-10 OF 51 REFERENCES

### Universality of load balancing schemes on the diffusion scale

- MathematicsJournal of Applied Probability
- 2016

A stochastic coupling construction is developed to obtain the diffusion limit of the queue process in the Halfin‒Whitt heavy-traffic regime, and it is established that it does not depend on the value of d, implying that assigning tasks to idle servers is sufficient for diffusion level optimality.

### Asymptotic Optimality of Power-of-d Load Balancing in Large-Scale Systems

- Mathematics, Computer ScienceMath. Oper. Res.
- 2020

The results indicate that the JSQ optimality can be preserved at the fluid and diffusion levels while reducing the overhead by nearly a factor O(N) and O([Formula: see text]), respectively.

### Load Balancing in the Non-Degenerate Slowdown Regime

- Computer Science
- 2017

A novel diffusion approximation and timescale separation is identified that provides insights into the performance of Join-the-Shortest-Queue and leads to new rules that have identical performance to JSQ but require less communication overhead than power-of-2-choices.

### Delay, Memory, and Messaging Tradeoffs in Distributed Service Systems

- MathematicsSIGMETRICS
- 2016

It is shown that for any given α>0 (no matter how small), if the policy only uses a linear message rate α N, the resulting asymptotic expected queueing delay is positive but upper bounded, uniformly over all λ>1.

### Randomized load balancing in heavy tra c

- Mathematics
- 2017

We consider three randomized schemes for load balancing among a xed number, N , of resources, and analyze them in the heavy tra c limit. The rst is join the shortest queue, where arrivals are routed…

### Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services

- Computer SciencePerform. Evaluation
- 2011

### Hyper-Scalable JSQ with Sparse Feedback

- Computer ScienceSIGMETRICS
- 2019

A novel class of load balancing schemes where the various servers provide occasional queue updates to guide the load assignment is introduced, and it is shown that the proposed schemes strongly outperform JSQ(d) strategies with comparable communication overhead per job, and can achieve a vanishing waiting time in the many-server limit with just one message per job.

### Asymptotically Optimal Load Balancing Topologies

- Mathematics, Computer ScienceProc. ACM Meas. Anal. Comput. Syst.
- 2018

It is proved that if G N is an Erdos-Rényi random graph with average degree d(N), then with high probability it is N -optimal and ∞N-optimal if d (N) -> ınfty$ and d( N) / (∞N łog(N)) -> Turkishnfty as N -> Istanbulnfty, respectively.

### Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers

- Computer ScienceQueueing Syst. Theory Appl.
- 2017

A multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in Stolyar in 2015, is defined and proved asymptotic optimality of PULL is proved: as narrows, the steady-state probability of an arriving customer experiencing blocking or waiting vanishes.

### Pull-based load distribution in large-scale heterogeneous service systems

- Computer ScienceQueueing Syst. Theory Appl.
- 2015

Assuming subcritical system load, it is proved asymptotic optimality of PULL, the steady-state probability of an arriving customer experiencing blocking or waiting, vanishes as system scale.