The power of slightly more than one sample in randomized load balancing

@article{Ying2015ThePO,
  title={The power of slightly more than one sample in randomized load balancing},
  author={Lei Ying and Ramakrishnan Srikant and Xiaohan Kang},
  journal={2015 IEEE Conference on Computer Communications (INFOCOM)},
  year={2015},
  pages={1131-1139}
}
In many computing and networking applications, arriving tasks have to be routed to one of many servers, with the goal of minimizing queueing delays. When the number of processors is very large, a popular routing algorithm works as follows: select two servers at random and route an arriving task to the least loaded of the two. It is well-known that this algorithm dramatically reduces queueing delays compared to an algorithm which routes to a single randomly selected server. In recent cloud… 
Randomized load balancing with a helper
TLDR
In this paper, a hybrid scheduling strategy is proposed, and it consists of a Pod scheduler and a throughput-limited helper, which has bounded maximum queue size in the large-system regime, which is in sharp contrast to the Pod scheduling whosemaximum queue size is unbounded.
Power-of-d-Choices with Memory: Fluid Limit and Optimality
TLDR
This paper considers the power-of-d-choice algorithm with the addition of a local memory that keeps track of the latest observations collected over time on the sampled servers, and shows that this algorithm is asymptotically optimal in the sense that the load balancer can always assign each job to an idle server in the large-system limit.
Power-of-$d$-Choices with Memory: Fluid Limit and Optimality
TLDR
This paper considers the power-of-d-choice algorithm with the addition of a local memory that keeps track of the latest observations collected over time on the sampled servers, and shows that this algorithm is asymptotically optimal in the sense that the load balancer can always assign each job to an idle server in the large-server limit.
Efficient load balancing in large-scale systems
TLDR
The threshold-based load balancing scheme can achieve similar performance as the power-of-d scheme with d(N) ≫ √N log(N), and thus diffusion-level optimality, with only O(1) rather than O( N) communication overhead per task.
Routing with blinkers: Online throughput maximization without queue length information
TLDR
A novel routing policy is proposed that “samples” the servers periodically and achieves maximum throughput, subject to a condition for the service discipline of the server.
Optimal Hyper-Scalable Load Balancing with a Strict Queue Limit
On Delay-Optimal Scheduling in Queueing Systems with Replications
TLDR
Low-complexity scheduling policies are designed and are proven to be delay-optimal or near delay-Optimal in stochastic ordering among all causal and non-preemptive policies in centralized and distributed multi-server systems.
A lower bound on the queueing delay in resource constrained load balancing
We consider the following distributed service model: jobs with unit mean, general distribution, and independent processing times arrive as a renewal process of rate $\lambda n$, with $0<\lambda<1$,
Delay Asymptotics and Bounds for Multi-Task Parallel Jobs
TLDR
The approach converts the job delay problem into a corresponding balls-and-bins problem, and it is proved that the variant exhibits positive correlation, which greatly generalizes the asymptotic-independence type of results in the literature.
Scalable load balancing in networked systems: A survey of recent advances
TLDR
It is demonstrated how Stochastic coupling techniques and stochastic-process limits play an instrumental role in establishing the asymptotic optimality and carries over to infinite-server settings, finite buffers, multiple dispatchers, servers arranged on graph topologies, and token-based load balancing including the popular Join-the-Idle-Queue (JIQ) scheme.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 21 REFERENCES
The Power of Slightly More than One Sample in Randomized Load Balancing
TLDR
The number of sampled queues can be dramatically reduced by using the fact that tasks arrive in batches (called jobs), in particular, a subset of the queues such that the size of the subset is slightly larger than the batch size.
On the Power of (Even a Little) Resource Pooling
TLDR
A multi-server model that captures a performance trade-off between centralized and distributed processing is proposed and analyzed, demonstrating a surprising phase transition in the steady-state delay scaling.
The Power of Two Choices in Randomized Load Balancing
TLDR
This work uses a limiting, deterministic model representing the behavior as n/spl rarr//spl infin/ to approximate the behavior of finite systems and provides simulations that demonstrate that the method accurately predicts system behavior, even for relatively small systems.
Asymptotic independence of queues under randomized load balancing
TLDR
This article considers the least loaded balancing problem, and considers the more difficult problem, where an arriving job is assigned to the queue with the fewest jobs, and demonstrates the ansatz when the service discipline is FIFO and the service time distribution has a decreasing hazard rate.
Load balancing and density dependent jump Markov processes
  • M. Mitzenmacher
  • Computer Science
    Proceedings of 37th Conference on Foundations of Computer Science
  • 1996
TLDR
A new approach for analyzing both static and dynamic randomized load balancing strategies, and appears more realistic than similar models studied previously, in that it is both dynamic and open: that is, customers arrive over time, and the number of customers is not fixed.
Analysis of Load Balancing in Large Heterogeneous Processor Sharing Systems
TLDR
Numerical results are presented to validate the theoretical results and to show that the power-of-two type schemes considered in this paper may not always result in better behaviour in terms of the mean sojourn time of jobs.
Decay of Tails at Equilibrium for FIFO Join the Shortest Queue Networks
TLDR
The tail of the equilibrium queue size exhibits a wide range of behavior depending on the relationship between $\beta$ and $D$, and is shown to be doubly exponential and exponentially distributed when $\beta=D/(D-1)$.
Queueing system topologies with limited flexibility
TLDR
This work studies a multi-server model with n flexible servers and rn queues, connected through a fixed bipartite graph, and shows that a large capacity region (robustness) and diminishing queueing delay (performance) are jointly achievable even under very limited flexibility.
The power of two random choices: a survey of tech-niques and results
TLDR
The important implication of this result is that even a small amount of choice can lead to drastically di erent results in load balancing.
Sparrow: distributed, low latency scheduling
TLDR
It is demonstrated that a decentralized, randomized sampling approach provides near-optimal performance while avoiding the throughput and availability limitations of a centralized design.
...
1
2
3
...