# Evaluating Load Balancing Performance in Distributed Storage With Redundancy

@article{Akta2019EvaluatingLB,
title={Evaluating Load Balancing Performance in Distributed Storage With Redundancy},
author={Mehmet Fatih Aktaş and Amir Behruzi Far and Emina Soljanin and Philip A. Whiting},
journal={IEEE Transactions on Information Theory},
year={2019},
volume={67},
pages={3623-3644}
}
• Published 13 October 2019
• Computer Science
• IEEE Transactions on Information Theory
To facilitate load balancing, distributed systems store data redundantly. We evaluate the load balancing performance of storage schemes in which each object is stored at <inline-formula> <tex-math notation="LaTeX">$d$ </tex-math></inline-formula> different nodes, and each node stores the same number of objects. In our model, the load offered for the objects is sampled uniformly at random from all the load vectors with a fixed cumulative value. We find that the load balance in a system of…
5 Citations

## Figures from this paper

• Computer Science
IEEE Journal on Selected Areas in Information Theory
• 2022
It turns out the superior performance of the block design-based policy results from better expansion properties of its associated graph, which reduces the average waiting time in the queue to up to 25% compared with the random policy and up to 100% compared to the round-robin policy.
• Computer Science
2022 IEEE International Symposium on Information Theory (ISIT)
• 2022
A new class of balanced trades is introduced that is important for access balancing of servers, and perturbation resilient balanced trades, important for studying the stability of server access frequencies with respect to changes in data popularity.
• Computer Science
J. Intell. Fuzzy Syst.
• 2023
To distribute workloads upon workers evenly, a resource-oriented load balancing task scheduling (RoLBTS) mechanism for Flink is proposed and, based on the self-recursive calling, a RoLBTS algorithm for scheduling task-needed resources is presented.
• Computer Science, Engineering
Journal of Physics: Conference Series
• 2022
The test results show that the digital grid open source platform based on CEPH open source distributed storage technology has low memory occupancy, data processing of high efficiency and good system load balancing ability, which meets the needs of information security and data processing.

## References

SHOWING 1-10 OF 38 REFERENCES

• Computer Science
2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY)
• 2019
It is found that load balance in a system of n nodes improves multiplicatively with d as long as d = o (log(n), and improves exponentially as soon as d= Θ(log( n)).
• Computer Science
2018 IEEE Information Theory Workshop (ITW)
• 2018
This paper determines the set of request arrival rates for the a 3-file coded storage system and provides an algorithm to maximize the rate of requests served for file $K$ given $\ lambda _{1}$,..., $\lambda _{K- 1}$ in a general K-file case.
• Computer Science
2010 IEEE International Symposium on Information Theory
• 2010
This work considers the problem of distributing a file in a network of storage nodes whose storage budget is limited but at least equals the size file and finds the optimal symmetric allocation for all coding redundancy constraints using the equivalent approximate problem.
• Computer Science
2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
• 2017
This analysis demonstrates that erasure coding makes the system more robust to skews in file popularity than simply replicating a file at multiple servers, and that coding and replication together can make the capacity region larger than either alone.
• Computer Science
SIAM J. Discret. Math.
• 2018
It is shown that proper relabelings of points in the Bose and Skolem constructions for Steiner triple systems lead to optimal MaxMin values for the sums of interest; for the duals of the designs, the block labelings that are within a 3/4 multiplicative factor from the optimum.
• Computer Science
OPSR
• 2010
Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of
This paper allows each ball to have an associated random set of bins and shows that this model captures structure important to two applications, nearby server selection and load balance in distributed hash tables.
• Computer Science
The best previously known results for the multiple-choice processes in the heavily loaded case were obtained using majorization by the single-choice process, so this paper yields an upper bound of the maximum load of bins of $m/n + {\mbox{$\cal O$}}(\sqrt{m \ln n \,/\, n})$ with high probability.