Evaluating Load Balancing Performance in Distributed Storage With Redundancy

  title={Evaluating Load Balancing Performance in Distributed Storage With Redundancy},
  author={Mehmet Fatih Aktaş and Amir Behruzi Far and Emina Soljanin and Philip A. Whiting},
  journal={IEEE Transactions on Information Theory},
To facilitate load balancing, distributed systems store data redundantly. We evaluate the load balancing performance of storage schemes in which each object is stored at <inline-formula> <tex-math notation="LaTeX">$d$ </tex-math></inline-formula> different nodes, and each node stores the same number of objects. In our model, the load offered for the objects is sampled uniformly at random from all the load vectors with a fixed cumulative value. We find that the load balance in a system of… 

Figures from this paper

Balanced Nonadaptive Redundancy Scheduling

It turns out the superior performance of the block design-based policy results from better expansion properties of its associated graph, which reduces the average waiting time in the queue to up to 25% compared with the random policy and up to 100% compared to the round-robin policy.

Balanced and Swap-Robust Trades for Dynamical Distributed Storage

A new class of balanced trades is introduced that is important for access balancing of servers, and perturbation resilient balanced trades, important for studying the stability of server access frequencies with respect to changes in data popularity.

A resource occupancy ratio-oriented load balancing task scheduling mechanism for Flink

To distribute workloads upon workers evenly, a resource-oriented load balancing task scheduling (RoLBTS) mechanism for Flink is proposed and, based on the self-recursive calling, a RoLBTS algorithm for scheduling task-needed resources is presented.

Research on open source platform of digital power grid based on CEPH open source distributed storage technology

The test results show that the digital grid open source platform based on CEPH open source distributed storage technology has low memory occupancy, data processing of high efficiency and good system load balancing ability, which meets the needs of information security and data processing.



Load Balancing Performance in Distributed Storage with Regular Balanced Redundancy

It is found that load balance in a system of n nodes improves multiplicatively with d as long as d = o (log(n), and improves exponentially as soon as d= Θ(log( n)).

Service Rate Region of Content Access from Erasure Coded Storage

This paper determines the set of request arrival rates for the a 3-file coded storage system and provides an algorithm to maximize the rate of requests served for file $K$ given $\ lambda _{1}$,..., $\lambda _{K- 1}$ in a general K-file case.

Memory allocation in distributed storage networks

This work considers the problem of distributing a file in a network of storage nodes whose storage budget is limited but at least equals the size file and finds the optimal symmetric allocation for all coding redundancy constraints using the equivalent approximate problem.

On the service capacity region of accessing erasure coded content

This analysis demonstrates that erasure coding makes the system more robust to skews in file popularity than simply replicating a file at multiple servers, and that coding and replication together can make the capacity region larger than either alone.

MaxMinSum Steiner Systems for Access-Balancing in Distributed Storage

It is shown that proper relabelings of points in the Bose and Skolem constructions for Steiner triple systems lead to optimal MaxMin values for the sums of interest; for the duals of the designs, the block labelings that are within a 3/4 multiplicative factor from the optimum.

Cassandra: a decentralized structured storage system

Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of

Balls and bins with structure: balanced allocations on hypergraphs

This paper allows each ball to have an associated random set of bins and shows that this model captures structure important to two applications, nearby server selection and load balance in distributed hash tables.

Combinatorial batch codes

This paper focuses on batch codes, which were introduced by Ishai, Kushilevitz, Ostrovsky and Sahai in [4], and presents batch codes that are optimal with respect to the storage requirement, denoted by N.

Balanced allocations: the heavily loaded case

The best previously known results for the multiple-choice processes in the heavily loaded case were obtained using majorization by the single-choice process, so this paper yields an upper bound of the maximum load of bins of $m/n + {\mbox{$\cal O$}}(\sqrt{m \ln n \,/\, n})$ with high probability.

Balanced allocation on graphs

The two choice balls and bins process when balls are not allowed to choose any two random bins, but only bins that are connected by an edge in an underlying graph is studied.