Load Balancing Performance in Distributed Storage with Regular Balanced Redundancy

@article{Akta2019LoadBP,
  title={Load Balancing Performance in Distributed Storage with Regular Balanced Redundancy},
  author={Mehmet Fatih Aktaş and Amir Behrouzi-Far and Emina Soljanin and Philip A. Whiting},
  journal={2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY)},
  year={2019},
  pages={75-80}
}
  • M. Aktaş, Amir Behrouzi-Far, +1 author P. Whiting
  • Published 1 October 2019
  • Computer Science, Mathematics
  • 2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY)
Contention at the storage nodes is the main cause of long and variable data access times in distributed storage systems. Offered load on the system must be balanced across the storage nodes in order to minimize contention, and load balancing should be robust against the skews and fluctuations in content popularities. Data objects are replicated across multiple nodes in practice to allow for load balancing. However redundancy increases the storage requirement and should be used efficiently. We… 
Evaluating Load Balancing Performance in Distributed Storage With Redundancy
TLDR
The load balance in a system of nodes in which each object is stored at different nodes improves multiplicatively with <inline-formula> <tex-math notation="LaTeX">$d$ </tex-Math></inline- formula> as long as the spacing between consecutive spacings is consecutive between the ordered statistics of uniform random variables.
Data Freshness in Leader-Based Replicated Storage
TLDR
It is shown that, depending on the relative speed of the write operation to the two groups of nodes, there exists an optimal number of leaders which minimizes the average age of the retrieved data, and that this number increases as the Relative speed of writing on leaders increases.
Distributed Multi-User Secret Sharing
TLDR
It is shown how to modify the proposed protocols in order to construct schemes with balanced storage load and communication complexity, thereby demonstrating schemes that are optimal in terms of both parameters.
A Geometric View of the Service Rates of Codes Problem and its Application to the Service Rate of the First Order Reed-Muller Codes
TLDR
This work derives upper bounds on the service rates of the first order Reed-Muller codes and the simplex codes and shows that given the service rate region of a code, a lower bound on the minimum distance of the code can be obtained.
A Combinatorial View of the Service Rates of Codes Problem, its Equivalence to Fractional Matching and its Connection with Batch Codes
TLDR
It is shown that the service capacity of a coded storage system equals the fractional matching number in the graph representation of the code, and thus is lower bounded and upper bounded by the matching number and the vertex cover number, respectively.
Batch Codes for Asynchronous Recovery of Data
We propose a new model of asynchronous batch codes that allow for parallel recovery of information symbols from a coded database in an asynchronous manner, i.e. when queries arrive at random times
Service Rate Region: A New Aspect of Coded Distributed System Design
TLDR
This work shows that erasure coding of data objects can flexibly handle skews in the request rates, and shows the effectiveness of hybrid codes that combine replication and erasures coding in terms of code design.

References

SHOWING 1-10 OF 39 REFERENCES
Memory allocation in distributed storage networks
TLDR
This work considers the problem of distributing a file in a network of storage nodes whose storage budget is limited but at least equals the size file and finds the optimal symmetric allocation for all coding redundancy constraints using the equivalent approximate problem.
Cassandra: a decentralized structured storage system
Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of
On the service capacity region of accessing erasure coded content
TLDR
This analysis demonstrates that erasure coding makes the system more robust to skews in file popularity than simply replicating a file at multiple servers, and that coding and replication together can make the capacity region larger than either alone.
Service Rate Region of Content Access from Erasure Coded Storage
TLDR
This paper determines the set of request arrival rates for the a 3-file coded storage system and provides an algorithm to maximize the rate of requests served for file $K$ given $\ lambda _{1}$,..., $\lambda _{K- 1}$ in a general K-file case.
Scarlett: coping with skewed content popularity in mapreduce clusters
TLDR
Scarlett, a system that replicates blocks based on their popularity by accurately predicting file popularity and working within hard bounds on additional storage, causes minimal interference to running jobs.
The Hadoop Distributed File System
TLDR
The architecture of HDFS is described and experience using HDFS to manage 25 petabytes of enterprise data at Yahoo! is reported on.
Balanced allocations: the heavily loaded case
TLDR
It is shown that the multiplechoice processes are fundamentally different from the singlechoice variant in that they have "short memory" and the deviation of the multiple-choice processes from the optimal allocation does not increase with the number of balls as in case of the single-choice process.
Redundancy Scheduling in Systems with Bi-Modal Job Service Time Distributions
  • Amir Behrouzi-Far, E. Soljanin
  • Computer Science, Mathematics
    2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
  • 2019
TLDR
This work develops an analogy to a classical urns and balls problem, and uses it to study the queuing time performance of two non-adaptive classical scheduling policies: random and round-robin.
An analysis of Facebook photo caching
TLDR
This paper instrumented every Facebook-controlled layer of the stack and sampled the resulting event stream to obtain traces covering over 77 million requests for more than 1 million unique photos to study traffic patterns, cache access patterns, geolocation of clients and servers, and to explore correlation between properties of the content and accesses.
Balls and bins with structure: balanced allocations on hypergraphs
TLDR
This paper allows each ball to have an associated random set of bins and shows that this model captures structure important to two applications, nearby server selection and load balance in distributed hash tables.
...
1
2
3
4
...