Learn More
Inspired by Google's BigTable, a variety of scalable, semi-structured, weak-semantic table stores have been developed and optimized for different priorities such as query speed, ingest speed, availability, and interactivity. As these systems mature, performance benchmarking will advance from measuring the rate of simple workloads to understanding and(More)
This paper presents the design and implementation of DOT, a flexible architecture for data transfer. This architecture separates content negotiation from the data transfer itself. Applications determine what data they need to send and then use a new transfer service to send it. This transfer service acts as a common interface between applications and the(More)
— This paper considers serial fusion as a mechanism for collaborative signal detection. The advantage of this technique is that it can use only the sensor observations that are really necessary for signal detection and thus can be very communication efficient. We develop the signal processing mechanisms for serial fusion based on simple models. We also(More)
– We examine the problem of scalable file system directories, motivated by data-intensive applications requiring millions to billions of small files to be ingested in a single directory at rates of hundreds of thousands of file creates every second. We introduce a POSIX-compliant scalable directory design, GIGA+, that distributes directory entries over a(More)
Data-intensive applications fall into two computing styles: Internet services (cloud computing) or high-performance computing (HPC). In both categories, the underlying file system is a key component for scalable application performance. In this paper, we explore the similarities and differences between PVFS, a parallel file system used in HPC at large(More)
There is an increasing use of high-performance computing (HPC) clusters with thousands of compute nodes that, with the advent of multi-core CPUs, will impose a significant challenge for storage systems: The ability to scale to handle I/O generated by applications executing in parallel in tens of thousands of threads. One such challenge is building scalable(More)
Acknowledgements: We would like to thank several people who made significant contributions in improving this paper. Sam Lang and Rob Ross helped us with all PVFS issues; they graciously answered many questions about PVFS internals, provided pointers for performance debugging, and sent us quick bug fixes and patches to keep us moving forward. Julio Lopez,(More)
The growing size of modern storage systems is expected to exceed billions of objects, making metadata scalability critical to overall performance. Many existing distributed file systems only focus on providing highly parallel fast access to file data, and lack a scalable metadata service. In this paper, we introduce a middleware design called IndexFS that(More)