A file is not a file: understanding the I/O behavior of Apple desktop applications
@article{Harter2011AFI, title={A file is not a file: understanding the I/O behavior of Apple desktop applications}, author={Tyler Harter and Chris Dragga and Michael Vaughn and Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau}, journal={Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles}, year={2011} }
We analyze the I/O behavior of iBench, a new collection of productivity and multimedia application workloads. Our analysis reveals a number of differences between iBench and typical file-system workload studies, including the complex organization of modern files, the lack of pure sequential access, the influence of underlying frameworks on I/O patterns, the widespread use of file synchronization and atomic operations, and the prevalence of threads. Our results have strong ramifications for the…
Figures and Tables from this paper
59 Citations
The Composite-file File System: Decoupling the One-to-One Mapping of Files and Metadata for Better Performance
- Computer ScienceFAST
- 2016
A composite-file file system is designed, implemented, and evaluated, which allows many-to-one mappings of files to metadata, and the design space of different mapping strategies is explored.
TABLEFS: Embedding a NoSQL database inside the local file system
- Computer Science2012 Digest APMRC
- 2012
This paper examines using techniques adopted from NoSQL databases to manage file system metadata and small files to improve the performance of modern local file systems in Linux for workloads dominated by metadata and tiny files.
Caching or Not: Rethinking Virtual File System for Non-Volatile Main Memory
- Computer ScienceHotStorage
- 2018
ByVFS is presented, an optimization of VFS to directly access metadata in PM file systems bypassing VFS caching layer, and the results show ByVFS outperforms conventional VFS with cold cache and provides comparable performance against conventional V FS with warm cache.
Turn Your Storage Stack into a File System
- Computer Science
- 2017
It is argued that a multi-layer filesystem will be simpler to implement and to use than the complex collection of different storage systems that the authors have now, because many storage system optimizations both at the OS and application layers are designed to hide access latencies.
Strata: A Cross Media File System
- Computer ScienceSOSP
- 2017
Strata is presented, a cross-media file system that leverages the strengths of one storage media to compensate for weaknesses of another, and has 20-30% better latency and throughput, compared to file systems purpose-built for each layer, while providing synchronous and unified access to the entire storage hierarchy.
Building a Reliable Storage Stack
- Computer Science
- 2016
Loris, the redesign of the storage stack along three dimensions: reliability, heterogeneity and flexibility, is presented and several major problems with the traditional stack are highlighted.
Analysis of HDFS under HBase: a facebook messages case study
- Computer ScienceFAST
- 2014
It is examined how layering causes write amplication when HBase is run on top of HDFS and how tighter integration could result in improved write performance, and whether it makes sense to include an SSD to improve performance while keeping costs in check.
Understanding Data Characteristics and Access Patterns in a Cloud Storage System
- Computer Science2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing
- 2013
An analysis of file system snapshot and five-month access trace of a campus cloud storage system that has been deployed on Tsinghua campus for three years finds that the cache efficiency can be improved by 5 times using the guidance from the observations.
Arrakis: The Operating System is the Control Plane
- Computer ScienceOSDI
- 2013
A new operating system, Arrakis, is designed and implemented that splits the traditional role of the kernel in two, allowing most I/O operations to skip the kernel entirely, while the kernel is re-engineered to provide network and disk protection without kernel mediation of every operation.
Extending the lifetime of flash-based storage through reducing write amplification from file systems
- Computer ScienceFAST
- 2013
An object-based flash translation layer design (OFTL), in which mechanisms are co-designed with flash memory, which enables lazy persistence of index metadata and eliminates journals while keeping consistency and coarse-grained block state maintenance reduces persistent free space management overhead.
References
SHOWING 1-10 OF 33 REFERENCES
Analysis of file I/O traces in commercial computing environments
- Computer ScienceSIGMETRICS '92/PERFORMANCE '92
- 1992
This paper analyzes file I/O traces of several existing production computer sytems to understand file access behavior and observes that although only a third of the active files are sequentially shared, they receive a very large proportion of the total operations.
A Comparison of File System Workloads
- Computer ScienceUSENIX Annual Technical Conference, General Track
- 2000
This paper describes the collection and analysis of file system traces from a variety of different environments, including both UNIX and NT systems, clients and servers, and instructional and production systems and develops a new metric for measuring file lifetime that accounts for files that are never deleted.
A trace-driven analysis of the unix 4
- Computer ScienceSOSP 1985
- 1985
The UNIX 4.2BSD file system is analyzed by recording activity in trace files and writing programs to analyze the traces, and a simulator that uses the traces to predict the performance of caches for disk blocks is written.
File system usage in Windows NT 4.0
- Computer ScienceSOSP
- 1999
This paper reports on the usage details of the Windows NT file system architecture, through a detailed comparison with the older traces, through details on the operational characteristics and through a usage analysis of the file system and cache manager.
A study of file sizes and functional lifetimes
- Computer ScienceSOSP
- 1981
The collection, analysis and interpretation of data pertaining to files in the computing environment of the Computer Science Department at Carnegie-Mellon University (CMU-CSD) is discussed.
Measurements of a distributed file system
- Computer ScienceSOSP '91
- 1991
This work analyzed the user-level file access patterns and caching behavior of the Sprite distributed file system and found that client cache consistency is needed to prevent stale data errors, but that it is not invoked often enough to degrade overall system performance.
Scale and performance in a distributed file system
- Computer ScienceTOCS
- 1988
Observations of a prototype implementation are presented, changes in the areas of cache validation, server process structure, name translation, and low-level storage representation are motivated, and Andrews ability to scale gracefully is quantitatively demonstrated.
Analysis and Evolution of Journaling File Systems
- Computer ScienceUSENIX Annual Technical Conference, General Track
- 2005
We develop and apply two new methods for analyzing file system behavior and evaluating file system changes. First, semantic block-level analysis (SBA) combines knowledge of on-disk data structures…
The Google file system
- Computer ScienceSOSP '03
- 2003
This paper presents file system interface extensions designed to support distributed applications, discusses many aspects of the design, and reports measurements from both micro-benchmarks and real world use.
A large-scale study of file-system contents
- Computer ScienceSIGMETRICS '99
- 1999
It is found that file and directory sizes are fairly consistent across file systems, but file lifetimes vary widely and are significantly affected by the job function of the user.