• Corpus ID: 239015947

A Learning-based Approach Towards Automated Tuning of SSD Configurations

@article{Li2021ALA,
  title={A Learning-based Approach Towards Automated Tuning of SSD Configurations},
  author={Daixuan Li and Jian Huang},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.08685}
}
Thanks to the mature manufacturing techniques, solid-state drives (SSDs) are highly customizable for applications today, which brings opportunities to further improve their storage performance and resource utilization. However, the SSD efficiency is usually determined by many hardware parameters, making it hard for developers to manually tune them and determine the optimal SSD configurations. In this paper, we present an automated learning-based framework, named LearnedSSD, that utilizes both… 
KML: Using Machine Learning to Improve Storage Systems
TLDR
KML is proposed, a proposed ML architecture that consumes little OS resources, adds negligible latency, and yet can learn patterns that can improve I/O throughput by as much as 2.3× or 15× for the two use cases respectively—even for complex, never-before-seen, concurrently running mixed workloads on different storage devices.

References

SHOWING 1-10 OF 69 REFERENCES
MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices
TLDR
MQSim faithfully models new high-bandwidth protocol implementations, steady-state SSD conditions, and the full end-to-end latency of requests in modern SSDs, and is released as an open-source tool, which can enable researchers to explore directions in new and different areas.
Detecting I/O Access Patterns of HPC Workloads at Runtime
TLDR
Three machine learning techniques are evaluated to automatically detect the I/O access pattern of HPC applications at runtime: decision trees, random forests, and neural networks, which correctly classify the access pattern with up to 99% precision.
Summarizer: Trading Communication with Computing Near Storage
TLDR
This work designs a set of application programming interfaces (APIs) that can be used by the host application to offload a data intensive task to the SSD processor, and describes how these APIs can be implemented by simple modifications to the existing Non-Volatile Memory Express (NVMe) command interface between the host and the SSD processors.
Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing
TLDR
This work shows that by exploiting internal parallelism, SSD performance is no longer highly sensitive to access patterns, but rather to other factors, such as data access interferences and physical data layout, which allows for significantly increasing data processing throughput.
LightNVM: The Linux Open-Channel SSD Subsystem
TLDR
The experience of building LightNVM, the Linux Open-Channel SSD subsystem, is presented and a new Physical Page Address I/O interface that exposes SSD parallelism and storage media characteristics is introduced, which integrates into traditional storage stacks, while also enabling storage engines to take advantage of the new I-O interface.
An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems
TLDR
A thorough evaluation of ML-based DBMS knob tuning methods on an enterprise database application using the OtterTune tuning service shows that three state-of-the-art ML algorithms generate knob configurations that improve performance by 45% over enterprise-grade configurations.
FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs
TLDR
The wear of different channels and dies is proposed to be allowed to diverge at fine time granularities in favor of isolation and adjusting that imbalance at a coarse time granularity in a principled manner.
DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings
TLDR
This work proposes a complete paradigm shift in the design of the core FTL engine from the existing techniques with a Demand-based Flash Translation Layer (DFTL), which selectively caches page-level address mappings and develops a flash simulation framework called FlashSim.
Design Tradeoffs for SSD Performance
TLDR
It is found that SSD performance and lifetime is highly workload-sensitive, and that complex systems problems that normally appear higher in the storage stack, or even in distributed systems, are relevant to device firmware.
Performance Evaluation of Dynamic Page Allocation Strategies in SSDs
TLDR
Using steady-state analysis of SSDs, it is shown that dynamism helps to mitigate performance and endurance overheads of garbage collection, and midrange/high-end SSDs with dynamic allocation can provide I/O operations per second improvement of up to 3.3x/9.6x.
...
1
2
3
4
5
...