Standardizing the next generation of bioinformatics software development with BioHDF (HDF5).

Abstract

Next Generation Sequencing technologies are limited by the lack of standard bioinformatics infrastructures that can reduce data storage, increase data processing performance, and integrate diverse information. HDF technologies address these requirements and have a long history of use in data-intensive science communities. They include general data file formats, libraries, and tools for working with the data. Compared to emerging standards, such as the SAM/BAM formats, HDF5-based systems demonstrate significantly better scalability, can support multiple indexes, store multiple data types, and are self-describing. For these reasons, HDF5 and its BioHDF extension are well suited for implementing data models to support the next generation of bioinformatics applications.

DOI: 10.1007/978-1-4419-5913-3_77

Cite this paper

@article{Mason2010StandardizingTN, title={Standardizing the next generation of bioinformatics software development with BioHDF (HDF5).}, author={Christopher E. Mason and Paul Zumbo and Stephan J. Sanders and Mike Folk and Dana Robinson and Ruth A. Aydt and Martin Gollery and Mark F Welsh and Nels E. Olson and Todd Smith}, journal={Advances in experimental medicine and biology}, year={2010}, volume={680}, pages={693-700} }