The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome

@article{McDonald2012TheBO,
  title={The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome},
  author={Daniel McDonald and Jos{\'e} Carlos Clemente and Justin Kuczynski and Jai Ram Rideout and Jesse Stombaugh and Doug Wendel and Andreas Wilke and Susan M. Huse and John Hufnagle and Folker Meyer and Rob Knight and J. Gregory Caporaso},
  journal={GigaScience},
  year={2012},
  volume={1},
  pages={7 - 7}
}
BackgroundWe present the Biological Observation Matrix (BIOM, pronounced “biome”) format: a JSON-based file format for representing arbitrary observation by sample contingency tables with associated sample and observation metadata. As the number of categories of comparative omics data types (collectively, the “ome-ome”) grows rapidly, a general format to represent and archive this data will facilitate the interoperability of existing bioinformatics tools and future meta-analyses.FindingsThe… 
biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format.
TLDR
A BioJS component for parsing BIOM data in all format versions that supports import, modification, and export via a unified interface is presented and it's usefulness is demonstrated by two applications that already use this component.
biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format
TLDR
This module aims to facilitate the development of web applications that use BIOM data through import, modification, and export via a unified interface and demonstrates its usefulness by two applications that already use this component.
Mian: Interactive Web-Based 16S rRNA Operational Taxonomic Unit Table Data Visualization and Discovery Platform
TLDR
Mian is an open-source web framework to interactively visualize or run a suite of statistical and feature selection tools on the microbiome to identify important taxonomic groups in the context of any provided categorical or numerical metadata.
MicrobiomeDB: a systems biology platform for integrating, mining and analyzing microbiome experiments
TLDR
The current release of the database integrates microbial census data with sample details for nearly 14 000 samples originating from human, animal and environmental sources, including over 9000 samples from healthy human subjects in the Human Microbiome Project.
Phinch: An interactive, exploratory data visualization framework for metagenomic datasets
TLDR
Phinch, an interactive, browser-based visualization framework that can be used to explore and analyze biological patterns in high-throughput -Omic datasets, takes advantage of standard file formats from computational pipelines in order to bridge the gap between biological software and existing data visualization capabilities.
ASaiM: a Galaxy-based framework to analyze raw shotgun data from microbiota
TLDR
Based on the Galaxy framework, ASaiM provides a powerful framework to easily and quickly explore microbiota data in a reproducible and transparent environment and offers sophisticated analyses to scientists without command-line knowledge.
Fizzy: feature subset selection for
TLDR
A new Python command line tool is developed for microbial ecologists that implements information-theoretic subset selection methods for biological data formats that help to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome.
Calour: an Interactive, Microbe-Centric Analysis Tool
TLDR
Calour provides a study-centric data model to store and manipulate sample-by-feature tables (with features typically being operational taxonomic units) and their associated metadata, and generates an interactive heatmap, allowing visualization of microbial patterns and exploration using microbial knowledge databases.
Compact graphical representation of phylogenetic data and metadata with GraPhlAn
TLDR
GraPhlAn (Graphical Phylogenetic Analysis), a computational tool that produces high-quality, compact visualizations of microbial genomes and metagenomes, is developed as an open-source command-driven tool in order to be easily integrated into complex, publication-quality bioinformatics pipelines.
metaXplor: an interactive viral and microbial metagenomic data manager
Abstract Background Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both
...
...

References

SHOWING 1-10 OF 42 REFERENCES
The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes
TLDR
The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes that is stable, extensible, and freely available to all researchers.
Resources and Costs for Microbial Sequence Analysis Evaluated Using Virtual Machines and Cloud Computing
TLDR
Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, the results suggests that smaller sequencing facilities invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bio informatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers.
IMG: the integrated microbial genomes database and comparative analysis system
TLDR
The Integrated Microbial Genomes system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context and provides tools and viewers for analyzing and reviewing the annotations of genes and genomes inA comparative context.
Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications
TLDR
To establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, the minimum information about any (x) sequence is presented (MIxS).
Moving pictures of the human microbiome
TLDR
The largest human microbiota time series analysis to date is presented, covering two individuals at four body sites over 396 timepoints and finds that despite stable differences between body sites and individuals, there is pronounced variability in an individual's microbiota across months, weeks and even days.
A human gut microbial gene catalogue established by metagenomic sequencing
TLDR
The Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence, from faecal samples of 124 European individuals are described, indicating that the entire cohort harbours between 1,000 and 1,150 prevalent bacterial species and each individual at least 160 such species.
Badomics words and the power and peril of the ome-meme
TLDR
The spread of one small subset of words that are meant to convey “comprehensiveness” in some way: the “omes” and other words derived from “genome” or “ genomics” are discussed.
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
TLDR
The results strongly support the need for systematic ‘phylogenomic’ efforts to compile a phylogeny-driven ‘Genomic Encyclopedia of Bacteria and Archaea’ in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come.
Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates.
TLDR
It is found that 16S rDNA diversity is grossly overestimated unless relatively stringent read quality filtering and low clustering thresholds are applied, and stringent quality-based trimming of 16S pyrotags and clustering threshold no greater than 97% identity should be used to avoid overestimates of the rare biosphere.
The rational exploration of microbial diversity
TLDR
A robust statistical method is applied to large gene sequence libraries from these environments to estimate both diversity and the sequencing effort required to obtain a given fraction of that diversity.
...
...