Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

@article{Bowers2017MinimumIA,
  title={Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea},
  author={Robert M. Bowers and Nikos C. Kyrpides and Ramunas Stepanauskas and Miranda Harmon-Smith and Devin F R Doud and T. B. K. Reddy and Frederik Schulz and Jessica K. Jarett and Adam R. Rivers and Emiley A. Eloe-Fadrosh and Susannah G. Tringe and Natalia N. Ivanova and Alex Copeland and Alicia Clum and Eric D. Becraft and Rex R. Malmstrom and Bruce W. Birren and Mircea Podar and Peer Bork and George M. Weinstock and George M. Garrity and Jeremy A. Dodsworth and Shibu Yooseph and Granger G. Sutton and Frank Oliver Gloeckner and Jack A. Gilbert and William C. Nelson and Steven J. Hallam and Sean P. Jungbluth and Thijs J. G. Ettema and Scott W. Tighe and Konstantinos T. Konstantinidis and Wen-Tso Liu and Brett J. Baker and Thomas Rattei and Jonathan A. Eisen and Brian P. Hedlund and Katherine D. McMahon and Noah Fierer and Rob Knight and Robert D. Finn and Guy Cochrane and Ilene Karsch-Mizrachi and Gene W. Tyson and Christian Rinke and Alla L. Lapidus and Folker Meyer and Pelin Yilmaz and Donovan H. Parks and A. Murat Eren and Lynn M. Schriml and Jillian F. Banfield and P. Bernt Hugenholtz and Tanja Woyke},
  journal={Nature Biotechnology},
  year={2017},
  volume={35},
  pages={725 - 731}
}
We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used… 
Accurate and Complete Genomes from Metagenomes
TLDR
Through analysis of ~7000 published complete bacterial isolate genomes, the value of cumulative GC skew is verified in combination with other metrics to establish bacterial genome sequence accuracy and analysis of possible mis-assemblies identified potential mis- assemblies in some reference genomes of isolated bacteria.
Minimum Information about an Uncultivated Virus Genome (MIUViG)
TLDR
Community-wide adoption of MIUViG standards, which complement the Minimum Information about a Single Amplified Genome and Metagenome-Assembled Genome standards, will improve the reporting of uncultivated virus genomes in public databases, and should enable more robust comparative studies and a systematic exploration of the global virosphere.
Full Shotgun DNA Metagenomics
TLDR
The focus in the chapter is on MG-RAST that both can handle the information from predicted proteins in metagenomics data for further prediction of function or taxonomic relationships and also can extract the 16S rRNA gene sequence information and provide more detailed taxonomic information from the specialized databases SILVA, Greengenes, and RDP.
Long-read metagenomics retrieves complete single-contig bacterial genomes from canine feces
TLDR
It is shown that long reads are essential to detect mobilome functions, usually missed in short-read MAGs, in metagenome-assembled genomes retrieved from canine feces of a healthy dog with nanopore long-reads.
Improved Mobilome Delineation in Fragmented Genomes
TLDR
TIGER2 better captures MAG microdiversity, recovering niche-defining GIs and supporting microbiome research aims such as virus-host linking and ecological assessment, as well as improving cross-scaffold search.
Complete and validated genomes from a metagenome
TLDR
This work presents a strategy to complete and validate multiple MAGs from a bacterial community using a combination of short and ultra long reads, and obtained multiple complete genomes from a naphthenic acid-degrading community, including one from the recently proposed Candidate Phyla Radiation.
Genomes from uncultivated prokaryotes: a comparison of metagenome-assembled and single-amplified genomes
TLDR
The strong agreement between the single-amplified and metagenome-assembled genomes emphasizes that both methods generate accurate genome information from uncultivated bacteria, implying that the research questions and the available resources are allowed to determine the selection of genomics approach for microbiome studies.
MetaPlatanus: a metagenome assembler that combines long-range sequence links and species-specific features
TLDR
The study demonstrates that MetaPlatanus could be an effective approach for exploring large-scale structures in metagenomes and can circumvent the limitations of highly fragmented assemblies and frequent interspecies misassembles obtained by the other tools.
Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes
TLDR
The optimal reconstruction method for four microbiome projects that had variable sequencing platforms, diversity, and environment, using a set of parameters to select for optimal assembly and binning tools is determined, finding metagenomes from microbial communities that have high coverage of phylogenetically distinct, and low taxonomic diversity results in highest quality metagenome-assembled genomes.
...
...

References

SHOWING 1-10 OF 96 REFERENCES
Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications
TLDR
To establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, the minimum information about any (x) sequence is presented (MIxS).
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.
TLDR
An objective measure of genome quality is proposed that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities and is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches.
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
TLDR
SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies.
Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes
TLDR
Reanalysis of published metagenomes reveals that differential coverage binning facilitates recovery of more complete and higher fidelity genome bins than other currently used methods, which are primarily based on sequence composition.
Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes
TLDR
This work presents a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences.
Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies
TLDR
This re-analysis of sequencing data generated for the tardigrade Hypsibius dujardini creates a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries, and indicates that most contaminant scaffolds were assembled from Moleculo long-read libraries.
Community-wide analysis of microbial genome sequence signatures
TLDR
It is found that shared environmental pressures and interactions among coevolving organisms do not obscure genome signatures in acid mine drainage communities and genome signatures can be used to assign sequence fragments to populations, an essential prerequisite if metagenomics is to provide ecological and biochemical insights into the functioning of microbial communities.
Fast Identification and Removal of Sequence Contamination from Genomic and Metagenomic Datasets
TLDR
DeconSeq is a robust framework for the rapid, automated identification and removal of sequence contamination in longer-read datasets (150 bp mean read length) and allows scientists to automatically detect and efficiently remove unwanted sequence contamination from their datasets while eliminating critical limitations of current methods.
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
TLDR
The results strongly support the need for systematic ‘phylogenomic’ efforts to compile a phylogeny-driven ‘Genomic Encyclopedia of Bacteria and Archaea’ in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come.
Genome Project Standards in a New Era of Sequencing
TLDR
There is an urgent need to distinguish good from poor data sets in genome sequences, as there is an ever-widening gap between drafted and finished genomes that only promises to continue.
...
...