Characterization, correction and de novo assembly of an Oxford Nanopore genomic dataset from Agrobacterium tumefaciens

@article{Deschamps2016CharacterizationCA,
  title={Characterization, correction and de novo assembly of an Oxford Nanopore genomic dataset from Agrobacterium tumefaciens},
  author={St{\'e}phane Deschamps and Joann Mudge and Connor Cameron and Thiruvarangan Ramaraj and Ajith Anand and Kevin A. Fengler and Kevin Hayes and Victor Llaca and Todd J. Jones and Gregory D. May},
  journal={Scientific Reports},
  year={2016},
  volume={6}
}
The MinION is a portable single-molecule DNA sequencing instrument that was released by Oxford Nanopore Technologies in 2014, producing long sequencing reads by measuring changes in ionic flow when single-stranded DNA molecules translocate through the pores. While MinION long reads have an error rate substantially higher than the ones produced by short-read sequencing technologies, they can generate de novo assemblies of microbial genomes, after an initial correction step that includes… 
Evaluation of assembly methods combining long-reads and short-reads to obtain Paenibacillus sp. R4 high-quality complete genome
TLDR
The results indicated that for more accurate predictions of open reading frames, contigs in the assemblies using only PacBio reads also needed to be corrected using short reads with high-quality bases, and repeat regions in genomes did not affect the increase of mispredicted coding sequences via genome polishing significantly.
Evaluation of long-read Nanopore sequencing in genome studies
TLDR
An evaluation of using the Nanopore MinION platform for de novo assembly of the Bifidobacterium longum genome and the suitability of short AFLP fragments to correct long reads that had up to 40 % of erroneous bases demonstrates their high potential for error-correction.
De Novo Assembly of a New Solanum pennellii Accession Using Nanopore Sequencing[CC-BY]
TLDR
The generation of a comprehensive nanopore sequencing data set with a median read length of 11,979 bp for a self-compatible accession of the wild tomato species Solanum pennellii indicates that such long read sequencing data can be used to affordably sequence and assemble gigabase-sized plant genomes.
Shotgun metagenome data of a defined mock community using Oxford Nanopore, PacBio and Illumina technologies
TLDR
A comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial mock community using the Oxford Nanopore Technologies (ONT) MinION, PacBio and Illumina sequencing platforms is reported.
Investigation of chimeric reads using the MinION
TLDR
It is found that at least 1.7% of reads prepared using the Nanopore LSK002 2D Ligation Kit include post-amplification chimeric elements, which has potential implications for other amplicon sequencing technologies, as the process is unlikely to be specific to the sample preparation used for nanopore sequencing.
Investigation of chimeric reads using the MinION
TLDR
It is found that at least 1.7% of reads prepared using the Nanopore LSK002 2D Ligation Kit include post-amplification chimeric elements, which has potential implications for other amplicon sequencing technologies, as the process is unlikely to be specific to the sample preparation used for nanopore sequencing.
Comparative analysis of targeted long read sequencing approaches for characterization of a plant’s immune receptor repertoire
TLDR
It is demonstrated how MinION data can be used for RenSeq achieving similar results to the PacBio and how novel NLR gene fusions can be identified via a Nanopore RenSequ pipeline.
A world of opportunities with nanopore sequencing.
TLDR
A brief overview of nanopore sequencing technology is provided, the growing range of nanopORE bioinformatics tools are described, and some of the most influential publications that have emerged over the last 2 years are highlighted.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
TLDR
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes and compilation of biological features that interfered with assembly included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication.
Resolving plasmid structures in Enterobacteriaceae using the MinION nanopore sequencer: assessment of MinION and MinION/Illumina hybrid data assembly approaches
TLDR
The analysis demonstrated the potential of using MinION sequencing technology to resolve important plasmid structures in Enterobacteriaceae species independent of and in conjunction with Illumina sequencing data.
...
...

References

SHOWING 1-10 OF 46 REFERENCES
Oxford Nanopore Sequencing, Hybrid Error Correction, and de novo Assembly of a Eukaryotic Genome
TLDR
The assembly with the long nanopore reads presents a much more complete representation of the features of the genome and correctly assembles gene cassettes, rRNAs, transposable elements, and other genomic features that were almost entirely absent in the Illumina-only assembly.
Hybrid error correction and de novo assembly of single-molecule sequencing reads
TLDR
This work introduces a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences, leading to substantially better assemblies than current sequencing strategies.
Genome assembly using Nanopore-guided long and error-free DNA reads
TLDR
The hybrid strategy was able to generate NaS (Nanopore Synthetic-long) reads up to 60 kb that aligned entirely and with no error to the reference genome and that spanned highly conserved repetitive regions, in contrast to an Illumina-only assembly.
Evaluation and Validation of Assembling Corrected PacBio Long Reads for Microbial Genome Completion via Hybrid Approaches
TLDR
Evaluation of the contemporary hybrid approaches shows that assembling the ECTools-corrected long reads via runCA generates near complete microbial genomes, suggesting that genome assembly could benefit from re-analyzing the available hybrid datasets that were not assembled in an optimal fashion.
Scaffolding of a bacterial genome using MinION nanopore sequencing
TLDR
It is shown that the MinION system produces long reads with high mapability that can be used for scaffolding bacterial genomes, despite currently producing substantially higher error rates than PacBio reads.
Evaluation of hybrid and non-hybrid methods for de novo assembly of nanopore reads
TLDR
Results show that hybrid methods are highly dependent on thequality of NGS data, but much less on the quality and coverage of nanopore data and perform relatively well on lower nanopore coverages, while non-hybrid methods correctly assemble the E. coli genome when coverage is above 40x.
SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
TLDR
The current work describes the SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences, and concludes that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run the program, allow to scaffold genomes in a fast and reliable manner.
De novo sequencing and variant calling with nanopores using PoreSeq
TLDR
PoreSeq is described, an algorithm that identifies and corrects errors in nanopore sequencing data and improves the accuracy of de novo genome assembly with increasing coverage depth.
Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement
TLDR
Pilon is a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions, which is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains.
Assessing the performance of the Oxford Nanopore Technologies MinION
...
...