From sequence mapping to genome assemblies.

@article{Otto2015FromSM,
  title={From sequence mapping to genome assemblies.},
  author={Thomas Dan Otto},
  journal={Methods in molecular biology},
  year={2015},
  volume={1201},
  pages={
          19-50
        }
}
  • T. Otto
  • Published 2015
  • Biology
  • Methods in molecular biology
The development of "next-generation" high-throughput sequencing technologies has made it possible for many labs to undertake sequencing-based research projects that were unthinkable just a few years ago. Although the scientific applications are diverse, e.g., new genome projects, gene expression analysis, genome-wide functional screens, or epigenetics-the sequence data are usually processed in one of two ways: sequence reads are either mapped to an existing reference sequence, or they are built… 
Resequencing the Escherichia coli genome by GenoCare single molecule sequencing platform
TLDR
The new GenoCare single molecule sequencing platform from Direct Genomics is used to resequence the E. coli genome and shows comparable performance to the Illumina MiSeq system.
An improved Plasmodium cynomolgi genome assembly reveals an unexpected methyltransferase gene expansion
TLDR
The high quality and contiguity of the data have enabled the discovery of a novel expansion of methyltransferase in the subtelomeres, and illustrates the new comparative genomics capabilities that are being unlocked by complete reference genomes.
A new reference sequence with improved Plasmodium vivax assembly of the subtelomeres reveals an abundance of pir
TLDR
An extensive repertoire of over 1200 interspersed repeat genes were identified in PvP01 Plasmodium pir compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development.
A new reference sequence with improved Plasmodium vivax assembly of the subtelomeres reveals an abundance of pir
TLDR
An extensive repertoire of over 1200 interspersed repeat genes were identified in PvP01 Plasmodium pir compared to 346 in Salvador-I, suggesting a vital role in parasite survival or development.
Genomes of all known members of a Plasmodium subgenus reveal paths to virulent human malaria
TLDR
It is concluded that interspecific gene transfers, as well as convergent evolution, were important in the evolution of the Laverania subgenus and features of the human-infecting Plasmodium falciparum species that enable parasite transmission in humans.
Genomes of an entire Plasmodium subgenus reveal paths to virulent human malaria
TLDR
It is concluded that interspecific gene transfers as well as convergent evolution were important in the evolution of these species and the timing of the beginning of speciation to be 40,000-60,000 years ago followed by a population bottleneck around 4,000 -6,000years ago.
A new Plasmodium vivax reference sequence with improved assembly of the subtelomeres reveals an abundance of pir genes
TLDR
The manually curated PvP01 reference and PvC01 and PvT01 draft assemblies are important new resources to study vivax malaria, suggesting a vital role in parasite survival or development.
Chaos inspired Particle Swarm Optimization with Levy Flight for Genome Sequence Assembly
TLDR
A new variant of PSO is proposed to address this permutation-optimization problem of computational biology and demonstrates that the proposed model attain a better performance with better reliability and consistency in comparison to others competitive methods in all cases.
Characterisation of the rif and stevor multigene families in Plasmodium falciparum isolates sampled from natural infections
TLDR
The first extensive analysis of sequence diversity and expression patterns of rif and stevor variant gene families in African field isolates of P. falciparum shows that RIFINs for additional targets of naturally acquired antibodies that recognize the surface of parasite-infected red blood cells.
Title: Genomes of all known members of a Plasmodium subgenus reveal paths to 1 virulent human malaria
TLDR
The plasmodium falciparum bacterium, the most virulent agent of human malaria, shares a common ancestor with E.coli, the bacterium that causes diarrhoea and vomiting in people infected with malaria.

References

SHOWING 1-10 OF 25 REFERENCES
Limitations of next-generation genome sequence assembly
TLDR
It is concluded that high-quality sequencing approaches must be considered in conjunction with high-throughput sequencing for comparative genomics analyses and studies of genome evolution.
A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs
TLDR
This protocol describes software (PAGIT) that is used to improve the quality of draft genomes and offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes in order to improve scaffolding and generating annotations.
RATT: Rapid Annotation Transfer Tool
TLDR
A method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny.
Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies
TLDR
It is demonstrated that the use of single or multiple fragments of a template genome in combination with repeat-masking results in mapping success rates close to 100% and closes the remaining gaps in prokaryotic genome sequence assemblies very efficient and virtually effortless.
ABACAS: algorithm-based automatic contiguation of assembled sequences
TLDR
ABACAS is intended as a tool to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence.
Scaffolding pre-assembled contigs using SSPACE
TLDR
A new tool, called SSPACE, which is a stand-alone scaffolder of pre-assembled contigs using paired-read data with a short runtime, multiple library input of paired-end and/or mate pair datasets and possible contig extension with unmapped sequence reads.
ABySS: a parallel assembler for short read sequence data.
TLDR
ABySS (Assembly By Short Sequences), a parallelized sequence assembler, was developed and assembled 3.5 billion paired-end reads from the genome of an African male publicly released by Illumina, Inc, representing 68% of the reference human genome.
BamView: visualizing and interpretation of next-generation sequencing read alignments
TLDR
BamView allows the user to study NGS data in the context of the sequence and annotation of the reference genome, and single nucleotide polymorphism (SNP) density and candidate SNP sites can be highlighted and investigated, and read-pair information can be used to discover large structural insertions and deletions.
Efficient de novo assembly of large genomes using compressed data structures.
TLDR
A new assembler based on the overlap-based string graph model of assembly, SGA (String Graph Assembler), which provides the first practical assembler for a mammalian-sized genome on a low-end computing cluster and is simply parallelizable.
Hierarchical scaffolding with Bambus.
TLDR
This work has developed a general-purpose scaffolder, called Bambus, which affords users significant flexibility in controlling the scaffolding parameters and enables the use of linking data other than that inferred from mate-pair information.
...
...