• Corpus ID: 16456320

Maximum Likelihood de novo reconstruction of viral populations using paired end sequencing data

@article{Malhotra2015MaximumLD,
  title={Maximum Likelihood de novo reconstruction of viral populations using paired end sequencing data},
  author={Raunaq Malhotra and Manjari Mukhopadhyay Steven Wu and Allen G. Rodrigo and Mary Poss and Raj Acharya},
  journal={arXiv: Populations and Evolution},
  year={2015}
}
We present MLEHaplo, a maximum likelihood de novo assembly algorithm for reconstructing viral haplotypes in a virus population from paired-end next generation sequencing (NGS) data. Using the pairing information of reads in our proposed Viral Path Reconstruction Algorithm (ViPRA), we generate a small subset of paths from a De Bruijn graph of reads that serve as candidate paths for true viral haplotypes. Our proposed method MLEHaplo then generates a maximum likelihood estimate of the viral… 

Figures and Tables from this paper

Evaluation of haplotype callers for next-generation sequencing of viruses
TLDR
It is concluded that haplotype reconstruction from NGS short reads is unreliable due to high genetic diversity of fast-evolving viruses, and local haplotypes reconstruction of longer reads to phase variants may provide a more reliable estimation of viral variants within a population.
Evaluation of haplotype callers for next-generation sequencing of viruses.
  • A. Eliseev, K. M. Gibson, +5 authors K. Crandall
  • Medicine, Biology
    Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases
  • 2020
De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding
TLDR
This work developed a de novo haplotype reconstruction tool PEHaplo for viral quasispecies data, which contains a group of related but different viral strains and employs paired-end reads to distinguish highly similar strains.
Full-length de novo viral quasispecies assembly through variation graph construction
TLDR
Virus-VG is presented as a de novo approach to viral haplotype reconstruction from pre-assembled contigs and shows significant improvements in assembly contiguity compared to the input contigs, while preserving low error ratesCompared to the state-of-the-art viral quasispecies assemblers.
De novo assembly of viral quasispecies using overlap graphs
TLDR
This work presents SAVAGE, a computational tool for reconstructing individual haplotypes of intrahost virus strains without the need for a high-quality reference genome, and applies it on two deep coverage samples of patients infected by the Zika and the hepatitis C virus, which sheds light on the genetic structures of the respective viral quasispecies.
ViQUF: de novo Viral Quasispecies reconstruction using Unitig-based Flow networks
TLDR
ViQUF is a de novo viral quasispecies assembler that addresses haplotype assembly and quantification and is at least four times faster using at most half of the memory than previous methods, while maintaining, and in some cases outperforming, the high quality of assembly and frequency estimation of overlap graph-based methodologies.
De novo assembly of viral quasispecies using overlap graphs.
TLDR
This work presents SAVAGE, a computational tool for reconstructing individual haplotypes of intra-host virus strains without the need for a high-quality reference genome, and applies it on two deep-coverage samples of patients infected by the Zika and the hepatitis C virus, which sheds light on the genetic structures of the respective viruses.
Inference of viral quasispecies with a paired de Bruijn graph
MOTIVATION RNA viruses exhibit a high mutation rate and thus they exist in infected cells as a population of closely related strains called viral quasispecies. The viral quasispecies assembly problem
A binning tool to reconstruct viral haplotypes from assembled contigs
TLDR
A contig binning tool, VirBin, which clusters contigs into different groups so that each group represents a haplotype, and competes favorably with other tools on viral contig bins for viral haplotype reconstruction.
A binning tool to reconstruct viral haplotypes from assembled contigs
TLDR
A contig binning tool, VirBin, which clusters contigs into different groups so that each group represents a haplotype, and demonstrates the superior sensitivity and precision of VirBIn in contig Binning for viral haplotype reconstruction.
...
1
2
...

References

SHOWING 1-10 OF 49 REFERENCES
Full-length haplotype reconstruction to infer the structure of heterogeneous virus populations
TLDR
It is shown that assembly of whole viral genomes of ∼8600 nucleotides length is feasible from mixtures of heterogeneous HIV-1 strains derived from defined combinations of cloned virus strains and from clinical samples of an HIV- 1 superinfected individual.
Reconstruction of viral population structure from next-generation sequencing data using multicommodity flows
TLDR
The problem of viral population reconstruction from amplicon or shotgun NGS reads was solved using the MCF formulation, and two new methods, AmpMCF and ShotMCF, for reconstruction of the whole-genome intra-host viral variants and estimation of their frequencies were developed, based on Multicommodity Flows.
Viral Quasispecies Assembly via Maximal Clique Enumeration
TLDR
HaploClique, a computational approach to reconstruct the structure of a viral quasispecies from next-generation sequencing data as obtained from bulk sequencing of mixed virus samples, is presented and compares favorably to state-of-the-art haplotype inference methods.
QuRe: software for viral quasispecies reconstruction from next-generation sequencing data
TLDR
QuRe is a program for viral quasispecies reconstruction, specifically developed to analyze long read NGS data, and comes with a built-in Poisson error correction method and a post-reconstruction probabilistic clustering, both parameterized on given error rates in homopolymeric and non-homopolymeric regions.
Accurate viral population assembly from ultra-deep sequencing data
TLDR
VGA is the first viral assembly method that scales to millions of sequencing reads and outperforms state-of-the-art methods for genome-wide viral assembly and detects rare variants previously undetectable due to sequencing errors.
Viral Population Estimation Using Pyrosequencing
TLDR
It is demonstrated that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations.
De novo assembly of highly diverse viral populations
TLDR
VICUNA, a publicly available software tool, that enables consensus assembly of ultra-deep sequence derived from diverse viral populations, and its application to other heterogeneous sequence data sets such as metagenomic or tumor cell population samples may prove beneficial in these fields of research.
Combinatorial analysis and algorithms for quasispecies reconstruction using next-generation sequencing
TLDR
The combinatorial analysis provided a description of the difficulty to reconstruct a quasispecies, given a determined amplicon partition and a measure of population diversity, and the reconstruction algorithm showed good performance both considering simulated data and real data, even in presence of sequencing errors.
Read length versus Depth of Coverage for Viral Quasispecies Reconstruction
TLDR
This work investigates how the differences between two common platforms provided by 454/Roche and Illumina affect viral diversity estimation and the reconstruction of viral haplotypes and provides guidance for the design of viral diversity studies.
Benchmarking of viral haplotype reconstruction programmes: an overview of the capacities and limitations of currently available programmes
TLDR
An independent benchmarking study that directly compares the currently available viral haplotype reconstruction programmes and developed a novel statistical framework to demonstrate the strengths and limitations of the programmes.
...
1
2
3
4
5
...