A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data

@article{Xu2018ARO,
  title={A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data},
  author={Chang Xu},
  journal={Computational and Structural Biotechnology Journal},
  year={2018},
  volume={16},
  pages={15 - 24}
}
  • Chang Xu
  • Published 6 February 2018
  • Computer Science
  • Computational and Structural Biotechnology Journal

Figures and Tables from this paper

Comparison of somatic variant detection algorithms using Ion Torrent targeted deep sequencing data
TLDR
Cautions should be taken when applying state-of-the-art somatic variant algorithms to Ion Torrent targeted deep sequencing data to ensure that results from bioinformatics pipelines using Ion Torrent deep sequencing can be robustly applied in cancer research and in the clinic.
Best practices for variant calling in clinical sequencing
TLDR
The relative strengths and weaknesses of panel, exome, and whole-genome sequencing for variant detection and recommended tools and strategies for calling variants of different classes are provided, along with guidance on variant review, validation, and benchmarking.
Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
TLDR
AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data and can, in principle, be applied to other sequencing platforms as well.
Accuracy and reproducibility of somatic point mutation calling in clinical-type targeted sequencing data
TLDR
Reconcibility and accuracy of targeted clinical sequencing results depend less on sequencing platform and panel than on variability between replicates and downstream bioinformatics.
Accuracy and Reproducibility of Somatic Point Mutation Calling in Clinical-Type Targeted Sequencing Data
TLDR
Reconcibility and accuracy of targeted clinical sequencing results depends less on sequencing platform and panel than on downstream bioinformatics and biological variability.
Identification of single nucleotide variants using position-specific error estimation in deep sequencing data
TLDR
AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data that achieves a good trade-off between precision and sensitivity.
Comprehensive Outline of Whole Exome Sequencing Data Analysis Tools Available in Clinical Oncology
TLDR
Analysis tools enabling utilization of WES data in clinical and research settings are reviewed, including its restricted ability to detect CNVs, low coverage compared to targeted sequencing, and the missing consensus regarding references and minimal application requirements.
SomatoSim: precision simulation of somatic single nucleotide variants
TLDR
SomatoSim is a user-friendly tool that offers a high level of customizability for simulating somatic single nucleotide variants in sequence alignment map (SAM/BAM) files with full control of the specific variant positions, number of variants, variant allele fractions, depth of coverage, read quality, and base quality.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 98 REFERENCES
Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data
TLDR
EBCall, Mutect, Virmid and Strelka are reported to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing and indel calling, and EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.
Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data
TLDR
There is a need to improve reproducibility of the results in the context of multithreading of variant calling tools regarding their ability to call single nucleotide variants and short indels with allelic frequencies as low as 1% in non-matched next-generation sequencing data.
Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers
TLDR
This study explored the typical false-positive and false-negative detections that arise from the use of sSNV-calling tools, and suggested that despite recent progress, these tools have significant room for improvement, especially in the discrimination of low coverage/allelic-frequency s SNVs and sSNVs with alternate alleles in normal samples.
Comparison of somatic mutation calling methods in amplicon and whole exome sequence data
TLDR
It is demonstrated that the five commonly used somatic SNV calling methods are applicable to both targeted amplicon and exome sequencing data, however, the sensitivities of these methods vary based on the allelic fraction of the mutation in the tumor sample.
SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing
TLDR
SNooPer is shown how the SNooPer algorithm is not affected by low coverage or low VAFs, and can be used to reduce overall sequencing costs while maintaining high specificity and sensitivity to somatic variant calling.
In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data
TLDR
Four popular somatic single nucleotide variant (SNV) calling methods were carefully evaluated on the real whole exome sequencing and ultra-deep targeted sequencing data to provide valuable benchmark for state-of-the-art SNV calling methods.
A somatic reference standard for cancer genome sequencing
TLDR
Paired PCR-free whole genome sequencing of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions is performed as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.
VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research
TLDR
VarDict will greatly facilitate application of NGS in clinical cancer research and performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts.
An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data
TLDR
Empirical Bayesian mutation Calling enables accurate calling of mutations with low allele frequencies harboured within a minor tumour subpopulation, thus allowing for the deciphering of fine substructures within a tumour specimen.
Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications
TLDR
The performance of Platypus is demonstrated by comparing with SAMtools and GATK on whole-genome and exome-capture data, by identifying de novo variation in 15 parent-offspring trios with high sensitivity and specificity, and by estimating human leukocyte antigen genotypes directly from variant calls.
...
1
2
3
4
5
...