High-throughput discovery of rare insertions and deletions in large cohorts.

@article{Vallania2010HighthroughputDO,
  title={High-throughput discovery of rare insertions and deletions in large cohorts.},
  author={Francesco Vallania and Todd E. Druley and Enrique Ramos and Jue Wang and Ingrid B. Borecki and Michael Province and Robi David Mitra},
  journal={Genome research},
  year={2010},
  volume={20 12},
  pages={
          1711-8
        }
}
Pooled-DNA sequencing strategies enable fast, accurate, and cost-effect detection of rare variants, but current approaches are not able to accurately identify short insertions and deletions (indels), despite their pivotal role in genetic disease. Furthermore, the sensitivity and specificity of these methods depend on arbitrary, user-selected significance thresholds, whose optimal values change from experiment to experiment. Here, we present a combined experimental and computational strategy… 

Figures from this paper

Population-based rare variant detection via pooled exome or custom hybridization capture with or without individual indexing
TLDR
This highly scalable methodology enables accurate rare variant detection, with or without individual DNA sample indexing, while reducing the amount of required source DNA and total costs through less hybridization reagent consumption, multi-sample sonication in a standard PCR plate, multiplexed pre-enrichment pooling with a single hybridization and lesser sequencing coverage required to obtain high sensitivity.
Detection and quantification of rare mutations with massively parallel sequencing
TLDR
An approach that can substantially increase the sensitivity of massively parallel sequencing instruments for the identification of rare variants and the utility of this approach for determining the fidelity of a polymerase, the accuracy of oligonucleotides synthesized in vitro, and the prevalence of mutations in the nuclear and mitochondrial genomes of normal cells is described.
LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets
TLDR
It is shown that LoFreq has near-perfect specificity, with significantly improved sensitivity compared with existing methods and can efficiently analyze deep Illumina sequencing datasets without resorting to approximations or heuristics.
Rare variant discovery and calling by sequencing pooled samples with overlaps
TLDR
Successful discovery of rare variants and identification of variant carriers using overlapping pool strategies critically depend on many steps, from generation of design matrixes to decoding algorithms.
Next-Generation Sequencing for the Analysis of Cancer Specimens
TLDR
Amplification-based as well as hybrid capture-based methods for NGS testing can be used for the analysis of cancer specimens, and assays that target a panel of genes, the exome, or the genome have been developed.
Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
TLDR
A pooled sequencing approach for pooling genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology.
An Evaluation of Different Target Enrichment Methods in Pooled Sequencing Designs for Complex Disease Association Studies
TLDR
It is found that pooled resequencing is most usefully applied as a variant discovery tool due to limitations in estimating allele frequency with high enough accuracy for association studies, and that in-solution hybrid-capture performs best among the enrichment methods examined regardless of pool size.
Detection of somatic mutations in tumors using unaligned clonal sequencing data
Most cancers arise and evolve as a consequence of somatic mutations. These mutations influence tumor behavior and clinical outcome. Consequently, there is considerable interest in identifying somatic
...
...

References

SHOWING 1-10 OF 32 REFERENCES
Overlapping pools for high-throughput targeted resequencing.
TLDR
This work presents a framework for overlapping pool design, where each individual sample is resequenced in several pools (many individuals to many pools), and guarantees high probability of unambiguous singleton carrier identification while maintaining the features of naïve pools in terms of sensitivity, specificity, and the ability to estimate allele frequencies.
A statistical method for the detection of variants from next-generation resequencing of DNA pools
TLDR
A novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: comparing the distribution of allele counts across multiple pools using contingency tables and evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone.
VarScan: variant detection in massively parallel sequencing of individual and pooled samples
TLDR
VarScan is presented, an open source tool for variant detection that is compatible with several short read aligners that demonstrates its ability to detect SNPs and indels with high sensitivity and specificity, in both Roche/454 sequencing of individuals and deep Illumina/Solexa sequencing of pooled samples.
SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries
TLDR
An economical, efficient, single-step method for SNP discovery, validation and characterization that uses deep sequencing of reduced representation libraries (RRLs) from specified target populations and may be applied to any species with at least a partially sequenced genome.
Target-enrichment strategies for next-generation sequencing
TLDR
The experiences with the leading target-enrichment technologies, the optimizations that are performed, and typical results that can be obtained using each are described and detailed protocols for each are provided so that end users can find the best compromise between sensitivity, specificity and uniformity for their particular project.
Mapping short DNA sequencing reads and calling variants using mapping quality scores.
TLDR
This work describes the software MAQ, software that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample.
Genetic Variation in an Individual Human Exome
TLDR
This is the first glimpse of an individual's exome and a snapshot of the current state of personalized genomics, and presents an approach to analyze the coding variation in humans by proposing multiple bioinformatic methods to hone in on possible functional variation.
The Human Gene Mutation Database: 2008 update
TLDR
Although originally established for the scientific study of mutational mechanisms in human genes, HGMD has since acquired a much broader utility for researchers, physicians, clinicians and genetic counselors as well as for companies specializing in biopharmaceuticals, bioinformatics and personalized genomics.
DNA Sudoku--harnessing high-throughput sequencing for multiplexed specimen analysis.
TLDR
This work reports a strategy that permits simultaneous analysis of tens of thousands of specimens through the use of combinatorial pooling strategies in which pools rather than individual specimens are assigned barcodes.
...
...