Sensitive and fast mapping of di-base encoded reads
@article{Hormozdiari2011SensitiveAF, title={Sensitive and fast mapping of di-base encoded reads}, author={Farhad Hormozdiari and Faraz Hach and S{\"u}leyman Cenk Sahinalp and Evan E. Eichler and Can Alkan}, journal={Bioinform.}, year={2011}, volume={28}, pages={150} }
Previously, we have used ‘–seed S20 -k 10000 -v 4’. With this update, PerM now achieves full sensitivity in our simulation experiment. With real datasets (Table 6), PerM tends to map more reads compared with Bowtie, but maps slightly less than Mapreads and SOCS. We would like to apologize for the previous parameter sets we used for PerM, due to our misinterpretation of its documentation. We now update the relevant rows in Tables 3 and 6 as follows. Table 3. Performance of PerM with simulated…
20 Citations
Accelerating read mapping with FastHASH
- Computer ScienceBMC Genomics
- 2013
A new algorithm, FastHASH, is proposed, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods.
The effects of sampling on the efficiency and accuracy of k−mer indexes: Theoretical and empirical comparisons using the human genome
- Computer SciencePloS one
- 2017
It is found that soft sampling significantly reduces both index size and query time with relatively small losses in query accuracy when identifying HSLAs, and a new model for sampling with BLAST is provided that predicts empirical retention rates with reasonable accuracy by modeling two key problem factors.
Boosting high throughput sequencing data compression algorithms using reordering
- Computer Science
- 2013
SCALCE is presented, a “boosting” scheme based on Locally Consistent Parsing technique which reorganizes the reads in a way that results in a higher compression speed and compression rate, independent of the compression algorithm in use and without using a reference genome.
mrsFAST-Ultra: a compact, SNP-aware mapper for high performance sequencing applications
- Computer ScienceNucleic Acids Res.
- 2014
High throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges for processing and downstream analysis. While tools that report the ‘best’ mapping location…
GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies
- BiologyBMC Genomics
- 2018
This work proposes a novel seed location filtering algorithm, GRIM-Filter, optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM).
Structural Variant Calling
- Biology
- 2013
This work used whole-genome shotgun paired-end sequence data generated with both Illumina and Applied Biosystems SOLiD platforms from the genomes of six canid samples to estimate the fraction of the genome with segmental duplications.
ALPHA: A Novel Algorithm-Hardware Co-design for Accelerating DNA Seed Location Filtering
- Computer Science
- 2021
An algorithm-hardware co-design is proposed that exploits the data-reuse in the seed location filtering operation and, compared to the GRIM-Filter, cuts the number of memory accesses by 22-54%, which improves the overall performance and energy consumption.
Comparing fixed sampling with minimizer sampling when using k-mer indexes to find maximal exact matches
- Computer SciencePloS one
- 2018
It is argued that for any application where each shared k-mer occurrence must be processed, fixed sampling is the right sampling method.
Novel computational techniques for mapping and classifying Next-Generation Sequencing data. (Nouvelles techniques informatiques pour la localisation et la classification de données de séquençage haut débit)
- Computer Science
- 2016
This thesis presents novel computational techniques for read mapping and taxonomic classification of NGS reads and provides the first comprehensive overview of this method and demonstrates its qualities using Dynamic Mapping Simulator, a pipeline that compares various dynamic mapping scenarios to static mapping and iterative referencing.
References
SHOWING 1-10 OF 38 REFERENCES
mrsFAST: a cache-oblivious algorithm for short-read mapping
- BiologyNature Methods
- 2010
In almost all recent structural variation discovery studies, short reads from a donor genome have been mapped to a reference genome as a first step, and the accuracy of such an SVD study is directly correlated to this mapping step, which also provides the main computational bottleneck of theSVD study.
PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds
- Computer ScienceBioinform.
- 2009
The mapping software, named PerM (Periodic Seed Mapping) is presented that uses periodic spaced seeds to significantly improve mapping efficiency for large reference genomes when compared with state-of-the-art programs.
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
- Computer ScienceGenome Biology
- 2008
Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.
BFAST: An Alignment Tool for Large Scale Genome Resequencing
- Computer Science, BiologyPloS one
- 2009
It is shown BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods.
Fast and accurate short read alignment with Burrows–Wheeler transform
- Computer ScienceBioinform.
- 2009
Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
SOAP: short oligonucleotide alignment program
- BiologyBioinform.
- 2008
The program SOAP is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology, which supports multi-threaded parallel computing and has a batch module for multiple query sets.
Detection and characterization of novel sequence insertions using paired-end next-generation sequencing
- BiologyBioinform.
- 2010
The NovelSeq framework can be built as part of a general sequence analysis pipeline to discover multiple types of genetic variation (SNPs, structural variation, etc.), thus it requires significantly less-computational resources than de novo sequence assembly.
Mapping short DNA sequencing reads and calling variants using mapping quality scores.
- Computer ScienceGenome research
- 2008
This work describes the software MAQ, software that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample.
SHRiMP: Accurate Mapping of Short Color-space Reads
- BiologyPLoS Comput. Biol.
- 2009
It is demonstrated that SHRiMP can accurately map reads to this highly polymorphic genome, while confirming high heterozygosity of C. savignyi in this second individual.
Technology-specific error signatures in the 1000 Genomes Project data
- BiologyHuman Genetics
- 2011
It is highlighted that different NGS platforms suit different practical applications differently well, and that NGS-based studies require stringent data quality control for their results to be valid, while the use of multiple N GS platforms may be more cost-efficient than relying upon a single technology alone.