Even better correction of genome sequencing data
@article{Dlugosz2017EvenBC, title={Even better correction of genome sequencing data}, author={Maciej Dlugosz and Sebastian Deorowicz and Marek Kokot}, journal={ArXiv}, year={2017}, volume={abs/1703.00690} }
We introduce an improved version of RECKONER, an error corrector for Illumina whole genome sequencing data. By modifying its workflow we reduce the computation time even 10 times. We also propose a new method of determination of $k$-mer length, the key parameter of $k$-spectrum-based family of correctors. The correction algorithms are examined on huge data sets, i.e., human and maize genomes for both Illumina HiSeq and MiSeq instruments.
References
SHOWING 1-10 OF 24 REFERENCES
RECKONER: read error corrector based on KMC
- BiologyBioinform.
- 2017
A new correction algorithm capable of processing eukaryotic close to 500 Mbp‐genome‐size, high error‐rated data using less than 4 GB of RAM in about 35 min on 16‐core computer is introduced.
Correcting Illumina data
- Computer ScienceBriefings Bioinform.
- 2015
A thorough comparison of the efficiency of the current state-of-the-art programs for correcting Illumina data and research directions for further improvement are provided.
Trowel: a fast and accurate error correction module for Illumina sequencing reads
- Computer ScienceBioinform.
- 2014
Trowel, a massively parallelized and highly efficient error correction module for Illumina read data that both corrects erroneous base calls and boosts base qualities based on the k-mer spectrum, achieves high accuracy for different short read sequencing applications.
RACER: Rapid and accurate correction of errors in reads
- Computer ScienceBioinform.
- 2013
This work proposes RACER (Rapid and Accurate Correction of Errors in Reads), a new software program for correcting errors in sequencing data that has better error-correcting performance than existing programs, is faster and requires less memory.
BFC: correcting Illumina sequencing errors
- BiologyBioinform.
- 2015
UNLABELLED
BFC is a free, fast and easy-to-use sequencing error corrector designed for Illumina short reads. It uses a non-greedy algorithm but still maintains a speed comparable to implementations…
Mason – A Read Simulator for Second Generation Sequencing Data
- Computer Science
- 2010
A read simulator software for Illumina, 454 and Sanger reads that has been written with performance in mind and can sample reads from large genomes.
ACE: accurate correction of errors using K-mer tries
- Computer ScienceBioinform.
- 2015
A tool, ACE, based on K-mer tries to correct substitution errors in Illumina archives, which yields higher gains in terms of coverage depth, outperforming state-of-the-art competitors in the majority of cases.
Blue: correcting sequencing errors using consensus and context
- Computer ScienceBioinform.
- 2014
Blue is an error-correction algorithm based on k-mer consensus and context that can correct substitution, deletion and insertion errors, as well as uncalled bases, and is usable on large sequencing datasets.
Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data
- Computer ScienceBioinform.
- 2013
This article uses the k-mer spectrum approach and introduces three correction techniques in a multistage workflow: two-sided conservative correction, one-sided aggressive correction and voting-based refinement to reveal that Musket is consistently one of the top performing correctors for Illumina short-read data.
Karect: accurate correction of substitution, insertion and deletion errors for next-generation sequencing data
- Computer ScienceBioinform.
- 2015
Karect is a novel error correction technique based on multiple alignment that supports substitution, insertion and deletion errors and can handle non-uniform coverage as well as moderately covered areas of the sequenced genome.