Which is faster: bowtie2GP bowtie > bowtie2 > BWA

@article{Langdon2013WhichIF,
  title={Which is faster: bowtie2GP bowtie > bowtie2 > BWA},
  author={William B. Langdon},
  journal={Proceedings of the 15th annual conference companion on Genetic and evolutionary computation},
  year={2013}
}
  • W. Langdon
  • Published 22 January 2013
  • Biology
  • Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
We have recently used genetic programming to automatically generate an improved version of Langmead's DNA read alignment tool Bowtie2 [RN/12/09, Sect.5.3]. We find it runs more than four times faster than the Bioinformatics sequencing tool (BWA) currently used with short next generation paired end DNA sequences by the Cancer Institute, takes less memory and yet finds similar matches in the human genome. 

Tables from this paper

Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks

A genetic programming based automated technique was demonstrated which generated a version of the state-of-the-art alignment tool Bowtie2 which was considerably faster on short sequences produced by a scanner at the Broad Institute and released as part of The Thousand Genome Project.

Chloroplast Genome Sequencing, Comparative Analysis, and Discovery of Unique Cytoplasmic Variants in Pomegranate (Punica granatum L.)

The contraction and expansion analysis revealed that the structural variations in IRs, LSC, and SSC have significantly accounted for the evolution of cp genomes of Punica and L. intermedia over the periods.

A search for improved performance in regular expressions

In this paper, Genetic Programming is used to find performance improvements in regular expressions for an array of target programs, representing the first application of automated software improvement for run-time performance in the Regular Expression language.

RN / 12 / 03 Evolving Human Competitive Spectra-Based Fault Localisation Techniques 08 / 05 / 2012

GP-evolved equations can consistently outperform many of the human-designed formulæ, such as Tarantula, Ochiai, Jaccard, Ample, and Wong1/2, up to 5.9 times and even outperform it against other program structures.

Evolving Human Competitive Spectra-Based Fault Localisation Techniques

  • S. Yoo
  • Computer Science
    SSBSE
  • 2012
Equations evolved by Genetic Programming can consistently outperform many of the human-designed formulae, such as Tarantula, Ochiai, Jaccard, Ample, and Wong1/2, up to 6 times and can perform equally as well as Op2, which was recently proved to be optimal against If-Then-Else-2 structure.

Investigating the Evolvability of Web Page Load Time

By exploring Javascript code changes and exploiting combinations of non-destructive changes, this work can optimise page load time by 41% in the authors' benchmark web page.

Using Genetic Programming to Model Software

It is shown genetic programming (GP) can evolve models of aspects of BLAST's output when it is used to map Solexa Next-Gen DNA sequences to the human genome.

References

SHOWING 1-8 OF 8 REFERENCES

Fast gapped-read alignment with Bowtie 2

Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

Fast and accurate long-read alignment with Burrows–Wheeler transform

A new algorithm, Burrows-Wheeler Aligner's Smith-Waterman Alignment (BWA-SW), to align long sequences up to 1 Mb against a large sequence database with a few gigabytes of memory is designed and implemented.

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

Tools for mapping high-throughput sequencing data

This survey focuses on classifying mappers through a wide number of characteristics to allow practitioners to compare the mappers more easily and find those that are most suitable for their specific problem.

A Field Guide to Genetic Programming

A unique overview of this exciting technique is written by three of the most active scientists in GP, which starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination until high-fitness solutions emerge.

Genetically Improving 50000 Lines of C

This work evolved a widely-used and highly complex 50000 line system, seeking improved versions that are faster than the original, yet at least as good semantically, and found a version that is 70 times faster (on average) and is also a small semantic improvement on the original.

6.2-r131 Bowtie 0.12.7 Bowtie 2 2.0.0-beta2 Bowtie2 GP 2.0.0-beta2 updated by 7 line patch as described in technical report [1]

  • 6.2-r131 Bowtie 0.12.7 Bowtie 2 2.0.0-beta2 Bowtie2 GP 2.0.0-beta2 updated by 7 line patch as described in technical report [1]

Fonseca , Johan Rung , Alvis Brazma , and John C . Marioni . T ols for mapping high - throughput sequencing data

  • Bioinformatics
  • 2012