Generic accelerated sequence alignment in SeqAn using vectorization and multi‐threading

  title={Generic accelerated sequence alignment in SeqAn using vectorization and multi‐threading},
  author={Ren{\'e} Rahn and Stefan Budach and Pascal Costanza and Marcel Ehrhardt and Jonny Hancox and Knut Reinert},
Motivation: Pairwise sequence alignment is undoubtedly a central tool in many bioinformatics analyses. [] Key Method In our module, we unified the standard dynamic programming kernel used for pairwise sequence alignments and extended it with a generalized inter‐sequence vectorization layout, such that many alignments can be computed simultaneously by exploiting SIMD (single instruction multiple data) instructions of modern processors.

Figures and Tables from this paper

AnySeq: A High Performance Sequence Alignment Library based on Partial Evaluation
The approach combines high performance with an intuitively understandable implementation, which is achieved through the concept of partial evaluation, and enables the compilation of algorithmic variants that are highly optimized for specific usage scenarios and hardware targets with a single, uniform codebase.
A Review of Parallel Implementations for the Smith–Waterman Algorithm
The current research status of parallel local alignments is summarized and the data layout in these work is described, with large-scale genomic comparisons emphasized, based on the research status.
Fast gap-affine pairwise alignment using the wavefront algorithm
The wavefront alignment algorithm (WFA) is presented, an exact gap-affine algorithm that takes advantage of homologous regions between the sequences to accelerate the alignment process and exhibits simple data dependencies that can be easily vectorized.
BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures
The BGSA toolkit for optimized implementations of popular bit-parallel global pairwise alignment algorithms on modern microprocessors is presented by presented by this work, which outperforms Edlib, SeqAn, and BitPAl for pairwise edit distance computations.
ADEPT: a domain independent sequence alignment strategy for gpu architectures
ADEPT is a new sequence alignment strategy for GPU architectures that is domain independent, supporting alignment of sequences from both genomes and proteins, and demonstrates a performance that is either comparable or better than existing GPU strategies.
AnySeq/GPU: a novel approach for faster sequence alignment on GPUs
AnySeq/GPU is presented, a sequence alignment library that augments the AnySeq 1 library with a novel approach for accelerating dynamic programming (DP) alignment on GPUs by minimizing memory accesses using warp shuffles and half-precision arithmetic and achieves over 80% of the peak performance on both NVIDIA and AMD GPUs.
SLPal: Accelerating Long Sequence Alignment on Many-Core and Multi-Core Architectures
SLPal, a fast bit-parallel algorithm for accelerating long DNA sequence comparison on Intel manycore and multi-core architectures is proposed, which achieves a stable performance for all benchmark data and yields a performance of up to 511.7 (617.2) GCUPS on a server with single Xeon Phi 7210 processor.
GASAL2: a GPU accelerated sequence alignment library for high-throughput NGS data
The paper shows how to use GASAL2 to accelerate BWA-MEM, speeding up the local alignment by 20x, which gives an overall application speedup of 1.3x vs. CPU with up to 12 threads.
Parallel Fine-Grained Comparison of Long DNA Sequences in Homogeneous and Heterogeneous GPU Platforms With Pruning
The proposed MultiBP, a sequence comparison solution in multiple GPUs with block pruning, was integrated to MASA-CUDAlign and tested in homogeneous and heterogeneous platforms, with different NVidia GPU architectures.
Vargas: heuristic-free alignment for assessing linear and graph read aligners
Vargas implements a heuristic-free algorithm guaranteed to find the highest-scoring alignment for real sequencing reads to a linear or graph genome, and it is demonstrated how these “gold standard” Vargas alignments can be used to improve heuristic alignment accuracy by optimizing command-line parameters in Bowtie 2, BWA-MEM, and vg to align more reads correctly.


Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments
  • J. Daily
  • Computer Science
    BMC Bioinformatics
  • 2016
For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library and applications that require optimal alignment scores could benefit from the improved performance.
Flexbar 3.0 ‐ SIMD and multicore parallelization
Flexbar 3.0, the successor of the popular program Flexbar, employs now twofold parallelism: multi‐threading and additionally SIMD vectorization, which is used to speed‐up the computation of pair‐wise sequence alignments, which are used for the detection of barcodes and adapters.
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs
The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture, which opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture.
Parallel biological sequence alignments on the Cell Broadband Engine
  • Abhinav Sarje, S. Aluru
  • Biology, Computer Science
    2008 IEEE International Symposium on Parallel and Distributed Processing
  • 2008
A comprehensive study of developing sequence alignment algorithms on the Cell exploiting its thread and data level parallelism features and presents cell implementations of two advanced alignment techniques - spliced alignments and syntenic alignments.
Retrieving Smith-Waterman Alignments with Optimizations for Megabase Biological Sequences Using GPU
This paper proposes and evaluates CUDAlign 2.1, a parallel algorithm that uses GPU to align huge sequences, executing the Smith-Waterman algorithm combined with Myers-Miller, with linear space complexity, and proposes optimizations which are able to reduce significantly the amount of data processed, while enforcing full parallelism most of the time.
SWAPHI-LS: Smith-Waterman Algorithm on Xeon Phi coprocessors for Long DNA Sequences
SWAPHI-LS is presented, the first parallel SW algorithm exploiting emerging Xeon Phi coprocessors to accelerate the alignment of long DNA sequences and achieves a stable performance of up to 30.1 billion cell updates per second on a single Xeon Phi and up to 111.4 GCUPS on four Xeon Phis sharing the same host.
SW#–GPU-enabled exact alignments on genome scale
SW#, a new CUDA graphical processor unit-enabled and memory-efficient implementation of dynamic programming algorithm, for local alignment, is proposed, which is the only one publicly available that can produce sequence alignments on genome-wide scale.
Segment-based multiple sequence alignment
The main problem is to define segments of the sequences in such a way that a graph-based alignment is possible, and the consistency idea can be extended to align multiple genomic sequences.
SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and ×86/SSE2
benchmarking results show that swps3 is currently the fastest implementation of a vectorized Smith-Waterman on the Cell/BE, outperforming the only other known implementation by a factor of at least 4.