GenPIP: In-Memory Acceleration of Genome Analysis via Tight Integration of Basecalling and Read Mapping

@article{Mao2022GenPIPIA,
  title={GenPIP: In-Memory Acceleration of Genome Analysis via Tight Integration of Basecalling and Read Mapping},
  author={Haiyu Mao and Mohammed H. Alser and Mohammad Sadrosadati and Can Firtina and Akanksha Baranwal and Damla Senol Cali and Aditya Manglik and Nour Almadhoun Alserr and Onur Mutlu},
  journal={2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)},
  year={2022},
  pages={710-726}
}
  • Haiyu MaoM. Alser O. Mutlu
  • Published 18 September 2022
  • Biology, Engineering, Computer Science
  • 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)
Nanopore sequencing is a widely-used high-throughput genome sequencing technology that can sequence long fragments of a genome into raw electrical signals at low cost. Nanopore sequencing requires two computationally-costly processing steps for accurate downstream genome analysis. The first step, basecalling, translates the raw electrical signals into nucleotide bases (i.e., A, C, G, T). The second step, read mapping, finds the correct location of a read in a reference genome. In existing… 

ETH 263-2210-00L Computer Architecture, Fall 2022 HW 1: Processing-in-memory (SOLUTIONS)

• Handin Critical Paper Reviews (1). You need to submit your reviews to https: //safari.ethz.ch/review/architecture22/. Please, check your inbox, you should have received an email with the password

RawHash: Enabling Fast and Accurate Real-Time Analysis of Raw Nanopore Signals for Large Genomes

This work proposes RawHash, the first mechanism that can accurately and efficiently perform real-time analysis of nanopore raw signals for large genomes using a hash-based similarity search and shows that RawHash is the only tool that can provide high accuracy and high throughput for analyzing large genomes in real- time.

References

SHOWING 1-10 OF 217 REFERENCES

GenStore: a high-performance in-storage processing system for genome sequence analysis

GenStore is proposed, the first in-storage processing system designed for genome sequence analysis that greatly reduces both data movement and computational overheads of genome sequenceAnalysis by exploiting low-cost and accurate in- storage filters.

GenAx: A Genome Sequencing Accelerator

GenAx is presented, an accelerator for read alignment, a time-consuming step in genome sequencing which achieves 31.7× speedup over the standard BWA-MEM sequence aligner running on a 56-thread dualsocket 14-core Xeon E5 server processor, while reducing power consumption and area.

GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis

GenASM is proposed, the first ASM acceleration framework for genome sequence analysis that accelerates read alignment for both long reads and short reads and accelerates pre-alignment filtering for short reads, and is demonstrated that GenASM provides significant performance and power benefits for three different use cases in genome sequenceAnalysis.

GateKeeper: a new hardware architecture for accelerating pre‐alignment in DNA short read mapping

GateKeeper is the first design to accelerate pre‐alignment using Field‐Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software, and maintains high accuracy while providing 90‐fold and 130‐fold speedup over the state‐of‐the‐art software pre‐Alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively.

SeedEx: A Genome Sequencing Accelerator for Optimal Alignments in Subminimal Space

This paper presents SeedEx, a read-alignment accelerator focused on the seed-extension step that achieves 6.0× iso-area throughput speedup when compared to a banded Smith-Waterman baseline, and achieves 43.9 M seed extentions/s on AWS f1.2xlarge instance.

BRAWL: A Spintronics-Based Portable Basecalling-in-Memory Architecture for Nanopore Genome Sequencing

BRAWL, a portableasecalling-in-memory architecture, is proposed to translate electrical signa to digital DNA symbols in SOT-MRAMs for Nanopore portable sequencers, and improves basecalling throughput per Watt by 3.88 times.

Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems

This work improves the performance of the three kernels of BWA-MEM by using techniques to improve cache reuse, simplifying the algorithms, and replacing many small memory allocations with a few large contiguous ones to improve hardware prefetching of data, and focusing on performance improvements on a single socket multicore processor.

Hardware Acceleration of Long Read Pairwise Overlapping in Genome Sequencing: A Race Between FPGA and GPU

A method to reorder the operation sequence that transforms the algorithm into a hardware-friendly form that achieves significant performance improvement and customizes a fine-grained task dispatching scheme which could keep parallel PEs busy while satisfying the on-chip memory restriction.

SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping

This work proposes SeGraM, a universal algorithm/hardware co-designed genomic mapping accelerator that can effectively and efficiently support both sequence-to-graph mapping and sequence- to-sequence mapping, for both short and long reads.

Accelerating Genome Analysis: A Primer on an Ongoing Journey

The ongoing journey in significantly improving the performance of read mapping is described and state-of-the-art algorithmic methods and hardware-based acceleration approaches are explained and the challenges of adopting hardware-accelerated read mappers are described.
...