Kang Su Gatlin

Learn More
The Smith-Waterman algorithm is a computationally-intensive string-matching operation that is fundamental to the analysis of proteins and genes. In this paper, we explore the use of some standard and novel techniques for improving its performance. We begin by tuning the algorithm using conventional techniques. These make modest performance improvements by(More)
This paper explores the interplay between algorithm design and a computer's memory hierarchy. Matrix transpose and the bit-reversal reordering are important scientific subrou-tines which often exhibit severe performance degradation due to cache and TLB associativity problems. We give lower bounds that show for typical memory hierarchy designs, extra data(More)
The Tera MTA is a revolutionary commercial computer based on a multithreaded processor architecture. In contrast to many other parallel architectures, the Tera MTA can effectively use high amounts of parallelism on a single processor. By running multiple threads on a single processor, it can tolerate memory latency and to keep the processor saturated. If(More)
The fast Fourier transform (FFT) is the cornerstone of many supercomputer applications and therefore needs careful performance tuning. Most often, however, the real performance of the FFT implementations is far below the acceptable figures. In this paper, we explore several strategies for performance optimisations of the FFT computation, such as enhancing(More)
HPF is a data parallel Fortran dialect currently implemented on diverse hardware platforms ranging from workstations to massively parallel processors. To date, performance data are sparse. We will present preliminary measurements of selected benchmarks, comparing HPF applications against equivalent SPMD implementations and the same HPF implementation(More)
  • 1