Yongpeng Zhang

Learn More
This paper develops and evaluates search and optimization techniques for auto-tuning 3D stencil (nearest-neighbor) computations on GPUs. Observations indicate that parameter tuning is necessary for heterogeneous GPUs to achieve optimal performance with respect to a search space. Our proposed framework takes a most concise specification of stencil behavior(More)
We present an efficient, robust and fully GPU-accelerated aggregation-based algebraic multigrid preconditioning technique for the solution of large sparse linear systems. These linear systems arise from the discretization of elliptic PDEs. The method involves two stages, setup and solve. In the setup stage, hierarchical coarse grids are constructed through(More)
Emerging accelerating architectures, such as GPUs, have proved successful in providing significant performance gains to various application domains. However, their viability to operate on general streaming data is still ambiguous. In this paper, we propose GStream, a general-purpose, scalable data streaming framework on GPUs. The contributions of GStream(More)
Document clustering plays an important role in data mining systems. Recently, a flocking-based document clustering algorithm has been proposed to solve the problem through simulation resembling the flocking behavior of birds in nature. This method is superior to other clustering algorithms, including k-means, in the sense that the outcome is not sensitive(More)
Problem domains are commonly decomposed hierarchically to fully utilize parallel resources in modern microprocessors. Such decompositions can be provided as library routines, written by experienced experts, for general algorithmic patterns. But such APIs tend to be constrained to certain architectures or data sizes. Integrating them with application code is(More)
A new methodology is proposed to design digital PID controllers for multivariable systems with time delays. Except for a few parameters that are preliminarily selected, most of the PID parameters are systematically tuned using the developed plant state-feedback and controller state-feedforward LQR approach, such that satisfactory performance with guaranteed(More)
The exponential growth of high-throughput DNA sequence data has posed great challenges to genomic data storage, retrieval and transmission. Compression is a critical tool to address these challenges, where many methods have been developed to reduce the storage size of the genomes and sequencing data (reads, quality scores and metadata). However, genomic(More)