Corpus ID: 15373234

An Implementation of a FIR Filter on a GPU

@inproceedings{Smirnov2005AnIO,
  title={An Implementation of a FIR Filter on a GPU},
  author={Alexey Smirnov and Tzi-cker Chiueh},
  year={2005}
}
In this paper we describe an implementation of the Finite Impulse Response (FIR) filter on a modern graphics processing unit (GPU. [...] Key Method We used Geforce 6600 video card and the Pentium 4-HT 3.2 GHz processor-based PC in our experiments. We varied several parameters of the GPU implementation to achieve highest performance. The results indicate that the GPU implementation is faster than the SSE-optimized implementation when the FIR filter has a large number of taps and therefore requires a large…Expand
FPGA vs GPU Performance Comparison on the Implementation of FIR Filters
  • 2013
FIR filters find place in digital signal processing applications that require stopping a frequency band while passing another band or removing noise. Due to the complex structure and parallelismExpand
High performance finite impulse response filter on graphics processors
A high performance FIR filtering algorithm on the GPU is presented based on the traditional overlapped-save method for the fast FIR filter. This algorithm exploits a symmetric segmentation approachExpand
HIGH-PERFORMANCE REAL-TIME FIR-FILTERING USING FAST CONVOLUTION ON GRAPHICS HARDWARE
TLDR
An implementation and detailled analysis of a frequency-domain fast convolution method on GPUs that allows to achieve an outstanding real-time filtering performance and identifies bottlenecks. Expand
GPU Acceleration of DSP for Communication Receivers.
TLDR
The GPU implementation of several algorithms encountered in a wide range of high-data rate communication receivers including filters, multirate filters, numerically controlled oscillators, and multi-stage digital down converters are described. Expand
A polyphase filter for GPUs and multi-core processors
TLDR
An optimized implementation of the polyphase filter bank used by LOFAR is discussed, and a novel way to compute polyphase filters efficiently on GPUs is presented. Expand
Optimal Data Distribution for Versatile Finite Impulse Response Filtering on Next-Generation Graphics Hardware Using CUDA
TLDR
This paper investigates discrete finite impulse response (FIR) filtering of images, while harnessing the powerful computational resources of next-generation GPUs and presents multiple convolution implementation techniques that are able to cope with the hard platform constraints in different situations, while still being able to optimize the implementation to the underlying architecture. Expand
USING PROGRAMMABLE GRAPHICS HARDWARE FOR AURALIZATION
Over the last 10 years, the architecture of graphics accele– rators (GPUs) has dramatically evolved, outpacing traditional general purpose processors (CPUs) with an average 2.25-fold increase inExpand
Multichannel massive audio processing for a generalized crosstalk cancellation and equalization application using GPUs
TLDR
The design and implementation of all the processing blocks of a multichannel convolution on a GPU for real-time applications and a very efficient filtering method using specific data structures is proposed, which takes advantage of overlap-save filtering and filter fragmentation. Expand
Use of GPUs in room acoustic modeling and auralization
TLDR
An overview of GPU architectures and issues in designing suitable algorithms that map well to GPU architectures are addressed and a brief overview of recent methods for geometric and numeric sound propagation that offer one order of magnitude speedup over CPU-based algorithms are given. Expand
Intelligent Visual Supercomputing on Hybrid Graphical Multiprocessor Environments, Computer Engineering
TLDR
A novel stereo matching algorithm is introduced that is able to extract depth from two parallel input camera images, which is designed from the ground up with the fundamentals of GPU computing in mind, and achieves a very high algorithmic throughput while jointly providing superior quality to many of its competitors. Expand
...
1
2
...

References

SHOWING 1-10 OF 11 REFERENCES
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
TLDR
An in-depth analysis of dense matrix-matrix multiplication, which reuses each element of input matrices O(n) times, finds even near-optimal GPU implementations are pronouncedly less efficient than current cache-aware CPU approaches. Expand
LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware
TLDR
A novel algorithm to solve dense linear systems using graphics processors (GPUs) by reducing matrix decomposition and row operations to a series of rasterization problems on the GPU and demonstrating that the commodity GPU is a useful co-processor for many scientific applications. Expand
The FFT on a GPU
TLDR
A system that can synthesize an image by conventional means, perform the FFT, filter the image, and finally apply the inverse FFT in well under 1 second for a 512 by 512 image is demonstrated. Expand
Linear algebra operators for GPU implementation of numerical algorithms
TLDR
This work proposes a stream model for arithmetic operations on vectors and matrices that exploits the intrinsic parallelism and efficient communication on modern GPUs and introduces a framework for the implementation of linear algebra operators on programmable graphics processors (GPUs), thus providing the building blocks for the design of more complex numerical algorithms. Expand
Nonlinear optimization framework for image-based modeling on programmable graphics hardware
TLDR
This paper casts nonlinear optimization as a data streaming process that is well matched to modern graphics processors and successfully applies this approach to two distinct image-based modeling problems: light field mapping approximation and fitting the Lafortune model to spatial bidirectional reflectance distribution functions. Expand
Fourier Volume Rendering on the GPU Using a Split-Stream-FFT
TLDR
This paper presents a novel implementation of the Fast Fourier Transform called Split-Stream-FFT, which maps the recursive structure of the FFT to the GPU in an efficient way and visualizes large volumetric data set in interactive frame rates on a mid-range computer system. Expand
Fast Matrix Multiplies Using Graphics Hardware
We present a technique for large matrix-matrix multiplies using low cost graphics hardware. The result is computed by literally visualizing the computations of a simple parallel processing algorithm.Expand
Understanding Digital Signal Processing (2nd Edition)
TLDR
Understanding Digital Signal Processing, Second Edition is quite simply the best way for engineers, and other technical professionals, to master and apply DSP techniques. Expand
ClawHMMER: A Streaming HMMer-Search Implementation
NVIDIA OpenGL Extension Specifications
  • NVIDIA OpenGL Extension Specifications
...
1
2
...