• Corpus ID: 6560270

Efficient Histogram Algorithms for NVIDIA CUDA Compatible Devices

@inproceedings{Shams2007EfficientHA,
  title={Efficient Histogram Algorithms for NVIDIA CUDA Compatible Devices},
  author={Ramtin Shams and Rodney A. Kennedy},
  year={2007}
}
We present two efficient histogram algorithms designed for NVIDIA’s compute unified device architecture (CUDA) compatible graphics processor units (GPUs). Our algorithm can be used for parallel computation of histograms on large data-sets and for thousands of bins. Traditionally histogram computation has been difficult and inefficient on the GPU. This often means that GPU-based implementation of the algorithms that require histogram calculation as part of their computation, require to transfer… 
Efficient Computation of Joint Histograms and Normalized Mutual Information on CUDA Compatible Devices
We present new strategies for a highly optimized joint histogram computation of large datasets on NVIDIA’s compute unified device architecture (CUDA) compatible graphics processor units (GPUs). By
Parallelizing general histogram application for CUDA architectures
TLDR
Two approaches for implementing general purpose histogramming on GPUs are compared based on private copies of bin counters stored in shared memory for each block of threads and the Thrust library to sort the input elements and then to search for upper bounds according to bin widths.
Improving GPU Performance: Reducing Memory Conflicts and Latency
TLDR
A set of software techniques to improve the parallel updating of the output bins in the voting algorithms, the so called ‘voting algorithms’ such as histogram and Hough transform, are analyzed, implemented and optimized on GPUs.
Speeding up Mutual Information Computation Using NVIDIA CUDA Hardware
  • R. Shams, Nick Barnes
  • Computer Science
    9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007)
  • 2007
TLDR
This work approximate the pmfs, using a down-sampled version of the joint- histogram which avoids memory update problems and improves the efficiency of MI calculations by a factor of 25 compared to a standard CPU- based implementation.
Efficient Weighted Histogramming on GPUs with CUDA
TLDR
A new method for histogramming on GPUs is presented, which reduces the collision intensity by rearranging the input, and provides predictable performance over data sets with different statistics, and shows improved performance over the state-of-the-art implementations.
Practical examples of GPU computing optimization principles
TLDR
The effect and optimization principles of memory coalescing, bandwidth reduction, processor occupancy, bank conflict reduction, local memory elimination and instruction optimization, as well as comparison with optimized and unoptimized algorithms are provided.
An optimized approach to histogram computation on GPU
TLDR
This paper proposes a highly optimized approach to histogram calculation that uses histogram replication for eliminating position conflicts, padding to reduce bank conflicts, and an improved access to input data called interleaved read access.
Compiling generalized histograms for GPU
TLDR
It is shown that the histogram implementation taken in isolation outperforms similar primitives from CUB, and that it is competitive or outperforms the hand-written code of several application benchmarks, even when the latter is specialized for a class of datasets.
Compiling Generalized Histograms for GPU
We present and evaluate an implementation technique for histogram-like computations on GPUs that ensures both work-efficient asymptotic cost, support for arbitrary associative and commutative
Selected Issues on Histograming on GPUs
TLDR
Results of the performance measurements including various configurations of the allocation of the histograms in various parts of the memory of used devices are presented.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 11 REFERENCES
Optimal Data-Based Binning for Histograms
Histograms are convenient non-parametric density estimators, which continue to be used ubiquitously. Summary quantities estimated from histogram-based probability density models depend on the choice
Image Registration in Hough Space Using Gradient of Images
  • R. Shams, N. Barnes, R. Hartley
  • Computer Science
    9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007)
  • 2007
TLDR
It is shown that the combination of estimating registration parameters in the Hough domain and fine tuning the results in the intensity domain significantly improves performance of the application compared to the conventional intensitybased multi-resolution methods.
Efficient image registration by decoupled parameter estimation using gradient-based techniques and mutual information
TLDR
This work presents an efficient and accurate method for similarity (rigid+scale) registration of 3D images by decoupled estimation of transformation parameters that improves the efficiency of the overall registration task and makes the registration more robust and less sensitive to the shape of the cost function.
Gradient Intensity: A New Mutual Information-Based Registration Method
TLDR
This work introduces the concept of 'gradient intensity' as a measure of spatial strength of an image in a given direction and determines the rotation parameter by maximizing the MI between gradient intensity histograms.
Gradient Intensity-Based Registration of Multi-Modal Images of the Brain
We present a fast and accurate framework for registration of multi-modal volumetric images based on decoupled estimation of registration parameters utilizing spatial information in the form of
On optimal and data based histograms
SUMMARY In this paper the formula for the optimal histogram bin width is derived which asymptotically minimizes the integrated mean squared error. Monte Carlo methods are used to verify the
Alignment by Maximization of Mutual Information
TLDR
A new information-theoretic approach is presented for finding the pose of an object in an image that works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation.
A Mathematical Theory of Communication
This paper opened the new area the information theory. Before this paper, most people believed that the only way to make the error probability of transmission as small as desired is to reduce the
64-bin histogram
  • NVIDIA, Tech. Rep., 2007.
  • 2007
Compute Unified Device Architecture (CUDA) Programming Guide
  • Compute Unified Device Architecture (CUDA) Programming Guide
  • 2007
...
1
2
...