• Corpus ID: 239768382

Accelerating Compact Fractals with Tensor Core GPUs

@article{Quezada2021AcceleratingCF,
  title={Accelerating Compact Fractals with Tensor Core GPUs},
  author={Felipe A. Quezada and Crist{\'o}bal A. Navarro},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.12952}
}
This work presents a GPU thread mapping approach that allows doing fast parallel stencil-like computations on discrete fractals using their compact representation. The intuition behind is to employ two GPU tensor-core accelerated thread maps, λ(ω) and ν(ω), which act as threadspace-to-dataspace and dataspace-to-threadspace functions, respectively. By combining these maps, threads can access compact space and interact with their neighbors. The cost of the maps is O(log log(n)) time, with n being… 

References

SHOWING 1-10 OF 16 REFERENCES

Block-Space GPU Mapping for Embedded Sierpiński Gasket Fractals

  • C. NavarroR. VegaB. BustosN. Hitschfeld-Kahler
  • Mathematics
    2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
  • 2017
A block-space map λ : Z2E ↦ Z2F is proposed, from Euclidean parallel space E to embedded fractal space F, that maps in O(log2 log2(n) time and uses no more than O(nH) threads with H ≈ 1.58 being the Hausdorff dimension, making it parallel space efficient.

GPU Maps for the Space of Computation in Triangular Domain Problems

  • C. NavarroN. Hitschfeld-Kahler
  • Computer Science
    2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)
  • 2014
Experimental results using different Nvidia Kepler GPUs show that for computing the Euclidean distance matrix g(λ) achieves an improvement of up to 18% over the basic bounding box (BB) strategy, runs faster than UTM and REC strategies and it is almost as fast as RB.

Analyzing GPU Tensor Core Potential for Fast Reductions

This work presents the idea of using tensor cores for a different purpose such as the parallel arithmetic reduction problem, and proposes a new GPU tensor-core based algorithm as well as analyze its potential performance benefits in comparison to a traditional GPU-based one.

Potential benefits of a block-space GPU approach for discrete tetrahedral domains

From the analysis, it is obtained that a block based succinct data re-organization can provide up to 2× improved performance over a linear data organization while the map can be up to 6× more efficient than a bounding box approach.

Efficient GPU Thread Mapping on Embedded 2D Fractals

Triangular matrix inversion on Graphics Processing Unit

This paper demonstrates how triangular matrix inversion (TMI) can be accelerated considerably by using commercial Graphics Processing Units (GPU) in a standard PC, and shows how inversion of an L- and U-matrix can be performed concurrently on a GTX 295 based dual-GPU system at up to 90 gigaflops/s.

Fractal-Based Description of Natural Scenes

  • A. Pentland
  • Environmental Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 1984
The3-D fractal model provides a characterization of 3-D surfaces and their images for which the appropriateness of the model is verifiable and this characterization is stable over transformations of scale and linear transforms of intensity.

Real time design and animation of fractal plants and trees

The program provides a study in the structure of branching objects that is both scientific and artistic and suggests that organisms and computers deal with complexity in similar ways.

Competitiveness of a Non-Linear Block-Space GPU Thread Map for Simplex Domains

Performance results show that the efficiency problem of mapping GPU threads onto simplex domains is competitive and even the fastest map when ran in recent GPU architectures such as the Tesla V100, where it reaches up to 1.25\times of speedup in 2-simplex tests.

Algorithmic Self-Assembly of DNA Sierpinski Triangles

This work reports the molecular realization, using two-dimensional self-assembly of DNA tiles, of a cellular automaton whose update rule computes the binary function XOR and thus fabricates a fractal pattern—a Sierpinski triangle—as it grows.