Reduced Precision for Hardware Ray Tracing in GPUs

@inproceedings{Keely2014ReducedPF,
  title={Reduced Precision for Hardware Ray Tracing in GPUs},
  author={Sean Keely},
  booktitle={High Performance Graphics},
  year={2014}
}
  • S. Keely
  • Published in High Performance Graphics 23 June 2014
  • Computer Science
We propose a high performance, GPU integrated, hardware ray tracing system. [] Key Method By making use of existing GPU resources we are able to keep all rays and scheduling traffic on chip and out of caches. We used simulations to estimate the performance of our architecture. Our system achieves an average ray rate of 3.4 billion rays per second while path tracing our test scenes.

Figures and Tables from this paper

Dual streaming for hardware-accelerated ray tracing
TLDR
This work provides a different approach to hardware-accelerated ray tracing, beginning with modifying the order of rendering operations, inspired by the streaming character of rasterization, and organizes the memory access of ray tracing into two predictable data streams.
Toward Real-Time Ray Tracing
TLDR
This article is aimed at providing a timely survey on hardware techniques to accelerate the ray-tracing algorithm by reviewing hardware techniques for the main functional blocks in a ray- Tracing pipeline following a systematic taxonomy.
Half-precision Floating-point Ray Traversal
TLDR
A new ray traversal algorithm for bounding volume hierarchies is presented that reduces the required memory bandwidth and energy usage, but requires extra computations with a new kind of utilization of half-precision floating-point numbers, which are used to store axis aligned bounding boxes in a hierarchical manner.
Mach-RT: a many chip architecture for ray tracing
We propose an unconventional solution to high-performance ray tracing that combines a ray ordering scheme that minimizes access to the scene data with a large on-chip buffer acting as near-compute
Efficient incoherent ray traversal on GPUs through compressed wide BVHs
TLDR
A GPU-based ray traversal algorithm that operates on compressed wide BVHs and maintains the traversal stack in a compressed format and an improved method for ordering the child nodes at build time for the purposes of octant-aware fixed-order traversal are presented.
Mach-RT: A Many Chip Architecture for High Performance Ray Tracing
TLDR
This article proposes an unconventional solution that combines a ray ordering scheme that minimizes access to the scene data with a large on-chip buffer acting as near-compute storage that is spread over multiple chips.
Fast Hardware Construction and Refitting of Quantized Bounding Volume Hierarchies
TLDR
It is found that in a hardware‐accelerated tree update, significant memory traffic and runtime savings are available from streaming, bottom‐up compression and novel algorithmic techniques of modulo encoding and treelet‐based compression are proposed to reduce backtracking inherent in bottom‐ up compression.
Reduced Precision Ray-Triangle Intersection Filtering
TLDR
A new leaf intersection scheme for hardware accelerated ray tracing, focusing on the rays which do not intersect a triangle, which can be implemented entirely with branch free, low precision, integer arithmetic, making it well suited for hardware implementation.
Wide BVH traversal with a short stack
TLDR
An algorithm for wide bounding volume hierarchy (BVH) traversal that uses a short stack of just a few entries and an extension to efficiently cull leaf nodes when a closer intersection has been found, which reduces ray primitive intersections by up to 14%.
Watertight ray traversal with reduced precision
TLDR
A novel traversal algorithm is introduced that achieves a significant reduction in the computational complexity of traversal compared to previous approaches and guarantees watertight intersections which is a key requirement for robust image quality, especially with reduced precision traversal where numerical errors can be large.
...
1
2
3
4
...

References

SHOWING 1-10 OF 36 REFERENCES
Realtime Ray Tracing on GPU with BVH-based Packet Traversal
TLDR
This paper presents a BVH-based GPU ray tracer with a parallel packet traversal algorithm using a shared stack, and presents a fast, CPU-based BvH construction algorithm which very accurately approximates the surface area heuristic using streamed binning while still being one order of magnitude faster than previously published results.
An energy and bandwidth efficient ray tracing architecture
TLDR
A streaming data model is used and part of the L2 cache is configured into a ray stream memory to enable efficient data processing through ray reordering to decrease energy consumption on massively parallel graphics processors for ray tracing while keeping performance high.
Real-time ray tracing on future mobile computing platform
TLDR
Simulation results show that this platform is potentially a versatile graphics solution for future application processors as it provides a real-time ray tracing performance at full HD resolution that can compete with that of existing desktop GPU ray tracers.
SGRT: a mobile GPU architecture for real-time ray tracing
TLDR
Simulation results show that SGRT is potentially a versatile graphics solution for future application processors as it provides a real-time ray tracing performance at full HD resolution that can compete with that of existing desktop GPU ray tracers.
Understanding the efficiency of ray traversal on GPUs
TLDR
A simple solution is proposed that significantly narrows the gap between simulation and measurement, and results in the fastest GPU ray tracer to date.
Dynamic Ray Scheduling to Improve Ray Coherence and Bandwidth Utilization
TLDR
A novel ray tracing algorithm is presented that both improves cache utilization and reduces DRAM-to-cache bandwidth usage and creates units of work that are more amenable to parallelization than traditional Whitted-style ray tracers.
A Mobile Accelerator Architecture for Ray Tracing
TLDR
A novel multi-core MIMD graphics accelerator architecture that is well suited to ray tracing on mobile platforms is presented, and it is shown that a small-footprint version of this architecture is suitable for the mobile computing space, and has performance up to 13 times faster than an existing mobile graphics solution for ray tracing.
SaarCOR: a hardware architecture for ray tracing
TLDR
A new, scalable, modular, and highly efficient hardware architecture for real-time ray tracing that achieves high performance with extremely low memory bandwidth requirements by efficiently tracing bundles of rays.
Memory efficient ray tracing with hierarchical mesh quantization
TLDR
A lossily compressed acceleration structure for ray tracing that encodes the bounding volume hierarchy (BVH) and the triangles of a scene together in a single unified data structure that achieves performance similar to the fastest uncompressed data structures.
Combining Single and Packet-Ray Tracing for Arbitrary Ray Distributions on the Intel MIC Architecture
TLDR
A single-ray tracing scheme for incoherent rays that uses just one traversal stack on 16-wide SIMD hardware is introduced and it is shown that on the Intel Many Integrated Core architecture this hybrid scheme consistently, and over a wide range of scenes and ray distributions, outperforms both packet and single-rays tracing.
...
1
2
3
4
...