FastTree: A hardware KD-tree construction acceleration engine for real-time ray tracing

@article{Liu2015FastTreeAH,
  title={FastTree: A hardware KD-tree construction acceleration engine for real-time ray tracing},
  author={Xingyu Liu and Yangdong Deng and Yufei Ni and Zonghui Li},
  journal={2015 Design, Automation \& Test in Europe Conference \& Exhibition (DATE)},
  year={2015},
  pages={1595-1598}
}
The ray tracing algorithm is well-known for its ability to generate photo-realistic rendering effects. Recent years have witnessed a renewed momentum in pushing it to real-time for better user experience. Today the construction of acceleration structures, e.g., kd-tree, has become the bottleneck of ray tracing. A dedicated hardware architecture, FastTree, was proposed for kd-tree construction by adopting a fully parallel construction algorithm. FastTree was validated by an FPGA prototype and… Expand
MergeTree: A Fast Hardware HLBVH Constructor for Animated Ray Tracing
Ray tracing is a computationally intensive rendering technique traditionally used in offline high-quality rendering. Powerful hardware accelerators have been recently developed that put real-time rayExpand
Toward Real-Time Ray Tracing
TLDR
This article is aimed at providing a timely survey on hardware techniques to accelerate the ray-tracing algorithm by reviewing hardware techniques for the main functional blocks in a ray- Tracing pipeline following a systematic taxonomy. Expand
PLOCTree: A Fast, High-Quality Hardware BVH Builder
TLDR
This paper proposes PLOCTree, an accelerator for tree construction based on the Parallel Locally-Ordered Clustering (PLOC) algorithm, which is nearly as fast as a state-of-the-art low-quality linear builder, while producing trees of similar Surface Area Heuristic (SAH) cost as a comparatively expensive binned SAH sweep builder. Expand
MergeTree: a HLBVH constructor for mobile systems
Powerful hardware accelerators have been recently developed that put interactive ray-tracing even in the reach of mobile devices. However, supplying the rendering unit with up-to date accelerationExpand
Ray Tracing on Single FPGA
Hardware accelerator has been reported for implementing ray tracing to achieve high realism in 3D graphics rendering based on Central Processing Unit (CPU), Graphic processing Unit (GPU) andExpand
Evaluating a CPU/GPU Implementation for Real-Time Ray Tracing
TLDR
This paper attempts to evaluate if it is possible to implement a hybrid CPU/GPU ray tracer that performs better than either a CPU or GPU implementation in isolation, and an existing framework is profiled, analyzed, and altered to give insight into this problem. Expand
A Survey on Bounding Volume Hierarchies for Ray Tracing
TLDR
The basic principles of bounding volume hierarchies are reviewed as well as advanced state of the art methods with a focus on the construction and traversal. Expand
Evaluation of a BVH Construction Accelerator Architecture for High-Quality Visualization
TLDR
This study conducts a study of a heterogeneous, system-on-chip solution for the construction of a highly important data structure for computer graphics: the bounding volume hierarchy, which incorporates conventional CPU cores alongside a fixed-function accelerator prototyped on a reconfigurable logic fabric. Expand
Hardware-Accelerated Dual-Split Trees
TLDR
This work introduces hardware acceleration for dual-split trees and shows that the performance advantages over BVHs are emphasized in a hardware ray tracing context that can take advantage of such acceleration. Expand
Efficient Progressive Radiance Estimation Engine Architecture and Implementation for Progressive Photon Mapping
TLDR
The presented PREE architecture consists of four progressive radiance estimation units (PREUs), approximate full task schedule-oriented hit-point update operation controller (AFTSO-HpUOC) and approximate data-independent schedule- oriented radiance evaluation controller (ADISO-REC). Expand
...
1
2
...

References

SHOWING 1-10 OF 24 REFERENCES
T&I engine: traversal and intersection engine for hardware accelerated ray tracing
Ray tracing naturally supports high-quality global illumination effects, but it is computationally costly. Traversal and intersection operations dominate the computation of ray tracing. To accelerateExpand
Parallel SAH k-D tree construction
TLDR
Two new parallel algorithms for building precise SAH-optimized k-D trees, with different tradeoffs between the total work done and parallel scalability are presented, yielding the best reported speedups so far for precise-SAH k- D tree construction. Expand
Fully parallel kd-tree construction for real-time ray tracing
TLDR
Experimental results on a set of frequently used scenes prove that the proposed kd-tree construction algorithm outperforms a state-of-the-art algorithm kD- tree construction algorithm by over one order of magnitude. Expand
Highly Parallel Fast KD‐tree Construction for Interactive Ray Tracing of Dynamic Scenes
TLDR
A highly parallel, linearly scalable technique of kd‐tree construction for ray tracing of dynamic geometry compatible with the high performing algorithms such as MLRTA or frustum tracing is presented. Expand
A feasibility study of ray tracing on mobile GPUs
TLDR
This work investigates the feasibility of ray tracing on mobile GPUs and proves that the Tegra K1 GPU already allows constructing the acceleration structure of 1M-triangle scene in around 120ms and performing traversal at a throughput of 15 to 70 million rays per second. Expand
SAH KD-tree construction on GPU
TLDR
This paper presents a kd-tree construction algorithm that is precisely SAH-optimized and runs entirely on GPU, and designs a parallel scheme based on the standard parallel scan primitive to count the triangle numbers for all split candidates, and a bucket-based algorithm to sort the AABBs of the clipped triangles of the child nodes. Expand
A hardware unit for fast SAH-optimised BVH construction
TLDR
This work has developed the first dedicated microarchitecture for the construction of binned SAH BVHs, and concludes that such a design would be useful in the context of a heterogeneous graphics processor, and may help future graphics processor designs to reduce predicted technology-imposed utilisation limits. Expand
Simpler and faster HLBVH with work queues
TLDR
This work presents a simpler and faster variant of HLBVH, where all the complex book-keeping of prefix sums, compaction and partial breadth-first tree traversal needed for spatial partitioning has been replaced with an elegant pipeline built on top of efficient work queues and binary search. Expand
Fast 4-way parallel radix sorting on GPUs
Efficient sorting is a key requirement for many computer science algorithms. Acceleration of existing techniques as well as developing new sorting approaches is crucial for many realtime graphicsExpand
Fast Four‐Way Parallel Radix Sorting on GPUs
TLDR
This paper presents a hardware‐optimized parallel implementation of the radix sort algorithm that results in a significant speed up over existing sorting implementations and makes this algorithm not only the fastest, but also the first general GPU sorting solution. Expand
...
1
2
3
...