We introduce a pragmatic algorithm for real-time adaptive super-sampling in games. It extends temporal antialiasing of rasterized images with adaptive ray tracing, and conforms to the constraints of a commercial game engine and today's GPU ray tracing APIs. The algorithm removes blurring and ghosting artifacts associated with standard temporal antialiasing and achieves quality approaching 8x supersampling of geometry, shading, and materials while staying within the 33ms frame budget required of most games.
Geometric aliasing is a persistent challenge for real-time rendering. Hardware multisampling remains limited to 8x, analytic coverage fails to capture correlated visibility samples, and spatial and temporal postfiltering primarily target edges of superpixel primitives.
We describe a novel semi-analytic representation of coverage designed to make progress on geometric antialiasing for subpixel primitives and pixels containing many edges while handling correlated subpixel coverage. Although not yet fast enough to deploy, it crosses three critical thresholds: image quality comparable to 256x MSAA, faster than 64x MSAA, and constant space per pixel.
A primary advantage of deferred shading is eliminating wasted shading operations due to overdraw. We present a new algorithm that we call Deferred Adaptive Compute Shading, for providing further reduction in shading computations. Our method hierarchically shades the image while reducing the number of required shading operations to below one shading computation per pixel on average. We determine whether to shade a pixel or approximate it using previously shaded pixels around it, based on an estimate of the image variance at the pixel location. The algorithm is designed to dynamically reconfigure itself to achieve optimal warp coherence and measurable performance gain. We extensively evaluate our algorithm, demonstrating that it produces high-quality results and is robust and highly scalable while providing significant performance improvements in complex scenes.
In this short paper we present a machine learning approach to detect visual artifacts in rendered image sequences. Specifically, we train a deep neural network using example aliased and antialiased image sequences exported from a real-time renderer. The trained network learns to identify and locate aliasing artifacts in an input sequence, without comparing it against a ground truth. Thus, it is useful as a fully automated tool for evaluating image quality.
We demonstrate the effectiveness of our approach in detecting aliasing in several rendered sequences. The trained network correctly predicts aliasing in 64 x 64 x 4 animated sequences with more than 90% accuracy for images it hasn't seen before. The output of our network is a single scalar between 0 and 1, which is usable as a quality metric for aliasing. It follows the same trend as (1-SSIM) for images with increasing sample counts.
Nowadays computing is heavily-based on accelerators, however, the cost of the hardware equipment prevents equal access to heterogeneous programming. In this work we present Brook GLES Pi, a port of the accelerator programming language Brook. Our solution, primarily focused on the educational platform Raspberry Pi, allows to teach, experiment and take advantage of heterogeneous programming on any low-cost embedded device featuring an OpenGL ES 2 GPU, democratising access to accelerator programming.
We propose and evaluate what we call Compressed-Leaf Bounding Volume Hierarchies (CLBVH), which strike a balance between compressed and non-compressed BVH layouts. Our CLBVH layout introduces dedicated compressed multi-leaf nodes where most effective at reducing memory use, and uses regular BVH nodes for inner nodes and small, isolated leaves. We show that when implemented within the Embree ray tracing framework, this approach achieves roughly the same memory savings as Embree's compressed BVH layout, while maintaining almost the full performance of its fastest non-compressed BVH.
In this paper we describe and evaluate an implementation of CPU-style SIMD ray traversal on the GPU. We show how spreading moderately wide BVHs (up to a branching factor of eight) across multiple threads in a warp can improve performance while not requiring expensive pre-processing. The presented ray-traversal method exhibits improved traversal performance especially for increasingly incoherent rays.
We introduce moment transparency, a new solution to real-time order-independent transparency. It expands upon existing approximate transmittance function techniques by using moments to capture and reconstruct the transmittance function. Because the moment-based transmittance function can be processed analytically using standard hardware blend operations, it is efficient and overcomes limitations of previous techniques.