TL;DR Unlike Gaussian splatting, Gaussian ray tracing is sensitive to per-pixel intersection counts.
We designed a ray tracer that minimizes intersections efficiently, primarily by initializing Gaussians small.
Our ray tracer optimizes and renders near 3DGS's speed while maintaining most of 3DGRT's quality.
Abstract
3D Gaussian Splatting (3DGS) is a popular representation for radiance field reconstruction, distinguished by the rendering speed of its rasterization-based renderer. While 3D Gaussians can also be ray traced, this approach has so far been slower, with 3D Gaussian Ray Tracing (3DGRT) taking nearly one order of magnitude longer to optimize. To address this, we present GRay, a fast ray tracer for 3D Gaussians designed to close this performance gap.
Our method leverages the algorithmic difference between both approaches: unlike rasterization, ray tracing evaluates only Gaussians that are actually intersected by a ray, leading to potentially logarithmic rather than linear scaling in the number of primitives. This property allows ray tracing to better exploit dense scenes composed of numerous tiny Gaussians, a configuration which has largely been overlooked. Notably, we show that dense initialization, which creates many small Gaussians, slows down rasterization, but instead speeds up ray tracing. Designed to leverage this effect, our ray tracer GRay renders nearly 4× faster and optimizes nearly 10× faster than 3DGRT while maintaining similar quality. Its speed is competitive with 3DGS albeit with somewhat lower quality.
INSIGHT 1
Gaussian Splatting’s tile-based rasterizer scales linearly with the number of Gaussians in a tile, while ray tracing’s complexity can be logarithmic. This is because rasterization processes all Gaussians assigned to a tile, while ray tracing uses a BVH to select only the Gaussians intersected by each ray.
In practice, this means that shrinking Gaussians speeds up ray tracing, while splatting’s performance stays constant past a certain point. The figure on the right illustrates this with a toy experiment rendering a single 16×16 tile.
INSIGHT 2
Dense initialization initializes scenes with millions of tiny Gaussians; it was designed for 3DGS to alleviate the need for densification and reduce the number of necessary optimization steps. However, it also increases the computational cost of optimization steps due to larger initial Gaussian counts.
We found that ray tracing directly benefits from dense initialization’s smaller Gaussian sizes, since they reduce intersection counts. While splatting slows down, ray tracing speeds up under dense initialization.
Method
Based on these insights, we propose a method that better exploits ray tracing's algorithmic complexity and its affinity for smaller Gaussians. We build our method on 3DGRT and our previous project by curating their techniques and modifying the training regimen to better exploit large initial counts of smaller Gaussians, obtained with dense initialization.
We initialize Gaussians tiny and keep them small, reducing intersection counts directly.
Minimizing Gaussian size reduces unnecessary overlap that bloat intersection counts. We initialize Gaussians small with dense initialization and regularize size with scale decay and early stopping. We further use learning rate schedules to match convergence at half iteration counts given dense initialization.
Dense initialization bloats initial Gaussian counts; we address this to keep training fast.
Dense initialization starts with millions of Gaussians directly, roughly 40x more than before. We address these counts with initialization binning and by pruning aggressively with a weight-based criterion. We also show that bounding Gaussians with oriented boxes is key to keeping BVH updates fast in this regime.
We readjust and revalidate existing design choices for the dense initialization regime.
Dense initialization impacts the performance profile of most components, so we revalidate existing design choices and improve them where possible. For instance, we optimize our ray tracer's in-memory storage for intersected Gaussians and improve the truncated-Gaussian approximation used in early ray termination.
RESULTS 1
Our method achieves quality close to 3DGRT at speeds comparable to 3DGS with and without dense initialization. We report averages over all 13 benchmarking scenes, measured on an RTX 4090.
| Configuration | Results | |||||||
|---|---|---|---|---|---|---|---|---|
| Method | Init | #Iters | PSNR↑ | SSIM↑ | LPIPS↓ | Init Time↓ | Opt Time↓ | FPS↑ |
| 3DGS | Sparse | 30K | 27.10 | 0.831 | 0.262 | 00:00 | 06:18 | 253 |
| 3DGRT | Sparse | 30K | 26.77 | 0.828 | 0.258 | 00:00 | 55:01 | 68 |
| GRay | Dense | 15K | 26.47 | 0.819 | 0.236 | 01:58 | 05:40 | 248 |
RESULTS 2
Selected scenes from the results page are presented in the following project video. In general our method's quality is close to 3DGRT with dense initialization, notwithstanding existing limitations.
Discussion
Our method speeds up 3DGRT significantly while staying near its quality. Yet, splatting continues to outperform ray tracing, achieving better reconstruction quality at comparable frame rates, or higher frame rates at equal quality. Not only is splatting usually faster at higher resolutions, concurrent work also shows its performance can still be improved. For these reasons, rasterization is unlikely to be fully replaced.
Regardless, Gaussian ray tracing naturally addresses several key limitations of 3D Gaussian Splatting, namely popping artifacts, perspective-correct rendering, and support for arbitrary camera models. More importantly, it opens the door to light transport simulation via path tracing. As such, we expect ray tracing to see wider adoption in the future and hope the significant speed increase provided by our method will become a key enabler for future work.
BibTeX
@article{poirierginter2026gray,
author = {Poirier-Ginter, Yohan and Lalonde, Jean-Fran\c{c}ois and Drettakis, George},
title = {GRay: Ray Tracing 3D Gaussians Near the Speed of Splats},
year = {2026},
issue_date = {May 2026},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {9},
number = {1},
url = {https://doi.org/10.1145/3804496},
doi = {10.1145/3804496},
month = may,
articleno = {14},
numpages = {19}
}
Acknowledgments
Thanks to Jeffrey Hu for helping with the code and pointing us towards dense initialization.
Thanks to Ishaan Shah for the Gaussian Viewer.
This research was co-funded by the European Union (EU) ERC Advanced Grant NERPHYS No 101141721. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the EU or the European Research Council. Neither the EU nor the granting authority can be held responsible for them. Experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. This research was also supported by NSERC grant RGPIN-2020-04799 and the Digital Research Alliance Canada. The authors are grateful to Adobe and NVIDIA for generous donations.