High-performance GPU ray tracer with path tracing, BVH acceleration, and real-time ray sorting
English · 简体中文 · Documentation · Releases
- Preview
- Overview
- Features
- Quick Start
- Command Line Reference
- Architecture
- Testing
- Troubleshooting
- Roadmap
- Contributing
- License
| Demo Scene (Phong) | Cornell Box (Path Tracing) | Random Scene |
|---|---|---|
| 800×600, 1 spp, <1s | 640×480, 256 spp, ~60s | 800×600, 64 spp, ~30s |
💡 Note: Run the commands below to generate your own preview images:
./build/bin/ray_tracer -w 800 -h 600 -s 1 --scene demo -o assets/images/demo.ppm ./build/bin/ray_tracer -w 640 -h 480 -s 256 -d 10 -p --scene cornell -o assets/images/cornell.ppm ./build/bin/ray_tracer -w 800 -h 600 -s 64 -d 8 -p --scene random -o assets/images/random.ppm
A production-grade CUDA/C++ ray tracer featuring Blinn-Phong shading, Monte Carlo path tracing, BVH acceleration, and an optional ray sorting optimization for improved warp efficiency. Built from the ground up with GPU parallelism in mind.
| Resolution | Samples | Mode | Time | Speedup |
|---|---|---|---|---|
| 800×600 | 1 | Phong | < 1s | ⚡ Instant |
| 1920×1080 | 256 | Path Tracing | ~60s | 🚀 Fast |
| 3840×2160 | 512 | Path Tracing | ~5min | 💎 4K Ready |
Test Environment: NVIDIA RTX 3080 (10GB), CUDA 12.4, Driver 550.54, Ubuntu 22.04, Release build
BVH acceleration provides O(log N) intersection queries vs O(N) brute-force — 10-100× faster for complex scenes.
| 💡 Rendering | 🎯 Acceleration | 🛠️ Engineering |
|---|---|---|
| Blinn-Phong shading | BVH tree structure | Comprehensive test suite |
| Monte Carlo path tracing | O(log N) queries | Unit + property tests |
| Cosine-weighted hemisphere sampling | Warp-divergence reduction | CI/CD pipeline |
| Russian roulette termination | Ray sorting optimization | clang-format enforced |
| ACES & Reinhard tone mapping | GPU memory optimization | Multi-scene support |
- 🎲 Path Tracing: Global illumination with cosine-weighted hemisphere sampling and Russian roulette termination for unbiased physically-based rendering
- 🌳 BVH Acceleration: Hierarchical bounding volume hierarchy with SAH-based construction for efficient O(log N) ray-object intersection queries
- ⚡ Ray Sorting: Groups primary rays by hit object ID to reduce warp divergence by 20-40% in Phong single-sample mode
- 🎨 Material System: Predefined materials (matte, plastic, metal, mirror) with easy extension via
Materialstruct - 🖼️ Tone Mapping: Reinhard and ACES filmic operators for HDR to LDR conversion with customizable exposure
- 📦 Scene Factory: Demo, Cornell Box, and random sphere scenes via
SceneBuilderfluent API
- Graphics Research: Experiment with different rendering algorithms on GPU
- Education: Learn ray tracing, path tracing, and GPU programming concepts
- Procedural Content: Generate scene variations programmatically
- Performance Analysis: Benchmark BVH construction and traversal strategies
| Dependency | Version | Notes |
|---|---|---|
| CUDA Toolkit | 11.0+ | Required for GPU computation |
| CMake | 3.18+ | Build system |
| C++ Compiler | C++17 | Host compiler (GCC 9+, Clang 10+, MSVC 2019+) |
| NVIDIA GPU | Compute 7.5+ | Turing architecture or newer recommended |
| GPU Memory | 2GB+ | Minimum for 1080p rendering |
# Clone the repository
git clone https://github.com/LessUp/ray-tracer.git
cd ray-tracer
# Configure and build
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)# Basic Phong rendering (800×600, 1 sample)
./build/bin/ray_tracer -w 800 -h 600 -s 1 --scene demo -o output.ppm
# Path tracing with 16 samples (Cornell Box)
./build/bin/ray_tracer -w 640 -h 480 -s 16 -d 5 -p --scene cornell -o cornell.ppm
# Ray sorting optimization (Phong + single sample only)
./build/bin/ray_tracer --scene demo --sort -w 640 -h 480 -s 1 -o sorted.ppm# macOS
open output.ppm
# Linux (with ImageMagick)
convert output.ppm output.png && xdg-open output.png
# Or use your favorite image viewer that supports PPMcd build
ctest --output-on-failure # Run all tests| Flag | Description | Default | Range |
|---|---|---|---|
-w <width> |
Image width in pixels | 800 |
1 - 4096 |
-h <height> |
Image height in pixels | 600 |
1 - 4096 |
-s <samples> |
Samples per pixel (path tracing) | 1 |
1 - 4096 |
-d <depth> |
Maximum ray bounce depth | 5 |
1 - 100 |
-p |
Enable path tracing mode | Off | — |
--scene <name> |
Scene: demo, cornell, random |
demo |
— |
--sort |
Enable ray sorting (Phong + single sample) | Off | — |
-o <file> |
Output PPM file path | output.ppm |
— |
--help |
Show help message | — | — |
# Fast preview (low-res, Phong only)
./build/bin/ray_tracer -w 320 -h 240 -s 1 --scene demo
# High-quality path tracing (4K, 512 samples)
./build/bin/ray_tracer -w 3840 -h 2160 -s 512 -d 10 -p --scene cornell
# Test ray sorting performance
./build/bin/ray_tracer --scene demo --sort -w 1920 -h 1080 -s 1
# Memory-constrained rendering (reduce frame buffer size)
./build/bin/ray_tracer -w 640 -h 480 -s 1 --scene demo┌─────────┐ ┌──────────────┐ ┌───────────────┐ ┌───────────┐ ┌─────────┐
│ Camera │ -> │ Primary Rays │ -> │ BVH/Brute │ -> │ Shading │ -> │ PPM │
│ │ │ Generation │ │ Force Intersect│ │ (Phong/PT)│ │ Output │
└─────────┘ └──────────────┘ └───────────────┘ └───────────┘ └─────────┘
│
┌─────────▼────────┐
│ Tone Mapping │
│ (Reinhard/ACES) │
└──────────────────┘
include/
├── core/ # vec3, ray, constants, CUDA utilities
├── geometry/ # AABB, BVH, sphere/plane intersections
├── rendering/ # Camera, materials, lights, kernels
├── scene/ # Scene management & factory
└── image/ # PPM output & tone mapping
src/
└── main.cu # CLI entry point
tests/
├── unit/ # Component tests (GoogleTest)
└── property/ # Mathematical invariants
All scene data is allocated on GPU once at scene creation. The frame buffer is transferred to CPU only at render completion, minimizing host-device communication overhead.
| Resolution | Frame Buffer Size | Scene Data |
|---|---|---|
| 1080p | ~24 MB | ~1-10 MB |
| 4K | ~96 MB | ~1-10 MB |
📚 See Architecture Documentation for detailed design decisions.
| Test Type | Purpose | Examples |
|---|---|---|
| Unit Tests | Verify individual components | vec3 operations, ray creation, camera setup |
| Property Tests | Verify mathematical invariants | Dot product properties, BVH correctness |
cd build
./bin/unit_tests # Run unit tests only
./bin/property_tests # Run property tests only
ctest --output-on-failure # Run all via CTest❌ CUDA Error: "no CUDA-capable device is detected"
Solution:
- Check NVIDIA driver:
nvidia-smi - Verify CUDA installation:
nvcc --version - Ensure your GPU has Compute Capability 7.5+ (Turing or newer recommended)
- Check if GPU is available in Docker:
--gpus allflag
❌ Compile Error: "nvcc not found"
Solution:
# Add CUDA to PATH
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATHFor Ubuntu, install CUDA Toolkit:
sudo apt-get install nvidia-cuda-toolkit❌ Runtime Error: "out of memory"
Solution: Reduce resolution or samples:
# Use smaller resolution
./build/bin/ray_tracer -w 640 -h 480 -s 1 --scene demo
# Monitor GPU memory
nvidia-smi❌ Output image is blank/black
Possible causes:
- Camera position may be inside geometry
- Scene may have no lights (check scene definition)
- Ray depth too low for path tracing (
-dparameter)
Debug: Try Phong mode first (without -p) to verify scene geometry.
❌ Tests fail with "CUDA error"
GPU tests require an NVIDIA GPU. On GitHub Actions (CPU-only), only build and format checks run.
Run tests locally:
cd build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make -j$(nproc)
ctest --output-on-failure| Version | Feature | Status |
|---|---|---|
| 2.0.1 | ✅ Current Release | Released |
| 2.1.0 | 🔄 Multi-GPU support | Planned |
| 2.1.0 | 🔄 Denoising (OIDN integration) | Planned |
| 2.2.0 | 📝 Textures & UV mapping | Backlog |
| 2.2.0 | 📝 Triangle mesh support (OBJ/GLTF) | Backlog |
| 3.0.0 | 📝 Real-time preview window | Backlog |
We welcome contributions! Please see our Contributing Guide for:
- Development workflow
- Code style guidelines (clang-format enforced)
- Testing requirements
- Pull request process
# 1. Fork and clone
git clone https://github.com/YOUR_USERNAME/ray-tracer.git
cd ray-tracer
# 2. Create a branch
git checkout -b feature/your-feature-name
# 3. Make changes and format code
# (clang-format will run automatically on commit)
# 4. Test locally
cd build && ctest --output-on-failure
# 5. Push and create PR
git push origin feature/your-feature-name🔄 CI/CD Pipeline Details
GitHub Actions validates every push:
| Check | Environment | Status |
|---|---|---|
| Build | CUDA 12.4.1 (Ubuntu 22.04) | ✅ Enforced |
| clang-format | Latest Ubuntu | ✅ Enforced |
| Tests | Local GPU required |
Note: GPU tests are not run on GitHub-hosted runners (no GPU available). Run
ctestlocally on a CUDA-capable machine.
| Resource | Description |
|---|---|
| 📖 Quick Start Guide | Get rendering in 5 minutes |
| 📋 API Reference | Complete API documentation |
| 🏗️ Architecture Design | Rendering pipeline & design decisions |
Distributed under the MIT License. See LICENSE for details.
- Inspired by Ray Tracing in One Weekend by Peter Shirley
- Built with CUDA and C++17
- Tested with GoogleTest framework