Skip to content

LessUp/ray-tracer

Repository files navigation

🎨 CUDA Ray Tracer

High-performance GPU ray tracer with path tracing, BVH acceleration, and real-time ray sorting


📋 Table of Contents


🖼️ Preview

Demo Scene (Phong) Cornell Box (Path Tracing) Random Scene
Demo Cornell Random
800×600, 1 spp, <1s 640×480, 256 spp, ~60s 800×600, 64 spp, ~30s

💡 Note: Run the commands below to generate your own preview images:

./build/bin/ray_tracer -w 800 -h 600 -s 1 --scene demo -o assets/images/demo.ppm
./build/bin/ray_tracer -w 640 -h 480 -s 256 -d 10 -p --scene cornell -o assets/images/cornell.ppm
./build/bin/ray_tracer -w 800 -h 600 -s 64 -d 8 -p --scene random -o assets/images/random.ppm

🌟 Overview

A production-grade CUDA/C++ ray tracer featuring Blinn-Phong shading, Monte Carlo path tracing, BVH acceleration, and an optional ray sorting optimization for improved warp efficiency. Built from the ground up with GPU parallelism in mind.

📊 Performance Benchmarks (RTX 3080)

Resolution Samples Mode Time Speedup
800×600 1 Phong < 1s ⚡ Instant
1920×1080 256 Path Tracing ~60s 🚀 Fast
3840×2160 512 Path Tracing ~5min 💎 4K Ready

Test Environment: NVIDIA RTX 3080 (10GB), CUDA 12.4, Driver 550.54, Ubuntu 22.04, Release build

BVH acceleration provides O(log N) intersection queries vs O(N) brute-force — 10-100× faster for complex scenes.


✨ Features

💡 Rendering 🎯 Acceleration 🛠️ Engineering
Blinn-Phong shading BVH tree structure Comprehensive test suite
Monte Carlo path tracing O(log N) queries Unit + property tests
Cosine-weighted hemisphere sampling Warp-divergence reduction CI/CD pipeline
Russian roulette termination Ray sorting optimization clang-format enforced
ACES & Reinhard tone mapping GPU memory optimization Multi-scene support

🔬 Technical Highlights

  • 🎲 Path Tracing: Global illumination with cosine-weighted hemisphere sampling and Russian roulette termination for unbiased physically-based rendering
  • 🌳 BVH Acceleration: Hierarchical bounding volume hierarchy with SAH-based construction for efficient O(log N) ray-object intersection queries
  • ⚡ Ray Sorting: Groups primary rays by hit object ID to reduce warp divergence by 20-40% in Phong single-sample mode
  • 🎨 Material System: Predefined materials (matte, plastic, metal, mirror) with easy extension via Material struct
  • 🖼️ Tone Mapping: Reinhard and ACES filmic operators for HDR to LDR conversion with customizable exposure
  • 📦 Scene Factory: Demo, Cornell Box, and random sphere scenes via SceneBuilder fluent API

🎯 Use Cases

  • Graphics Research: Experiment with different rendering algorithms on GPU
  • Education: Learn ray tracing, path tracing, and GPU programming concepts
  • Procedural Content: Generate scene variations programmatically
  • Performance Analysis: Benchmark BVH construction and traversal strategies

🚀 Quick Start

Prerequisites

Dependency Version Notes
CUDA Toolkit 11.0+ Required for GPU computation
CMake 3.18+ Build system
C++ Compiler C++17 Host compiler (GCC 9+, Clang 10+, MSVC 2019+)
NVIDIA GPU Compute 7.5+ Turing architecture or newer recommended
GPU Memory 2GB+ Minimum for 1080p rendering

1️⃣ Build

# Clone the repository
git clone https://github.com/LessUp/ray-tracer.git
cd ray-tracer

# Configure and build
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)

2️⃣ Run

# Basic Phong rendering (800×600, 1 sample)
./build/bin/ray_tracer -w 800 -h 600 -s 1 --scene demo -o output.ppm

# Path tracing with 16 samples (Cornell Box)
./build/bin/ray_tracer -w 640 -h 480 -s 16 -d 5 -p --scene cornell -o cornell.ppm

# Ray sorting optimization (Phong + single sample only)
./build/bin/ray_tracer --scene demo --sort -w 640 -h 480 -s 1 -o sorted.ppm

3️⃣ View Output

# macOS
open output.ppm

# Linux (with ImageMagick)
convert output.ppm output.png && xdg-open output.png

# Or use your favorite image viewer that supports PPM

4️⃣ Test

cd build
ctest --output-on-failure  # Run all tests

📖 Command Line Reference

Flag Description Default Range
-w <width> Image width in pixels 800 1 - 4096
-h <height> Image height in pixels 600 1 - 4096
-s <samples> Samples per pixel (path tracing) 1 1 - 4096
-d <depth> Maximum ray bounce depth 5 1 - 100
-p Enable path tracing mode Off
--scene <name> Scene: demo, cornell, random demo
--sort Enable ray sorting (Phong + single sample) Off
-o <file> Output PPM file path output.ppm
--help Show help message

💡 Pro Tips

# Fast preview (low-res, Phong only)
./build/bin/ray_tracer -w 320 -h 240 -s 1 --scene demo

# High-quality path tracing (4K, 512 samples)
./build/bin/ray_tracer -w 3840 -h 2160 -s 512 -d 10 -p --scene cornell

# Test ray sorting performance
./build/bin/ray_tracer --scene demo --sort -w 1920 -h 1080 -s 1

# Memory-constrained rendering (reduce frame buffer size)
./build/bin/ray_tracer -w 640 -h 480 -s 1 --scene demo

🏗️ Architecture

Rendering Pipeline

┌─────────┐    ┌──────────────┐    ┌───────────────┐    ┌───────────┐    ┌─────────┐
│ Camera  │ -> │ Primary Rays │ -> │ BVH/Brute     │ -> │ Shading   │ -> │ PPM     │
│         │    │ Generation   │    │ Force Intersect│    │ (Phong/PT)│    │ Output  │
└─────────┘    └──────────────┘    └───────────────┘    └───────────┘    └─────────┘
                                                                              │
                                                                    ┌─────────▼────────┐
                                                                    │ Tone Mapping     │
                                                                    │ (Reinhard/ACES)  │
                                                                    └──────────────────┘

Project Structure

include/
├── core/           # vec3, ray, constants, CUDA utilities
├── geometry/       # AABB, BVH, sphere/plane intersections
├── rendering/      # Camera, materials, lights, kernels
├── scene/          # Scene management & factory
└── image/          # PPM output & tone mapping

src/
└── main.cu         # CLI entry point

tests/
├── unit/           # Component tests (GoogleTest)
└── property/       # Mathematical invariants

Memory Model

All scene data is allocated on GPU once at scene creation. The frame buffer is transferred to CPU only at render completion, minimizing host-device communication overhead.

Resolution Frame Buffer Size Scene Data
1080p ~24 MB ~1-10 MB
4K ~96 MB ~1-10 MB

📚 See Architecture Documentation for detailed design decisions.


🧪 Testing Strategy

Test Type Purpose Examples
Unit Tests Verify individual components vec3 operations, ray creation, camera setup
Property Tests Verify mathematical invariants Dot product properties, BVH correctness
cd build
./bin/unit_tests       # Run unit tests only
./bin/property_tests   # Run property tests only
ctest --output-on-failure  # Run all via CTest

🔧 Troubleshooting

❌ CUDA Error: "no CUDA-capable device is detected"

Solution:

  1. Check NVIDIA driver: nvidia-smi
  2. Verify CUDA installation: nvcc --version
  3. Ensure your GPU has Compute Capability 7.5+ (Turing or newer recommended)
  4. Check if GPU is available in Docker: --gpus all flag
❌ Compile Error: "nvcc not found"

Solution:

# Add CUDA to PATH
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

For Ubuntu, install CUDA Toolkit:

sudo apt-get install nvidia-cuda-toolkit
❌ Runtime Error: "out of memory"

Solution: Reduce resolution or samples:

# Use smaller resolution
./build/bin/ray_tracer -w 640 -h 480 -s 1 --scene demo

# Monitor GPU memory
nvidia-smi
❌ Output image is blank/black

Possible causes:

  1. Camera position may be inside geometry
  2. Scene may have no lights (check scene definition)
  3. Ray depth too low for path tracing (-d parameter)

Debug: Try Phong mode first (without -p) to verify scene geometry.

❌ Tests fail with "CUDA error"

GPU tests require an NVIDIA GPU. On GitHub Actions (CPU-only), only build and format checks run.

Run tests locally:

cd build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make -j$(nproc)
ctest --output-on-failure

🗺️ Roadmap

Version Feature Status
2.0.1 ✅ Current Release Released
2.1.0 🔄 Multi-GPU support Planned
2.1.0 🔄 Denoising (OIDN integration) Planned
2.2.0 📝 Textures & UV mapping Backlog
2.2.0 📝 Triangle mesh support (OBJ/GLTF) Backlog
3.0.0 📝 Real-time preview window Backlog

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for:

  • Development workflow
  • Code style guidelines (clang-format enforced)
  • Testing requirements
  • Pull request process

Quick Start for Contributors

# 1. Fork and clone
git clone https://github.com/YOUR_USERNAME/ray-tracer.git
cd ray-tracer

# 2. Create a branch
git checkout -b feature/your-feature-name

# 3. Make changes and format code
# (clang-format will run automatically on commit)

# 4. Test locally
cd build && ctest --output-on-failure

# 5. Push and create PR
git push origin feature/your-feature-name
🔄 CI/CD Pipeline Details

GitHub Actions validates every push:

Check Environment Status
Build CUDA 12.4.1 (Ubuntu 22.04) ✅ Enforced
clang-format Latest Ubuntu ✅ Enforced
Tests Local GPU required ⚠️ Manual

Note: GPU tests are not run on GitHub-hosted runners (no GPU available). Run ctest locally on a CUDA-capable machine.


📚 Documentation

Resource Description
📖 Quick Start Guide Get rendering in 5 minutes
📋 API Reference Complete API documentation
🏗️ Architecture Design Rendering pipeline & design decisions

📝 License

Distributed under the MIT License. See LICENSE for details.


🙏 Acknowledgments


Built with ❤️ using CUDA/C++ · ⭐ Star this repo · Report issues