Mini-ImagePipe

A high-performance DAG-based GPU image processing pipeline with multi-stream scheduling, pinned memory pool, and CUDA-accelerated operators. Designed for real-time video and batch image processing workflows.

Highlights

GPU Accelerated: Full CUDA implementation with async kernel execution
DAG Scheduling: Directed acyclic graph-based task dependency management with automatic parallelization
Multi-Stream Execution: Concurrent CUDA stream execution for independent tasks
Memory Efficient: Pinned/Device memory pools with best-fit allocation strategy
Separable Filtering: Gaussian blur optimized with separable horizontal + vertical passes
Error Propagation: Task failures automatically propagate downstream along the DAG

Quick Start

# Clone the repository
git clone https://github.com/LessUp/mini-image-pipe.git
cd mini-image-pipe

# Build with CMake Presets (recommended)
cmake --preset release
cmake --build --preset release

# Run the demo
./build/demo_pipeline

# Run tests
./build/mini_image_pipe_tests

Requirements

OS: Linux (Ubuntu 20.04+ recommended)
CMake: >= 3.18
CUDA Toolkit: >= 11.0 with nvcc in PATH
C++ Compiler: GCC 7+, Clang 7+, or MSVC 2019+
GPU: NVIDIA GPU with Compute Capability >= 7.0 (Volta or newer)
GTest: v1.14.0 (auto-fetched via FetchContent, no manual installation needed)

Installation

Prerequisites

Ensure you have CUDA Toolkit installed:

# Verify CUDA installation
nvcc --version

# If not installed, download from:
# https://developer.nvidia.com/cuda-downloads

System Setup

# Add CUDA to PATH (if not already)
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Build

Using CMake Presets (Recommended)

# Debug build
cmake --preset default
cmake --build --preset default

# Release build (optimized for performance)
cmake --preset release
cmake --build --preset release

# Native GPU arch only (faster compile)
cmake --preset minimal
cmake --build --preset minimal

Manual Build

mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . -j$(nproc)

Run Demo

./build/demo_pipeline

Run Tests

# Using ctest
ctest --preset release

# Or run directly
./build/mini_image_pipe_tests

Usage

#include "pipeline.h"
#include "operators/resize.h"
#include "operators/color_convert.h"
#include "operators/gaussian_blur.h"
#include "operators/sobel.h"
#include <cuda_runtime.h>

using namespace mini_image_pipe;

int main() {
    // Configuration
    PipelineConfig config;
    config.numStreams = 4;
    Pipeline pipeline(config);

    // Add operators
    auto resize = std::make_shared<ResizeOperator>(320, 240, InterpolationMode::BILINEAR);
    auto gray   = std::make_shared<ColorConvertOperator>(ColorConversionType::RGB_TO_GRAY);
    auto blur   = std::make_shared<GaussianBlurOperator>(GaussianKernelSize::KERNEL_5x5);
    auto sobel  = std::make_shared<SobelOperator>();

    int n1 = pipeline.addOperator("Resize", resize);
    int n2 = pipeline.addOperator("Gray",   gray);
    int n3 = pipeline.addOperator("Blur",   blur);
    int n4 = pipeline.addOperator("Sobel",  sobel);

    // Connect: Resize -> Gray -> Blur -> Sobel
    pipeline.connect(n1, n2);
    pipeline.connect(n2, n3);
    pipeline.connect(n3, n4);

    // Allocate GPU memory for input
    int width = 640, height = 480, channels = 3;
    size_t inputSize = width * height * channels * sizeof(uint8_t);
    uint8_t* d_input;
    cudaMalloc(&d_input, inputSize);
    
    // (Load your image data to d_input here)

    // Set input and execute
    pipeline.setInput(n1, d_input, width, height, channels);
    pipeline.execute();

    // Get output
    void* output = pipeline.getOutput(n4);
    
    // Cleanup
    cudaFree(d_input);
    return 0;
}

See examples/demo_pipeline.cpp for a complete working example.

Operators

Operator	Function	Features
GaussianBlur	Gaussian blur	3×3/5×5/7×7 separable filter, reflection boundary padding
Sobel	Edge detection	3×3 Sobel kernels, gradient magnitude output
Resize	Image scaling	Bilinear / nearest-neighbor interpolation
ColorConvert	Color conversion	RGB↔Gray, BGR↔RGB, RGBA→RGB

GPU Architecture Support

Architecture	Compute Capability	Example GPUs
Volta	sm_70	V100
Turing	sm_75	RTX 2080, T4
Ampere	sm_80, sm_86	A100, RTX 3090
Ada Lovelace	sm_89	RTX 4090, L40
Hopper	sm_90	H100

Project Structure

mini-image-pipe/
├── include/
│   ├── types.h                # Data types, enums, KernelConfig
│   ├── operator.h             # IOperator abstract base class
│   ├── memory_manager.h       # Pinned/Device memory pool manager
│   ├── task_graph.h           # DAG task graph (topological sort, cycle detection)
│   ├── scheduler.h            # CUDA multi-stream DAG scheduler
│   ├── pipeline.h             # Pipeline builder and execution entry
│   └── operators/
│       ├── color_convert.h    # Color space conversion operator
│       ├── resize.h           # Image resize operator
│       ├── sobel.h            # Sobel edge detection operator
│       └── gaussian_blur.h    # Gaussian blur operator (separable filter)
├── src/
│   ├── memory_manager.cu      # Memory pool (best-fit strategy)
│   ├── task_graph.cpp         # Kahn topological sort, DFS cycle detection
│   ├── scheduler.cu           # Stream assignment, event sync, error propagation
│   ├── pipeline.cpp           # Buffer allocation, dimension inference, batch processing
│   └── operators/
│       ├── color_convert.cu   # RGB/BGR/RGBA/Gray conversion kernels
│       ├── resize.cu          # Nearest-neighbor / bilinear interpolation kernels
│       ├── sobel.cu           # 3×3 Sobel gradient kernel (__constant__ weights)
│       └── gaussian_blur.cu   # Separable Gaussian kernel (horizontal + vertical pass)
├── tests/                     # GTest property tests (100 random iterations per operator)
├── examples/
│   └── demo_pipeline.cpp      # End-to-end pipeline demo
├── .clang-format              # Code format rules
├── .editorconfig              # Editor format rules
├── CMakeLists.txt             # Build configuration
└── CMakePresets.json          # CMake presets (default/release/minimal)

Architecture

┌───────────────────────────────────────────────────────┐
│                     Pipeline API                      │
├───────────────────────────────────────────────────────┤
│  TaskGraph   │  DAGScheduler   │  MemoryManager       │
├───────────────────────────────────────────────────────┤
│  Operators: Gaussian │ Sobel │ Resize │ ColorConvert  │
├───────────────────────────────────────────────────────┤
│  CUDA Streams   │   CUDA Events   │  Shared Memory   │
└───────────────────────────────────────────────────────┘

Documentation

Getting Started Guide - Build and run your first project
Usage Examples - Common usage patterns and best practices
Architecture Overview - System design and component overview
API Reference - Complete API documentation
Contributing Guide - How to contribute to the project

Engineering Quality

Modern CMake: target_include_directories, generator expressions, FetchContent, MSVC compatibility
CI/CD: GitHub Actions (CUDA container build + clang-format check + ctest)
Memory Safety: Pooled memory management, best-fit allocation, automatic reuse
Error Handling: Full CUDA API error checking, DAG failure propagation
Code Standards: .clang-format (Google style, 4-space indent, 100 col)
Test Coverage: 100-iteration randomized property tests per operator/component

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.claude		.claude
.github		.github
_includes		_includes
_sass		_sass
assets		assets
changelog		changelog
docs		docs
examples		examples
include		include
openspec		openspec
src		src
tests		tests
.clang-format		.clang-format
.clangd		.clangd
.editorconfig		.editorconfig
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
VERSION		VERSION
_config.yml		_config.yml
about.md		about.md
index.md		index.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini-ImagePipe

Table of Contents

Highlights

Quick Start

Requirements

Installation

Prerequisites

System Setup

Build

Using CMake Presets (Recommended)

Manual Build

Run Demo

Run Tests

Usage

Operators

GPU Architecture Support

Project Structure

Architecture

Documentation

Engineering Quality

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mini-ImagePipe

Table of Contents

Highlights

Quick Start

Requirements

Installation

Prerequisites

System Setup

Build

Using CMake Presets (Recommended)

Manual Build

Run Demo

Run Tests

Usage

Operators

GPU Architecture Support

Project Structure

Architecture

Documentation

Engineering Quality

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages