Skip to content

Thrust (CUDA) by example: transform/zip, scan, reduce, sort—minimal C++/CMake samples that run on WSL2/RTX.

Notifications You must be signed in to change notification settings

FlosMume/cpp-cuda-thust-intro

Repository files navigation

🧩 C++ CUDA Thrust (Windows + WSL2)

A minimal, production-ready CUDA C++ project introducing NVIDIA Thrust, a high-level parallel algorithms library for CUDA, designed for reproducible builds, clean debugging, and educational clarity. Works seamlessly under Windows + WSL2 environments.


🚀 Key Features

  • CUDA C++ example: src/thrust_intro.cu
  • Demonstrates thrust::transform with zip_iterator and a functor (SAXPY example)
  • Cross-platform build via CMake
  • GPU architecture targeting (Compute Capability 8.9, RTX 4070 SUPER)
  • Optional .vscode/launch.json for GDB debugging
  • Optional CTest integration for regression testing
  • Status check script for verifying CUDA environment

💡 Example: SAXPY with thrust::transform

#include <cstdio>
#include <thrust/device_vector.h>
#include <thrust/transform.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/tuple.h>

struct saxpy_functor {
  float a;
  __host__ __device__
  float operator()(const thrust::tuple<float,float>& t) const {
    return a * thrust::get<0>(t) + thrust::get<1>(t);
  }
};

int main(){
  const int N = 1 << 20;
  thrust::device_vector<float> x(N, 1.f), y(N, 2.f), z(N);

  saxpy_functor f{3.f};
  auto first = thrust::make_zip_iterator(thrust::make_tuple(x.begin(), y.begin()));
  auto last  = thrust::make_zip_iterator(thrust::make_tuple(x.end(),   y.end()));
  thrust::transform(first, last, z.begin(), f);

  float z0 = z[0], zN = z[N-1];
  printf("z[0]=%.1f  z[N-1]=%.1f  (expect 5.0)\n", z0, zN);
  printf("Success!\n");
  return 0;
}

Output:

z[0]=5.0  z[N-1]=5.0  (expect 5.0)
Success!

🧱 Project Structure

cpp-cuda-thust-intro/
├── src/
│   └── thrust_intro.cu
├── build/                # Auto-generated by CMake
├── CMakeLists.txt
├── README.md
├── check_thrust_intro_status.sh
└── (optional) .vscode/   # Launch/debug configuration

⚙️ Build & Run

From project root:

# Configure
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release

# Build
cmake --build build -j

# Run
./build/thrust_intro

Expected output:

z[0]=5.0  z[N-1]=5.0  (expect 5.0)
Success!

✅ Requires: CUDA 12.0+, a GPU with Compute Capability ≥ 8.9 (e.g. RTX 4070 SUPER), CMake 3.24+, and a C++17 compiler.


🧪 Testing (Optional)

If you enable CTest in your CMake configuration, you can run tests with:

cd build
ctest

Expected result:

100% tests passed, 0 tests failed out of 1

⚡ Status Check Script

A helper script validates your CUDA environment and runs a quick benchmark.

./check_thrust_intro_status.sh

Typical output:

GPU: NVIDIA GeForce RTX 4070 SUPER (Compute Capability 8.9)
Driver Version: 560.xx, CUDA 12.8
Vector Add completed in 0.5 ms

🧠 Learning Focus

  • Understand how Thrust abstracts GPU kernels into STL-like functions.
  • Learn transform with zip iterators to apply operations on multiple device vectors.
  • Practice building CUDA C++ projects with CMake on Linux/WSL2.
  • Compare explicit CUDA kernel programming vs. Thrust abstractions.

🧩 Topics

  • CUDA
  • Thrust (NVIDIA)
  • GPU computing
  • Parallel programming
  • C++ / CMake / WSL2
  • Examples: transform · zip iterator · device vector

🛠 Environment

Component Recommended Version
CUDA Toolkit 12.8+
CMake ≥ 3.24
Compiler g++ or nvcc
GPU RTX 4070 SUPER (SM 8.9)
OS Windows 11 + WSL2 (Ubuntu 22.04)

🛠 VS Code Integration

Launch configuration is in .vscode/launch.json:

{
  "name": "Run thrust_intro",
  "type": "cppdbg",
  "request": "launch",
  "program": "${workspaceFolder}/build/thrust_intro",
  "cwd": "${workspaceFolder}",
  "MIMode": "gdb"
}

Run or debug directly with F5 inside VS Code.


📚 References

  • CUDA by Example — Sanders & Kandrot
  • NVIDIA CUDA Toolkit Docs
  • Thrust Quick Start Guide
  • CMake + CUDA Language Guide

📜 License

MIT License © 2025 Samuel Huang (FlosMume)

About

Thrust (CUDA) by example: transform/zip, scan, reduce, sort—minimal C++/CMake samples that run on WSL2/RTX.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published