Skip to content

CUDA Kernel

A CUDA kernel is a function written by a developer that runs in parallel on a single GPU to perform computations on local data.

The communications between GPUs are generally handled by NCCL.