Common Issue RuntimeError: CUDA error

ShrutiDongare · September 19, 2023, 7:50pm

I am running into ‘RuntimeError: CUDA error: no kernel image is available for execution on the device’ multiple times. I tried to manually configure Pytorch and its compatible cuda version. The versions that did not work are below,

PyTorch==1.12.1 cuda 11.3,11.6, etc
PyTorch==1.13.1 cuda 11.7,11.2, etc

I partially understood the issue, I am using machine with GPU Rtx 3080 with compute capability of 8.6 which requires cuda binaries to be compiled with the sm_86 capability to work properly. PyTorch 2 is the minimum version that officially supports it which is not compatible with the sim. PyTorch 1 built with cuda 11 has a highest compute capability of sm_75. But do that have access to 2000 series GPU. Is there any alternate solution to this problem?

Jobair.16 · November 2, 2023, 8:20pm

The “RuntimeError: CUDA error: no kernel image is available for execution on the device” error occurs when the CUDA version, GPU architecture, and software are incompatible. The NVIDIA RTX 3080 GPU has a compute capability of 8.6, which requires software compiled with this compute capability.

Given your constraints, here are some solutions:

Custom build PyTorch from source with the CUDA toolkit targeting the sm_86 compute capability. This ensures compatibility between your GPU and PyTorch version, but it can be complex.
Use NVIDIA GPU Cloud (NGC) Docker containers for deep learning frameworks, including PyTorch. NGC containers are optimized for NVIDIA hardware, but they may not have the exact version you need.
Downgrade to a GPU of the 2000 series (e.g., RTX 2080), which has a compute capability of 7.5. This would be compatible with PyTorch built with CUDA 11 targeting sm_75.
Try TensorFlow or another deep learning framework, which may have different compatibility points or more recent builds that support sm_86.
Seek community builds or contact PyTorch developers for potential solutions or workarounds.