sophytoeat

Posted on Dec 14, 2024

Resolving CUDA Version and GPU Architecture Issues in ContourCraft

#simulation #linux

When implementing ContourCraft on a system using Windows 11, RTX 4090, and WSL2, I encountered multiple issues related to CUDA compatibility. Below, I outline the errors, troubleshooting steps, and solutions that led to a successful installation of the CCCollisions module, which handles static and dynamic collision handling.

Environment

OS: Windows 11
GPU: RTX 4090
Platform: WSL2

Error 1: CUDA Version Mismatch

Error Message

      RuntimeError:
      The detected CUDA version (12.6) mismatches the version that was used to compile
      PyTorch (11.8). Please make sure to use the same CUDA versions.

      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for cccollisions
  Running setup.py clean for cccollisions
Failed to build cccollisions
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (cccollisions)

Root Cause

The CCCollisions module requires PyTorch, which was compiled with CUDA version 11.7. However, my environment had CUDA version 12.6 installed, causing a mismatch.

Solution

Install CUDA 11.7: I reinitialized my WSL2 environment and installed CUDA version 11.7. 2.** Verify CUDA Installation**:

Check the installed CUDA version:

nvcc --version

Ensure it matches the required version (11.7). After completing these steps, the CUDA version mismatch error was resolved.

Error 2: Unsupported GPU Architecture

Error Message

      nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
      nvcc fatal   : Unsupported gpu architecture 'compute_89'
      error: command '/usr/local/cuda-11.7/bin/nvcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for cccollisions
  Running setup.py clean for cccollisions
Failed to build cccollisions
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (cccollisions)

Root Cause

The RTX 4090 GPU supports the compute_89 architecture. However, CUDA version 11.7 does not recognize compute_89, leading to a fatal error during the build process.

Solution

To resolve this, I explicitly set the architecture to compute_86, which CUDA 11.7 supports:

Temporary Fix

Set Environment Variables:Execute the following commands to set the appropriate CUDA architecture:

export TORCH_CUDA_ARCH_LIST="8.6"
export PATH=/usr/local/cuda-11.7/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH

Validate the Fix:Run the build process again and confirm the error no longer occurs.

Permanent Fix

To make this change permanent, add the environment variables to your shell configuration file:

Edit ~/.bashrc or ~/.zshrc:

echo 'export TORCH_CUDA_ARCH_LIST="8.6"' >> ~/.bashrc
echo 'export PATH=/usr/local/cuda-11.7/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc

Reload the Configuration:

source ~/.bashrc

Re-login: Log out and log back in to apply the changes globally.

After implementing this solution, I successfully installed the CCCollisions module without further issues.

DEV Community

Resolving CUDA Version and GPU Architecture Issues in ContourCraft

Environment

Error 1: CUDA Version Mismatch

Error Message

Root Cause

Solution

Error 2: Unsupported GPU Architecture

Error Message

Root Cause

Solution

Temporary Fix

Permanent Fix

Top comments (0)

Read next

DeepSeek Always Busy? Deploy It Locally with Milvus in Just 10 Minutes—No More Waiting!

PDF Butler Salesforce: Review of Document Generation

AWS Deepracer DRFC MinIO Fixes

How to Set Up Docker for Your Next Microservice Project 🚀, Microservices Development with Docker 🐳, Containerization Made Easy 💻