Troubleshooting: H26xDec_0_0: Could not open codec/Nvidia driver Failed to initialize NVML
Due to a known issue in Nvidia Container Toolkit version 1.17.8-1.18.0, we recommend using Nvidia Container Toolkit version 1.17.7 or previous.
The Nvidia Container Toolkit issue is known to Nvidia here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/troubleshooting.html#containers-losing-access-to-gpus-with-error-failed-to-initialize-nvml-unknown-error
It will manifest within Live Transcoder as either or both of the following:

H26xDec_0_0: Could not open codec

Nvidia driver Failed to initialize NVML
The Nvidia Container Toolkit is unable to pass the GPU to the container.
Downgrade Nvidia Container Toolkit to 1.17.7 (Ubuntu/Debian)
Please note, for other distros, you will need to edit the script to make suitable for your package manager.
sudo apt-get purge -y nvidia-container-toolkit libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit-base
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
export NVIDIA_CONTAINER_TOOLKIT_VERSION=1.17.7-1
sudo apt-get install -y \
nvidia-container-toolkit=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
nvidia-container-toolkit-base=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
libnvidia-container-tools=${NVIDIA_CONTAINER_TOOLKIT_VERSION} \
libnvidia-container1=${NVIDIA_CONTAINER_TOOLKIT_VERSION}
Unable to downgrade?
If your image is reliant on Nvidia Container Toolkit version 1.17.8-1.18.0 or you wish not to downgrade, the following work around is available:
- SSH to the host instance
- Edit the docker-compose.yml in a text editor
- Add the following under devices:
- /dev/nvidia0:/dev/nvidia0
- /dev/nvidiactl:/dev/nvidiactl
- /dev/nvidia-uvm:/dev/nvidia-uvm
- Here is an example of the complete docker-compose.yml:
version: "3.9"
services:
transcoder:
image: "comprimato/live-transcoder:latest"
container_name: transcoder-0
hostname: transcoder-0
network_mode: host
tty: true
stdin_open: true
shm_size: "1gb"
stop_signal: SIGRTMIN+3
restart: "always"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
tmpfs:
- /run:exec
logging:
driver: "journald"
volumes:
- /var/lib/systemd/coredump:/var/lib/systemd/coredump:rw
- /sys/fs/cgroup:/sys/fs/cgroup:rw
- transcoder0-data:/etc/transcoder/transcoder
- /var/log:/var/log/host:ro
environment:
- NVIDIA_DRIVER_CAPABILITIES=utility,compute,video
- NVIDIA_VISIBLE_DEVICES=all
- TRC_LICENSE_KEY=yourlicensekey
# uncomment to enable sending data to the Monitoring Dashboard
# - TRC_monitoring_enabled=1
cap_add:
- CAP_SYS_RESOURCE
- CAP_SYS_PTRACE
- CAP_SYSLOG
- CAP_SYS_RAWIO
- CAP_SYS_NICE
devices:
- /dev/mem:/dev/mem
- /dev/nvidia0:/dev/nvidia0
- /dev/nvidiactl:/dev/nvidiactl
- /dev/nvidia-uvm:/dev/nvidia-uvm
ulimits:
core: -1
memlock: -1
nofile: 65536
volumes:
transcoder0-data:
name: transcoder0-data
- Save the changes to the docker-compose.yml
- Recreate the docker container:
docker-compose up -d --force-recreate
Updated 6 days ago
