收起左侧

docker调用宿主机GPU失败

1
回复
50
查看
[ 复制链接 ]

2

主题

8

回帖

0

牛值

fnOS系统内测组

悬赏1飞牛币未解决

按照docker安装deepseek+open-webui并用NVIDIA Tesla P4加速 教程的相关设置,使用的是应用商店的驱动.以及nvidia-container-toolkroot

root@NAS:~# ls -lh /usr/lib/x86_64-linux-gnu/libnvidia-ml.so*
lrwxrwxrwx 1 root  root    51 Sep 17 11:30 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so -> /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.560.28.03
lrwxrwxrwx 1 root  root    51 Sep 17 11:30 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 -> /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.560.28.03
-rw-rw---- 1 admin Users 2.1M Mar  4  2025 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1.bak
-rwxrwxrwx 1 admin Users 2.1M Mar  4  2025 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.560.28.03
-rw-rw---- 1 admin Users 2.1M Mar  4  2025 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.bak
root@NAS:~# ls -lh /usr/lib/x86_64-linux-gnu/libcuda.so*
lrwxrwxrwx 1 root  root   46 Sep 17 11:30 /usr/lib/x86_64-linux-gnu/libcuda.so -> /usr/lib/x86_64-linux-gnu/libcuda.so.560.28.03
lrwxrwxrwx 1 root  root   46 Sep 17 11:30 /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> /usr/lib/x86_64-linux-gnu/libcuda.so.560.28.03
-rw-rw---- 1 admin Users 34M Mar  4  2025 /usr/lib/x86_64-linux-gnu/libcuda.so.1.bak
-rwxrwxrwx 1 admin Users 34M Mar  4  2025 /usr/lib/x86_64-linux-gnu/libcuda.so.560.28.03
-rw-rw---- 1 admin Users 34M Mar  4  2025 /usr/lib/x86_64-linux-gnu/libcuda.so.bak
root@NAS:~# 

root@NAS:~# nvidia-smi 
Wed Sep 17 12:24:27 2025     
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.28.03              Driver Version: 560.28.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Com** M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla P4                       Off |   00000000:05:00.0 Off |                    0 |
| N/A   49C    P0             22W /   75W |       0MiB /   7680MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                       
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
root@NAS:~#  apt install nvidia-container-toolkit
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
nvidia-container-toolkit is already the newest version (1.13.5-1).
0 upgraded, 0 newly installed, 0 to remove and 166 not upgraded.
root@NAS:~# 
root@NAS:~# 
root@NAS:~# cat /etc/docker
docker/    dockermgr/ 
root@NAS:~# cat /etc/docker/daemon.json
{
  "data-root": "/vol1/docker",
  "live-restore": true,
  "log-driver": "json-file",
  "log-opts": {
    "max-file": "5",
    "max-size": "100m"
  },
  "proxies": {},
  "registry-mirrors": [
    "https://docker.1ms.run"
  ],
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}
root@NAS:~# 

然后又按照官方的说明

最新版系统docker中调用nvidia失效解决办法

构建容器时提示:Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'

nvidia-container-cli: detection error: open failed: /usr/lib/x86_64-linux-gnu/libnvoptix.so.1: permission denied: unknown

Exited:0

root@NAS:~# docker run --rm --gpus all docker.1ms.run/nvidia/cuda:12.2.2-base-ubuntu22.04 nvidia-smi
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: detection error: open failed: /usr/lib/x86_64-linux-gnu/libnvoptix.so.1: permission denied: unknown

Run 'docker run --help' for more information
root@NAS:~# 

收藏
送赞
分享

2

主题

8

回帖

0

牛值

fnOS系统内测组

昨天 12:31 楼主 显示全部楼层

系统版本位0.9.26

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则