收起左侧

ollama无法调用GPU

3
回复
123
查看
[ 复制链接 ]

6

主题

10

回帖

0

牛值

江湖小虾

2025-4-14 13:13:56 显示全部楼层 阅读模式

设备环境:(物理机 系统版本0.8.45 显卡是Telsa P4 显卡驱动是应用商店的560.28.03)

BUG现象:(我使用ollama官网的curl -fsSL https://ollama.com/install.sh | sh安装ollama,发现完全无法正常调用GPU进行推理,全程用的CPU)

出现频率:(必现)

日志文件:

#deviceQuery

root@Server-666:/# cd /usr/local/cuda-12.6/extras/demo_suite/
root@Server-666:/usr/local/cuda-12.6/extras/demo_suite# sudo make
make: *** No targets specified and no makefile found. Stop.
root@Server-666:/usr/local/cuda-12.6/extras/demo_suite# ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla P4"
CUDA Driver Version / Runtime Version 12.6 / 12.6
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 7599 MBytes (7968522240 bytes)
(20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
GPU Max Clock rate: 1114 MHz (1.11 GHz)
Memory Clock rate: 3003 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device supports Com** Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 132 / 0
Com** Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.6, CUDA Runtime Version = 12.6, NumDevs = 1, Device0 = Tesla P4
Result = PASS

#ollama

root@Server-666:/# ollama ps
NAME ID SIZE PROCESSOR UNTIL
deepseek-r1:latest 0a8c26691023 5.5 GB 100% CPU 4 minutes from now
root@Server-666:/# ollama serve
Error: listen tcp 127.0.0.1:11434: bind: address already in use
root@Server-666:/# journalctl -u ollama
Apr 14 11:48:09 Server-666 systemd[1]: Started ollama.service - Ollama Service.
Apr 14 11:48:09 Server-666 ollama[42317]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key.
Apr 14 11:48:09 Server-666 ollama[42317]: Your new public key is:
Apr 14 11:48:09 Server-666 ollama[42317]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAbViHYIWtcIKqg0gC3BKaSKXS4HkdWCKph7zNiSVL6N
Apr 14 11:48:09 Server-666 ollama[42317]: 2025/04/14 11:48:09 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVI>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.169+08:00 level=INFO source=images.go:458 msg="total blo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.169+08:00 level=INFO source=images.go:465 msg="total unu>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.170+08:00 level=INFO source=routes.go:1298 msg="Listenin>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.171+08:00 level=INFO source=gpu.go:217 msg="looking for >
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.180+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.180+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.180+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.183+08:00 level=INFO source=gpu.go:377 msg="no compatibl>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.183+08:00 level=INFO source=types.go:130 msg="inference >
Apr 14 11:52:17 Server-666 systemd[1]: Stopping ollama.service - Ollama Service...
Apr 14 11:52:17 Server-666 systemd[1]: ollama.service: Deactivated successfully.
Apr 14 11:52:17 Server-666 systemd[1]: Stopped ollama.service - Ollama Service.
Apr 14 11:52:17 Server-666 systemd[1]: Started ollama.service - Ollama Service.
Apr 14 11:52:17 Server-666 ollama[50794]: 2025/04/14 11:52:17 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVI>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=images.go:458 msg="total blo>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=images.go:465 msg="total unu>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=routes.go:1298 msg="Listenin>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=gpu.go:217 msg="looking for >
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.999+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.999+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.999+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:52:18 Server-666 ollama[50794]: time=2025-04-14T11:52:18.001+08:00 level=INFO source=gpu.go:377 msg="no compatibl>
Apr 14 11:52:18 Server-666 ollama[50794]: time=2025-04-14T11:52:18.001+08:00 level=INFO source=types.go:130 msg="inference >

#lspci | grep -i nvidia

root@Server-666:/# lspci | grep -i nvidia
84:00.0 3D controller: NVIDIA Corporation GP104GL [Tesla P4] (rev a1)

#nvidia-smi和nvcc --version

root@Server-666:/# nvidia-smi
Mon Apr 14 12:34:28 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.28.03 Driver Version: 560.28.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Com** M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla P4 Off | 00000000:84:00.0 Off | 0 |
| N/A 37C P0 23W / 75W | 0MiB / 7680MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
root@Server-666:/# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:34:21_PDT_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0
root@Server-666:/# find / -name libcuda.so 2>/dev/null
/usr/lib/x86_64-linux-gnu/libcuda.so
/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/libcuda.so
root@Server-666:/# find / -name libnvidia-nvvm.so* 2>/dev/null
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.560.28.03
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so
root@Server-666:/# find / -name libnvidia-ml.so* 2>/dev/null
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.560.28.03
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/libnvidia-ml.so
root@Server-666:/# ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195, 0 Apr 14 11:10 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Apr 14 11:10 /dev/nvidiactl
crw-rw-rw- 1 root root 236, 0 Apr 14 11:10 /dev/nvidia-uvm
crw-rw-rw- 1 root root 236, 1 Apr 14 11:10 /dev/nvidia-uvm-tools

/dev/nvidia-caps:
total 0
cr-------- 1 root root 239, 1 Apr 14 11:10 nvidia-cap1
cr--r--r-- 1 root root 239, 2 Apr 14 11:10 nvidia-cap2

收藏
送赞
分享

6

主题

10

回帖

0

牛值

江湖小虾

2025-4-14 17:28:15 楼主 显示全部楼层

很奇怪,AI相册能正常调用GPU

29

主题

6477

回帖

0

牛值

管理员

2025-4-15 15:32:55 显示全部楼层
目前无法调用gpu
好吧,我是服务器装的飞牛,纯CPU推理在夏天温度有些顶不住()  详情 回复
2025-4-15 18:52

6

主题

10

回帖

0

牛值

江湖小虾

2025-4-15 18:52:45 楼主 显示全部楼层

好吧,我是服务器装的飞牛,纯CPU推理在夏天温度有些顶不住()
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则