ollama无法调用GPU

神经蛙_oZU3S · 2025-4-14 13:13:56

设备环境：（物理机系统版本0.8.45 显卡是Telsa P4 显卡驱动是应用商店的560.28.03）

BUG现象：（我使用ollama官网的curl -fsSL https://ollama.com/install.sh | sh安装ollama，发现完全无法正常调用GPU进行推理，全程用的CPU）

出现频率：（必现）

日志文件：

#deviceQuery

root@Server-666:/# cd /usr/local/cuda-12.6/extras/demo_suite/
root@Server-666:/usr/local/cuda-12.6/extras/demo_suite# sudo make
make: *** No targets specified and no makefile found. Stop.
root@Server-666:/usr/local/cuda-12.6/extras/demo_suite# ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla P4"
CUDA Driver Version / Runtime Version 12.6 / 12.6
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 7599 MBytes (7968522240 bytes)
(20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
GPU Max Clock rate: 1114 MHz (1.11 GHz)
Memory Clock rate: 3003 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device supports Com** Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 132 / 0
Com** Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.6, CUDA Runtime Version = 12.6, NumDevs = 1, Device0 = Tesla P4
Result = PASS

#ollama

root@Server-666:/# ollama ps
NAME ID SIZE PROCESSOR UNTIL
deepseek-r1:latest 0a8c26691023 5.5 GB 100% CPU 4 minutes from now
root@Server-666:/# ollama serve
Error: listen tcp 127.0.0.1:11434: bind: address already in use
root@Server-666:/# journalctl -u ollama
Apr 14 11:48:09 Server-666 systemd[1]: Started ollama.service - Ollama Service.
Apr 14 11:48:09 Server-666 ollama[42317]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key.
Apr 14 11:48:09 Server-666 ollama[42317]: Your new public key is:
Apr 14 11:48:09 Server-666 ollama[42317]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAbViHYIWtcIKqg0gC3BKaSKXS4HkdWCKph7zNiSVL6N
Apr 14 11:48:09 Server-666 ollama[42317]: 2025/04/14 11:48:09 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVI>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.169+08:00 level=INFO source=images.go:458 msg="total blo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.169+08:00 level=INFO source=images.go:465 msg="total unu>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.170+08:00 level=INFO source=routes.go:1298 msg="Listenin>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.171+08:00 level=INFO source=gpu.go:217 msg="looking for >
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.180+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.180+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.180+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.183+08:00 level=INFO source=gpu.go:377 msg="no compatibl>
Apr 14 11:48:09 Server-666 ollama[42317]: time=2025-04-14T11:48:09.183+08:00 level=INFO source=types.go:130 msg="inference >
Apr 14 11:52:17 Server-666 systemd[1]: Stopping ollama.service - Ollama Service...
Apr 14 11:52:17 Server-666 systemd[1]: ollama.service: Deactivated successfully.
Apr 14 11:52:17 Server-666 systemd[1]: Stopped ollama.service - Ollama Service.
Apr 14 11:52:17 Server-666 systemd[1]: Started ollama.service - Ollama Service.
Apr 14 11:52:17 Server-666 ollama[50794]: 2025/04/14 11:52:17 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVI>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=images.go:458 msg="total blo>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=images.go:465 msg="total unu>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=routes.go:1298 msg="Listenin>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.989+08:00 level=INFO source=gpu.go:217 msg="looking for >
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.999+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.999+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:52:17 Server-666 ollama[50794]: time=2025-04-14T11:52:17.999+08:00 level=INFO source=gpu.go:612 msg="Unable to lo>
Apr 14 11:52:18 Server-666 ollama[50794]: time=2025-04-14T11:52:18.001+08:00 level=INFO source=gpu.go:377 msg="no compatibl>
Apr 14 11:52:18 Server-666 ollama[50794]: time=2025-04-14T11:52:18.001+08:00 level=INFO source=types.go:130 msg="inference >

#lspci | grep -i nvidia

root@Server-666:/# lspci | grep -i nvidia
84:00.0 3D controller: NVIDIA Corporation GP104GL [Tesla P4] (rev a1)

#nvidia-smi和nvcc --version

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
root@Server-666:/# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:34:21_PDT_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0
root@Server-666:/# find / -name libcuda.so 2>/dev/null
/usr/lib/x86_64-linux-gnu/libcuda.so
/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/libcuda.so
root@Server-666:/# find / -name libnvidia-nvvm.so* 2>/dev/null
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.560.28.03
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so
root@Server-666:/# find / -name libnvidia-ml.so* 2>/dev/null
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.560.28.03
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/libnvidia-ml.so
root@Server-666:/# ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195, 0 Apr 14 11:10 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Apr 14 11:10 /dev/nvidiactl
crw-rw-rw- 1 root root 236, 0 Apr 14 11:10 /dev/nvidia-uvm
crw-rw-rw- 1 root root 236, 1 Apr 14 11:10 /dev/nvidia-uvm-tools

/dev/nvidia-caps:
total 0
cr-------- 1 root root 239, 1 Apr 14 11:10 nvidia-cap1
cr--r--r-- 1 root root 239, 2 Apr 14 11:10 nvidia-cap2

神经蛙_oZU3S · 2025-4-14 17:28:15

很奇怪，AI相册能正常调用GPU

飞牛技术同学 · 2025-4-15 15:32:55

目前无法调用gpu

神经蛙_oZU3S · 2025-4-15 18:52:45

飞牛技术同学发表于 2025-4-15 15:32
目前无法调用gpu

好吧，我是服务器装的飞牛，纯CPU推理在夏天温度有些顶不住（）

janewar666 · 2025-5-11 14:55:15

飞牛技术同学发表于 2025-4-15 15:32
目前无法调用gpu

我可以死心了，我以为是bug。

maysoul · 2025-6-1 17:39:52

飞牛技术同学发表于 2025-4-15 15:32
目前无法调用gpu

不用应用商店的，docker 安装也没法调用是吗？

脑壳儿疼。 · 2025-6-6 09:14:52

docker调用gpu好像要装镜像的，我这边nvme的一直没成功，你可以试试

		自动登录	找回密码
密码			立即注册

ollama无法调用GPU

点评

浏览过的版块