收起左侧

英伟达p620 显卡安装驱动后执行nvidia-smi卡死,重启开机页面卡住

1
回复
673
查看
[ 复制链接 ]

4

主题

0

回帖

0

牛值

江湖小虾

2025-2-20 16:04:36 显示全部楼层 阅读模式

设备环境:(虚拟机、局域网、0.8.36-645系统/APP版本号)

BUG现象:(应用中心安装英伟达560驱动后,后台执行nvidia-smi命令卡住,重启后页面如下并且无法操作,ssh输入密码无响应)

image.png

出现频率:(偶现)

联系方式:(微信-fnOS141-SuPerC)

syslog日志:

2025-02-20T15:48:47.396872+08:00 fnos-test trim_app_center[1437]: #015
2025-02-20T15:48:47.396917+08:00 fnos-test trim_app_center[1437]: 2025/02/20 15:48:47 #033[31;1m/app/core/storage/dao/download_file.go:31 #033[35;1mrecord not found
2025-02-20T15:48:47.396933+08:00 fnos-test trim_app_center[1437]: #033[0m#033[33m[0.343ms] #033[34;1m[rows:0]#033[0m SELECT * FROM "download_file" WHERE app_name = 'Nvidia-Driver-560' and version = '1.0.13' LIMIT 1
2025-02-20T15:49:50.719025+08:00 fnos-test systemd[1]: libvirtd.service: Deactivated successfully.
2025-02-20T15:50:01.812468+08:00 fnos-test trim_app_center[1437]: #015
2025-02-20T15:50:01.812508+08:00 fnos-test trim_app_center[1437]: 2025/02/20 15:50:01 #033[31;1m/app/core/storage/dao/system_config.go:32 #033[35;1mrecord not found
2025-02-20T15:50:01.812535+08:00 fnos-test trim_app_center[1437]: #033[0m#033[33m[0.212ms] #033[34;1m[rows:0]#033[0m SELECT * FROM "system_config" WHERE type = 'volume' and k = 'lastUsed' ORDER BY "system_config"."id" LIMIT 1
2025-02-20T15:52:00.449808+08:00 fnos-test trim_app_center[1437]: #015
2025-02-20T15:52:00.449865+08:00 fnos-test trim_app_center[1437]: 2025/02/20 15:52:00 #033[31;1m/app/core/storage/dao/app.go:64 #033[35;1mrecord not found
2025-02-20T15:52:00.449881+08:00 fnos-test trim_app_center[1437]: #033[0m#033[33m[0.347ms] #033[34;1m[rows:0]#033[0m SELECT * FROM "app" WHERE app_name = 'Nvidia-Driver-560' LIMIT 1
2025-02-20T15:52:00.618260+08:00 fnos-test trim[1326]: [Lingual]Cannot find event id:app.installing of service trim.app-center
2025-02-20T15:52:00.692909+08:00 fnos-test sysinfo_service[1345]: [ne]max connections: 100 (25 per worker), workers: 4
2025-02-20T15:52:00.692952+08:00 fnos-test sysinfo_service[1345]: [ne]socket timeout: 0s
2025-02-20T15:52:00.692972+08:00 fnos-test sysinfo_service[1345]: app id:com.trim.sysinfo
2025-02-20T15:52:00.692989+08:00 fnos-test sysinfo_service[1345]: app key:A81B3FE7-B257-4CB9-31AC-7DF89F038568
2025-02-20T15:52:06.234435+08:00 fnos-test nmbd[1065]: [2025/02/20 15:52:06.233301,  0] ../../source3/nmbd/nmbd_workgroupdb.c:279(dump_workgroups)
2025-02-20T15:52:06.234480+08:00 fnos-test nmbd[1065]:   dump_workgroups()
2025-02-20T15:52:06.234496+08:00 fnos-test nmbd[1065]:    dump workgroup on subnet      172.17.0.1: netmask=    255.255.0.0:
2025-02-20T15:52:06.234510+08:00 fnos-test nmbd[1065]:   #011WORKGROUP(1) current master browser = UNKNOWN
2025-02-20T15:52:06.234638+08:00 fnos-test nmbd[1065]:   #011#011FNOS-TEST 40819a03 (fnos-test server (Samba TRIM))
2025-02-20T15:52:06.234657+08:00 fnos-test nmbd[1065]: [2025/02/20 15:52:06.233356,  0] ../../source3/nmbd/nmbd_workgroupdb.c:279(dump_workgroups)
2025-02-20T15:52:06.234671+08:00 fnos-test nmbd[1065]:   dump_workgroups()
2025-02-20T15:52:06.234689+08:00 fnos-test nmbd[1065]:    dump workgroup on subnet 192.168.100.240: netmask=  255.255.255.0:
2025-02-20T15:52:06.234703+08:00 fnos-test nmbd[1065]:   #011WORKGROUP(1) current master browser = UNKNOWN
2025-02-20T15:52:06.234716+08:00 fnos-test nmbd[1065]:   #011#011FNOS-TEST 40819a03 (fnos-test server (Samba TRIM))
2025-02-20T15:52:06.250113+08:00 fnos-test nmbd[1065]: [2025/02/20 15:52:06.248852,  0] ../../source3/nmbd/nmbd_workgroupdb.c:279(dump_workgroups)
2025-02-20T15:52:06.250165+08:00 fnos-test nmbd[1065]:   dump_workgroups()
2025-02-20T15:52:06.250182+08:00 fnos-test nmbd[1065]:    dump workgroup on subnet      172.17.0.1: netmask=    255.255.0.0:
2025-02-20T15:52:06.250196+08:00 fnos-test nmbd[1065]:   #011WORKGROUP(1) current master browser = UNKNOWN
2025-02-20T15:52:06.250210+08:00 fnos-test nmbd[1065]:   #011#011FNOS-TEST 40819a03 (fnos-test server (Samba TRIM))
2025-02-20T15:52:06.250222+08:00 fnos-test nmbd[1065]: [2025/02/20 15:52:06.249555,  0] ../../source3/nmbd/nmbd_workgroupdb.c:279(dump_workgroups)
2025-02-20T15:52:06.250234+08:00 fnos-test nmbd[1065]:   dump_workgroups()
2025-02-20T15:52:06.250246+08:00 fnos-test nmbd[1065]:    dump workgroup on subnet 192.168.100.240: netmask=  255.255.255.0:
2025-02-20T15:52:06.250258+08:00 fnos-test nmbd[1065]:   #011WORKGROUP(1) current master browser = UNKNOWN
2025-02-20T15:52:06.250270+08:00 fnos-test nmbd[1065]:   #011#011FNOS-TEST 40819a03 (fnos-test server (Samba TRIM))
2025-02-20T15:52:06.862102+08:00 fnos-test NetworkManager[734]: <info>  [1740037926.8570] manager: kernel firmware directory '/lib/firmware' changed
2025-02-20T15:52:09.042403+08:00 fnos-test trim[1326]: #033[1;34m# cache_passwd_group#033[0m
2025-02-20T15:52:09.042441+08:00 fnos-test trim[1326]: #033[1;34m# cache_passwd_group#033[0m
2025-02-20T15:52:28.350536+08:00 fnos-test kernel: [  286.891610] nvidia-nvlink: Nvlink Core is being initialized, major device number 243
2025-02-20T15:52:28.350549+08:00 fnos-test kernel: [  286.891616]
2025-02-20T15:52:28.350550+08:00 fnos-test kernel: [  286.893113] nvidia 0000:07:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
2025-02-20T15:52:28.470562+08:00 fnos-test kernel: [  287.009714] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  560.28.03  Thu Jul 18 19:32:18 UTC 2024
2025-02-20T15:52:28.502553+08:00 fnos-test kernel: [  287.042353] [drm] [nvidia-drm] [GPU ID 0x00000700] Loading driver
2025-02-20T15:52:28.502566+08:00 fnos-test kernel: [  287.042355] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:07:00.0 on minor 1
2025-02-20T15:52:29.110589+08:00 fnos-test systemd[1]: Stopping mediasrv.service - Mediasrv Service...
2025-02-20T15:52:29.111522+08:00 fnos-test systemd[1]: mediasrv.service: Deactivated successfully.
2025-02-20T15:52:29.111641+08:00 fnos-test systemd[1]: Stopped mediasrv.service - Mediasrv Service.
2025-02-20T15:52:29.134960+08:00 fnos-test systemd[1]: Started mediasrv.service - Mediasrv Service.
2025-02-20T15:52:29.138647+08:00 fnos-test systemd[1]: Stopping resmon_service.service - trim resmon service...
2025-02-20T15:52:29.139791+08:00 fnos-test systemd[1]: resmon_service.service: Deactivated successfully.
2025-02-20T15:52:29.139893+08:00 fnos-test systemd[1]: Stopped resmon_service.service - trim resmon service.
2025-02-20T15:52:29.141269+08:00 fnos-test systemd[1]: Started resmon_service.service - trim resmon service.
2025-02-20T15:52:29.149766+08:00 fnos-test resmon_service[2413]: SPDLOG: create spdlog success!
2025-02-20T15:52:29.150165+08:00 fnos-test rpc_broker[1338]: #033[1;33m[ne]post_close_event to non-existent socket 21, session: 93921607703.
2025-02-20T15:52:31.182938+08:00 fnos-test TRIMEVENT[2413]: [Publisher]: command channel:ipc:///run/trim_message/trim.service.resmon.mounts.cmd, message channel:ipc:///run/trim_message/trim.service.resmon.mounts.msg
2025-02-20T15:52:31.183098+08:00 fnos-test TRIMEVENT[2413]: Initialize the command thread!
2025-02-20T15:52:31.183961+08:00 fnos-test TRIMEVENT[2413]: [Publisher]: command channel:ipc:///run/trim_message/trim.service.resmon.passwd.cmd, message channel:ipc:///run/trim_message/trim.service.resmon.passwd.msg
2025-02-20T15:52:31.183994+08:00 fnos-test TRIMEVENT[2413]: Initialize the command thread!
2025-02-20T15:52:33.183464+08:00 fnos-test TRIMEVENT[2413]: Start the command loop!
2025-02-20T15:52:33.184079+08:00 fnos-test TRIMEVENT[2413]: Start the command loop!
2025-02-20T15:52:33.545880+08:00 fnos-test kernel: [  292.078836] BUG: kernel NULL pointer dereference, address: 0000000000000028
2025-02-20T15:52:33.545894+08:00 fnos-test kernel: [  292.078839] #PF: error_code(0x0000) - not-present page
2025-02-20T15:52:33.545896+08:00 fnos-test kernel: [  292.078840] PGD 0 P4D 0
2025-02-20T15:52:33.545897+08:00 fnos-test kernel: [  292.078842] Oops: 0000 [#1] PREEMPT SMP NOPTI
2025-02-20T15:52:33.545897+08:00 fnos-test kernel: [  292.079258] RDX: ffff90f98983fe48 RSI: ffff90f98426e008 RDI: 0000000000000005
2025-02-20T15:52:33.545898+08:00 fnos-test kernel: [  292.079259] R10: 0000000000800004 R11: ffffae4f03d73000 R12: ffff90f98426e008
2025-02-20T15:52:33.545899+08:00 fnos-test kernel: [  292.079260] R13: ffff90f98a088008 R14: 00000000fffefc10 R15: 0000000000000000
2025-02-20T15:52:33.545899+08:00 fnos-test kernel: [  292.079261] FS:  00007f337ebf96c0(0000) GS:ffff90f9fbc00000(0000) knlGS:0000000000000000
2025-02-20T15:52:33.545900+08:00 fnos-test kernel: [  292.079262] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2025-02-20T15:52:33.545901+08:00 fnos-test kernel: [  292.079263] CR2: 0000000000000028 CR3: 0000000108df4000 CR4: 0000000000750ef0
2025-02-20T15:52:33.545913+08:00 fnos-test kernel: [  292.079267] PKRU: 55555554
2025-02-20T15:52:33.545914+08:00 fnos-test kernel: [  292.079267] Call Trace:
2025-02-20T15:52:33.545914+08:00 fnos-test kernel: [  292.079269]  <TASK>
2025-02-20T15:52:33.545915+08:00 fnos-test kernel: [  292.079271]  ? __die+0x1f/0x70
2025-02-20T15:52:33.545915+08:00 fnos-test kernel: [  292.079274]  ? page_fault_oops+0x159/0x450
2025-02-20T15:52:33.545916+08:00 fnos-test kernel: [  292.079276]  ? __slab_free+0xb4/0x2d0
2025-02-20T15:52:33.545917+08:00 fnos-test kernel: [  292.079645]  ? _nv025000rm+0x789/0xf80 [nvidia]
2025-02-20T15:52:33.545917+08:00 fnos-test kernel: [  292.079993]  ? _nv035746rm+0x1f3/0x290 [nvidia]
2025-02-20T15:52:33.545918+08:00 fnos-test kernel: [  292.080337]  ? _nv000727rm+0x522/0x599 [nvidia]
2025-02-20T15:52:33.545919+08:00 fnos-test kernel: [  292.080689]  ? _nv017895rm+0x6a/0x9b [nvidia]
2025-02-20T15:52:33.545919+08:00 fnos-test kernel: [  292.081028]  ? _raw_spin_lock_irqsave+0x23/0x50
2025-02-20T15:52:33.545920+08:00 fnos-test kernel: [  292.081030]  ? _nv017873rm+0x76/0xa0 [nvidia]
2025-02-20T15:52:33.545920+08:00 fnos-test kernel: [  292.081384]  ? _nv004241rm+0xd/0x20 [nvidia]
2025-02-20T15:52:33.545921+08:00 fnos-test kernel: [  292.081682]  ? _nv006121rm+0x1e/0xb0 [nvidia]
2025-02-20T15:52:33.545922+08:00 fnos-test kernel: [  292.082029]  ? _nv018077rm+0x59c/0x680 [nvidia]
2025-02-20T15:52:33.545922+08:00 fnos-test kernel: [  292.082326]  ? _nv048176rm+0xb3/0xe0 [nvidia]
2025-02-20T15:52:33.545923+08:00 fnos-test kernel: [  292.082577]  ? _nv049950rm+0xb3/0x180 [nvidia]
2025-02-20T15:52:33.545923+08:00 fnos-test kernel: [  292.082878]  ? _nv049949rm+0x4ae/0x660 [nvidia]
2025-02-20T15:52:33.545924+08:00 fnos-test kernel: [  292.083178]  ? _nv048068rm+0xdd/0x190 [nvidia]
2025-02-20T15:52:33.545924+08:00 fnos-test kernel: [  292.083436]  ? _nv017727rm+0x105/0x220 [nvidia]
2025-02-20T15:52:33.545925+08:00 fnos-test kernel: [  292.083832]  ? _nv004137rm+0xd/0x20 [nvidia]
2025-02-20T15:52:33.545925+08:00 fnos-test kernel: [  292.084128]  ? _nv006121rm+0x1e/0xb0 [nvidia]
2025-02-20T15:52:33.545926+08:00 fnos-test kernel: [  292.084426]  ? _nv049232rm+0xd1/0x210 [nvidia]
2025-02-20T15:52:33.545927+08:00 fnos-test kernel: [  292.084799]  ? _nv038824rm+0xc2/0xf0 [nvidia]
2025-02-20T15:52:33.545927+08:00 fnos-test kernel: [  292.085183]  ? _nv038620rm+0x3d/0x80 [nvidia]
2025-02-20T15:52:33.545928+08:00 fnos-test kernel: [  292.085558]  ? _nv035811rm+0x51/0x150 [nvidia]
2025-02-20T15:52:33.545928+08:00 fnos-test kernel: [  292.085933]  ? _nv035802rm+0x64/0x1d0 [nvidia]
2025-02-20T15:52:33.545929+08:00 fnos-test kernel: [  292.086309]  ? _nv027909rm+0x97/0x1e0 [nvidia]
2025-02-20T15:52:33.545929+08:00 fnos-test kernel: [  292.086738]  ? _nv000786rm+0x1f0/0x344 [nvidia]
2025-02-20T15:52:33.545930+08:00 fnos-test kernel: [  292.086980]  ? _nv000735rm+0x486/0x2030 [nvidia]
2025-02-20T15:52:33.545930+08:00 fnos-test kernel: [  292.087225]  ? rm_init_adapter+0xcd/0xf0 [nvidia]
2025-02-20T15:52:33.545931+08:00 fnos-test kernel: [  292.087477]  ? nv_open_device+0x407/0xa70 [nvidia]
2025-02-20T15:52:33.545931+08:00 fnos-test kernel: [  292.087710]  ? nvidia_open+0x252/0x4b0 [nvidia]
2025-02-20T15:52:33.545932+08:00 fnos-test kernel: [  292.087943]  ? chrdev_open+0xc7/0x240
2025-02-20T15:52:33.545932+08:00 fnos-test kernel: [  292.087946]  ? __pfx_chrdev_open+0x10/0x10
2025-02-20T15:52:33.545933+08:00 fnos-test kernel: [  292.087948]  ? do_dentry_open+0x219/0x530
2025-02-20T15:52:33.545933+08:00 fnos-test kernel: [  292.087949]  ? path_openat+0xd1e/0x1140
2025-02-20T15:52:33.545934+08:00 fnos-test kernel: [  292.087951]  ? do_filp_open+0x131/0x160
2025-02-20T15:52:33.545934+08:00 fnos-test kernel: [  292.087952]  ? __virt_addr_valid+0xb0/0x140
2025-02-20T15:52:33.545935+08:00 fnos-test kernel: [  292.087954]  ? __check_object_size+0x16a/0x2c0
2025-02-20T15:52:33.545935+08:00 fnos-test kernel: [  292.087957]  ? do_sys_openat2+0x91/0xc0
2025-02-20T15:52:33.545936+08:00 fnos-test kernel: [  292.087959]  ? __x64_sys_openat+0x6a/0xa0
2025-02-20T15:52:33.545936+08:00 fnos-test kernel: [  292.087961]  ? do_syscall_64+0x34/0x80
2025-02-20T15:52:33.545937+08:00 fnos-test kernel: [  292.087963]  ? entry_SYSCALL_64_after_hwframe+0x78/0xe2
2025-02-20T15:52:33.545938+08:00 fnos-test kernel: [  292.087965]  </TASK>
2025-02-20T15:52:33.545938+08:00 fnos-test kernel: [  292.087966] Modules linked in: nvidia_drm(POE) nvidia_uvm(POE) nvidia_modeset(POE) nvidia(POE) video(E) xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) nfnetlink(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) openvswitch(E) nsh(E) nf_conncount(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) qrtr(E) binfmt_misc(E) intel_rapl_msr(E) intel_rapl_common(E) kvm_intel(E) kvm(E) irqbypass(E) ghash_clmulni_intel(E) sha512_ssse3(E) sha512_generic(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) libaes(E) crypto_simd(E) cryptd(E) rapl(E) snd_pcsp(E) snd_pcm(E) snd_timer(E) cec(E) bochs(E) snd(E) iTCO_wdt(E) drm_vram_helper(E) intel_pmc_bxt(E) soundcore(E) iTCO_vendor_support(E) drm_ttm_helper(E) wmi(E) watchdog(E) virtio_console(E) virtio_balloon(E) ttm(E) drm_kms_helper(E) evdev(E) joydev(E) serio_raw(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) msr(E) bonding(E) drm(E) fuse(E) efi_pstore(E)
2025-02-20T15:52:33.545939+08:00 fnos-test kernel: [  292.088028] CR2: 0000000000000028
2025-02-20T15:52:33.545940+08:00 fnos-test kernel: [  292.088415] Code: 4d 48 8b 82 50 07 00 00 48 85 c0 74 69 48 8b 97 00 21 00 00 48 8b 92 98 08 00 00 48 85 d2 74 49 8b b8 98 06 00 00 48 8b 42 50 <48> 8b 3c f8 48 85 ff 74 36 48 8b 47 78 48 89 f2 48 83c4 08 4c 89
2025-02-20T15:52:33.545941+08:00 fnos-test kernel: [  292.088416] RSP: 0018:ffffae4f0286f720 EFLAGS: 00010286
2025-02-20T15:52:33.545941+08:00 fnos-test kernel: [  292.088418] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff90f985f2d008
2025-02-20T15:52:33.545942+08:00 fnos-test kernel: [  292.088418] RDX: ffff90f98983fe48 RSI: ffff90f98426e008 RDI: 0000000000000005
2025-02-20T15:52:33.545942+08:00 fnos-test kernel: [  292.088419] RBP: ffff90f9868ed790 R08: ffff90f98a088008 R09: ffff90f9926fbf08
2025-02-20T15:52:33.545943+08:00 fnos-test kernel: [  292.088420] R10: 0000000000800004 R11: ffffae4f03d73000 R12: ffff90f98426e008
2025-02-20T15:52:33.545943+08:00 fnos-test kernel: [  292.088420] R13: ffff90f98a088008 R14: 00000000fffefc10 R15: 0000000000000000
2025-02-20T15:52:33.545944+08:00 fnos-test kernel: [  292.088421] FS:  00007f337ebf96c0(0000) GS:ffff90f9fbc00000(0000) knlGS:0000000000000000
2025-02-20T15:52:33.545944+08:00 fnos-test kernel: [  292.088422] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2025-02-20T15:52:33.545945+08:00 fnos-test kernel: [  292.088423] CR2: 0000000000000028 CR3: 0000000108df4000 CR4: 0000000000750ef0
2025-02-20T15:52:33.545945+08:00 fnos-test kernel: [  292.088426] PKRU: 55555554
2025-02-20T15:52:33.545946+08:00 fnos-test kernel: [  292.088427] note: mediasrv[2448] exited with irqs disabled
2025-02-20T15:52:34.042793+08:00 fnos-test trim[1326]: [DEBUG] cgi 'resmon' exit.
2025-02-20T15:52:34.042836+08:00 fnos-test trim[1326]: [DEBUG] start cgi 'resmon'.
2025-02-20T15:52:34.086394+08:00 fnos-test TRIMEVENT[1437]: TRIMEVENT:{"data":{"APP_GROUP":"","APP_ID":1,"APP_NAME":"Nvidia-Driver-560","APP_USERNAME":"","DISPLAY_NAME":"Nvidia-Driver-560","INSTALL_VOLUME_ID":0,"META_VOLUME_ID":0,"PORT_USAGE":0},"datetime":1740037954,"eventId":"APP_INSTALLED","from":"trim.app-center","level":0}
2025-02-20T15:52:34.086460+08:00 fnos-test trim[1326]: [Lingual]Cannot find event id:app.installed of service trim.app-center
2025-02-20T15:52:39.150544+08:00 fnos-test rpc_broker[1338]: #033[0mSave done!
2025-02-20T15:53:10.389432+08:00 fnos-test nmbd[1065]: [2025/02/20 15:53:10.389197,  0] ../../source3/nmbd/nmbd_namequery.c:109(query_name_response)
2025-02-20T15:53:10.389499+08:00 fnos-test nmbd[1065]:   query_name_response: Multiple (2) responses received for a query on subnet 192.168.100.240 for name WORKGROUP<1d>.
2025-02-20T15:53:10.389524+08:00 fnos-test nmbd[1065]:   This response was from IP 192.168.100.106, reporting an IP address of 192.168.100.106.
2025-02-20T15:53:10.606824+08:00 fnos-test systemd[1]: Started session-3.scope - Session 3 of User admin.
收藏
送赞
分享

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

x

43

主题

8457

回帖

0

牛值

管理员

2025-2-21 17:14:49 显示全部楼层
感谢反馈,我们跟进一下
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则