收起左侧

开启nfs共享后引发的kernel crash

3
回复
549
查看
[ 复制链接 ]

1

主题

0

回帖

0

牛值

江湖小虾

2025-1-4 23:14:35 显示全部楼层 阅读模式

设备环境:(pve物理机上安装了一台飞牛系统(V0.8.29)以及另一台ubuntu机器,通过飞牛nas开启的nfs共享给ubuntu机器使用,相当于飞牛nas拥有某一块磁盘的所有权,其他机器都是通过nfs挂载来使用这块磁盘),:

BUG现象:

ubuntu机器上安装的gitea服务,目前在使用gitea的相关操作时(比如新建repo)会超时,具体到原因就是此时飞牛机器会有kernel crash。飞牛的dmesg信息如下

 fh_update: jit.git/tdmLnbh still negative!
[  491.227618] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  491.227628] #PF: supervisor read access in kernel mode
[  491.227637] #PF: error_code(0x0000) - not-present page
[  491.227644] PGD 0 P4D 0
[  491.227650] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  491.227659] CPU: 2 PID: 1275 Comm: nfsd Tainted: G            E      6.6.38-trim #80
[  491.227671] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2023.08-4 02/15/2024
[  491.227682] RIP: 0010:fh_verify+0x19e/0x7a0 [nfsd]
[  491.227743] Code: 03 00 00 4c 89 ee 48 89 ef 44 89 54 24 08 e8 09 fd ff ff 41 89 c7 85 c0 0f 85 d1 fe ff ff 49 8b 46 30 44 8b 54 24 08 41 8b 16 <0f> b7 00 66 45 85 d2 0f 84 75 02 00 00 66 25 00 f0 66 41 39 c2 0f
[  491.227768] RSP: 0018:ffffbbdd4212fd90 EFLAGS: 00010246
[  491.227777] RAX: 0000000000000000 RBX: ffff91fdc7b1e028 RCX: 0000000000000001
[  491.227787] RDX: 0000000000000088 RSI: ffff91fdc717b2d8 RDI: 0000000000000000
[  491.227797] RBP: ffff91fdc2ed0000 R08: ffffbbdd4212fc70 R09: ffff91fdc717b2d8
[  491.227807] R10: 0000000000000000 R11: ffffffffaeccaae8 R12: 0000000000000000
[  491.227817] R13: ffff91fdc5ff1c00 R14: ffff91fdc05939c0 R15: 0000000000000000
[  491.227827] FS:  0000000000000000(0000) GS:ffff91ff37d00000(0000) knlGS:0000000000000000
[  491.227839] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  491.227847] CR2: 0000000000000000 CR3: 00000001063b6003 CR4: 0000000000770ee0
[  491.227860] PKRU: 55555554
[  491.227866] Call Trace:
[  491.227872]  <TASK>
[  491.227878]  ? __die+0x1f/0x70
[  491.227887]  ? page_fault_oops+0x159/0x450
[  491.227896]  ? do_user_addr_fault+0x65/0x620
[  491.227904]  ? exc_page_fault+0x73/0x170
[  491.227913]  ? asm_exc_page_fault+0x22/0x30
[  491.227923]  ? fh_verify+0x19e/0x7a0 [nfsd]
[  491.227974]  nfsd4_getattr+0x16/0x80 [nfsd]
[  491.228021]  nfsd4_proc_compound+0x352/0x670 [nfsd]
[  491.228068]  nfsd_dispatch+0xe2/0x200 [nfsd]
[  491.228358]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[  491.228657]  svc_process_common+0x2ed/0x6f0 [sunrpc]
[  491.228958]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[  491.229243]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[  491.229533]  svc_process+0x12d/0x170 [sunrpc]
[  491.229832]  nfsd+0x80/0xd0 [nfsd]
[  491.230118]  kthread+0xf0/0x120
[  491.230370]  ? __pfx_kthread+0x10/0x10
[  491.230625]  ret_from_fork+0x2d/0x50
[  491.230882]  ? __pfx_kthread+0x10/0x10
[  491.231127]  ret_from_fork_asm+0x1b/0x30
[  491.231372]  </TASK>

搜索了下发现有可能时内核bug或者nfs4的bug,具体不清楚,还请官方大大跟踪下

目前我的gitea服务就没有使用挂载的nfs文件系统,改用本地磁盘后就可以正常使用了

出现频率:(必现)

联系方式:(216群-上牛牛)

收藏
送赞 1
分享

71

主题

9693

回帖

0

牛值

管理员

2025-1-6 17:49:43 显示全部楼层
感谢反馈,我们跟进一下

0

主题

6

回帖

0

牛值

江湖小虾

2025-7-26 19:31:47 显示全部楼层

同样问题,我的是发生在创建软链接时

[  427.858843] fh_update: DISK01/fds still negative!
[  427.858898] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  427.858914] #PF: supervisor read access in kernel mode
[  427.858926] #PF: error_code(0x0000) - not-present page
[  427.858937] PGD 0 P4D 0 
[  427.858952] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
[  427.858967] CPU: 3 UID: 0 PID: 1394 Comm: nfsd Tainted: G     U      E      6.12.18-trim #5
[  427.858989] Tainted: [U]=USER, [E]=UNSIGNED_MODULE
[  427.858999] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 11/24/2018
[  427.859014] RIP: 0010:__fh_verify+0x1ad/0x7c0 [nfsd]
[  427.859222] Code: cd 4c 89 fa 48 89 de e8 51 87 fe ff 65 ff 0d 52 c5 1b 3f 0f 85 d6 fe ff ff 0f 1f 44 00 00 e9 cc fe ff ff 49 8b 46 30 41 8b 16 <0f> b7 00 66 45 85 ed 0f 84 b5 02 00 00 66 25 00 f0 66 41 39 c5 0f
[  427.859246] RSP: 0018:ffffba0bc0f77d10 EFLAGS: 00010246
[  427.859261] RAX: 0000000000000000 RBX: ffff8ef28bd8c000 RCX: ffff8ef28b6c6000
[  427.859274] RDX: 0000000000000088 RSI: ffff8ef280d4bf98 RDI: 0000000000000000
[  427.859287] RBP: ffff8ef2fe296600 R08: ffffba0bc0f77c10 R09: 0000000000000000
[  427.859299] R10: ffffba0bc0f77c68 R11: ffff8ef315a2c840 R12: ffff8ef28bd8c160
[  427.859312] R13: 0000000000000000 R14: ffff8ef28c968480 R15: ffff8ef2912c4028
[  427.859325] FS:  0000000000000000(0000) GS:ffff8ef3b7d80000(0000) knlGS:0000000000000000
[  427.859340] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  427.859352] CR2: 0000000000000000 CR3: 000000001b01c000 CR4: 00000000001026f0
[  427.859366] Call Trace:
[  427.859378]  <TASK>
[  427.859391]  ? __die+0x1f/0x60
[  427.859411]  ? page_fault_oops+0x155/0x510
[  427.859428]  ? __fh_verify+0x1ad/0x7c0 [nfsd]
[  427.859619]  ? search_bpf_extables+0x5b/0x80
[  427.859638]  ? exc_page_fault+0x72/0x190
[  427.859656]  ? asm_exc_page_fault+0x22/0x30
[  427.859676]  ? __fh_verify+0x1ad/0x7c0 [nfsd]
[  427.859868]  ? nfsd4_encode_nfs_fh4+0x45/0x80 [nfsd]
[  427.860062]  fh_verify+0x3d/0x60 [nfsd]
[  427.860253]  nfsd4_getattr+0x16/0x80 [nfsd]
[  427.860449]  nfsd4_proc_compound+0x34c/0x670 [nfsd]
[  427.860648]  nfsd_dispatch+0xf4/0x210 [nfsd]
[  427.860842]  svc_process_common+0x30c/0x720 [sunrpc]
[  427.861069]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[  427.861264]  svc_process+0x12d/0x1c0 [sunrpc]
[  427.861481]  svc_recv+0x806/0x9e0 [sunrpc]
[  427.861694]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[  427.861883]  nfsd+0x9f/0x100 [nfsd]
[  427.862071]  kthread+0xdd/0x110
[  427.862087]  ? __pfx_kthread+0x10/0x10
[  427.862102]  ret_from_fork+0x30/0x50
[  427.862118]  ? __pfx_kthread+0x10/0x10
[  427.862132]  ret_from_fork_asm+0x1a/0x30
[  427.862153]  </TASK>
[  427.862161] Modules linked in: rpcsec_gss_krb5(E) tun(E) veth(E) xt_nat(E) xt_tcpudp(E) xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) br_netfilter(E) bridge(E) stp(E) llc(E) cls_bpf(E) sch_ingress(E) overlay(E) nfnetlink_cttimeout(E) nfnetlink(E) openvswitch(E) nsh(E) nf_conncount(E) nf_nat(E) rfkill(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) psample(E) qrtr(E) intel_rapl_msr(E) intel_rapl_common(E) intel_soc_dts_thermal(E) intel_soc_dts_iosf(E) intel_powerclamp(E) coretemp(E) snd_hda_codec_hdmi(E) kvm_intel(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) binfmt_misc(E) snd_hda_scodec_component(E) kvm(E) crct10dif_pclmul(E) ghash_clmulni_intel(E) cryptd(E) nls_ascii(E) sha512_ssse3(E) nls_cp437(E) sha512_generic(E) sha256_ssse3(E) mei_pxp(E) mei_hdcp(E) evdev(E) sha1_ssse3(E) vfat(E) fat(E) i915(E) intel_cstate(E) iTCO_wdt(E) intel_pmc_bxt(E) at24(E) serio_raw(E) iTCO_vendor_support(E) watchdog(E)
[  427.862355]  snd_pcsp(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_intel_sdw_acpi(E) cec(E) drm_buddy(E) snd_hda_codec(E) drm_display_helper(E) snd_hda_core(E) snd_hwdep(E) ttm(E) button(E) snd_pcm(E) snd_timer(E) mei_txe(E) snd(E) drm_kms_helper(E) soundcore(E) mei(E) i2c_algo_bit(E) sg(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) msr(E) grace(E) bonding(E) loop(E) fuse(E) drm(E) efi_pstore(E) configfs(E) sunrpc(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) efivarfs(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid0(E) linear(E) dm_mod(E) raid1(E) md_mod(E) sd_mod(E) ahci(E) libahci(E) ehci_pci(E) e1000e(E) ptp(E) libata(E) crc32_pclmul(E) ehci_hcd(E) pps_core(E) crc32c_intel(E) lpc_ich(E) usbcore(E) i2c_i801(E) usb_common(E) i2c_smbus(E) scsi_mod(E) scsi_common(E) video(E) wmi(E)
[  427.862711] CR2: 0000000000000000
[  427.862723] ---[ end trace 0000000000000000 ]---
[  427.899490] RIP: 0010:__fh_verify+0x1ad/0x7c0 [nfsd]
[  427.899701] Code: cd 4c 89 fa 48 89 de e8 51 87 fe ff 65 ff 0d 52 c5 1b 3f 0f 85 d6 fe ff ff 0f 1f 44 00 00 e9 cc fe ff ff 49 8b 46 30 41 8b 16 <0f> b7 00 66 45 85 ed 0f 84 b5 02 00 00 66 25 00 f0 66 41 39 c5 0f
[  427.899726] RSP: 0018:ffffba0bc0f77d10 EFLAGS: 00010246
[  427.899742] RAX: 0000000000000000 RBX: ffff8ef28bd8c000 RCX: ffff8ef28b6c6000
[  427.899756] RDX: 0000000000000088 RSI: ffff8ef280d4bf98 RDI: 0000000000000000
[  427.899769] RBP: ffff8ef2fe296600 R08: ffffba0bc0f77c10 R09: 0000000000000000
[  427.899781] R10: ffffba0bc0f77c68 R11: ffff8ef315a2c840 R12: ffff8ef28bd8c160
[  427.899794] R13: 0000000000000000 R14: ffff8ef28c968480 R15: ffff8ef2912c4028
[  427.899807] FS:  0000000000000000(0000) GS:ffff8ef3b7d80000(0000) knlGS:0000000000000000
[  427.899822] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  427.899834] CR2: 0000000000000000 CR3: 00000001565ce000 CR4: 00000000001026f0
[  427.899848] note: nfsd[1394] exited with irqs disabled

0

主题

6

回帖

0

牛值

江湖小虾

2025-7-27 17:13:06 显示全部楼层

测试了飞牛的内核,6.12和6.6的内核都是一样的问题。目前换回debian的内核6.1.0-37,可以正常使用nfs了,创建软链接也不会异常了,只是需要重新配置nfs的exports(/etc/exports)。但是更换内核,不保证会不会影响到飞牛的某些功能,希望官方还是能赶紧修复这个问题吧

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则