硬件环境:
- 设备:联想 M720Q 小主机
- 网卡:Intel I219-V(板载,驱动为 e1000e)
- 系统:飞牛 OS(内核版本 6.12.18-trim)
问题现象:
-
启用 Open vSwitch(OVS)后,在运行虚拟机(含 OP 路由)且网络请求量较大时,频繁出现网卡卡死,表现为网络中断、连接超时。
-
内核日志(/var/log/syslog
)中持续报以下错误:
plaintext
e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
-
网卡型号与驱动匹配(lspci
显示为 Intel I219-V,modinfo e1000e
包含对应设备 ID),但高负载下(如多设备连接、大流量传输)必现卡死。
已做排查:
- 确认网卡物理连接正常(网线、交换机端口更换测试);
- 通过
dmesg
监控到网卡初始状态正常(NIC Link is Up 1000 Mbps Full Duplex
);
- 尝试过临时关闭网卡高级特性(TSO/GSO),效果不持久。
求助建议
- 临时缓解措施:
- 能否通过调整 OVS 配置降低物理网卡负载?例如修改转发模式、限制虚拟机带宽。
- 针对 e1000e 驱动,是否有适合飞牛 OS 的优化参数(如
InterruptThrottleRate
、多队列配置)?
- 驱动与系统层面:
- 飞牛 OS 是否有针对 Intel I219-V 网卡的驱动补丁?当前 e1000e 驱动版本为 2015 年版权,是否需要升级至官网最新版(如 5.10+)?
- 内核 6.12.18-trim 与 e1000e 驱动是否存在兼容性问题?是否需要降级或升级内核?
- OVS 与虚拟机配置:
- 运行 OP 路由的虚拟机在高负载下是否会触发网卡硬件保护?如何优化 OVS 与 e1000e 网卡的协同工作(如调整 OVS 转发模式、关闭不必要的流量统计)?
- 硬件替代方案:
- 若板载网卡性能不足,是否有推荐的兼容飞牛 OS 的独立网卡(如 Intel 千兆 / 2.5G 网卡)?
补充信息
- 已尝试临时重启网卡(
ifdown eno1 && ifup eno1
)可短暂恢复,但高负载下仍会复发。
- OVS 配置为默认网桥模式,虚拟机通过虚拟网卡(vnet0 等)连接至 OVS。
若有进一步调试方向或需要提供更多日志(如 ethtool eno1
输出、OVS 配置详情),可随时补充。
这样的描述清晰列出了环境、现象和已排查步骤,方便论坛大佬针对性给出解决方案。发帖时可附上关键日志截图(如错误信息、lspci
输出),更易获得关注。
root@M720Q:# lspci | grep -i ethernet
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10)
root@M720Q:#
root@M720Q:~# modinfo e1000e
filename: /lib/modules/6.12.18-trim/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
license: GPL v2
description: Intel(R) PRO/1000 Network Driver
alias: pci:v00008086d000057BAsvsdbcsci*
alias: pci:v00008086d000057B9svsdbcsci*
alias: pci:v00008086d000057B8svsdbcsci*
alias: pci:v00008086d000057B7svsdbcsci*
alias: pci:v00008086d000057B6svsdbcsci*
alias: pci:v00008086d000057B5svsdbcsci*
alias: pci:v00008086d000057B4svsdbcsci*
alias: pci:v00008086d000057B3svsdbcsci*
alias: pci:v00008086d000057A1svsdbcsci*
alias: pci:v00008086d000057A0svsdbcsci*
alias: pci:v00008086d00005511svsdbcsci*
alias: pci:v00008086d00005510svsdbcsci*
alias: pci:v00008086d0000550Fsvsdbcsci*
alias: pci:v00008086d0000550Esvsdbcsci*
alias: pci:v00008086d0000550Bsvsdbcsci*
alias: pci:v00008086d0000550Asvsdbcsci*
alias: pci:v00008086d0000550Dsvsdbcsci*
alias: pci:v00008086d0000550Csvsdbcsci*
alias: pci:v00008086d00000DC8svsdbcsci*
alias: pci:v00008086d00000DC7svsdbcsci*
alias: pci:v00008086d00001A1Dsvsdbcsci*
alias: pci:v00008086d00001A1Csvsdbcsci*
alias: pci:v00008086d00001A1Fsvsdbcsci*
alias: pci:v00008086d00001A1Esvsdbcsci*
alias: pci:v00008086d00000DC6svsdbcsci*
alias: pci:v00008086d00000DC5svsdbcsci*
alias: pci:v00008086d000015F5svsdbcsci*
alias: pci:v00008086d000015F4svsdbcsci*
alias: pci:v00008086d000015FAsvsdbcsci*
alias: pci:v00008086d000015F9svsdbcsci*
alias: pci:v00008086d000015FCsvsdbcsci*
alias: pci:v00008086d000015FBsvsdbcsci*
alias: pci:v00008086d00000D55svsdbcsci*
alias: pci:v00008086d00000D53svsdbcsci*
alias: pci:v00008086d00000D4Dsvsdbcsci*
alias: pci:v00008086d00000D4Csvsdbcsci*
alias: pci:v00008086d00000D4Fsvsdbcsci*
alias: pci:v00008086d00000D4Esvsdbcsci*
alias: pci:v00008086d000015E2svsdbcsci*
alias: pci:v00008086d000015E1svsdbcsci*
alias: pci:v00008086d000015E0svsdbcsci*
alias: pci:v00008086d000015DFsvsdbcsci*
alias: pci:v00008086d000015BCsvsdbcsci*
alias: pci:v00008086d000015BBsvsdbcsci*
alias: pci:v00008086d000015BEsvsdbcsci*
alias: pci:v00008086d000015BDsvsdbcsci*
alias: pci:v00008086d000015D6svsdbcsci*
alias: pci:v00008086d000015E3svsdbcsci*
alias: pci:v00008086d000015D8svsdbcsci*
alias: pci:v00008086d000015D7svsdbcsci*
alias: pci:v00008086d000015B9svsdbcsci*
alias: pci:v00008086d000015B8svsdbcsci*
alias: pci:v00008086d000015B7svsdbcsci*
alias: pci:v00008086d00001570svsdbcsci*
alias: pci:v00008086d0000156Fsvsdbcsci*
alias: pci:v00008086d000015A3svsdbcsci*
alias: pci:v00008086d000015A2svsdbcsci*
alias: pci:v00008086d000015A1svsdbcsci*
alias: pci:v00008086d000015A0svsdbcsci*
alias: pci:v00008086d00001559svsdbcsci*
alias: pci:v00008086d0000155Asvsdbcsci*
alias: pci:v00008086d0000153Bsvsdbcsci*
alias: pci:v00008086d0000153Asvsdbcsci*
alias: pci:v00008086d00001503svsdbcsci*
alias: pci:v00008086d00001502svsdbcsci*
alias: pci:v00008086d000010F0svsdbcsci*
alias: pci:v00008086d000010EFsvsdbcsci*
alias: pci:v00008086d000010EBsvsdbcsci*
alias: pci:v00008086d000010EAsvsdbcsci*
alias: pci:v00008086d00001525svsdbcsci*
alias: pci:v00008086d000010DFsvsdbcsci*
alias: pci:v00008086d000010DEsvsdbcsci*
alias: pci:v00008086d000010CEsvsdbcsci*
alias: pci:v00008086d000010CDsvsdbcsci*
alias: pci:v00008086d000010CCsvsdbcsci*
alias: pci:v00008086d000010CBsvsdbcsci*
alias: pci:v00008086d000010F5svsdbcsci*
alias: pci:v00008086d000010BFsvsdbcsci*
alias: pci:v00008086d000010E5svsdbcsci*
alias: pci:v00008086d0000294Csvsdbcsci*
alias: pci:v00008086d000010BDsvsdbcsci*
alias: pci:v00008086d000010C3svsdbcsci*
alias: pci:v00008086d000010C2svsdbcsci*
alias: pci:v00008086d000010C0svsdbcsci*
alias: pci:v00008086d00001501svsdbcsci*
alias: pci:v00008086d00001049svsdbcsci*
alias: pci:v00008086d0000104Dsvsdbcsci*
alias: pci:v00008086d0000104Bsvsdbcsci*
alias: pci:v00008086d0000104Asvsdbcsci*
alias: pci:v00008086d000010C4svsdbcsci*
alias: pci:v00008086d000010C5svsdbcsci*
alias: pci:v00008086d0000104Csvsdbcsci*
alias: pci:v00008086d000010BBsvsdbcsci*
alias: pci:v00008086d00001098svsdbcsci*
alias: pci:v00008086d000010BAsvsdbcsci*
alias: pci:v00008086d00001096svsdbcsci*
alias: pci:v00008086d0000150Csvsdbcsci*
alias: pci:v00008086d000010F6svsdbcsci*
alias: pci:v00008086d000010D3svsdbcsci*
alias: pci:v00008086d0000109Asvsdbcsci*
alias: pci:v00008086d0000108Csvsdbcsci*
alias: pci:v00008086d0000108Bsvsdbcsci*
alias: pci:v00008086d0000107Fsvsdbcsci*
alias: pci:v00008086d0000107Esvsdbcsci*
alias: pci:v00008086d0000107Dsvsdbcsci*
alias: pci:v00008086d000010B9svsdbcsci*
alias: pci:v00008086d000010D5svsdbcsci*
alias: pci:v00008086d000010DAsvsdbcsci*
alias: pci:v00008086d000010D9svsdbcsci*
alias: pci:v00008086d00001060svsdbcsci*
alias: pci:v00008086d000010A5svsdbcsci*
alias: pci:v00008086d000010BCsvsdbcsci*
alias: pci:v00008086d000010A4svsdbcsci*
alias: pci:v00008086d0000105Fsvsdbcsci*
alias: pci:v00008086d0000105Esvsdbcsci*
depends: ptp
intree: Y
name: e1000e
retpoline: Y
vermagic: 6.12.18-trim SMP preempt mod_unload modversions
parm: debug:Debug level (0=none,...,16=all) (int)
parm: copybreak:Maximum size of packet that is copied to a new buffer on receive (uint)
parm: TxIntDelay:Transmit Interrupt Delay (array of int)
parm: TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm: RxIntDelay:Receive Interrupt Delay (array of int)
parm: RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm: InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm: IntMode:Interrupt Mode (array of int)
parm: SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm: KumeranLockLoss:Enable Kumeran lock loss workaround (array of int)
parm: WriteProtectNVM:Write-protect NVM [WARNING: disabling this can lead to corrupted NVM] (array of int)
parm: CrcStripping:Enable CRC Stripping, disable if your BMC needs the CRC (array of int)
root@M720Q:~# ^C
root@M720Q:~# dmesg -w | grep -i eno1
[ 1.503927] e1000e 0000:00:1f.6 eno1: renamed from eth0
[ 5.035888] e1000e 0000:00:1f.6 eno1: entered promiscuous mode
[ 5.071053] eno1-ovs: entered promiscuous mode
[ 7.703902] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
^C
root@M720Q:~# dmesg -w | grep -i e1000e
[ 1.102565] e1000e: Intel(R) PRO/1000 Network Driver
[ 1.102567] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 1.103353] e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[ 1.299104] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock
[ 1.366062] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) f8:75:a4:bd:50:0a
[ 1.366074] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
[ 1.366175] e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
[ 1.503927] e1000e 0000:00:1f.6 eno1: renamed from eth0
[ 5.035888] e1000e 0000:00:1f.6 eno1: entered promiscuous mode
[ 7.703902] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
dmesg -w | grep -iE 'eth|eno|net|link|e1000e'
^C
root@M720Q:~# dmesg -w | grep -iE 'eth|eno|net|link|e1000e'
[ 0.000000] DMI: LENOVO 10T7002CUS/312D, BIOS M1UKT65A 03/03/2021
[ 0.011881] ACPI: RSDP 0x000000009AAA6000 000024 (v02 LENOVO)
[ 0.011886] ACPI: XSDT 0x000000009AAA60C0 0000FC (v01 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011893] ACPI: FACP 0x000000009AAF0E18 000114 (v06 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011900] ACPI: DSDT 0x000000009AAA6248 04ABCD (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011907] ACPI: APIC 0x000000009AAF0F30 0000A0 (v04 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011911] ACPI: FPDT 0x000000009AAF0FD0 000044 (v01 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011915] ACPI: FIDT 0x000000009AAF1018 00009C (v01 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011918] ACPI: MCFG 0x000000009AAF10B8 00003C (v01 LENOVO TC-M1U 00001650 MSFT 00000097)
[ 0.011922] ACPI: SSDT 0x000000009AAF10F8 0003A3 (v01 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011926] ACPI: SSDT 0x000000009AAF14A0 001B5A (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011930] ACPI: SLIC 0x000000009AAF3000 000176 (v01 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011934] ACPI: MSDM 0x000000009AAF3178 000055 (v03 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011938] ACPI: SSDT 0x000000009AAF31D0 0031C6 (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011942] ACPI: HPET 0x000000009AAF6398 000038 (v01 LENOVO TC-M1U 00001650 01000013)
[ 0.011946] ACPI: SSDT 0x000000009AAF63D0 001C01 (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011950] ACPI: SSDT 0x000000009AAF7FD8 000FAE (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011954] ACPI: UEFI 0x000000009AAF8F88 000042 (v01 LENOVO TC-M1U 00001650 01000013)
[ 0.011957] ACPI: LPIT 0x000000009AAF8FD0 00005C (v01 LENOVO TC-M1U 00001650 01000013)
[ 0.011961] ACPI: SSDT 0x000000009AAF9030 0027DE (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011965] ACPI: SSDT 0x000000009AAFB810 0014E2 (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011968] ACPI: DBGP 0x000000009AAFCCF8 000034 (v01 LENOVO TC-M1U 00001650 01000013)
[ 0.011972] ACPI: DBG2 0x000000009AAFCD30 000054 (v00 LENOVO TC-M1U 00001650 01000013)
[ 0.011976] ACPI: SSDT 0x000000009AAFCD88 001B67 (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011979] ACPI: DMAR 0x000000009AAFE8F0 0000A8 (v01 LENOVO TC-M1U 00001650 01000013)
[ 0.011983] ACPI: SSDT 0x000000009AAFE998 000144 (v02 LENOVO TC-M1U 00001650 INTL 20160527)
[ 0.011987] ACPI: NHLT 0x000000009AAFEAE0 00002D (v00 LENOVO TC-M1U 00001650 01000013)
[ 0.011991] ACPI: BGRT 0x000000009AAFEB10 000038 (v01 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.011995] ACPI: TPM2 0x000000009AAFEB48 000034 (v04 LENOVO TC-M1U 00001650 AMI 00000000)
[ 0.011999] ACPI: LUFT 0x000000009AAFEB80 034CC2 (v01 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.012003] ACPI: WSMT 0x000000009AB33848 000028 (v01 LENOVO TC-M1U 00001650 AMI 00010013)
[ 0.158200] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 0.158200] audit: initializing netlink subsys (disabled)
[ 0.364040] ACPI: PCI: Interrupt link LNKA configured for IRQ 0
[ 0.364153] ACPI: PCI: Interrupt link LNKB configured for IRQ 1
[ 0.364260] ACPI: PCI: Interrupt link LNKC configured for IRQ 0
[ 0.364392] ACPI: PCI: Interrupt link LNKD configured for IRQ 0
[ 0.364500] ACPI: PCI: Interrupt link LNKE configured for IRQ 0
[ 0.364606] ACPI: PCI: Interrupt link LNKF configured for IRQ 0
[ 0.364713] ACPI: PCI: Interrupt link LNKG configured for IRQ 0
[ 0.364820] ACPI: PCI: Interrupt link LNKH configured for IRQ 0
[ 0.374277] NetLabel: Initializing
[ 0.374277] NetLabel: domain hash size = 128
[ 0.374277] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 0.374277] NetLabel: unlabeled traffic allowed by default
[ 0.408891] NET: Registered PF_INET protocol family
[ 0.413815] NET: Registered PF_UNIX/PF_LOCAL protocol family
[ 0.413824] NET: Registered PF_XDP protocol family
[ 0.839622] NET: Registered PF_INET6 protocol family
[ 0.844083] NET: Registered PF_PACKET protocol family
[ 0.850261] integrity: Loaded X.509 cert 'Lenovo UEFI CA 2014: 4b91a68732eaefdd2c8ffffc6b027ec3449e9c8f'
[ 0.872740] integrity: Loaded X.509 cert 'Trust - Lenovo Certificate: bc19ccf68446c18b4a08dce9b1cb4deb'
[ 1.102565] e1000e: Intel(R) PRO/1000 Network Driver
[ 1.102567] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 1.103353] e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[ 1.299104] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock
[ 1.366062] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) f8:75:a4:bd:50:0a
[ 1.366074] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
[ 1.366175] e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
[ 1.474303] ata6: SATA link down (SStatus 4 SControl 300)
[ 1.474350] ata5: SATA link down (SStatus 4 SControl 300)
[ 1.474395] ata3: SATA link down (SStatus 4 SControl 300)
[ 1.474442] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 1.474484] ata2: SATA link down (SStatus 4 SControl 300)
[ 1.474527] ata4: SATA link down (SStatus 4 SControl 300)
[ 1.503927] e1000e 0000:00:1f.6 eno1: renamed from eth0
[ 3.666770] NET: Registered PF_BLUETOOTH protocol family
[ 4.592758] NET: Registered PF_QIPCRTR protocol family
[ 5.035888] e1000e 0000:00:1f.6 eno1: entered promiscuous mode
[ 5.071053] eno1-ovs: entered promiscuous mode
[ 5.121301] i915 0000:00:02.0: [drm] [ENCODER:98:DDI A/PHY A] failed to retrieve link info, disabling eDP
[ 7.703902] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 13.328812] netfs: FS-Cache loaded
[ 13.888931] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[ 13.970072] Initializing XFRM netlink socket
[ 14.820185] br-d483056e65b4: port 1(vethfbb09a8) entered blocking state
[ 14.820189] br-d483056e65b4: port 1(vethfbb09a8) entered disabled state
[ 14.820195] vethfbb09a8: entered allmulticast mode
[ 14.820313] vethfbb09a8: entered promiscuous mode
[ 14.820521] br-121ceae6d523: port 1(veth64f1fda) entered blocking state
[ 14.820525] br-121ceae6d523: port 1(veth64f1fda) entered disabled state
[ 14.820533] veth64f1fda: entered allmulticast mode
[ 14.820572] veth64f1fda: entered promiscuous mode
[ 14.838089] br-d483056e65b4: port 2(veth80c7afc) entered blocking state
[ 14.838094] br-d483056e65b4: port 2(veth80c7afc) entered disabled state
[ 14.838099] veth80c7afc: entered allmulticast mode
[ 14.838139] veth80c7afc: entered promiscuous mode
[ 14.863574] br-121ceae6d523: port 2(vethaafddf0) entered blocking state
[ 14.863579] br-121ceae6d523: port 2(vethaafddf0) entered disabled state
[ 14.863732] vethaafddf0: entered allmulticast mode
[ 14.863781] vethaafddf0: entered promiscuous mode
[ 15.062730] eth0: renamed from vethf89ae09
[ 15.063816] br-121ceae6d523: port 1(veth64f1fda) entered blocking state
[ 15.063821] br-121ceae6d523: port 1(veth64f1fda) entered forwarding state
[ 15.095106] eth0: renamed from vethef7a7e1
[ 15.122354] br-d483056e65b4: port 1(vethfbb09a8) entered blocking state
[ 15.122359] br-d483056e65b4: port 1(vethfbb09a8) entered forwarding state
[ 15.122553] eth0: renamed from veth2b8d01d
[ 15.122993] br-d483056e65b4: port 2(veth80c7afc) entered blocking state
[ 15.122996] br-d483056e65b4: port 2(veth80c7afc) entered forwarding state
[ 15.155682] eth0: renamed from veth7bf51c0
[ 15.157290] br-121ceae6d523: port 2(vethaafddf0) entered blocking state
[ 15.157299] br-121ceae6d523: port 2(vethaafddf0) entered forwarding state
[ 44.155975] vnet0: entered promiscuous mode
^C
root@M720Q:~# ^C
root@M720Q:~#

主要op 路由并发搞死的,之前pve是正常的。esxi也可以的。