收起左侧

组802.3ad bond后NetworkManager服务不断重启,导致网络中断。

2
回复
121
查看
[ 复制链接 ]

2

主题

2

回帖

0

牛值

江湖小虾

2025-4-12 16:12:29 显示全部楼层 阅读模式

设备环境:物理机N100/16G 双10G网卡、局域网、系统0.8.45、

BUG现象:组bond 802.3ad 后NetworkManager服务不断重启

出现频率:必现

联系方式:me@debugo.com

界面和交换机配置显示正常,另一台物理服务器(ubuntu,使用netplan)也正常。syslog日志如下。NetworkManager不停重启导致网络中断。

image.png

image1.png

2025-04-12T15:24:02.578921+08:00 mi-d0 systemd[1]: Stopping getty@tty1.service - Getty on tty1...
2025-04-12T15:24:02.594740+08:00 mi-d0 systemd[1]: getty@tty1.service: Deactivated successfully.
2025-04-12T15:24:02.595300+08:00 mi-d0 systemd[1]: Stopped getty@tty1.service - Getty on tty1.
2025-04-12T15:24:02.638594+08:00 mi-d0 systemd[1]: Started getty@tty1.service - Getty on tty1.
2025-04-12T15:24:02.652303+08:00 mi-d0 nm-dispatcher[52713]: STARTUP: getty@tty1 is active
2025-04-12T15:24:02.652413+08:00 mi-d0 nm-dispatcher[52713]: STARTUP: Cleaning up
2025-04-12T15:24:02.776663+08:00 mi-d0 kernel: [ 2568.365616] bond1: (slave enp1s0f0np0): Removing an active aggregator
2025-04-12T15:24:02.776684+08:00 mi-d0 kernel: [ 2568.365666] bond1: (slave enp1s0f0np0): Releasing backup interface
2025-04-12T15:24:02.776686+08:00 mi-d0 kernel: [ 2568.365668] bond1: (slave enp1s0f0np0): the permanent HWaddr of slave - b8:59:9f:37:d7:ac - is still in use by bond - set the HWaddr of slave to a different address to avoid conflicts
2025-04-12T15:24:02.776687+08:00 mi-d0 kernel: [ 2568.365715] mlx5_core 0000:01:00.0: lag map: port 1:1 port 2:2
2025-04-12T15:24:02.787468+08:00 mi-d0 systemd[1]: Stopped target rdma-hw.target - RDMA Hardware.
2025-04-12T15:24:02.788764+08:00 mi-d0 systemd[1]: Stopping rdma-ndd.service - RDMA Node Description Daemon...
2025-04-12T15:24:02.788955+08:00 mi-d0 systemd[1]: rdma-ndd.service: Deactivated successfully.
2025-04-12T15:24:02.789099+08:00 mi-d0 systemd[1]: Stopped rdma-ndd.service - RDMA Node Description Daemon.
2025-04-12T15:24:02.816191+08:00 mi-d0 NetworkManager[44852]: [1744442642.8159] device (bond1): bond port enp1s0f0np0 was detached
2025-04-12T15:24:02.816616+08:00 mi-d0 NetworkManager[44852]: [1744442642.8160] device (enp1s0f0np0): state change: activated -> deactivating (reason 'connection-assumed', sys-iface-state: 'managed')
2025-04-12T15:24:02.818579+08:00 mi-d0 NetworkManager[44852]: [1744442642.8184] device (enp1s0f0np0): state change: deactivating -> disconnected (reason 'connection-assumed', sys-iface-state: 'managed')
2025-04-12T15:24:02.823232+08:00 mi-d0 nm-dispatcher[52918]: STARTUP: NetworkChanged of enp1s0f0np0, event:down
2025-04-12T15:24:02.823636+08:00 mi-d0 NetworkManager[44852]: [1744442642.8234] policy: auto-activating connection 'bond1-slave1' (38aa0aec-fe97-4e28-a764-0c0902fe4b64)
2025-04-12T15:24:02.824005+08:00 mi-d0 NetworkManager[44852]: [1744442642.8237] device (enp1s0f0np0): Activation: starting connection 'bond1-slave1' (38aa0aec-fe97-4e28-a764-0c0902fe4b64)
2025-04-12T15:24:02.824320+08:00 mi-d0 NetworkManager[44852]: [1744442642.8237] device (enp1s0f0np0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:02.824376+08:00 mi-d0 nm-dispatcher[52918]: STARTUP: checking ips
2025-04-12T15:24:02.824415+08:00 mi-d0 NetworkManager[44852]: [1744442642.8240] device (enp1s0f0np0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:02.937733+08:00 mi-d0 NetworkManager[44852]: [1744442642.9369] device (enp1s0f0np0): carrier: link connected
2025-04-12T15:24:02.942035+08:00 mi-d0 NetworkManager[44852]: [1744442642.9419] device (enp1s0f0np0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:03.094096+08:00 mi-d0 NetworkManager[44852]: [1744442643.0938] device (bond1): attached bond port enp1s0f0np0
2025-04-12T15:24:03.094367+08:00 mi-d0 NetworkManager[44852]: [1744442643.0939] device (enp1s0f0np0): Activation: connection 'bond1-slave1' enslaved, continuing activation
2025-04-12T15:24:03.106280+08:00 mi-d0 NetworkManager[44852]: [1744442643.1061] device (enp1s0f0np0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:03.107841+08:00 mi-d0 NetworkManager[44852]: [1744442643.1077] device (enp1s0f0np0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:03.108069+08:00 mi-d0 NetworkManager[44852]: [1744442643.1079] device (enp1s0f0np0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:03.108760+08:00 mi-d0 NetworkManager[44852]: [1744442643.1086] device (enp1s0f0np0): Activation: successful, device activated.
2025-04-12T15:24:04.046258+08:00 mi-d0 nm-dispatcher[52918]: bond1 is not a physical interface, skip
2025-04-12T15:24:04.046819+08:00 mi-d0 nm-dispatcher[52957]: cat: /tmp/52918/ips.txt: No such file or directory
2025-04-12T15:24:04.968716+08:00 mi-d0 systemd[1]: Starting rdma-ndd.service - RDMA Node Description Daemon...
2025-04-12T15:24:04.988128+08:00 mi-d0 systemd[1]: Started rdma-ndd.service - RDMA Node Description Daemon.
2025-04-12T15:24:04.988568+08:00 mi-d0 systemd[1]: Reached target rdma-hw.target - RDMA Hardware.
2025-04-12T15:24:05.007796+08:00 mi-d0 rdma-ndd[52963]: set Node Description failed on mlx5_0
2025-04-12T15:24:05.062428+08:00 mi-d0 nm-dispatcher[52918]: STARTUP: restart tty
2025-04-12T15:24:05.079976+08:00 mi-d0 nm-dispatcher[52918]: STARTUP: getty@tty0 is disabled. Skip it
2025-04-12T15:24:05.088958+08:00 mi-d0 nm-dispatcher[52918]: STARTUP: getty@tty1 is enabled. Restarting...
2025-04-12T15:24:05.094954+08:00 mi-d0 systemd[1]: Stopping getty@tty1.service - Getty on tty1...
2025-04-12T15:24:05.109866+08:00 mi-d0 systemd[1]: getty@tty1.service: Deactivated successfully.
2025-04-12T15:24:05.110402+08:00 mi-d0 systemd[1]: Stopped getty@tty1.service - Getty on tty1.
2025-04-12T15:24:05.138096+08:00 mi-d0 systemd[1]: Started getty@tty1.service - Getty on tty1.
2025-04-12T15:24:05.147116+08:00 mi-d0 nm-dispatcher[52918]: STARTUP: getty@tty1 is active
2025-04-12T15:24:05.147222+08:00 mi-d0 nm-dispatcher[52918]: STARTUP: Cleaning up
2025-04-12T15:24:05.326446+08:00 mi-d0 NetworkManager[44852]: [1744442645.3258] device (bond1): bond port enp1s0f1np1 was detached
2025-04-12T15:24:05.326543+08:00 mi-d0 NetworkManager[44852]: [1744442645.3260] device (enp1s0f1np1): state change: activated -> deactivating (reason 'connection-assumed', sys-iface-state: 'managed')
2025-04-12T15:24:05.328201+08:00 mi-d0 NetworkManager[44852]: [1744442645.3281] device (enp1s0f1np1): state change: deactivating -> disconnected (reason 'connection-assumed', sys-iface-state: 'managed')
2025-04-12T15:24:05.331500+08:00 mi-d0 nm-dispatcher[53040]: STARTUP: NetworkChanged of enp1s0f1np1, event:down
2025-04-12T15:24:05.332696+08:00 mi-d0 nm-dispatcher[53040]: STARTUP: checking ips
2025-04-12T15:24:05.333689+08:00 mi-d0 NetworkManager[44852]: [1744442645.3329] policy: auto-activating connection 'bond1-slave2' (56f5f904-823e-43ea-9f82-904a84c343fb)
2025-04-12T15:24:05.333762+08:00 mi-d0 NetworkManager[44852]: [1744442645.3332] device (enp1s0f1np1): Activation: starting connection 'bond1-slave2' (56f5f904-823e-43ea-9f82-904a84c343fb)
2025-04-12T15:24:05.333807+08:00 mi-d0 NetworkManager[44852]: [1744442645.3333] device (enp1s0f1np1): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:05.333847+08:00 mi-d0 NetworkManager[44852]: [1744442645.3334] device (enp1s0f1np1): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:05.444668+08:00 mi-d0 kernel: [ 2571.034460] mlx5_core 0000:01:00.1 enp1s0f1np1: Link up
2025-04-12T15:24:05.447041+08:00 mi-d0 NetworkManager[44852]: [1744442645.4468] device (enp1s0f1np1): carrier: link connected
2025-04-12T15:24:05.449768+08:00 mi-d0 NetworkManager[44852]: [1744442645.4495] device (enp1s0f1np1): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:05.603487+08:00 mi-d0 NetworkManager[44852]: [1744442645.6032] device (bond1): attached bond port enp1s0f1np1
2025-04-12T15:24:05.603790+08:00 mi-d0 NetworkManager[44852]: [1744442645.6033] device (enp1s0f1np1): Activation: connection 'bond1-slave2' enslaved, continuing activation
2025-04-12T15:24:05.604724+08:00 mi-d0 kernel: [ 2571.191706] bond1: (slave enp1s0f1np1): Enslaving as a backup interface with an up link
2025-04-12T15:24:05.613172+08:00 mi-d0 NetworkManager[44852]: [1744442645.6130] device (enp1s0f1np1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:05.614614+08:00 mi-d0 NetworkManager[44852]: [1744442645.6145] device (enp1s0f1np1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:05.614840+08:00 mi-d0 NetworkManager[44852]: [1744442645.6147] device (enp1s0f1np1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
2025-04-12T15:24:05.615420+08:00 mi-d0 NetworkManager[44852]: [1744442645.6153] device (enp1s0f1np1): Activation: successful, device activated.
2025-04-12T15:24:05.899420+08:00 mi-d0 systemd[1]: Stopped target rdma-hw.target - RDMA Hardware.
2025-04-12T15:24:05.900465+08:00 mi-d0 systemd[1]: Stopping rdma-ndd.service - RDMA Node Description Daemon...
2025-04-12T15:24:05.901691+08:00 mi-d0 systemd[1]: rdma-ndd.service: Deactivated successfully.
2025-04-12T15:24:05.902347+08:00 mi-d0 systemd[1]: Stopped rdma-ndd.service - RDMA Node Description Daemon.
2025-04-12T15:24:06.556671+08:00 mi-d0 nm-dispatcher[53040]: bond1 is not a physical interface, skip
2025-04-12T15:24:06.557218+08:00 mi-d0 nm-dispatcher[53079]: cat: /tmp/53040/ips.txt: No such file or directory

收藏
送赞
分享

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

x

39

主题

8000

回帖

0

牛值

管理员

2025-4-15 18:48:03 显示全部楼层
感谢反馈,我们跟进一下

2

主题

2

回帖

0

牛值

江湖小虾

2025-4-16 13:54:25 楼主 显示全部楼层

能否在安装时提供选择 用 systemd-networkd && netplan方案来取代NeworkManager。这个稳定很多,用了很多年了。

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则