作者:勋酥-osh海州吴氏 | 来源:互联网 | 2024-11-11 09:31
在分析和解决KeepalivedVIP漂移故障的过程中,我们发现主备节点配置如下:主节点IP为172.16.30.31,备份节点IP为172.16.30.32,虚拟IP为172.16.30.10。故障表现为监控系统显示Keepalived主节点状态异常,导致VIP漂移到备份节点。通过详细检查配置文件和日志,我们发现主节点上的Keepalived进程未能正常运行,最终通过优化配置和重启服务解决了该问题。此外,我们还增加了健康检查机制,以提高系统的稳定性和可靠性。
keepalived + lvs
172.16.30.31 master
172.16.30.32 backup
172.16.30.10 vip
故障:监控显示keepalived master主机故障;通过ping查看vip存在。master重启以后,VIP漂移回master.
偶然间,重启master网卡发现,vip漂移backup以后,无法漂移回master;需重启master-keealived服务,VIP才可漂移回master.
master:
message
May 9 10:47:06 sd3031 Keepalived_healthcheckers[31967]: Netlink reflector reports IP 172.16.30.31 removed
May 9 10:47:06 sd3031 Keepalived_vrrp[31968]: Netlink reflector reports IP 172.16.30.31 removed
May 9 10:47:06 sd3031 Keepalived_vrrp[31968]: Netlink reflector reports IP 172.16.30.10 removed
May 9 10:47:06 sd3031 Keepalived_healthcheckers[31967]: Netlink reflector reports IP 172.16.30.10 removed
May 9 10:47:06 sd3031 avahi-daemon[3924]: Withdrawing address record for 172.16.30.31 on eth0.
May 9 10:47:06 sd3031 avahi-daemon[3924]: Leaving mDNS multicast group on interface eth0.IPv4 with address 172.16.30.31.
May 9 10:47:06 sd3031 avahi-daemon[3924]: Joining mDNS multicast group on interface eth0.IPv4 with address 172.16.30.10.
May 9 10:47:06 sd3031 avahi-daemon[3924]: IP_ADD_MEMBERSHIP failed: No such device
May 9 10:47:06 sd3031 avahi-daemon[3924]: Withdrawing address record for 172.16.30.10 on eth0.
May 9 10:47:06 sd3031 avahi-daemon[3924]: Interface eth0.IPv4 no longer relevant for mDNS.
May 9 10:47:06 sd3031 kernel: bnx2 0000:0b:00.0: eth0: using MSIX
May 9 10:47:06 sd3031 Keepalived_vrrp[31968]: Kernel is reporting: interface eth0 DOWN
May 9 10:47:06 sd3031 Keepalived_vrrp[31968]: VRRP_Instance(VI_1) Entering FAULT STATE
May 9 10:47:06 sd3031 Keepalived_vrrp[31968]: VRRP_Instance(VI_1) removing protocol VIPs.
May 9 10:47:06 sd3031 Keepalived_vrrp[31968]: Netlink: error: Cannot assign requested address, type=(21), seq=1494298005, pid=0
May 9 10:47:06 sd3031 Keepalived_vrrp[31968]: VRRP_Instance(VI_1) Now in FAULT state
May 9 10:47:07 sd3031 Keepalived_healthcheckers[31967]: TCP socket bind failed. Rescheduling.
May 9 10:47:09 sd3031 kernel: bnx2 0000:0b:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
May 9 10:47:09 sd3031 Keepalived_healthcheckers[31967]: TCP socket bind failed. Rescheduling.
May 9 10:47:09 sd3031 last message repeated 2 times
May 9 10:47:10 sd3031 Keepalived_vrrp[31968]: Kernel is reporting: interface eth0 UP
May 9 10:47:10 sd3031 Keepalived_vrrp[31968]: cant do IP_ADD_MEMBERSHIP errno=No such device (19)
May 9 10:47:10 sd3031 Keepalived_vrrp[31968]: Netlink reflector reports IP 172.16.30.31 added
May 9 10:47:10 sd3031 Keepalived_healthcheckers[31967]: Netlink reflector reports IP 172.16.30.31 added
May 9 10:47:10 sd3031 avahi-daemon[3924]: New relevant interface eth0.IPv4 for mDNS.
May 9 10:47:10 sd3031 avahi-daemon[3924]: Joining mDNS multicast group on interface eth0.IPv4 with address 172.16.30.31.
May 9 10:47:10 sd3031 avahi-daemon[3924]: Registering new address record for 172.16.30.31 on eth0.
master重启网卡以后,keepalived在eth0网卡up状态,但是未获取到ip地址的时候,无法进行组播。
出现异常
May 9 10:47:10 sd3031 Keepalived_vrrp[31968]: Kernel is reporting: interface eth0 UP
May 9 10:47:10 sd3031 Keepalived_vrrp[31968]: cant do IP_ADD_MEMBERSHIP errno=No such device (19)
May 9 10:47:10 sd3031 Keepalived_vrrp[31968]: Netlink reflector reports IP 172.16.30.31 added
keepalived进行服务之间内部通信,需要网卡IP做支撑,此出,网卡未有IP,keepalived之间通信失败。
master与backup之间通信失败,导致vip未进行漂移。
修改了keepalived.conf
vrrp_instance ** {
advert_int 2 #修改此项,为keepalived之间组播间隔时间。
}
将默认1修改为2
重启master网卡
master_message.log
May 9 11:03:30 sd3031 Keepalived_healthcheckers[5463]: Netlink reflector reports IP 172.16.30.31 removed
May 9 11:03:30 sd3031 avahi-daemon[3924]: Withdrawing address record for 172.16.30.31 on eth0.
May 9 11:03:30 sd3031 Keepalived_healthcheckers[5463]: Netlink reflector reports IP 172.16.30.10 removed
May 9 11:03:30 sd3031 avahi-daemon[3924]: Leaving mDNS multicast group on interface eth0.IPv4 with address 172.16.30.31.
May 9 11:03:30 sd3031 Keepalived_vrrp[5464]: Netlink reflector reports IP 172.16.30.31 removed
May 9 11:03:30 sd3031 Keepalived_vrrp[5464]: Netlink reflector reports IP 172.16.30.10 removed
May 9 11:03:30 sd3031 avahi-daemon[3924]: Joining mDNS multicast group on interface eth0.IPv4 with address 172.16.30.10.
May 9 11:03:30 sd3031 avahi-daemon[3924]: IP_ADD_MEMBERSHIP failed: No such device
May 9 11:03:30 sd3031 avahi-daemon[3924]: Withdrawing address record for 172.16.30.10 on eth0.
May 9 11:03:30 sd3031 avahi-daemon[3924]: Interface eth0.IPv4 no longer relevant for mDNS.
May 9 11:03:30 sd3031 Keepalived_vrrp[5464]: Kernel is reporting: interface eth0 DOWN
May 9 11:03:30 sd3031 Keepalived_vrrp[5464]: VRRP_Instance(VI_1) Entering FAULT STATE
May 9 11:03:30 sd3031 Keepalived_vrrp[5464]: VRRP_Instance(VI_1) removing protocol VIPs.
May 9 11:03:30 sd3031 Keepalived_vrrp[5464]: Netlink: error: Cannot assign requested address, type=(21), seq=1494298989, pid=0
May 9 11:03:30 sd3031 Keepalived_vrrp[5464]: VRRP_Instance(VI_1) Now in FAULT state
May 9 11:03:30 sd3031 kernel: bnx2 0000:0b:00.0: eth0: using MSIX
May 9 11:03:32 sd3031 Keepalived_healthcheckers[5463]: TCP socket bind failed. Rescheduling.
May 9 11:03:33 sd3031 kernel: bnx2 0000:0b:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
May 9 11:03:34 sd3031 Keepalived_healthcheckers[5463]: TCP socket bind failed. Rescheduling.
May 9 11:03:34 sd3031 Keepalived_healthcheckers[5463]: TCP socket bind failed. Rescheduling.
May 9 11:03:34 sd3031 Keepalived_healthcheckers[5463]: Netlink reflector reports IP 172.16.30.31 added
May 9 11:03:34 sd3031 Keepalived_vrrp[5464]: Netlink reflector reports IP 172.16.30.31 added
May 9 11:03:34 sd3031 avahi-daemon[3924]: New relevant interface eth0.IPv4 for mDNS.
May 9 11:03:34 sd3031 avahi-daemon[3924]: Joining mDNS multicast group on interface eth0.IPv4 with address 172.16.30.31.
May 9 11:03:35 sd3031 avahi-daemon[3924]: Registering new address record for 172.16.30.31 on eth0.
May 9 11:03:37 sd3031 Keepalived_vrrp[5464]: Kernel is reporting: interface eth0 UP
May 9 11:03:37 sd3031 Keepalived_vrrp[5464]: VRRP_Instance(VI_1) Transition to MASTER STATE
May 9 11:03:37 sd3031 Keepalived_vrrp[5464]: VRRP_Instance(VI_1) Entering MASTER STATE
May 9 11:03:37 sd3031 Keepalived_vrrp[5464]: VRRP_Instance(VI_1) setting protocol VIPs.
May 9 11:03:37 sd3031 Keepalived_vrrp[5464]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 172.16.30.10
May 9 11:03:37 sd3031 Keepalived_healthcheckers[5463]: Netlink reflector reports IP 172.16.30.10 added
May 9 11:03:37 sd3031 avahi-daemon[3924]: Registering new address record for 172.16.30.10 on eth0.
May 9 11:03:37 sd3031 Keepalived_vrrp[5464]: Netlink reflector reports IP 172.16.30.10 added
backup_message-log
May 9 11:03:35 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) Transition to MASTER STATE
May 9 11:03:35 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) Entering MASTER STATE
May 9 11:03:35 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) setting protocol VIPs.
May 9 11:03:35 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 172.16.30.10
May 9 11:03:35 sd3032 Keepalived_vrrp[26277]: Netlink reflector reports IP 172.16.30.10 added
May 9 11:03:35 sd3032 Keepalived_healthcheckers[26275]: Netlink reflector reports IP 172.16.30.10 added
May 9 11:03:35 sd3032 avahi-daemon[3963]: Registering new address record for 172.16.30.10 on eth0.
May 9 11:03:40 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 172.16.30.10
May 9 11:04:05 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) Received higher prio advert
May 9 11:04:05 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) Entering BACKUP STATE
May 9 11:04:05 sd3032 Keepalived_vrrp[26277]: VRRP_Instance(VI_1) removing protocol VIPs.
May 9 11:04:05 sd3032 Keepalived_healthcheckers[26275]: Netlink reflector reports IP 172.16.30.10 removed
May 9 11:04:05 sd3032 Keepalived_vrrp[26277]: Netlink reflector reports IP 172.16.30.10 removed
May 9 11:04:05 sd3032 avahi-daemon[3963]: Withdrawing address record for 172.16.30.10 on eth0.
VIP正常漂移
本文出自 “运维笔记” 博客,请务必保留此出处http://phospherus.blog.51cto.com/7824598/1923725
keepalived漂移VIP故障