大家好! 今天大表哥分享一个真实的MGR的测试环境重启的案例。
近日公司的测试数据库环境掉电,造成了机器意外的关机,测试环境大多数都是mysql 3 节点的MGR, 重启之后,发现有些节点的状态是异常的并且无法加不到MGR的组复制组中。
我们今天在实验环境下模拟一下:
如果MGR节点全部down掉:
可能是MGR集群意外crash,并且集群已经做了多次主从切换,并且最后所有的节点全部关闭
或者
可能是计划内的人工维护, 先关闭从库,最后关闭主库
我们改如何重启动MGR集群呢?
在哪个节点引导 group_replication_bootstrap_group 组复制,保证顺利启动为不丢数据?
我们的3台试验节点环境:
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
我们在主节点 3308 创建表:插入测试数据
mysql> create database mgrdb;
Query OK, 1 row affected (0.01 sec)
mysql> use mgrdb;
Database changed
mysql> create table t_mgr(id int not null primary key, msg varchar(100));
Query OK, 0 rows affected (0.06 sec)
mysql> insert into t_mgr value ('1','before mgr crash');
Query OK, 1 row affected (0.02 sec)
此时,我们模拟OOM, kill掉一个从库 3307的mysqld 进程:
[jason@VM-24-9-centos ~]$ kill -9 27476 28812
这个时候MGR的集群试图是:
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
2 rows in set (0.00 sec)
现在这个MGR的集群依然是可用的,我们来继续插入数据
mysql> insert into t_mgr value ('2','one node crash');
Query OK, 1 row affected (0.01 sec)
然后我们来正常关闭节点 3306:
mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)
这个时候我们只有一个节点存活 3308:
我们是依然可以插入成功的:
(这个是MGR的特点,正常shutdown, MGR会认为是节点正常的退出,所以只有一个节点,依然是可读可写)
mysql> use mgrdb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> insert into t_mgr value ('3','one node crash,one was shutdown');
Query OK, 1 row affected (0.00 sec)
最后我们 kill点 3308这个唯一存活的节点的mysqld 进程。
[jason@VM-24-9-centos ~]$ kill -9 5780 7115
至此MGR中的3个节点全部都已经down了, 我们现在尝试恢复一下:
首先先把3个几点mysqld全部启动:
nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql3306/my3306.cnf > /dev/null --user=jason 2>&1 &
nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3307/my3307.cnf > /dev/null --user=jason 2>&1 &
nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3308/my3308.cnf > /dev/null --user=jason 2>&1 &
我们看到3个节点的MGR的状态都是offline的:
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | OFFLINE | | |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
1 row in set (0.00 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | OFFLINE | | |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
1 row in set (0.01 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | OFFLINE | | |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
1 row in set (0.01 sec)
我们应该在哪个节点去引导group_replication_bootstrap_group? 这里假设我们并不知道集群到底是如何crash的顺序。
我们只能通过 @@gtid_executed 来判断哪个节点最靠后了,证明他的数据是最后的,他就是最后的主库。
mysql> select @@port ,@@gtid_executed;
+--------+--------------------------------------------------------------------------------------+
| @@port | @@gtid_executed |
+--------+--------------------------------------------------------------------------------------+
| 3306 | 9915aff5-bbd6-11ec-9110-525400743c33:1-12,
b965e3bb-be27-11ec-ba80-525400743c33:1-22 |
+--------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select @@port,@@gtid_executed;
+--------+--------------------------------------------------------------------------------------+
| @@port | @@gtid_executed |
+--------+--------------------------------------------------------------------------------------+
| 3307 | 9915aff5-bbd6-11ec-9110-525400743c33:1-12,
b965e3bb-be27-11ec-ba80-525400743c33:1-21 |
+--------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec
mysql> select @@port,@@gtid_executed;
+--------+--------------------------------------------------------------------------------------+
| @@port | @@gtid_executed |
+--------+--------------------------------------------------------------------------------------+
| 3308 | 9915aff5-bbd6-11ec-9110-525400743c33:1-12,
b965e3bb-be27-11ec-ba80-525400743c33:1-23 |
+--------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
我们可以看到 3308 的GTID 是最大的,那么他就是最后的主库,我们需要在 3308上引导组复制 group_replication_bootstrap_group
3306–》 b965e3bb-be27-11ec-ba80-525400743c33:1-22
3307–》 b965e3bb-be27-11ec-ba80-525400743c33:1-21
3308–》 b965e3bb-be27-11ec-ba80-525400743c33:1-23
节点3308 执行:
mysql> set global group_replication_bootstrap_group = ON;
Query OK, 0 rows affected (0.00 sec)
mysql> start group_replication;
Query OK, 0 rows affected (3.14 sec)
mysql> set global group_replication_bootstrap_group = OFF;
Query OK, 0 rows affected (0.00 sec)
从节点 3306 执行:
mysql> start group_replication;
Query OK, 0 rows affected (3.32 sec)
从节点 3307 执行:
mysql> start group_replication;
Query OK, 0 rows affected (3.32 sec)
全部启动完毕后,MGR集群满血复活:
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
我们查看数据: 全部都存在
mysql> select * from t_mgr;
+----+---------------------------------+
| id | msg |
+----+---------------------------------+
| 1 | before mgr crash |
| 2 | one node crash |
| 3 | one node crash,one was shutdown |
+----+---------------------------------+
3 rows in set (0.00 sec)
如果我们很不幸运,引导在错误的节点(GTID低的节点上)上引导了 group_replication_bootstrap_group, 那么我们将面临丢数据的危险,并且GTID 大的节点是拒绝加入到MGR组里面的。
下面我们具体来看一下: 测试环境,丢失数据可以接受的情况
我们先清空上一次实验的表:
mysql> drop table t_mgr;
Query OK, 0 rows affected (0.04 sec)
最原始MGR的主从信息:
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
现在我们向主库3308 插入记录:
mysql> create table t_mgr(id int not null primary key, msg varchar(100));
Query OK, 0 rows affected (0.04 sec)
mysql> insert into t_mgr value ('1','before mgr crash');
Query OK, 1 row affected (0.01 sec)
然后我们来正常关闭数据库3306 :
mysql> select @@port;
+--------+
| @@port |
+--------+
| 3306 |
+--------+
1 row in set (0.00 sec)
mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)
我们再次向主库 3008插入数据:
mysql> insert into t_mgr value ('2','one node shutdown');
Query OK, 1 row affected (0.01 sec)
然后我们再次关闭 3307 这个从库:
mysql> select @@port;
+--------+
| @@port |
+--------+
| 3307 |
+--------+
1 row in set (0.00 sec)
mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)
我们再一次向主库 3008插入数据:
mysql> insert into t_mgr value ('3','two nodes shutdown');
Query OK, 1 row affected (0.01 sec)
我们最后关闭掉主库 3308:
mysql> shutdown;
Query OK, 0 rows affected (0.01 sec)
至此MGR上的所有的节点全部down掉。
我们现在把3个节点全部起来:
[jason@VM-24-9-centos ~]$ nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql3306/my3306.cnf > /dev/null --user=jason 2>&1 &
[1] 26839
[jason@VM-24-9-centos ~]$ nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3307/my3307.cnf > /dev/null --user=jason 2>&1 &
[2] 26840
[jason@VM-24-9-centos ~]$ nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3308/my3308.cnf > /dev/null --user=jason 2>&1 &
[3] 29051
我们登陆各个节点查看一下GTID的值:
mysql> select @@port ,@@gtid_executed;
+--------+--------------------------------------------------------------------------------------+
| @@port | @@gtid_executed |
+--------+--------------------------------------------------------------------------------------+
| 3306 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-56 |
+--------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select @@port ,@@gtid_executed;
+--------+--------------------------------------------------------------------------------------+
| @@port | @@gtid_executed |
+--------+--------------------------------------------------------------------------------------+
| 3307 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-57 |
+--------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select @@port ,@@gtid_executed;
+--------+--------------------------------------------------------------------------------------+
| @@port | @@gtid_executed |
+--------+--------------------------------------------------------------------------------------+
| 3308 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-58 |
+--------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
MGR各个节点的GITD:
3308->b965e3bb-be27-11ec-ba80-525400743c33:1-58
3307->b965e3bb-be27-11ec-ba80-525400743c33:1-57
3306->b965e3bb-be27-11ec-ba80-525400743c33:1-56
很明显3308是集群中最后的主库,其GTID的值为最大: 1-58
这个时候,我们人为失误把 3306 这个节点作为了主节点: 来引导 group_replication_bootstrap_group:
mysql> set global group_replication_bootstrap_group = ON;
Query OK, 0 rows affected (0.00 sec)
mysql> start group_replication;
Query OK, 0 rows affected (3.14 sec)
mysql> set global group_replication_bootstrap_group = OFF;
Query OK, 0 rows affected (0.00 sec)
我们从日志中可以看到,这个时候 3306 这个节点被选为了主节点:
2022-04-20T14:04:27.161451+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member gtid_executed: 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-57'
2022-04-20T14:04:27.161470+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member applier channel received_transaction_set: 9915aff5-bbd6-11ec-9110-525400743c33:1-12,
b965e3bb-be27-11ec-ba80-525400743c33:1-28:32:36-39:46-49:53-57'
2022-04-20T14:04:28.161566+08:00 0 [Note] [MY-011507] [Repl] Plugin group_replication reported: 'A new primary with address 82.156.200.136:3306 was elected. The new primary will execute all previous group transactions before allowing writes.'
2022-04-20T14:04:28.162211+08:00 24 [Note] [MY-011485] [Repl] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection.'
这个时候我们再启动GTID最高的 3308 这个节点:
mysql> start group_replication;
Query OK, 0 rows affected (4.06 sec)
启动报错:
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.
查看具体的日志信息是
2022-04-20T14:05:29.120482+08:00 0 [ERROR] [MY-011526] [Repl] Plugin group_replication reported: ‘This member has more executed transactions than those present in the group. Local transactions: 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-58 > Group transactions: 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-57’
2022-04-20T14:05:29.120514+08:00 0 [ERROR] [MY-011522] [Repl] Plugin group_replication reported: ‘The member contains transactions not present in the group. The member will now exit the group.’
这个报错信息已经很明显了, 从节点的GTID 1-58 大于 主节点的GTID 1-57。
如果数据可以忽略掉,我们可以在主库3306 上手动的设置GTID_NEXT跳到 1-58 这个值:
mysql> set GTID_NEXT='b965e3bb-be27-11ec-ba80-525400743c33:58';
Query OK, 0 rows affected (0.00 sec)
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
mysql> set GTID_NEXT='AUTOMATIC';
Query OK, 0 rows affected (0.00 sec)
mysql> select @@port ,@@gtid_executed;
+--------+--------------------------------------------------------------------------------------+
| @@port | @@gtid_executed |
+--------+--------------------------------------------------------------------------------------+
| 3306 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-58 |
+--------+--------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
我们再次重启3308 这个节点:
mysql> start group_replication;
Query OK, 0 rows affected (4.39 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | PRIMARY | 8.0.20 |
| group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | SECONDARY | 8.0.20 |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
2 rows in set (0.01 sec)
这次可以顺利启动成功, MGR集群中也恢复了正常。
我们另外一个节点也启动起来:
mysql> start group_replication;
Query OK, 0 rows affected (4.54 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 |
| group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | PRIMARY | 8.0.20 |
| group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | SECONDARY | 8.0.20 |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
目前只是看起来很集群很正常,但是数据在不同的节点上已经存在了差异。
在数据一致性上,是不能够被接受的。
mysql> select @@port;
+--------+
| @@port |
+--------+
| 3306 |
+--------+
1 row in set (0.00 sec)
mysql> select * from mgrdb.t_mgr;
+----+------------------+
| id | msg |
+----+------------------+
| 1 | before mgr crash |
+----+------------------+
1 row in set (0.00 sec)
mysql> select @@port;
+--------+
| @@port |
+--------+
| 3308 |
+--------+
1 row in set (0.00 sec)
mysql> select * from mgrdb.t_mgr;
+----+--------------------+
| id | msg |
+----+--------------------+
| 1 | before mgr crash |
| 2 | one node shutdown |
| 3 | two nodes shutdown |
+----+--------------------+
3 rows in set (0.00 sec)
所以手动指定GTID_NEXT,然后跳过空事务,造成了数据的不一致,对后续的业务测试也会产生不可预知的影响。
最后的结论: MGR 集群down机后(人工正常维护或者是异常crach),正确的方式必须确定在GTID是最大的节点上来引导 group_replication_bootstrap_group 最为主节点, 从而保证各个节点数据的一致性。
今天也在官方文档上得到了证实: https://dev.mysql.com/doc/refman/8.0/en/group-replication-restarting-group.html
大家可以参考一下,官网是最严谨的。