强行关机之后,Percona XtraDB Cluster集群无法启动

版权声明:本文为Buddy Yuan原创文章,转载请注明出处。原文地址:强行关机之后,Percona XtraDB Cluster集群无法启动

事情是这样的,忙着下班,就匆匆忙忙的暴力的把安装的三台PXC集群虚拟机关闭了。回家在打开的时候,怎么也开不了。直接运行启动服务的命令是报错的。

[root@10 mysqld]# systemctl start mysql@bootstrap.service
Job for mysql@bootstrap.service failed because the control process exited with error code. See "systemctl status mysql@bootstrap.service" and "journalctl -xe" for details.

通过报错的提示去查看相关的日志

[root@10 mysqld]# systemctl status mysql@bootstrap.service
● mysql@bootstrap.service - Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap
Loaded: loaded (/usr/lib/systemd/system/mysql@.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since 二 2019-07-02 20:57:54 CST; 2min 1s ago
Process: 4896 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 4867 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
Process: 4296 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=1/FAILURE)
Process: 4295 ExecStart=/usr/bin/mysqld_safe --basedir=/usr ${EXTRA_ARGS} (code=exited, status=1/FAILURE)
Process: 4255 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 4295 (code=exited, status=1/FAILURE)

7月 02 20:57:54 10.0.2.15 mysql-systemd[4296]: ERROR! mysqld_safe with PID 4295 has already exited: FAILURE
7月 02 20:57:54 10.0.2.15 systemd[1]: mysql@bootstrap.service: control process exited, code=exited status=1
7月 02 20:57:54 10.0.2.15 mysql-systemd[4867]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
7月 02 20:57:54 10.0.2.15 mysql-systemd[4867]: ERROR! mysql already dead
7月 02 20:57:54 10.0.2.15 systemd[1]: mysql@bootstrap.service: control process exited, code=exited status=2
7月 02 20:57:54 10.0.2.15 mysql-systemd[4896]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
7月 02 20:57:54 10.0.2.15 mysql-systemd[4896]: WARNING: mysql may be already dead
7月 02 20:57:54 10.0.2.15 systemd[1]: Failed to start Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap.
7月 02 20:57:54 10.0.2.15 systemd[1]: Unit mysql@bootstrap.service entered failed state.
7月 02 20:57:54 10.0.2.15 systemd[1]: mysql@bootstrap.service failed.
[root@10 mysqld]# systemctl start mysql@bootstrap.service
Job for mysql@bootstrap.service failed because the control process exited with error code. See "systemctl status mysql@bootstrap.service" and "journalctl -xe" for details.

这些错误信息看不出什么东西来。于是我进一步检查了MySQL的error日志。通过查看error日志发现一些信息。

2019-07-02T13:00:21.399956Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-07-02T13:00:21.400952Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.26-29-57-log) starting as process 7045 ...
2019-07-02T13:00:21.402588Z 0 [Warning] No argument was provided to --log-bin, and --log-bin-index was not used; so replication may break when this MySQL server acts as a master and has his hostname changed!! Please use '--log-bin=10-bin' to avoid this problem.
2019-07-02T13:00:21.402718Z 0 [Note] WSREP: Setting wsrep_ready to false
2019-07-02T13:00:21.402729Z 0 [Note] WSREP: No pre-stored wsrep-start position found. Skipping position initialization.
2019-07-02T13:00:21.402732Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera3/libgalera_smm.so'
2019-07-02T13:00:21.404735Z 0 [Note] WSREP: wsrep_load(): Galera 3.37(rff05089) by Codership Oy <info@codership.com> loaded successfully.
2019-07-02T13:00:21.404791Z 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2019-07-02T13:00:21.405060Z 0 [Note] WSREP: Found saved state: 428f9095-9980-11e9-b8b6-1322440f5dbe:14, safe_to_bootstrap: 0
2019-07-02T13:00:21.406406Z 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.56.161; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 10; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 4; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.freeze_purge_at_seqno = -1; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 100; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 9; socket.checksum = 2; socket.recv_buf_size = 212992;
2019-07-02T13:00:21.413402Z 0 [Note] WSREP: GCache history reset: 428f9095-9980-11e9-b8b6-1322440f5dbe:0 -> 428f9095-9980-11e9-b8b6-1322440f5dbe:14
2019-07-02T13:00:21.414049Z 0 [Note] WSREP: Assign initial position for certification: 14, protocol version: -1
2019-07-02T13:00:21.414070Z 0 [Note] WSREP: Preparing to initiate SST/IST
2019-07-02T13:00:21.414072Z 0 [Note] WSREP: Starting replication
2019-07-02T13:00:21.414081Z 0 [Note] WSREP: Setting initial position to 428f9095-9980-11e9-b8b6-1322440f5dbe:14
2019-07-02T13:00:21.414085Z 0 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
2019-07-02T13:00:21.414088Z 0 [ERROR] WSREP: Provider/Node (gcomm://192.168.56.161,192.168.56.162,192.168.56.163) failed to establish connection with cluster (reason: 7)
2019-07-02T13:00:21.414093Z 0 [ERROR] Aborting

2019-07-02T13:00:21.414096Z 0 [Note] Giving 0 client threads a chance to die gracefully
2019-07-02T13:00:21.414100Z 0 [Note] WSREP: Waiting for active wsrep applier to exit
2019-07-02T13:00:21.414104Z 0 [Note] WSREP: Service disconnected.
2019-07-02T13:00:21.414105Z 0 [Note] WSREP: Waiting to close threads......
2019-07-02T13:00:26.414590Z 0 [Note] WSREP: Some threads may fail to exit.
2019-07-02T13:00:26.414701Z 0 [Note] Binlog end
2019-07-02T13:00:26.414959Z 0 [Note] /usr/sbin/mysqld: Shutdown complete

这里主要是这个地方启动出现了error。这段话翻译过来意思就是:从这个节点引导集群可能不安全。它不是最后一个离开集群的,可能不包含所有更新。要强制使用此节点进行群集引导,请手动编辑grastate.dat文件,并将safe_to_bootstrap参数设置为1。所以这里我想可以尝试使用最后一个离开集群的来引导集群。当然咱们也可以修改这个文件,把这个文件的safe_to_bootstrap修改成1,但是还是不推荐这么干啊。

[root@10 /]# vi /var/lib/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid: 428f9095-9980-11e9-b8b6-1322440f5dbe
seqno: 14
safe_to_bootstrap: 0

通过对比三个节点的这个文件,发现节点3的这个文件的值为1。于是使用节点3来引导集群,顺利引导成功。

[root@10 ~]# systemctl start mysql@bootstrap.service
分享到: 更多

Post a Comment

Your email is never published nor shared. Required fields are marked *