MySQL集群搭建(5)-MHA高可用架構(gòu)

Michael_Lin 發(fā)布于2019-06-25 19:01 / 3890人閱讀

摘要：前面的文章介紹了怎么從單點(diǎn)開始搭建集群，列表如下安裝二進(jìn)制版集群搭建主備搭建集群搭建主主從模式集群搭建高可用架構(gòu)集群搭建今天說(shuō)另一個(gè)常用的高可用方案概述簡(jiǎn)介是由實(shí)現(xiàn)的一款高可用程序，出現(xiàn)故障時(shí)，以最小的停機(jī)時(shí)間通常秒執(zhí)行的故障轉(zhuǎn)

前面的文章介紹了怎么從單點(diǎn)開始搭建MySQL集群，列表如下

MySQL 安裝(二進(jìn)制版)

MySQL集群搭建(1)-主備搭建

MySQL集群搭建(2)-主主從模式

MySQL集群搭建(3)-MMM高可用架構(gòu)

MySQL集群搭建(4)-MMM+LVS+Keepalived

今天說(shuō)另一個(gè)常用的高可用方案: MHA

1 概述 1.1 MHA 簡(jiǎn)介

MHA - Master High Availability 是由 Perl 實(shí)現(xiàn)的一款高可用程序，出現(xiàn)故障時(shí)，MHA 以最小的停機(jī)時(shí)間(通常10-30秒)執(zhí)行 master 的故障轉(zhuǎn)移以及 slave 的升級(jí)。MHA 可防止復(fù)制一致性問(wèn)題，并且易于安裝，不需要改變現(xiàn)有部署。

MHA 由MHA manager和MHA node組成, MHA manager是一個(gè)監(jiān)控管理程序，用于監(jiān)控MySQL master狀態(tài); MHA node是具有故障轉(zhuǎn)移的工具腳本，如解析 MySQL 二進(jìn)制/中繼日志，傳輸應(yīng)用事件到Slave, MHA node在每個(gè)MySQL服務(wù)器上運(yùn)行。

出自 MHA Wiki

MHA manager調(diào)用MHA node工具腳本的方式是SSH到主機(jī)上然后執(zhí)行命令，所以各節(jié)點(diǎn)需要做等效驗(yàn)證。

1.2 MHA 怎么保證數(shù)據(jù)不丟失

當(dāng)Master宕機(jī)后，MHA會(huì)嘗試保存宕機(jī)Master的二進(jìn)制日志，然后自動(dòng)判斷MySQL集群中哪個(gè)實(shí)例的中繼日志是最新的，并將有最新日志的實(shí)例的差異日志傳到其他實(shí)例補(bǔ)齊，從而實(shí)現(xiàn)所有實(shí)例數(shù)據(jù)一致。然后把宕機(jī)Master的二進(jìn)制日志應(yīng)用到選定節(jié)點(diǎn)，并提升為 Master。

具體流程如下：

嘗試從宕機(jī)Master中保存二進(jìn)制日志

找到含有最新中繼日志的Slave

把最新中繼日志應(yīng)用到其他實(shí)例，實(shí)現(xiàn)各實(shí)例數(shù)據(jù)一致

應(yīng)用從Master保存的二進(jìn)制日志事件

提升一個(gè)Slave為Master

其他Slave向該新Master同步

從切換流程流程可以看到，如果宕機(jī)Master主機(jī)無(wú)法SSH登錄，那么第一步就沒(méi)辦法實(shí)現(xiàn)，對(duì)于MySQL5.5以前的版本，數(shù)據(jù)還是有丟失的風(fēng)險(xiǎn)。對(duì)于5.5后的版本，開啟半同步復(fù)制后，真正有助于避免數(shù)據(jù)丟失，半同步復(fù)制保證至少一個(gè) （不是所有）slave 在 master 提交時(shí)接收到二進(jìn)制日志事件。因此，對(duì)于可以處理一致性問(wèn)題的MHA 可以實(shí)現(xiàn)"幾乎沒(méi)有數(shù)據(jù)丟失"和"從屬一致性"。

1.3 MHA 優(yōu)點(diǎn)和限制 優(yōu)點(diǎn)

開源，用Perl編寫

方案成熟，故障切換時(shí)，MHA會(huì)做日志補(bǔ)齊操作，盡可能減少數(shù)據(jù)丟失，保證數(shù)據(jù)一

部署不需要改變現(xiàn)有架構(gòu)

限制

各個(gè)節(jié)點(diǎn)要打通SSH信任，有一定的安全隱患

沒(méi)有 Slave 的高可用

自帶的腳本不足，例如虛IP配置需要自己寫命令或者依賴其他軟件

需要手動(dòng)清理中繼日志

1.4 MHA 常用兩種復(fù)制配置 單 master，多 slave

        M(RW)
        |
+-------+-------+
S1(R)  S2(R)   S3(R)

這種復(fù)制方式非常常見，當(dāng)Master宕機(jī)時(shí)，MHA會(huì)選一個(gè)日志最新的主機(jī)升級(jí)為Master, 如果不希望個(gè)節(jié)點(diǎn)成為Master，把no_master設(shè)為1就可以。

多 master, 多 slave

        M(RW)----M2(R, candidate_master=1)
        |
+-------+-------+
S1(R)          S2(R)

雙主結(jié)構(gòu)也是常見的復(fù)制模式，如果當(dāng)前Master崩潰, MHA會(huì)選擇只讀Master成為新的Master

2 數(shù)據(jù)庫(kù)環(huán)境準(zhǔn)備

本次演示使用復(fù)制方式是主主從，主主從數(shù)據(jù)庫(kù)搭建方式參考以前文章

2.1 節(jié)點(diǎn)信息

IP	系統(tǒng)	端口	MySQL版本	節(jié)點(diǎn)	讀寫	說(shuō)明
10.0.0.247	Centos6.5	3306	5.7.9	Master	讀寫	主節(jié)點(diǎn)
10.0.0.248	Centos6.5	3306	5.7.9	Standby	只讀,可切換為讀寫	備主節(jié)點(diǎn)
10.0.0.249	Centos6.5	3306	5.7.9	Slave	只讀	從節(jié)點(diǎn)
10.0.0.24	Centos6.5	-	-	manager	-	MHA Manager
10.0.0.237	-	-	-	-	-	VIP

2.2 架構(gòu)圖

2.3 參考配置

Master1

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2473306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 1
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

Master2

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2483306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 2
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

Slave

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2493306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

read_only=1

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

3 安裝配置 MHA 3.1 下載 MHA

進(jìn)入 MHA 下載頁(yè)面 Downloads, 下載Manager和Node節(jié)點(diǎn)安裝包，由于我的服務(wù)器是centos6，所以下載了MHA Manager 0.56 rpm RHEL6和MHA Node 0.56 rpm RHEL6

3.2 安裝 MHA

Node安裝

在所有主機(jī)(包括Manager)上執(zhí)行

# 安裝依賴
yum install perl perl-devel perl-DBD-MySQL
# 安裝 node 工具
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

Manager安裝

在 Manager 主機(jī)上執(zhí)行

# 安裝依賴
yum install -y perl perl-devel perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
# 安裝 manager
rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

MHA Installation

3.3 創(chuàng)建 MHA 管理用戶

管理用戶需要執(zhí)行一些數(shù)據(jù)庫(kù)管理命令包括STOP SLAVE, CHANGE MASTER, RESET SLAVE

create user mha_manager@"%" identified by "mha_manager";
grant all on *.* to mha_manager@"%";
flush privileges;

3.4 增加 MySQL 用戶 sudo 權(quán)限

配置 VIP 需要有 sudo 權(quán)限

打開/etc/sudoers文件, 增加一條

root    ALL=(ALL)       ALL
# 這個(gè)是增加的
mysql   ALL=(ALL)       NOPASSWD: ALL

然后把Defaults requiretty注釋掉

# Defaults    requiretty

3.5 配置各主機(jī)免密碼登陸

所有主機(jī)執(zhí)行

# 進(jìn)入 mysql 用戶
su - mysql

# 生成密鑰對(duì), 執(zhí)行命令，然后按回車
ssh-keygen -t rsa

# 復(fù)制公鑰到相應(yīng)主機(jī)
ssh-copy-id [email protected]
ssh-copy-id [email protected]
ssh-copy-id [email protected]
ssh-copy-id [email protected]

3.6 配置 Manager

新建/etc/masterha目錄，我們把配置文件放到這里

mkdir /etc/masterha

創(chuàng)建配置文件/etc/masterha/app1.cnf, 寫上配置

[server default]
manager_workdir=/etc/masterha                  # 設(shè)置 manager 的工作目錄, 可以自己調(diào)整
manager_log=/etc/masterha/manager.log          # 設(shè)置 manager 的日志文件
master_binlog_dir=/data/mysql_log/test_db      # 設(shè)置 master binlog 的日志的位置
master_ip_failover_script= /etc/masterha/script/master_ip_failover            # 設(shè)置自動(dòng) failover 時(shí)的切換腳本, 腳本參考附件
master_ip_online_change_script= /etc/masterha/script/master_ip_online_change  # 設(shè)置手動(dòng)切換時(shí)執(zhí)行的切換腳本, 腳本參考附件

user=mha_manager            # 設(shè)置管理用戶, 用來(lái)監(jiān)控、配置 MySQL(STOP SLAVE, CHANGE MASTER, RESET SLAVE), 默認(rèn)為 root
password=mha_manager        # 設(shè)置管理用戶密碼

repl_user=repl              # 設(shè)置復(fù)制環(huán)境中的復(fù)制用戶名
repl_password=repl          # 設(shè)置復(fù)制用戶的密碼

ping_interval=1             # 發(fā)送 ping 包的時(shí)間間隔，三次沒(méi)有回應(yīng)就自動(dòng)進(jìn)行 failover
remote_workdir=/tmp         # 設(shè)置遠(yuǎn)端 MySQL 的工作目錄

report_script=/etc/masterha/script/send_report    # 設(shè)置發(fā)生切換后執(zhí)行的腳本

# 檢查腳本
secondary_check_script= /usr/bin/masterha_secondary_check-s 10.0.0.247 -s 10.0.0.248            

shutdown_script=""              #設(shè)置故障發(fā)生后關(guān)閉故障主機(jī)腳本(可以用于防止腦裂)

ssh_user=mysql                  #設(shè)置 ssh 的登錄用戶名

[server1]
hostname=10.0.0.247
port=3306

[server2]
hostname=10.0.0.248
port=3306
candidate_master=1   # 設(shè)置為候選 master, 如果發(fā)生宕機(jī)切換，會(huì)把該節(jié)點(diǎn)設(shè)為新 Master，即使它不是數(shù)據(jù)最新的節(jié)點(diǎn)
check_repl_delay=0   # 默認(rèn)情況下，一個(gè) Slave 落后 Master 100M 的中繼日志，MHA 不會(huì)選擇它作為新的 Master，因?yàn)檫@對(duì)于 Slave 恢復(fù)數(shù)據(jù)要很長(zhǎng)時(shí)間，check_repl_delay=0 的時(shí)候會(huì)忽略延遲，可以和 candidate_master=1 配合用

[server3]
hostname=10.0.0.249
port=3306
no_master=1         # 從不將這臺(tái)主機(jī)升級(jí)為 Master
ignore_fail=1       # 默認(rèn)情況下，如果有 Slave 節(jié)點(diǎn)掛了， 就不進(jìn)行切換，設(shè)置 ignore_fail=1 可以忽然它

創(chuàng)建配置文件/etc/masterha/app2.cnf, 以備用Master 為 Master, 方便切換后啟動(dòng)MHA

[server default]
manager_workdir=/etc/masterha                  # 設(shè)置 manager 的工作目錄, 可以自己調(diào)整
manager_log=/etc/masterha/manager.log          # 設(shè)置 manager 的日志文件
master_binlog_dir=/data/mysql_log/test_db      # 設(shè)置 master binlog 的日志的位置
master_ip_failover_script= /etc/masterha/script/master_ip_failover            # 設(shè)置自動(dòng) failover 時(shí)的切換腳本
master_ip_online_change_script= /etc/masterha/script/master_ip_online_change  # 設(shè)置手動(dòng)切換時(shí)執(zhí)行的切換腳本

user=mha_manager            # 設(shè)置管理用戶, 用來(lái)監(jiān)控、配置 MySQL(STOP SLAVE, CHANGE MASTER, RESET SLAVE), 默認(rèn)為 root
password=mha_manager        # 設(shè)置管理用戶密碼

repl_user=repl              # 設(shè)置復(fù)制環(huán)境中的復(fù)制用戶名
repl_password=repl          # 設(shè)置復(fù)制用戶的密碼

ping_interval=1             # 發(fā)送 ping 包的時(shí)間間隔，三次沒(méi)有回應(yīng)就自動(dòng)進(jìn)行 failover
remote_workdir=/tmp         # 設(shè)置遠(yuǎn)端 MySQL 的工作目錄

report_script=/etc/masterha/script/send_report    # 設(shè)置發(fā)生切換后執(zhí)行的腳本

# 檢查腳本
secondary_check_script= /usr/bin/masterha_secondary_check -s 10.0.0.248 -s 10.0.0.247

shutdown_script=""              #設(shè)置故障發(fā)生后關(guān)閉故障主機(jī)腳本(可以用于防止腦裂)

ssh_user=mysql                  #設(shè)置 ssh 的登錄用戶名

[server1]
hostname=10.0.0.248
port=3306

[server2]
hostname=10.0.0.247
port=3306
candidate_master=1   # 設(shè)置為候選 master, 如果發(fā)生宕機(jī)切換，會(huì)把該節(jié)點(diǎn)設(shè)為新 Master，即使它不是數(shù)據(jù)最新的節(jié)點(diǎn)
check_repl_delay=0   # 默認(rèn)情況下，一個(gè) Slave 落后 Master 100M 的中繼日志，MHA 不會(huì)選擇它作為新的 Master，因?yàn)檫@對(duì)于 Slave 恢復(fù)數(shù)據(jù)要很長(zhǎng)時(shí)間，check_repl_delay=0 的時(shí)候會(huì)忽略延遲，可以和 candidate_master=1 配合用

[server3]
hostname=10.0.0.249
port=3306
no_master=1         # 從不將這臺(tái)主機(jī)升級(jí)為 Master
ignore_fail=1       # 默認(rèn)情況下，如果有 Slave 節(jié)點(diǎn)掛了， 就不進(jìn)行切換，設(shè)置 ignore_fail=1 可以忽然它

注意：使用的時(shí)候去掉注釋

3.7 配置切換腳本 管理 VIP 方式

MHA管理VIP有兩種方案，一種是使用Keepalived，另一種是自己寫命令實(shí)現(xiàn)增刪VIP，由于Keepalived容易受到網(wǎng)絡(luò)波動(dòng)造成VIP切換，而且無(wú)法在多實(shí)例機(jī)器上使用，所以建議寫腳本管理VIP。

當(dāng)前主機(jī)的網(wǎng)卡是eth0, 可以通過(guò)下列命令增刪 VIP

up VIP

sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255

down VIP

sudo /sbin/ifconfig eth0:1 down

配置切換腳本

master_ip_failover , master_ip_online_change和send_report腳本在附錄里面

更改 mysql 配置

MHA的檢測(cè)比較嚴(yán)格，所以我們把除Master外的節(jié)點(diǎn)設(shè)為read_only, 有必要可以寫進(jìn)配置文件里面

# mysql shell
set global read_only=1;

MHA需要使用中繼日志來(lái)實(shí)現(xiàn)數(shù)據(jù)一致性，所以所有節(jié)點(diǎn)要設(shè)置不自動(dòng)清理中繼日志

# mysql shell
set global relay_log_purge=0;

也可以寫入配置文件

# my.cnf
relay_log_purge=0

MHA 常用命令

Manager

masterha_check_ssh              檢查 MHA 的 SSH 配置狀況    
masterha_check_repl             檢查 MySQL 復(fù)制狀況
masterha_manger                 啟動(dòng) MHA
masterha_stop                   停止 MHA
masterha_check_status           檢測(cè)當(dāng)前 MHA 運(yùn)行狀態(tài)
masterha_master_monitor         檢測(cè) master 是否宕機(jī)
masterha_master_switch          手動(dòng)故障轉(zhuǎn)移
masterha_conf_host              添加或刪除配置的 server 信息

Node

save_binary_logs                保存 master 的二進(jìn)制日志
apply_diff_relay_logs           對(duì)比識(shí)別中繼日志的差異部分
purge_relay_logs                清除中繼日志(MHA中繼日志需要使用這個(gè)命令清除)

命令的使用方法可以通過(guò)執(zhí)行命令 --help 得到

驗(yàn)證 SSH 是否成功、主從狀態(tài)是否正常

在 manager 節(jié)點(diǎn)執(zhí)行 masterha_check_ssh --conf=/etc/masterha/app1.cnf 檢測(cè)SSH狀態(tài)，下面是執(zhí)行結(jié)果

[mysql@chengqm ~]$ masterha_check_ssh --conf=/etc/masterha/app1.cnf
Thu Dec 20 19:47:18 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Dec 20 19:47:18 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Dec 20 19:47:18 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Dec 20 19:47:18 2018 - [info] Starting SSH connection tests..
Thu Dec 20 19:47:19 2018 - [debug] 
Thu Dec 20 19:47:18 2018 - [debug]  Connecting via SSH from [email protected](10.0.0.247:22) to [email protected](10.0.0.248:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from [email protected](10.0.0.247:22) to [email protected](10.0.0.249:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:19 2018 - [debug] 
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from [email protected](10.0.0.248:22) to [email protected](10.0.0.247:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from [email protected](10.0.0.248:22) to [email protected](10.0.0.249:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:20 2018 - [debug] 
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from [email protected](10.0.0.249:22) to [email protected](10.0.0.247:22)..
Thu Dec 20 19:47:20 2018 - [debug]   ok.
Thu Dec 20 19:47:20 2018 - [debug]  Connecting via SSH from [email protected](10.0.0.249:22) to [email protected](10.0.0.248:22)..
Thu Dec 20 19:47:20 2018 - [debug]   ok.
Thu Dec 20 19:47:20 2018 - [info] All SSH connection tests passed successfully.

在 manager 節(jié)點(diǎn)執(zhí)行 masterha_check_repl --conf=/etc/masterha/app1.cnf 檢測(cè)同步狀態(tài)，下面是執(zhí)行結(jié)果

[mysql@chengqm ~]$ masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Dec 20 20:05:03 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Dec 20 20:05:03 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Dec 20 20:05:03 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Dec 20 20:05:03 2018 - [info] MHA::MasterMonitor version 0.56.
Thu Dec 20 20:05:03 2018 - [info] Multi-master configuration is detected. Current primary(writable) master is 10.0.0.247(10.0.0.247:3306)
Thu Dec 20 20:05:03 2018 - [info] Master configurations are as below: 
Master 10.0.0.247(10.0.0.247:3306), replicating from 10.0.0.248(10.0.0.248:3306)
Master 10.0.0.248(10.0.0.248:3306), replicating from 10.0.0.247(10.0.0.247:3306), read-only
================ 省略 ==================
Thu Dec 20 20:05:08 2018 - [info]   /etc/masterha/script/master_ip_failover --command=status --ssh_user=mysql --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 
Thu Dec 20 20:05:08 2018 - [info]  OK.
Thu Dec 20 20:05:08 2018 - [warning] shutdown_script is not defined.
Thu Dec 20 20:05:08 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

出現(xiàn) MySQL Replication Health is OK. 表示成功

如果出現(xiàn)Failed to get master_ip_failover_script status with return code 255:0這個(gè)錯(cuò)誤，就注釋掉master_ip_failover腳本的FIXME_xxx

注意：要想正常運(yùn)行，系統(tǒng)路徑必須要有 mysqlbinlog 和 mysql 命令

4 啟動(dòng)和測(cè)試 4.1 啟動(dòng)

使用腳本管理 VIP 不會(huì)自動(dòng)設(shè)置 VIP，所以先手動(dòng)在 Master 設(shè)置 VIP

[root@cluster01 ~]# /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255
[root@cluster01 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.247  Bcast:10.0.255.255  Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fede:8033/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:17333247 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5472004 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1476157398 (1.3 GiB)  TX bytes:1064253754 (1014.9 MiB)

eth0:1    Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.237  Bcast:10.0.0.237  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
...

啟動(dòng) MHA Manager

[mysql@chengqm ~]$ nohup /usr/bin/masterha_manager --conf=/etc/masterha/app1.cnf --ignore_last_failover &
[1] 21668

--ignore_last_failover 忽略上次切換。MHA每次故障切換后都會(huì)生成一個(gè)app1.failover.complete這樣的文件，如果不加這個(gè)參數(shù)，需要?jiǎng)h除這個(gè)文件才能再次啟動(dòng)

檢查啟動(dòng)日志

[mysql@chengqm ~]$ tail -18 /etc/masterha/manager.log 
Fri Dec 21 13:56:39 2018 - [info] 
10.0.0.247(10.0.0.247:3306) (current master)
 +--10.0.0.248(10.0.0.248:3306)
 +--10.0.0.249(10.0.0.249:3306)

Fri Dec 21 13:56:39 2018 - [info] Checking master_ip_failover_script status:
Fri Dec 21 13:56:39 2018 - [info]   /etc/masterha/script/master_ip_failover --command=status --ssh_user=mysql --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 


 VIP Command: start=sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Check script.. OK 
Fri Dec 21 13:56:39 2018 - [info]  OK.
Fri Dec 21 13:56:39 2018 - [warning] shutdown_script is not defined.
Fri Dec 21 13:56:39 2018 - [info] Set master ping interval 1 seconds.
Fri Dec 21 13:56:39 2018 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s 10.0.0.247 -s 10.0.0.248
Fri Dec 21 13:56:39 2018 - [info] Starting ping health check on 10.0.0.247(10.0.0.247:3306)..
Fri Dec 21 13:56:39 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn"t respond..

日志中顯示 Ping(SELECT) succeeded, waiting until MySQL doesn"t respond 表示啟動(dòng)成功

如果查看Master的general日志，會(huì)發(fā)現(xiàn)MHA不斷執(zhí)行SELECT 1 As Value檢查命令

4.2 失效轉(zhuǎn)移

我們模擬Master數(shù)據(jù)庫(kù)宕機(jī)的情況

[root@cluster01 ~]# ps -ef | grep mysql
mysql    20061     1  0 11:19 pts/0    00:00:00 /bin/sh /usr/local/mysql57/bin/mysqld_safe --defaults-file=/data/mysql_db/test_db/my.cnf --datadir=/data/mysql_db/test_db --pid-file=/data/mysql_db/test_db/mysql.pid
mysql    20494 20061  0 11:19 pts/0    00:00:21 /usr/local/mysql57/bin/mysqld --defaults-file=/data/mysql_db/test_db/my.cnf --basedir=/usr/local/mysql57 --datadir=/data/mysql_db/test_db --plugin-dir=/usr/local/mysql57/lib/plugin --log-error=/data/mysql_log/test_db/mysql-error.log --pid-file=/data/mysql_db/test_db/mysql.pid --socket=/data/mysql_db/test_db/mysql.sock --port=3306
[root@cluster01 ~]# kill -9 20061 20494

查看MHA日志可以看到整個(gè)切換過(guò)程

Fri Dec 21 14:04:49 2018 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Fri Dec 21 14:04:49 2018 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 10.0.0.247 -s 10.0.0.248  --user=mysql  --master_host=10.0.0.247  --master_ip=10.0.0.247  --master_port=3306 --master_user=mha_manager --master_password=mha_manager --ping_type=SELECT
Fri Dec 21 14:04:49 2018 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql_log/test_db --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --binlog_prefix=mysql-bin
Monitoring server 10.0.0.247 is reachable, Master is not reachable from 10.0.0.247. OK.
Fri Dec 21 14:04:49 2018 - [info] HealthCheck: SSH to 10.0.0.247 is reachable.
Monitoring server 10.0.0.248 is reachable, Master is not reachable from 10.0.0.248. OK.
Fri Dec 21 14:04:49 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
=============== 省略 ================
Fri Dec 21 14:04:52 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Fri Dec 21 14:04:52 2018 - [info] Executing master IP deactivation script:
Fri Dec 21 14:04:52 2018 - [info]   /etc/masterha/script/master_ip_failover --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 --command=stopssh --ssh_user=mysql  


 VIP Command: start=sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Disabling the VIP on old master: 10.0.0.247 
SIOCSIFFLAGS: Cannot assign requested address
Fri Dec 21 14:04:52 2018 - [info]  done.
=============== 省略 ================
Fri Dec 21 14:04:53 2018 - [info] Starting master failover..
Fri Dec 21 14:04:53 2018 - [info] 
From:
10.0.0.247(10.0.0.247:3306) (current master)
 +--10.0.0.248(10.0.0.248:3306)
 +--10.0.0.249(10.0.0.249:3306)

To:
10.0.0.248(10.0.0.248:3306) (new master)
 +--10.0.0.249(10.0.0.249:3306)
Fri Dec 21 14:04:53 2018 - [info]
=============== 省略 ================
Fri Dec 21 14:04:53 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST="10.0.0.248", MASTER_PORT=3306, MASTER_LOG_FILE="mysql-bin.000005", MASTER_LOG_POS=154, MASTER_USER="repl", MASTER_PASSWORD="xxx";
Fri Dec 21 14:04:53 2018 - [info] Executing master IP activate script:
Fri Dec 21 14:04:53 2018 - [info]   /etc/masterha/script/master_ip_failover --command=start --ssh_user=mysql --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 --new_master_host=10.0.0.248 --new_master_ip=10.0.0.248 --new_master_port=3306 --new_master_user="mha_manager" --new_master_password="mha_manager"  


 VIP Command: start=sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Set read_only=0 on the new master.
Enabling the VIP - 10.0.0.237 on the new master - 10.0.0.248 
=============== 省略 ================
Fri Dec 21 14:04:55 2018 - [info]  10.0.0.248: Resetting slave info succeeded.
Fri Dec 21 14:04:55 2018 - [info] Master failover to 10.0.0.248(10.0.0.248:3306) completed successfully.

查看新Master VIP

[mysql@cluster02 ~]$ ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:66:7E:E8  
          inet addr:10.0.0.248  Bcast:10.0.255.255  Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fe66:7ee8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:40197173 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10470689 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:4063358126 (3.7 GiB)  TX bytes:2269241789 (2.1 GiB)

eth0:1    Link encap:Ethernet  HWaddr FA:16:3E:66:7E:E8  
          inet addr:10.0.0.237  Bcast:10.0.0.237  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

可以看到VIP已經(jīng)成功切換

查看新Master的general日志，可以看到MHA的操作過(guò)程, 下面展示部分日志

...
2018-12-21T14:04:41.782336+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:41.788318+08:00 5525 Query    STOP SLAVE IO_THREAD
2018-12-21T14:04:41.900734+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:42.044801+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:42.668581+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:42.670336+08:00 5525 Query    STOP SLAVE SQL_THREAD
...
2018-12-21T14:04:42.863904+08:00 5526 Query    SET GLOBAL read_only=0
...
2018-12-21T14:04:43.950986+08:00 5527 Query    SET @rpl_semi_sync_slave= 1
...

查看Slave的general日志，可以看到Slave會(huì)重新指向

2018-12-21T14:04:04.835218+08:00   90 Query    STOP SLAVE IO_THREAD
2018-12-21T14:04:04.955706+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:05.092123+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.018838+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.034225+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.036613+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.038475+08:00   90 Query    STOP SLAVE SQL_THREAD
2018-12-21T14:04:06.160142+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.162224+08:00   90 Query    STOP SLAVE
2018-12-21T14:04:06.163171+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.164554+08:00   90 Query    RESET SLAVE
2018-12-21T14:04:06.825564+08:00   90 Query    CHANGE MASTER TO MASTER_HOST = "10.0.0.248" MASTER_USER = "repl" MASTER_PASSWORD =  MASTER_PORT = 3306 MASTER_LOG_FILE = "mysql-bin.000005" MASTER_LOG_POS = 154
2018-12-21T14:04:06.981718+08:00   90 Query    SET GLOBAL relay_log_purge=0
2018-12-21T14:04:06.982802+08:00   90 Query    START SLAVE

注意: MHA在切換完成后會(huì)結(jié)束 Manager 進(jìn)程

4.3 手動(dòng)切換

切換后Master為Cluster2, 把Cluster1重新指向Cluster2，現(xiàn)在測(cè)試一下手動(dòng)切換，把Master切回Cluster1, 命令如下

masterha_master_switch --conf=/etc/masterha/app2.cnf --master_state=alive --new_master_host=10.0.0.247 --new_master_port=3306 --orig_master_is_new_slave

--orig_master_is_new_slave 是將原master切換為新主的slave，默認(rèn)情況下，是不添加的。

下面是執(zhí)行過(guò)程, 有兩個(gè)地方要回答 yes/no

[mysql@chengqm ~]$ masterha_master_switch --conf=/etc/masterha/app2.cnf --master_state=alive --new_master_host=10.0.0.247 --new_master_port=3306 --orig_master_is_new_slave

......

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 10.0.0.248(10.0.0.248:3306)? (YES/no): yes

......

Sun Dec 23 16:50:48 2018 - [info] 
From:
10.0.0.248(10.0.0.248:3306) (current master)
 +--10.0.0.247(10.0.0.247:3306)
 +--10.0.0.249(10.0.0.249:3306)

To:
10.0.0.247(10.0.0.247:3306) (new master)
 +--10.0.0.249(10.0.0.249:3306)
 +--10.0.0.248(10.0.0.248:3306)

Starting master switch from 10.0.0.248(10.0.0.248:3306) to 10.0.0.247(10.0.0.247:3306)? (yes/NO): yes

......

Sun Dec 23 16:51:36 2018 - [info]  10.0.0.247: Resetting slave info succeeded.
Sun Dec 23 16:51:36 2018 - [info] Switching master to 10.0.0.247(10.0.0.247:3306) completed successfully.

切換成功，查看Cluster1的VIP

[mysql@cluster01 ~]$ ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.247  Bcast:10.0.255.255  Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fede:8033/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:20585872 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5519122 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1785787985 (1.6 GiB)  TX bytes:1068115408 (1018.6 MiB)

eth0:1    Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.237  Bcast:10.0.0.237  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

注意：手動(dòng)切換的時(shí)候先把 MHA Manager 停了

4.4 停止 MHA

停止 MHA 的命令如下，就不演示了

masterha_stop --conf=配置文件

5 總結(jié)

總的來(lái)說(shuō)，MHA是一套非常優(yōu)秀而且使用比較廣的高可用程序，它可以自動(dòng)補(bǔ)齊日志使得一致性有保證，部署的時(shí)候不需要改變?cè)屑軜?gòu)就可以使用。但是使用起來(lái)還是有一點(diǎn)復(fù)雜的，因?yàn)?b>MHA不接管VIP，所以要自己寫腳本實(shí)現(xiàn)，而且只保證Master高可用，沒(méi)有Slave高可用，還有就是中繼日志要自己設(shè)定時(shí)任務(wù)來(lái)清理。

不管怎么說(shuō)，在沒(méi)有更好的方案下，MHA還是值得使用的。

附 master_ip_failover 腳本

#!/usr/bin/env perl
use strict;
use warnings FATAL => "all";

use Getopt::Long;
use MHA::DBHelper;

my (
  $command,        $ssh_user,         $orig_master_host,
  $orig_master_ip, $orig_master_port, $new_master_host,
  $new_master_ip,  $new_master_port,  $new_master_user,
  $new_master_password
);

my $vip = "10.0.0.237";
my $key = "1";
my $ssh_start_vip = "sudo /sbin/ifconfig eth0:$key $vip netmask 255.255.255.255";
my $ssh_stop_vip = "sudo /sbin/ifconfig eth0:$key down";

GetOptions(
  "command=s"             => $command,
  "ssh_user=s"            => $ssh_user,
  "orig_master_host=s"    => $orig_master_host,
  "orig_master_ip=s"      => $orig_master_ip,
  "orig_master_port=i"    => $orig_master_port,
  "new_master_host=s"     => $new_master_host,
  "new_master_ip=s"       => $new_master_ip,
  "new_master_port=i"     => $new_master_port,
  "new_master_user=s"     => $new_master_user,
  "new_master_password=s" => $new_master_password,
);

exit &main();

sub main {

  print "

 VIP Command: start=$ssh_start_vip stop=$ssh_stop_vip

";
 
  if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
    # If you manage master ip address at global catalog database,
    # invalidate orig_master_ip here.
    my $exit_code = 1;
    eval {
      print "Disabling the VIP on old master: $orig_master_host 
";
      &stop_vip();
      # updating global catalog, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@
";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {

    # all arguments are passed.
    # If you manage master ip address at global catalog database,
    # activate new_master_ip here.
    # You can also grant write access (create user, set read_only=0, etc) here.
    my $exit_code = 10;
    eval {
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      $new_master_handler->disable_log_bin_local();
      print "Set read_only=0 on the new master.
";
      $new_master_handler->disable_read_only();
      $new_master_handler->disconnect();

      print "Enabling the VIP - $vip on the new master - $new_master_host 
";
      &start_vip();

      $exit_code = 0;
    };
    if ($@) {
      warn $@;

      # If you want to continue failover, exit 10.
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {
    print "Check script.. OK 
";
    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

sub start_vip() {
    `ssh $ssh_user@$new_master_host " $ssh_start_vip "`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;
}

sub usage {
  print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port
";
}

master_ip_online_change 腳本

#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => "all";

use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiRes qw( sleep gettimeofday tv_interval );
use Data::Dumper;

my $_tstart;
my $_running_interval = 0.1;
my (
  $command,              $orig_master_is_new_slave, $orig_master_host,
  $orig_master_ip,       $orig_master_port,         $orig_master_user,
  $orig_master_password, $orig_master_ssh_user,     $new_master_host,
  $new_master_ip,        $new_master_port,          $new_master_user,
  $new_master_password,  $new_master_ssh_user
);
GetOptions(
  "command=s"                => $command,
  "orig_master_is_new_slave" => $orig_master_is_new_slave,
  "orig_master_host=s"       => $orig_master_host,
  "orig_master_ip=s"         => $orig_master_ip,
  "orig_master_port=i"       => $orig_master_port,
  "orig_master_user=s"       => $orig_master_user,
  "orig_master_password=s"   => $orig_master_password,
  "orig_master_ssh_user=s"   => $orig_master_ssh_user,
  "new_master_host=s"        => $new_master_host,
  "new_master_ip=s"          => $new_master_ip,
  "new_master_port=i"        => $new_master_port,
  "new_master_user=s"        => $new_master_user,
  "new_master_password=s"    => $new_master_password,
  "new_master_ssh_user=s"    => $new_master_ssh_user,
);

my $vip = "10.0.0.237";
my $key = "1";
my $ssh_start_vip = "sudo /sbin/ifconfig eth0:$key $vip netmask 255.255.255.255";
my $ssh_stop_vip = "sudo /sbin/ifconfig eth0:$key down";

exit &main();

sub current_time_us {
  my ( $sec, $microsec ) = gettimeofday();
  my $curdate = localtime($sec);
  return $curdate . " " . sprintf( "%06d", $microsec );
}

sub sleep_until {
  my $elapsed = tv_interval($_tstart);
  if ( $_running_interval > $elapsed ) {
    sleep( $_running_interval - $elapsed );
  }
}

sub get_threads_util {
  my $dbh                    = shift;
  my $my_connection_id       = shift;
  my $running_time_threshold = shift;
  my $type                   = shift;
  $running_time_threshold = 0 unless ($running_time_threshold);
  $type                   = 0 unless ($type);
  my @threads;

  my $sth = $dbh->prepare("SHOW PROCESSLIST");
  $sth->execute();

  while ( my $ref = $sth->fetchrow_hashref() ) {
    my $id         = $ref->{Id};
    my $user       = $ref->{User};
    my $host       = $ref->{Host};
    my $command    = $ref->{Command};
    my $state      = $ref->{State};
    my $query_time = $ref->{Time};
    my $info       = $ref->{Info};
    $info =~ s/^s*(.*?)s*$/$1/ if defined($info);
    next if ( $my_connection_id == $id );
    next if ( defined($query_time) && $query_time < $running_time_threshold );
    next if ( defined($command)    && $command eq "Binlog Dump" );
    next if ( defined($user)       && $user eq "system user" );
    next
      if ( defined($command)
      && $command eq "Sleep"
      && defined($query_time)
      && $query_time >= 1 );

    if ( $type >= 1 ) {
      next if ( defined($command) && $command eq "Sleep" );
      next if ( defined($command) && $command eq "Connect" );
    }

    if ( $type >= 2 ) {
      next if ( defined($info) && $info =~ m/^select/i );
      next if ( defined($info) && $info =~ m/^show/i );
    }

    push @threads, $ref;
  }
  return @threads;
}

sub main {
  if ( $command eq "stop" ) {
    ## Gracefully killing connections on the current master
    # 1. Set read_only= 1 on the new master
    # 2. DROP USER so that no app user can establish new connections
    # 3. Set read_only= 1 on the current master
    # 4. Kill current queries
    # * Any database access failure will result in script die.
    my $exit_code = 1;
    eval {
      ## Setting read_only=1 on the new master (to avoid accident)
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );
      print current_time_us() . " Set read_only on the new master.. ";
      $new_master_handler->enable_read_only();
      if ( $new_master_handler->is_read_only() ) {
        print "ok.
";
      }
      else {
        die "Failed!
";
      }
      $new_master_handler->disconnect();

      # Connecting to the orig master, die if any database error happens
      my $orig_master_handler = new MHA::DBHelper();
      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,
        $orig_master_user, $orig_master_password, 1 );

      $orig_master_handler->disable_log_bin_local();

      ## Waiting for N * 100 milliseconds so that current connections can exit
      my $time_until_read_only = 15;
      $_tstart = [gettimeofday];
      my @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_read_only > 0 && $#threads >= 0 ) {
        if ( $time_until_read_only % 5 == 0 ) {
          printf
"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)
",
            current_time_us(), $#threads + 1, $time_until_read_only * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "
"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_read_only--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write
      print current_time_us() . " Set read_only=1 on the orig master.. ";
      $orig_master_handler->enable_read_only();
      if ( $orig_master_handler->is_read_only() ) {
        print "ok.
";
      }
      else {
        die "Failed!
";
      }

      ## Waiting for M * 100 milliseconds so that current update queries can complete
      my $time_until_kill_threads = 5;
      @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
        if ( $time_until_kill_threads % 5 == 0 ) {
          printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)
",
            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "
"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_kill_threads--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

      ## Terminating all threads
      print current_time_us() . " Killing all application threads..
";
      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
      print current_time_us() . " done.
";
      $orig_master_handler->enable_log_bin_local();
      $orig_master_handler->disconnect();

      print "Disabling the VIP on old master: $orig_master_host 
";
      &stop_vip();

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@
";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {
    ## Activating master ip on the new master
    # 1. Create app user with write privileges
    # 2. Moving backup script if needed
    # 3. Register new master"s ip to the catalog database

# We don"t return error even though activating updatable accounts/ip failed so that we don"t interrupt slaves" recovery.
# If exit code is 0 or 10, MHA does not abort
    my $exit_code = 10;
    eval {
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      $new_master_handler->disable_log_bin_local();
      print current_time_us() . " Set read_only=0 on the new master.
";
      $new_master_handler->disable_read_only();

      $new_master_handler->enable_log_bin_local();
      $new_master_handler->disconnect();

      print "Enabling the VIP - $vip on the new master - $new_master_host 
";
      &start_vip();

      ## Update master ip on the catalog database, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@
";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {

    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

sub start_vip() {
    return 0  unless  ($new_master_ssh_user);
    `ssh $new_master_ssh_user@$new_master_host " $ssh_start_vip "`;
}
sub stop_vip() {
     return 0  unless  ($orig_master_ssh_user);
    `ssh $orig_master_ssh_user@$orig_master_host " $ssh_stop_vip "`;
}

sub usage {
  print
"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port
";
  die;
}

send_report 腳本

#!/usr/bin/perl
use strict;
use warnings FATAL => "all";

use Getopt::Long;

#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body, $title, $content);
GetOptions(
  "orig_master_host=s" => $dead_master_host,
  "new_master_host=s"  => $new_master_host,
  "new_slave_hosts=s"  => $new_slave_hosts,
  "subject=s"          => $subject,
  "body=s"             => $body,
);

# 調(diào)用外部腳本
$title="[mha switch]";
$content="`date +"%Y-%m-%d %H:%M"` old_master=".$dead_master_host." new_master=".$new_master_host;
system("sh /etc/masterha/script/send_report.sh $title $content");

exit 0;

清理中繼日志定時(shí)任務(wù)

下面是我的定時(shí)任務(wù)，參數(shù)自行替換, workdir 需要和中繼日志在同一個(gè)盤

# 每小時(shí)清理一次
0 * * * * (/usr/bin/purge_relay_logs --user=mha_manager --password=mha_manager --disable_relay_log_purge --port=3306 --workdir=/tmp/relaylogtmp >> /var/log/purge_relay_logs.log 2>&1)

MHA Wiki: https://github.com/yoshinorim...

GPU云服務(wù)器云服務(wù)器 mysql高可用架構(gòu) 高可用集群雙機(jī)高可用集群高可用集群軟件

文章版權(quán)歸作者所有，未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址：http://systransis.cn/yun/17878.html

發(fā)表評(píng)論

登陸后可評(píng)論

0條評(píng)論

Michael_Lin

男|高級(jí)講師

我要關(guān)注我要私信

TA的文章

【黑五】RackNerd：$10.88/年/768MB內(nèi)存/12GB SSD空間/1TB流量/1Gb

閱讀 1211·2021-11-24 11:16
程序人生之周記

閱讀 3440·2021-11-15 11:38
DMIT：香港cn2 gia vps，lite輕量版大帶寬vps，1核/0.75G內(nèi)存/10G SS

閱讀 1952·2021-10-20 13:47
Java和c計(jì)算器

閱讀 559·2021-09-29 09:35
云主機(jī)和服務(wù)器哪個(gè)好-想建一個(gè)服務(wù)器，使用云服務(wù)器好些還是個(gè)人電腦好些？

閱讀 2207·2021-09-22 15:17
編寫一個(gè)并發(fā)性能測(cè)試的小程序

閱讀 1028·2021-09-07 09:59
下拉滑動(dòng)二級(jí)菜單

閱讀 3395·2019-08-30 13:21
博客引入漂亮字體二三事

閱讀 2919·2019-08-30 12:47

成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來(lái)選購(gòu)！

MySQL集群搭建(5)-MHA高可用架構(gòu)

相關(guān)文章

發(fā)表評(píng)論

0條評(píng)論

Michael_Lin

男|高級(jí)講師

TA的文章

【黑五】RackNerd：$10.88/年/768MB內(nèi)存/12GB SSD空間/1TB流量/1Gb

程序人生之周記

DMIT：香港cn2 gia vps，lite輕量版大帶寬vps，1核/0.75G內(nèi)存/10G SS

Java和c計(jì)算器

云主機(jī)和服務(wù)器哪個(gè)好-想建一個(gè)服務(wù)器，使用云服務(wù)器好些還是個(gè)人電腦好些？

編寫一個(gè)并發(fā)性能測(cè)試的小程序

下拉滑動(dòng)二級(jí)菜單

博客引入漂亮字體二三事

最新活動(dòng)

資訊專欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來(lái)選購(gòu)！

MySQL集群搭建(5)-MHA高可用架構(gòu)

相關(guān)文章

發(fā)表評(píng)論

0條評(píng)論

男|高級(jí)講師

TA的文章

最新活動(dòng)

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來(lái)選購(gòu)！