摘要:上個(gè)文章集群搭建主主從模式中我們知道如何搭建主主從模式,今天這個(gè)文章正式進(jìn)入高可用的架構(gòu)。由開(kāi)發(fā),用來(lái)管理和監(jiān)控雙主復(fù)制,雖然是雙主架構(gòu),但是業(yè)務(wù)上同一時(shí)間只允許一個(gè)節(jié)點(diǎn)進(jìn)行寫入操作。包含兩類角色和分別對(duì)應(yīng)讀寫節(jié)點(diǎn)和只讀節(jié)點(diǎn)。
上個(gè)文章 MySQL集群搭建(2)-主主從模式 中我們知道如何搭建 MySQL 主主從模式,今天這個(gè)文章正式進(jìn)入 MySQL 高可用的架構(gòu)。
1 MMM 介紹 1.1 簡(jiǎn)介MMM 是一套支持雙主故障切換以及雙主日常管理的第三方軟件。MMM 由 Perl 開(kāi)發(fā),用來(lái)管理和監(jiān)控雙主復(fù)制,雖然是雙主架構(gòu),但是業(yè)務(wù)上同一時(shí)間只允許一個(gè)節(jié)點(diǎn)進(jìn)行寫入操作。
MMM 包含兩類角色: writer 和 reader, 分別對(duì)應(yīng)讀寫節(jié)點(diǎn)和只讀節(jié)點(diǎn)。
使用 MMM 管理雙主節(jié)點(diǎn)的情況下,當(dāng) writer 節(jié)點(diǎn)出現(xiàn)宕機(jī)(假定是 master1),程序會(huì)自動(dòng)移除該節(jié)點(diǎn)上的讀寫 VIP,切換到 Master2 ,并設(shè)置 Master2 為 read_only = 0, 同時(shí),所有 Slave 節(jié)點(diǎn)會(huì)指向 Master2。
除了管理雙主節(jié)點(diǎn),MMM 也會(huì)管理 Slave 節(jié)點(diǎn),在出現(xiàn)宕機(jī)、復(fù)制延遲或復(fù)制錯(cuò)誤,MMM 會(huì)移除該節(jié)點(diǎn)的 VIP,直到節(jié)點(diǎn)恢復(fù)正常。
1.2 組件MMM 由兩類程序組成
monitor: 監(jiān)控集群內(nèi)數(shù)據(jù)庫(kù)的狀態(tài),在出現(xiàn)異常時(shí)發(fā)布切換命令,一般和數(shù)據(jù)庫(kù)分開(kāi)部署
agent: 運(yùn)行在每個(gè) MySQL 服務(wù)器上的代理進(jìn)程,monitor 命令的執(zhí)行者,完成監(jiān)控的探針工作和具體服務(wù)設(shè)置,例如設(shè)置 VIP、指向新同步節(jié)點(diǎn)
其架構(gòu)如下:
以上述架構(gòu)為例,描述一下故障轉(zhuǎn)移的流程,現(xiàn)在假設(shè) Master1 宕機(jī)
Monitor 檢測(cè)到 Master1 連接失敗
Monitor 發(fā)送 set_offline 指令到 Master1 的 Agent
Master1 Agent 如果存活,下線寫 VIP,嘗試把 Master1 設(shè)置為 read_only=1
Moniotr 發(fā)送 set_online 指令到 Master2
Master2 Agent 接收到指令,執(zhí)行 select master_pos_wait() 等待同步完畢
Master2 Agent 上線寫 VIP,把 Master2 節(jié)點(diǎn)設(shè)為 read_only=0
Monitor 發(fā)送更改同步對(duì)象的指令到各個(gè) Slave 節(jié)點(diǎn)的 Agent
各個(gè) Slave 節(jié)點(diǎn)向新 Master 同步數(shù)據(jù)
從整個(gè)流程可以看到,如果主節(jié)點(diǎn)出現(xiàn)故障,MMM 會(huì)自動(dòng)實(shí)現(xiàn)切換,不需要人工干預(yù),同時(shí)我們也能看出一些問(wèn)題,就是數(shù)據(jù)庫(kù)掛掉后,只是做了切換,不會(huì)主動(dòng)補(bǔ)齊丟失的數(shù)據(jù),所以 MMM 會(huì)有數(shù)據(jù)不一致性的風(fēng)險(xiǎn)。
2 MMM 安裝 2.1 yum 安裝如果服務(wù)器能連網(wǎng)或者有合適 yum 源,直接執(zhí)行以下命令安裝
# 增加 yum 源(如果默認(rèn) yum 源有,這一步可以忽略) yum install epel-release.noarch # 在 agent 節(jié)點(diǎn)執(zhí)行 yum install -y mysql-mmm-agent # 在 monitor 節(jié)點(diǎn)執(zhí)行 yum install -y mysql-mmm-monitor
執(zhí)行該安裝命令,會(huì)安裝以下軟件包或依賴
mysql-mmm-agent.noarch 0:2.2.1-1.el5 libart_lgpl.x86_64 0:2.3.17-4 mysql-mmm.noarch 0:2.2.1-1.el5 perl-Algorithm-Diff.noarch 0:1.1902-2.el5 perl-DBD-mysql.x86_64 0:4.008-1.rf perl-DateManip.noarch 0:5.44-1.2.1 perl-IPC-Shareable.noarch 0:0.60-3.el5 perl-Log-Dispatch.noarch 0:2.20-1.el5 perl-Log-Dispatch-FileRotate.noarch 0:1.16-1.el5 perl-Log-Log4perl.noarch 0:1.13-2.el5 perl-MIME-Lite.noarch 0:3.01-5.el5 perl-Mail-Sender.noarch 0:0.8.13-2.el5.1 perl-Mail-Sendmail.noarch 0:0.79-9.el5.1 perl-MailTools.noarch 0:1.77-1.el5 perl-Net-ARP.x86_64 0:1.0.6-2.1.el5 perl-Params-Validate.x86_64 0:0.88-3.el5 perl-Proc-Daemon.noarch 0:0.03-1.el5 perl-TimeDate.noarch 1:1.16-5.el5 perl-XML-DOM.noarch 0:1.44-2.el5 perl-XML-Parser.x86_64 0:2.34-6.1.2.2.1 perl-XML-RegExp.noarch 0:0.03-2.el5 rrdtool.x86_64 0:1.2.27-3.el5 rrdtool-perl.x86_64 0:1.2.27-3.el5
其他系統(tǒng)安裝方式可以參考官網(wǎng)
2.2 手動(dòng)安裝 1). 下載安裝包進(jìn)入 MMM 下載頁(yè)面 Downloads MMM for MySQL,點(diǎn)擊下載,如圖
下載完成上傳到服務(wù)器上
2). 安裝依賴yum install -y wget perl openssl gcc gcc-c++ wget http://xrl.us/cpanm --no-check-certificate mv cpanm /usr/bin chmod 755 /usr/bin/cpanm cat > /root/list << EOF install Algorithm::Diff install Class::Singleton install DBI install DBD::mysql install File::Basename install File::stat install File::Temp install Log::Dispatch install Log::Log4perl install Mail::Send install Net::ARP install Net::Ping install Proc::Daemon install Thread::Queue install Time::HiRes EOF for package in `cat /root/list` do cpanm $package done3). 安裝
tar -xvf mysql-mmm-2.2.1.tar.gz cd mysql-mmm-2.2.1 make install
ps: 大部分時(shí)候,數(shù)據(jù)庫(kù)機(jī)器都是不允許連接外網(wǎng)的,這個(gè)時(shí)候只能把上述依賴的 RPM 包一個(gè)個(gè)下載下來(lái)拷到服務(wù)器上3 數(shù)據(jù)庫(kù)環(huán)境準(zhǔn)備
操作前已經(jīng)準(zhǔn)備好了一套主主從架構(gòu)的數(shù)據(jù)庫(kù),搭建方法可以參考以往文章,具體信息如下
節(jié)點(diǎn)信息IP | 系統(tǒng) | 端口 | MySQL版本 | 節(jié)點(diǎn) | 讀寫 | 說(shuō)明 |
---|---|---|---|---|---|---|
10.0.0.247 | Centos6.5 | 3306 | 5.7.9 | Master | 讀寫 | 主節(jié)點(diǎn) |
10.0.0.248 | Centos6.5 | 3306 | 5.7.9 | Standby | 只讀,可切換為讀寫 | 備主節(jié)點(diǎn) |
10.0.0.249 | Centos6.5 | 3306 | 5.7.9 | Slave | 只讀 | 從節(jié)點(diǎn) |
10.0.0.24 | Centos6.5 | - | - | monitor | - | MMM Monitor |
簡(jiǎn)稱 | VIP | 類型 |
---|---|---|
RW-VIP | 10.0.0.237 | 讀寫VIP |
RO-VIP1 | 10.0.0.238 | 讀VIP |
RO-VIP2 | 10.0.0.239 | 讀VIP |
Master1
[client] port = 3306 default-character-set=utf8mb4 socket = /data/mysql_db/test_mmm/mysql.sock [mysqld] datadir = /data/mysql_db/test_mmm basedir = /usr/local/mysql57 tmpdir = /tmp socket = /data/mysql_db/test_mmm/mysql.sock pid-file = /data/mysql_db/test_mmm/mysql.pid skip-external-locking = 1 skip-name-resolve = 1 port = 3306 server_id = 2473306 default-storage-engine = InnoDB character-set-server = utf8mb4 default_password_lifetime=0 auto_increment_offset = 1 auto_increment_increment = 2 #### log #### log_timestamps=system log_bin = /data/mysql_log/test_mmm/mysql-bin log_bin_index = /data/mysql_log/test_mmm/mysql-bin.index binlog_format = row relay_log_recovery=ON relay_log=/data/mysql_log/test_mmm/mysql-relay-bin relay_log_index=/data/mysql_log/test_mmm/mysql-relay-bin.index log_error = /data/mysql_log/test_mmm/mysql-error.log #### replication #### log_slave_updates = 1 replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.% #### semi sync replication settings ##### plugin_dir=/usr/local/mysql57/lib/plugin plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" loose_rpl_semi_sync_master_enabled = 1 loose_rpl_semi_sync_slave_enabled = 1
Master2
[client] port = 3306 default-character-set=utf8mb4 socket = /data/mysql_db/test_mmm/mysql.sock [mysqld] datadir = /data/mysql_db/test_mmm basedir = /usr/local/mysql57 tmpdir = /tmp socket = /data/mysql_db/test_mmm/mysql.sock pid-file = /data/mysql_db/test_mmm/mysql.pid skip-external-locking = 1 skip-name-resolve = 1 port = 3306 server_id = 2483306 default-storage-engine = InnoDB character-set-server = utf8mb4 default_password_lifetime=0 auto_increment_offset = 2 auto_increment_increment = 2 #### log #### log_timestamps=system log_bin = /data/mysql_log/test_mmm/mysql-bin log_bin_index = /data/mysql_log/test_mmm/mysql-bin.index binlog_format = row relay_log_recovery=ON relay_log=/data/mysql_log/test_mmm/mysql-relay-bin relay_log_index=/data/mysql_log/test_mmm/mysql-relay-bin.index log_error = /data/mysql_log/test_mmm/mysql-error.log #### replication #### log_slave_updates = 1 replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.% #### semi sync replication settings ##### plugin_dir=/usr/local/mysql57/lib/plugin plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" loose_rpl_semi_sync_master_enabled = 1 loose_rpl_semi_sync_slave_enabled = 1
Slave
[client] port = 3306 default-character-set=utf8mb4 socket = /data/mysql_db/test_mmm/mysql.sock [mysqld] datadir = /data/mysql_db/test_mmm basedir = /usr/local/mysql57 tmpdir = /tmp socket = /data/mysql_db/test_mmm/mysql.sock pid-file = /data/mysql_db/test_mmm/mysql.pid skip-external-locking = 1 skip-name-resolve = 1 port = 3306 server_id = 2493306 default-storage-engine = InnoDB character-set-server = utf8mb4 default_password_lifetime=0 read_only=1 #### log #### log_timestamps=system log_bin = /data/mysql_log/test_mmm/mysql-bin log_bin_index = /data/mysql_log/test_mmm/mysql-bin.index binlog_format = row relay_log_recovery=ON relay_log=/data/mysql_log/test_mmm/mysql-relay-bin relay_log_index=/data/mysql_log/test_mmm/mysql-relay-bin.index log_error = /data/mysql_log/test_mmm/mysql-error.log #### replication #### log_slave_updates = 1 replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.% #### semi sync replication settings ##### plugin_dir=/usr/local/mysql57/lib/plugin plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" loose_rpl_semi_sync_master_enabled = 1 loose_rpl_semi_sync_slave_enabled = 1新建用戶
在主節(jié)點(diǎn)中執(zhí)行下列建立 MMM 用戶的命令,由于是測(cè)試環(huán)境,密碼就設(shè)為和賬號(hào)一樣
CREATE USER "mmm_monitor"@"%" IDENTIFIED BY "mmm_monitor"; CREATE USER "mmm_agent"@"%" IDENTIFIED BY "mmm_agent"; GRANT REPLICATION CLIENT ON *.* TO "mmm_monitor"@"%"; GRANT SUPER, REPLICATION CLIENT, PROCESS ON *.* TO "mmm_agent"@"%"; FLUSH PRIVILEGES;4 配置 MMM 4.1 配置文件
MMM 有3個(gè)配置文件,分別是 mmm_agent.conf, mmm_common.conf, mmm_mon.conf, 在目錄 /etc/mysql-mmm 下。如果區(qū)分集群,也就是說(shuō)一臺(tái)服務(wù)器跑多個(gè) MMM,那么配置文件可以這樣命名 mmm_agent_cluster.conf, mmm_common_cluster.conf, mmm_mon_cluster.conf, 其中 cluster 表示集群名稱
mmm_common.conf , 通用配置,在所有 MMM 節(jié)點(diǎn)都需要
mmm_agent.conf, agent 配置,在 MMM Agent 節(jié)點(diǎn)需要
mmm_mon.conf, monitor 配置,在 MMM Monitor 節(jié)點(diǎn)需要
這次配置,我們把集群名命名為 test_mmm, 下面是具體配置
mmm_common在所有節(jié)點(diǎn)新建 /etc/mysql-mmm/mmm_common_test_mmm.conf, 根據(jù)實(shí)際情況寫上
active_master_role writermmm_agentcluster_interface eth0 # 群集的網(wǎng)絡(luò)接口 agent_port 9989 # agent 監(jiān)聽(tīng)端口,如果有多個(gè) agent,需要更改默認(rèn)端口 mysql_port 3306 # 數(shù)據(jù)庫(kù)端口,默認(rèn)為3306 pid_path /var/run/mysql-mmm/mmm_agentd_test_mmm.pid # pid路徑, 要和啟動(dòng)文件對(duì)應(yīng) bin_path /usr/libexec/mysql-mmm # bin 文件路徑 replication_user repl # 復(fù)制用戶 replication_password repl # 復(fù)制用戶密碼 agent_user mmm_agent # 代理用戶,用來(lái)設(shè)置 `read_only` 等 agent_password mmm_agent # 代理用戶密碼 # master1 的 host 名 ip 10.0.0.247 # master1 的 ip mode master # 角色屬性,master 代表是主節(jié)點(diǎn) peer cluster02 # 與 master1 對(duì)等的服務(wù)器的 host 名,雙主中另一個(gè)的主機(jī)名 # master2 的 host 名 ip 10.0.0.248 # master2 的 ip mode master # 角色屬性,master 代表是主節(jié)點(diǎn) peer cluster01 # 與 master2 對(duì)等的服務(wù)器的 host 名,雙主中另一個(gè)的主機(jī)名 # slave 的 host 名 ip 10.0.0.249 # slave 的 ip mode slave # 角色屬性,slave 代表是從節(jié)點(diǎn) # writer 角色配置 hosts cluster01, cluster02 # 能進(jìn)行寫操作的服務(wù)器的 host 名 ips 10.0.0.237 # writer 的 VIP mode exclusive # exclusive 代表只允許存在一個(gè)主節(jié)點(diǎn)(寫節(jié)點(diǎn)),也就是只能提供一個(gè)寫的 VIP # writer 角色配置 hosts cluster01, cluster02, cluster03 # 能進(jìn)行讀操作的服務(wù)器的 host 名 ips 10.0.0.238,10.0.0.239 # reader 的 VIP mode balanced # balanced 代表負(fù)載均衡可以多個(gè) host 同時(shí)擁有此角色
在所有 agent 的節(jié)點(diǎn)新建 /etc/mysql-mmm/mmm_agent_test_mmm.conf 文件,寫上以下內(nèi)容
Cluster1
include mmm_common_test_mmm.conf # common 文件名,對(duì)應(yīng)上述寫下的文件 this cluster01 # 當(dāng)前節(jié)點(diǎn)名稱,對(duì)應(yīng) common 文件 host 名
Cluster2
include mmm_common_test_mmm.conf this cluster02
Cluster3
include mmm_common_test_mmm.conf this cluster03mmm_mon
在 monitor 節(jié)點(diǎn)新建 /etc/mysql-mmm/mmm_mon_test_mmm.conf 文件,寫下監(jiān)控節(jié)點(diǎn)配置
include mmm_common_test_mmm.conf # common 文件名ip 127.0.0.1 # 監(jiān)聽(tīng) IP port 9992 # 監(jiān)聽(tīng)端口 pid_path /var/run/mysql-mmm/mmm_mond_test_mmm.pid # PID 文件位置, 要和啟動(dòng)文件對(duì)應(yīng) bin_path /usr/libexec/mysql-mmm # bin目錄 status_path /var/lib/mysql-mmm/mmm_mond_test_mmm.status # 狀態(tài)文件位置 ping_ips 10.0.0.247, 10.0.0.248, 10.0.0.249 # 需要監(jiān)控的主機(jī) IP,對(duì)應(yīng) MySQL 節(jié)點(diǎn) IP auto_set_online 30 # 自動(dòng)恢復(fù) online 的時(shí)間 monitor_user mmm_monitor # 監(jiān)控用的 MySQL 賬號(hào) monitor_password mmm_monitor # 監(jiān)控用的 MySQL 密碼 check_period 2 # 監(jiān)控周期 trap_period 4 # 一個(gè)節(jié)點(diǎn)被檢測(cè)不成功的時(shí)間持續(xù) trap_period 秒,就認(rèn)為失去連接 max_backlog 900 # 主從延遲超過(guò)這個(gè)值就會(huì)設(shè)為 offline debug 0 # 是否開(kāi)啟 debug 模式
PS1: 以上配置文件在使用的時(shí)候需要去掉注釋
PS2: 如果只有一個(gè)集群,可以在默認(rèn)配置文件上改
安裝成功后,會(huì)在 /etc/init.d/ 下生成配置啟動(dòng)文件
[root@chengqm ~]# ls /etc/init.d/mysql* /etc/init.d/mysqld /etc/init.d/mysql-mmm-agent /etc/init.d/mysql-mmm-monitormysql-mmm-agent
在所有 agent 節(jié)點(diǎn)執(zhí)行
cp /etc/init.d/mysql-mmm-agent /etc/init.d/mysql-mmm-agent-test-mmm
打開(kāi) /etc/init.d/mysql-mmm-agent-test-mmm, 如果你的配置文件頭部是這樣的
CLUSTER="" #----------------------------------------------------------------------- # Paths if [ "$CLUSTER" != "" ]; then MMM_AGENTD_BIN="/usr/sbin/mmm_agentd @$CLUSTER" MMM_AGENTD_PIDFILE="/var/run/mmm_agentd-$CLUSTER.pid" else MMM_AGENTD_BIN="/usr/sbin/mmm_agentd" MMM_AGENTD_PIDFILE="/var/run/mmm_agentd.pid" fi echo "Daemon bin: "$MMM_AGENTD_BIN"" echo "Daemon pid: "$MMM_AGENTD_PIDFILE""
改為
CLUSTER="test_mmm" #----------------------------------------------------------------------- # Paths if [ "$CLUSTER" != "" ]; then MMM_AGENTD_BIN="/usr/sbin/mmm_agentd @$CLUSTER" MMM_AGENTD_PIDFILE="/var/run/mysql-mmm/mmm_agentd_$CLUSTER.pid" else MMM_AGENTD_BIN="/usr/sbin/mmm_agentd" MMM_AGENTD_PIDFILE="/var/run/mysql-mmm/mmm_agentd.pid" fi echo "Daemon bin: "$MMM_AGENTD_BIN"" echo "Daemon pid: "$MMM_AGENTD_PIDFILE""
如果打開(kāi)發(fā)現(xiàn)是這樣的
MMMD_AGENT_BIN="/usr/sbin/mmm_agentd" MMMD_AGENT_PIDFILE="/var/run/mysql-mmm/mmm_agentd.pid" LOCKFILE="/var/lock/subsys/mysql-mmm-agent" prog="MMM Agent Daemon"
改為
... CLUSTER="test_mmm" MMMD_AGENT_BIN="/usr/sbin/mmm_agentd @$CLUSTER" MMMD_AGENT_PIDFILE="/var/run/mysql-mmm/mmm_agentd_$CLUSTER.pid" LOCKFILE="/var/lock/subsys/mysql-mmm-agent_CLUSTER$" prog="MMM Agent Daemon"mysql-mmm-monitor
在 monitor 節(jié)點(diǎn)執(zhí)行
cp /etc/init.d/mysql-mmm-monitor /etc/init.d/mysql-mmm-monitor-test-mmm
打開(kāi) /etc/init.d/mysql-mmm-monitor-test-mmm, 把文件開(kāi)始部分改為
# Cluster name (it can be empty for default cases) CLUSTER="test_mmm" LOCKFILE="/var/lock/subsys/mysql-mmm-monitor-${CLUSTER}" prog="MMM Monitor Daemon" if [ "$CLUSTER" != "" ]; then MMMD_MON_BIN="/usr/sbin/mmm_mond @$CLUSTER" MMMD_MON_PIDFILE="/var/run/mysql-mmm/mmm_mond_$CLUSTER.pid" else MMMD_MON_BIN="/usr/sbin/mmm_mond" MMMD_MON_PIDFILE="/var/run/mysql-mmm/mmm_mond.pid" fi start() { ...
如果打開(kāi)啟動(dòng)文件發(fā)現(xiàn)和本文的啟動(dòng)文件有出入,可以根據(jù)實(shí)際情況進(jìn)行修改,確保啟動(dòng) monitor 命令為 /usr/sbin/mmm_mond @$CLUSTER 且 pid 文件和配置文件一致即可
PS: 如果只有一個(gè)集群,可以直接使用默認(rèn)啟動(dòng)文件
注意: 配置文件的 PID 文件位置要和啟動(dòng)文件的 PID 文件位置要一致,如果不一致就改為一致
啟動(dòng) MMM 的順序是
啟動(dòng) MMM Monitor
啟動(dòng) MMM Agent
關(guān)閉 MMM 的順序則反過(guò)來(lái)執(zhí)行
5.1 啟動(dòng) Monitor在 monitor 節(jié)點(diǎn)上執(zhí)行啟動(dòng)命令,示例如下
[root@chengqm ~]# /etc/init.d/mysql-mmm-monitor-test-mmm start Starting MMM Monitor Daemon: [ OK ]
如果啟動(dòng)有報(bào)錯(cuò)查看 mmm 日志,mmm 日志放在 /var/log/mysql-mmm/ 目錄下
5.2 啟動(dòng) Agent在所有 agent 節(jié)點(diǎn)執(zhí)行啟動(dòng)命令,示例如下
[root@cluster01 ~]# /etc/init.d/mysql-mmm-agent-test-mmm start Daemon bin: "/usr/sbin/mmm_agentd @test_mmm" Daemon pid: "/var/run/mmm_agentd-test_mmm.pid" Starting MMM Agent daemon... Ok5.3 觀察 mmm 狀態(tài)
在 monitor 節(jié)點(diǎn)執(zhí)行 mmm_control @cluster show 命令查看各節(jié)點(diǎn)狀態(tài)
[root@chengqm ~]# mmm_control @test_mmm show cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238), writer(10.0.0.237) cluster02(10.0.0.248) master/ONLINE. Roles: reader(10.0.0.239) cluster03(10.0.0.249) slave/ONLINE. Roles:
在 monitor 節(jié)點(diǎn)執(zhí)行 mmm_control @cluster checks all 命令檢測(cè)所有節(jié)點(diǎn)
[root@chengqm ~]# mmm_control @test_mmm checks all cluster01 ping [last change: 2018/12/05 20:06:35] OK cluster01 mysql [last change: 2018/12/05 20:23:59] OK cluster01 rep_threads [last change: 2018/12/05 20:24:14] OK cluster01 rep_backlog [last change: 2018/12/05 20:24:14] OK: Backlog is null cluster02 ping [last change: 2018/12/05 20:06:35] OK cluster02 mysql [last change: 2018/12/05 20:23:59] OK cluster02 rep_threads [last change: 2018/12/05 20:24:14] OK cluster02 rep_backlog [last change: 2018/12/05 20:24:14] OK cluster03 ping [last change: 2018/12/05 20:06:35] OK cluster03 mysql [last change: 2018/12/05 20:23:59] OK cluster03 rep_threads [last change: 2018/12/05 20:24:14] OK cluster03 rep_backlog [last change: 2018/12/05 20:24:14] OK: Backlog is null
在 Cluster1 主機(jī)查看 VIP 情況
[root@cluster01 ~]# ip addr 1: lo:mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0 inet 10.0.0.238/32 scope global eth0 inet 10.0.0.237/32 scope global eth0 inet6 fe80::f816:3eff:fede:8033/64 scope link valid_lft forever preferred_lft forever
可以看到 VIP 和 MMM 描述的一致
6 MMM 切換MMM 切換有兩種方式,手動(dòng)切換和自動(dòng)切換
6.1 直接切換 role相關(guān)命令: mmm_control [@cluster] move_role [writer/reader] host 給某個(gè)節(jié)點(diǎn)增加角色
讓我們測(cè)試一下
當(dāng)前節(jié)點(diǎn)狀態(tài)
[root@chengqm ~]# mmm_control @test_mmm show cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238), writer(10.0.0.237) cluster02(10.0.0.248) master/ONLINE. Roles: reader(10.0.0.239) cluster03(10.0.0.249) slave/ONLINE. Roles:
Cluster1 VIP
[mysql@cluster01 ~]$ ip addr 1: lo:mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0 inet 10.0.0.238/32 scope global eth0 inet 10.0.0.237/32 scope global eth0 inet6 fe80::f816:3eff:fede:8033/64 scope link valid_lft forever preferred_lft forever
Master1 read_only 狀態(tài)
[mysql@cluster01 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show variables like "read_only""; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | read_only | OFF | +---------------+-------+
Cluster2 VIP
[mysql@cluster02 ~]$ ip addr 1: lo:mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:66:7e:e8 brd ff:ff:ff:ff:ff:ff inet 10.0.0.248/16 brd 10.0.255.255 scope global eth0 inet 10.0.0.239/32 scope global eth0 inet6 fe80::f816:3eff:fe66:7ee8/64 scope link valid_lft forever preferred_lft forever
Master2 read_only 狀態(tài)
[mysql@cluster02 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show variables like "read_only""; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | read_only | ON | +---------------+-------+
Slave 同步指向
[mysql@cluster03 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show slave status G"; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.0.247 Master_User: repl Master_Port: 3306 Connect_Retry: 60 ... ....切換
執(zhí)行 mmm_control @test_mmm move_role writer cluster02 切換
[root@chengqm ~]# mmm_control @test_mmm move_role writer cluster02 OK: Role "writer" has been moved from "cluster01" to "cluster02". Now you can wait some time and check new roles info! [root@chengqm ~]# mmm_control @test_mmm show cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238) cluster02(10.0.0.248) master/ONLINE. Roles: reader(10.0.0.239), writer(10.0.0.237) cluster03(10.0.0.249) slave/ONLINE. Roles:
切換后 cluster2 VIP
[mysql@cluster02 ~]$ ip addr 1: lo:mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:66:7e:e8 brd ff:ff:ff:ff:ff:ff inet 10.0.0.248/16 brd 10.0.255.255 scope global eth0 inet 10.0.0.239/32 scope global eth0 inet 10.0.0.237/32 scope global eth0 inet6 fe80::f816:3eff:fe66:7ee8/64 scope link valid_lft forever preferred_lft forever
切換后 Master2 read_only 狀態(tài)
[mysql@cluster02 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show variables like "read_only""; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | read_only | OFF | +---------------+-------+
切換后 Slave 同步指向
[mysql@cluster03 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show slave status G"; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.0.248 Master_User: repl Master_Port: 3306 Connect_Retry: 60
可以看到切換成功
6.2 使用"上線""下線"功能切換切換操作也可以用以下兩個(gè)命令完成
mmm_control [@cluster] set_offline host 下線節(jié)點(diǎn)
mmm_control [@cluster] set_online host 上線節(jié)點(diǎn)
現(xiàn)在我們想把寫節(jié)點(diǎn)從 Master2 切換到 Master1,可以進(jìn)行如下操作
mmm_control @test_mmm set_offline cluster02 mmm_control @test_mmm set_online cluster02
切換后的效果是一樣的,就不演示了
6.3 宕機(jī)自動(dòng)切換現(xiàn)在我們演示一下 Master2 數(shù)據(jù)庫(kù)掛掉后自動(dòng)切換情況
kill master2
查看 MMM monitor 日志,看到切換過(guò)程
[root@chengqm ~]# tail -8 /var/log/mysql-mmm/mmm_mond_test_mmm.log 2018/12/06 18:09:27 WARN Check "rep_backlog" on "cluster02" is in unknown state! Message: UNKNOWN: Connect error (host = 10.0.0.248:3306, user = mmm_monitor)! Lost connection to MySQL server at "reading initial communication packet", system error: 111 2018/12/06 18:09:30 ERROR Check "mysql" on "cluster02" has failed for 4 seconds! Message: ERROR: Connect error (host = 10.0.0.248:3306, user = mmm_monitor)! Lost connection to MySQL server at "reading initial communication packet", system error: 111 2018/12/06 18:09:31 FATAL State of host "cluster02" changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK) 2018/12/06 18:09:31 INFO Removing all roles from host "cluster02": 2018/12/06 18:09:31 INFO Removed role "reader(10.0.0.238)" from host "cluster02" 2018/12/06 18:09:31 INFO Removed role "writer(10.0.0.237)" from host "cluster02" 2018/12/06 18:09:31 INFO Orphaned role "writer(10.0.0.237)" has been assigned to "cluster01" 2018/12/06 18:09:31 INFO Orphaned role "reader(10.0.0.238)" has been assigned to "cluster01"
查看節(jié)點(diǎn)狀態(tài)
[root@chengqm ~]# mmm_control @test_mmm show cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238), reader(10.0.0.239), writer(10.0.0.237) cluster02(10.0.0.248) master/HARD_OFFLINE. Roles: cluster03(10.0.0.249) slave/ONLINE. Roles:
Cluster1 VIP 情況
[mysql@cluster01 ~]$ ip addr 1: lo:mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0 inet 10.0.0.238/32 scope global eth0 inet 10.0.0.237/32 scope global eth0 inet6 fe80::f816:3eff:fede:8033/64 scope link valid_lft forever preferred_lft forever
切換后 Slave 同步指向
[mysql@cluster03 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show slave status G"; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.0.247 Master_User: repl Master_Port: 3306 Connect_Retry: 60
可以看到數(shù)據(jù)庫(kù)宕機(jī)后, MMM 會(huì)自動(dòng)切換, 從而實(shí)現(xiàn)高可用
7. 總結(jié) 7.1 MMM 優(yōu)點(diǎn)MMM 可以管理主備節(jié)點(diǎn),并實(shí)現(xiàn)全節(jié)點(diǎn)高可用
當(dāng)節(jié)點(diǎn)出現(xiàn)問(wèn)題的時(shí)候自動(dòng)切換,恢復(fù)后自動(dòng)上線
7.2 MMM 缺點(diǎn)在進(jìn)行主從切換時(shí), 容易造成數(shù)據(jù)丟失。
MMM Monitor 服務(wù)存在單點(diǎn)故障 ,也就是說(shuō), MMM 本身不是高可用的,所以監(jiān)控端要和數(shù)據(jù)庫(kù)分開(kāi)部署以防數(shù)據(jù)庫(kù)和監(jiān)控都出現(xiàn)問(wèn)題
筆者在實(shí)際使用過(guò)程中發(fā)現(xiàn):
主備切換偶爾會(huì)造成從節(jié)點(diǎn)同步失敗(主鍵沖突、記錄不存在)
宕機(jī)切換恢復(fù)后節(jié)點(diǎn)有數(shù)據(jù)丟失
7.3 MMM 適用場(chǎng)景對(duì)數(shù)據(jù)一致性要求不高,允許丟失少量數(shù)據(jù),比如說(shuō)評(píng)論、資訊類數(shù)據(jù)
讀操作頻繁,需要在所有節(jié)點(diǎn)上進(jìn)行讀操作負(fù)載均衡(后續(xù)文章會(huì)說(shuō)到怎么做負(fù)載均衡)
到此, MMM 高可用架構(gòu)搭建完畢
8. 附 8.1 問(wèn)題及解決方案 1). 配置文件讀寫權(quán)限問(wèn)題描述
FATAL Configuration file /etc/mysql-mmm/mmm_agent*.conf is world writable! FATAL Configuration file /etc/mysql-mmm/mmm_agent*.conf is world readable!
解決方案
chmod 664 /etc/mysql-mmm/*2). 重復(fù)監(jiān)聽(tīng)
問(wèn)題描述
這個(gè)問(wèn)題容易出現(xiàn)在多個(gè) MMM 監(jiān)控實(shí)例的情況下, 報(bào)錯(cuò)如下
FATAL Listener: Can’t create socket!
解決方案
檢查配置文件端口是否沖突
檢查機(jī)器端口是否被占用
3). 網(wǎng)卡配置不對(duì)問(wèn)題描述
FATAL Couldn’t configure IP ‘192.168.1.202’ on interface ‘em1’: undef
解決方案
ifconfig 命令查看網(wǎng)卡,更改配置文件
8.2 mmm 6 種狀態(tài)及變化原因 狀態(tài)online
admin_offline
hard_offline
awaiting_recovery
replication_delay
replication_fail
變化原因:ONLINE: Host is running without any problems.
ADMIN_OFFLINE: host was set to offline manually.
HARD_OFFLINE: Host is offline (Check ping and/or mysql failed)
AWAITING_RECOVERY: Host is awaiting recovery
REPLICATION_DELAY: replication backlog is too big (Check rep_backlog failed)
REPLICATION_FAIL: replication threads are not running (Check rep_threads failed)
其他說(shuō)明Only hosts with state ONLINE may have roles. When a host switches from ONLINE to any other state, all roles will be removed from it.
A host that was in state REPLICATION_DELAY or REPLICATION_FAIL will be switched back to ONLINE if everything is OK again, unless it is flapping (see Flapping).
A host that was in state HARD_OFFLINE will be switched to AWAITING_RECOVERY if everything is OK again. If its downtime was shorter than 60 seconds and it wasn"t rebooted or auto_set_online is > 0 it will be switched back to ONLINE automatically, unless it is flapping (see Flapping again).
Replication backlog or failure on the active master isn"t considered to be a problem, so the active master will never be in state REPLICATION_DELAY or REPLICATION_FAIL.
Replication backlog or failure will be ignored on hosts whos peers got ONLINE less than 60 seconds ago (That"s the default value of master-connect-retry).
If both checks rep_backlog and rep_threads fail, the state will change to REPLICATION_FAIL.
If auto_set_online is > 0, flapping hosts will automatically be set to ONLINE after flap_duration seconds.
參考: mmm 官方文檔
文章版權(quán)歸作者所有,未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。
轉(zhuǎn)載請(qǐng)注明本文地址:http://systransis.cn/yun/17833.html
閱讀 3363·2021-10-13 09:40
閱讀 2601·2021-10-08 10:17
閱讀 4006·2021-09-28 09:45
閱讀 938·2021-09-28 09:35
閱讀 1819·2019-08-30 10:51
閱讀 2910·2019-08-26 12:11
閱讀 1655·2019-08-26 10:41
閱讀 3103·2019-08-23 17:10