标签:监控MySQL主从同步状态是否异常
阶段1:开发一个守护进程脚本每30秒实现检测一次。
阶段2:如果同步出现如下错误号(1158,1159,1008,1007,1062),请跳过错误
阶段3:请使用数组技术实现上述脚本(获取主从判断及错误号部分)
[root@slave ~]# mysql -u root -proot -e "show slave statusG;"
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.16.1.2 #当前的mysql master服务器主机
Master_User: myslave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000003
Read_Master_Log_Pos: 471
Relay_Log_File: relay-log-bin.000002
Relay_Log_Pos: 252
Relay_Master_Log_File: master-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Master_SSL_Key:
Seconds_Behind_Master: 0 #和主库比同步延迟的秒数
准备:
egrep "_Running|Behind_Master" slave.log #过滤
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Seconds_Behind_Master: 0
[root@slave ~]# egrep "_Running|Behind_Master" slave.log | awk ‘{print $NF}‘
Yes
Yes
0
阶段一:开发一个守护进程脚本每30秒实现检测一次。
#!/bin/bash
while true
do
array=($(egrep "_Running|Behind_Master" slave.log|awk ‘{print $NF}‘))
if [ "${array[0]}" == "Yes" -a "${array[1]}" == "Yes" -a "${array[2]}" == "0" ]
then
echo "MySQL is slave is ok"
else
char="MySQL slave is not ok"
echo "$char"
echo "$char"|mail -s "$char" 995345781@qq.com
break
fi
sleep 30
done
执行结果:
[root@slave ~]# sh test.sh
MySQL is slave is ok
MySQL is slave is ok
终极版:
#!/bin/bash
#Date:2017-7-3
#Author:xcn(baishuchao@yeah.net)
#version 1.0
mysql_cmd="mysql -u root -proot"
errorno=(1158 1159 1008 1007 1062)
while true
do
array=($($mysql_cmd -e "show slave statusG"|egrep ‘_Running|Behind_Master|Last_SQL_Errno‘|awk ‘{print $NF}‘))
if [ "${array[0]}" == "Yes" -a "${array[1]}" == "Yes" -a "${array[2]}" == "0" ]
then
echo "MySQL is slave is ok"
else
for ((i=0;i<${#errorno[*]};i++))
do
if [ "${array[3]}" = "${errorno[$i]}" ];then
$mysql_cmd -e "stop slave &&set global sql_slave_skip_counter=1;start slave;"
fi
done
char="MySQL slave is not ok"
echo "$char"
echo "$char"|mail -s "$char" 995345781@qq.com
break
fi
sleep 30
done
提示:这个脚本可以用于生产环境中,监控mysql主从同步状态是否异常,根据
‘_Running|Behind_Master|Last_SQL_Errno‘
这个进行判断,如果不正常的话则会进一步判断状态码,然后进行输出,则会发邮件或短信给运维人员