MySQL主从- slave跳过错误
目录
一、跳过指定数量的事务
二、修改mysql的配置文件
三、模拟错误场景
mysql主从复制,经常会遇到错误而导致slave端复制中断,这个时候一般就需要人工干预,跳过错误才能继续 跳过错误有两种方式:
一、跳过指定数量的事务
mysql>slave stop;
mysql>SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1 #跳过一个事务
mysql>slave start
二、修改mysql的配置文件
通过slave_skip_errors参数来跳所有错误或指定类型的错误
vim /etc/my.cnf
[mysqld]
#slave-skip-errors=1062,1053,1146 #跳过指定error no类型的错误
#slave-skip-errors=all #跳过所有错误
注意事项:
1)当配置文件里写两行:
slave-skip-errors=1062
slave-skip-errors=1032#如果将跳过错误做多行写时,第二个参数会覆盖第一个参数。
#所以一定要写到同一行,并用逗号分隔。
2)虽然slave会跳过这些错误,继续复制,但是仍会以Warning的形式记录到错误日志中,如:
160620 10:40:17 [Warning] Slave SQL: Could not execute Write_rows event on table dba.t; Duplicate entry '10' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000033, end_log_pos 1224, Error_code: 1062
三、模拟错误场景
1、环境(一个已经配置好的主从复制环境)
master数据库IP:192.168.247.128
slave数据库IP:192.168.247.130
mysql版本:5.6.14
binlog-do-db = mydb
2、在master上执行以下语句:
mysql>use mysql;
mysql>create table t1 (id int);
mysql>use mydb;
mysql>insert into mysql.t1 select 1;
3、在slave上查看复制状态:
mysql> show slave status\G
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.247.128Master_User: replMaster_Port: 3306Connect_Retry: 60Master_Log_File: mysql-bin.000017Read_Master_Log_Pos: 2341Relay_Log_File: DBtest1-relay-bin.000011Relay_Log_Pos: 494Relay_Master_Log_File: mysql-bin.000017Slave_IO_Running: YesSlave_SQL_Running: NoReplicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1146Last_Error: Error 'Table 'mysql.t1' doesn't exist' on query. Default database: 'mydb'. Query: 'insert into mysql.t1 select 1'Skip_Counter: 0Exec_Master_Log_Pos: 1919Relay_Log_Space: 1254Until_Condition: NoneUntil_Log_File: Until_Log_Pos: 0Master_SSL_Allowed: NoMaster_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: NoLast_IO_Errno: 0Last_IO_Error: Last_SQL_Errno: 1146Last_SQL_Error: Error 'Table 'mysql.t1' doesn't exist' on query. Default database: 'mydb'. Query: 'insert into mysql.t1 select 1'Replicate_Ignore_Server_Ids: Master_Server_Id: 1Master_UUID: f0f7faf6-51a8-11e3-9759-000c29eed3eaMaster_Info_File: /var/lib/mysql/master.infoSQL_Delay: 0SQL_Remaining_Delay: NULLSlave_SQL_Running_State: Master_Retry_Count: 86400Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 131210 21:37:19Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0
1 row in set (0.00 sec)
由结果可以看到:
Read_Master_Log_Pos: 2341,Exec_Master_Log_Pos: 1919 时出错了
Last_SQL_Error: Error 'Table 'mysql.t1' doesn't exist' on query.
原因:因为只对mydb记录了binlog,当在mydb库操作其它数据库的表,但该表在slave上又不存在时就出错了。我们可在master的binlog里看事务内容,这里一行代表一个事务
mysql> SHOW BINLOG EVENTS in 'mysql-bin.000017' from 1919\G
*************************** 1. row ***************************Log_name: mysql-bin.000017Pos: 1919Event_type: QueryServer_id: 1
End_log_pos: 1999Info: BEGIN
*************************** 2. row ***************************Log_name: mysql-bin.000017Pos: 1999Event_type: QueryServer_id: 1
End_log_pos: 2103Info: use `mydb`; insert into mysql.t1 select 1
*************************** 3. row ***************************Log_name: mysql-bin.000017Pos: 2103Event_type: XidServer_id: 1
End_log_pos: 2134Info: COMMIT /* xid=106 */
*************************** 4. row ***************************Log_name: mysql-bin.000017Pos: 2134Event_type: QueryServer_id: 1
End_log_pos: 2213Info: BEGIN
*************************** 5. row ***************************Log_name: mysql-bin.000017Pos: 2213Event_type: QueryServer_id: 1
End_log_pos: 2310Info: use `mydb`; insert into t1 select 9
*************************** 6. row ***************************Log_name: mysql-bin.000017Pos: 2310Event_type: XidServer_id: 1
End_log_pos: 2341Info: COMMIT /* xid=107 */
6 rows in set (0.00 sec)
由上面的结果可知,我们需要跳过两个事务
(Pos: 1999 insert,Pos: 2103 commit),
(Pos:2213 insert , Pos:2310 commit)
4、在slave上的操作
mysql> stop slave; #停止slave
mysql>SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 2 #跳过一个事务
mysql>start slave #开启slave
mysql> show slave status\G #查看slave状态
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 3
Current database: mydb*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.247.128Master_User: replMaster_Port: 3306Connect_Retry: 60Master_Log_File: mysql-bin.000017Read_Master_Log_Pos: 3613Relay_Log_File: DBtest1-relay-bin.000018Relay_Log_Pos: 283Relay_Master_Log_File: mysql-bin.000017Slave_IO_Running: YesSlave_SQL_Running: YesReplicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0Last_Error: Skip_Counter: 0Exec_Master_Log_Pos: 3613Relay_Log_Space: 458Until_Condition: NoneUntil_Log_File: Until_Log_Pos: 0Master_SSL_Allowed: NoMaster_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: NoLast_IO_Errno: 0Last_IO_Error: Last_SQL_Errno: 0Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1Master_UUID: f0f7faf6-51a8-11e3-9759-000c29eed3eaMaster_Info_File: /var/lib/mysql/master.infoSQL_Delay: 0SQL_Remaining_Delay: NULLSlave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update itMaster_Retry_Count: 86400Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0
1 row in set (0.01 sec)
备注:排查应该跳过几个事务时应该要仔细,如果跳过的事务过多,也就说明slave I/O线程读取二进制的位置已经远远超过了master 的二进制存储位置,所以此时当master进行写操作时,slave不会有数据更新。