Restart slave thread in MySQL when specific error occurred

On one of slaves I got error:

write failed: No space left on device (28)

I also found that slave got 1062 error (“Duplicate entry”) and stopped. I cleaned up some free space (old logs). When I tried to restart it with pt-slave-restart then I found that IO_Thread downloads binlogs from master and uses all free space again.

As a workaround – I decide to start just SQL_Thread, let it process all relay logs and then start IO_Thread again.

This is quick bash oneliner I created, which checks replication and if 1062 error exists then does skipping and starting SQL_Thread again.

while true; do if [[ $(mysql -e "show slave status\G" | grep "Last_SQL_Errno: 1062" -c) -gt 0 ]]; then mysql -e "set global sql_slave_skip_counter=1; start slave sql_thread;"; fi; done

This was enough to get issue fixed.

Syncing MySQL tables by pt-table-checksum when there is no unique key

Today I needed to checksum and sync tables in simple master-slave replication.
So I used percona-toolkit for this.

Following checksumming (command should be running on master) of tables shows that repl.t9 and repl.t1 tables have differences:

$ pt-table-checksum --replicate=percona.checksums --create-replicate-table --empty-replicate-table --ask-pass h=localhost,u=root
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
05-26T01:50:30      0      1        4       1       0   0.016 repl.t1
05-26T01:50:30      0      1        6       1       0   0.016 repl.t9

Let’s run sync and check result:
Continue reading