A problem we occasionally see is Relay Log corruption, which is most frequently caused by network errors. At this point in time, the replication IO thread does not perform checksumming on incoming data (currently scheduled for MySQL 6.x). In the mean time, we have a relatively easy workaround: encrypt the replication connection. Because of the nature of encrypted connections, they have to checksum each packet.
Solution 1: Replication over SSH Tunnel
This is the easiest to setup. You simply need to do the following on the Slave:
shell> ssh -f email@example.com -L 4306:master.server:3306 -N
This sets up the tunnel. slave.server:4306 is now a tunnelled link to master.server:3306. So now, you just need to alter the Slave to go through the tunnel:
mysql> STOP SLAVE;
mysql> CHANGE MASTER TO master_host='localhost', master_port=4306;
mysql> START SLAVE;
Everything else stays the same. Your Slave is still connecting to the same Master, just in a different manner.
This solution does have a couple of downsides, however:
- If the SSH tunnel goes down, it won’t automatically reconnect. This can be fixed with a small script that restarts the connection if it fails. The script can be added to your init.d setup, so it automatically opens on server startup.
- If you use MySQL Enterprise Monitor, it won’t be able to recognize that the Master/Slave pair go together.
Solution 2: Replication with SSL
Replication with SSL can be trickier to setup, but it removes the two downsides of the previous solution. Luckily, the MySQL Documentation Team have done all the hard work for you.
- Step 1: Create the certificates
- Step 2: Setup the servers to recognize the certificates
- Step 3: Change the Slave to use SSL
If you’re seeing corruption problems in your Relay Log, but not in your Master Binary Log, try Solution 1. It’s quick to setup and will determine if encryption is the solution to your problem. If it works, setup Solution 2. It will take a little bit of fiddling around, but is certainly worth the effort.