mysql master-slave synchronization principle

Replication thread

The Mysql Replication is an asynchronous replication process,From a Mysql instace(We call Master)Copied to another Mysql instance(We call it Slave)。Realization of the entire replication process between Master and Slave is done mainly by the three threads,Two threads(Sql threads and IO thread)In the end Slave,Another thread(IO thread)At the end of Master。

To achieve the MySQL Replication ,You must first open the Master end of the Binary Log(mysql-bin.xxxxxx)Features,Otherwise it is impossible to achieve。Various operations because the whole process is actually a copy Slave from the Master side to obtain the log and then himself completely in the implementation of the log in the order recorded。Open the MySQL Binary Log can be used in the process of starting MySQL Server "-log-bin" parameter options,Or my.cnf configuration file mysqld parameter set([mysqld]Section identifies the parameters after)Increase the "log-bin" Parameter。

The basic process is as follows MySQL replication:

  1. Slave IO above the thread connecting the Master,And requests the specified log file from the specified location(Or from the beginning of the log)After the contents of the log;

   2. After receiving the request from the Master of Slave IO thread,Log information is read by the specified log specify the location of the IO thread is responsible for copying the information after the request,Back to the Slave IO thread ends。Addition to the information contained in the log, in addition to the return information,This also includes the name of the information returned by the end of the Master Binary Log file and its location in the Binary Log in;

  3. Slave IO thread after receiving the information,The contents of the log of the received file are sequentially written to the Slave Relay Log end(mysql-relay-bin.xxxxxx)The very end,And records the file name and location of the read end of the Master bin-log to the master- info file,So that when the next time can be clearly read speed Master "I need a position from which bin-log log content beginning later,please send to me"

   4. Slave SQL thread detected Relay Log in newly added content after,Parses the contents of the Log file immediately becomes a real implementation of the Master side when those Query executable statement,And their implementation of these Query。such,Actually end in Master and Slave implementation of the same end Query,So the data is exactly the same at both ends。

Actually,In the old version,MySQL replication is not achieved at the end Slave to work together by the IO thread and the SQL thread two threads completed,But by a separate thread to do all the work。But MySQL's engineers soon discovered,Doing so there is a big risk and performance problems,Notably the following:

First of all,If you do this work independently, then by a single thread,Causes the end of the Master Copy,Binary Log Log,And parsing these logs,Then the execution of this process itself becomes a serial process,Performance will naturally be subject to greater restrictions,Replication of this architecture naturally relatively long delay。

Secondly,After this replication thread Slave side get up from Master Binary Log end,Then we need to resolve these,Restore the original end Query Master performed,Then in its own execution。During this process,Master end and is likely to have generated a lot of changes and generate a large amount of information Binary Log。If there is a fault can not be repaired at the end of this storage system Master stage,So at this stage all the changes arising will be lost forever,I can not get it back。This potential risk is particularly prominent in the Slave-side pressure is relatively large when,If the pressure is relatively large because the Slave,Log analysis and application of these logs time spent will naturally be longer,The data will also be lost more。

and so,In the latter part of the transformation,The new version of MySQL in order to minimize this risk,And improve the performance of replication,Copy the Slave-side instead of two threads to finish,That is, the aforementioned IO thread and the SQL thread。The first proposed improvement program is Yahoo!An engineer "Jeremy Zawodny"。Through this transformation,This will not only largely solve the performance problems,Shorten the delay time of asynchronous,While reducing the potential amount of data loss。

of course,Even replaced the two threads so now to collaborate after,Also there is still the possibility of Slave data latency and data loss is,After all, this replication is asynchronous。Whenever you change the data in a transaction not,These problems are there。

If you want to completely avoid these problems,We can only use the MySQL Cluster to solve。But the MySQL Cluster know when I wrote this part of,It is still a number of memory database solutions,That is, all the data including the index will need all Load into memory,This memory requirement is very large large,For the general popularity of applications can be implemented not too much。of course,Before that communicate with the MySQL CTO David when,MySQL is now achieve continuous improvement of its Cluster,One very large data do not allow a change is all the Load into memory,But merely an index of all Load into memory,I would like to believe that after the completion of the transformation will be more popular MySQL Cluster,Can be implemented will be greater。

Leave a Comment