Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Double Sync Replication

171 vues

Publié le

Double Sync Replication slides in OOW16

Publié dans : Ingénierie
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Double Sync Replication

  1. 1. Double Sync Replication ——Enhancing Data Durability Lixun Peng @ Alibaba Cloud Compute
  2. 2. About me • Name: Lixun Peng • Location: Hangzhou, China • Occupation: Staff Database Kernel Engineer @ Alibaba Cloud • Interests: MySQL Replication & InnoDB • Experience: In the first, I worked as a DBA Then, I began to modify code, in order to better use Gradually I became a MySQL Kernel Engineer
  3. 3. Agenda • The problem about Async/Semi-Sync • How to solve these problems • How to implement Double-Sync • How to use Double-Sync • Several cases
  4. 4. Agenda • The problem about Async/Semi-Sync • How to solve these problems • How to implement Double-Sync • Several cases
  5. 5. Problem of Async Replication • Master doesn’t have to wait ACK from Slave. • Slave doesn’t know if it dumps the latest binary logs. • When Master crashes, slave can’t tell if it catches up Master. • The major problem is slave doesn’t know master’s status.
  6. 6. Semi-Sync Replication Semi-Sync will wait for the ACK from Slave
  7. 7. Problem of SemiSync • Master has to wait ACK from slave. • Slave will downgrade to async when timeout happens. • If timeout setting is too small, timeout happens too often. • If timeout setting is too big, master blocks a lot. • Slave dump binary logs generated during timeout asynchronously, after it recover from network failure. • If Master crashes, slave doesn’t know how replication works (Async or SemiSync). • In this case, slave still doesn’t know if it dumps the latest binary logs. • Conclusion is SemiSync doesn’t solve the major problem .
  8. 8. Problem of Async/SemiSync
  9. 9. Flow Chart (Async/Semi-Sync)
  10. 10. Background & Target • Background • SA team guarantee the server availability: 99.999% • Net Ops team guarantee the network availability: 99.999% • Assuming master and network doesn’t fail at the same time. • Target • Slave knows if it catch up master. • Slave knows how data in master side it doesn’t have. • Key Point: Clarify Slave's status!
  11. 11. Agenda • The problem about Async/Semi-Sync • How to solve these problems • How to implement Double-Sync • Several cases
  12. 12. Solve the weak point of SemiSync • Even network recover after failure, slave still has to dump the binary logs generated during timeout asynchronously. • If timeout happens and slave gives up the binary logs generated during timeout, what will happen afterwards if master only send the latest position & logs? • When network is down, slave always knows the latest position. • Slave can know if its data is the same with Master or not. • How to catch up data modification when network is down? • Async replication can still dump binary logs • So we can use Async replication to do a full log apply.
  13. 13. Combine the Async and SemiSync • Async Replication (Async Channel) • Dumping continuous binary logs from master. • Applying logs immediately after slave receives them. • SemiSync Replication(Sync Channel) • Dumping the latest binary logs and position. • Not applying logs immediately. Expired logs are being purged automatically. • Analyzing Consistency • Comparing logs and position from two channels.
  14. 14. Combine the Async and SemiSync
  15. 15. Flow Chart (Double Sync)
  16. 16. Agenda • The problem about Async/Semi-Sync • How to solve these problems • How to implement Double-Sync • Several cases
  17. 17. How to create two channels(1) • Multi-Source replication enables N channels in one slave. • Problem: when master received two dump requests from the same server-id servers, it disconnects the previous one. • Solution: set up special Server-ID (0xFFFFFF) for Sync Channel.
  18. 18. How to create two channels (2) • Problem: there are a SemiSync and a non-SemiSync Channel in one slave, but the SemiSync settings are global. • Solution: move SemiSyncSlave class to Master_info.
  19. 19. Analyzing consistency • Using the GTID • Using the Log_file_name and Log_file_pos • Learn the process by checking the following pictures J
  20. 20. Analyzing consistency ß Needn’t Repair, Just use it! ß Can’t Repair, Will lose something ß Can Repair, Use it after repair
  21. 21. Agenda • The problem about Async/Semi-Sync • How to solve these problems • How to implement Double-Sync • Several cases
  22. 22. CASE 1: Needn’t Fix • The GTID between Sync and Async Channel are the same.
  23. 23. CASE 2: Can’t Fix • Exists broken gap between Sync and Async Channel.
  24. 24. CASE 3: Can Repair • Combine two channel’s logs to make logs continuous.
  25. 25. How to Repair • Slave waits for the Async Channel to apply all the logs it receives, then start the SQL THREAD of Sync Channel. • GTID filters the events which have been applied by Async Channel. • A REPAIR SLAVE command is provided to do things automatically.
  26. 26. FAQs (1) • Q1: Will Alibaba release this feature? • A1: Of course! Alibaba will release all the patches. • Q2: When Alibaba release the source codes? • A2: Check AliSQL’s roadmap. • Q3: How can I access AliSQL’s source codes? • A3: https://github.com/alibaba/AliSQL Currently the project is private. If you want to access it, please email me to provide your GitHub account.
  27. 27. FAQs (2) • Q4: What’s the difference between 2 Semi-Sync Slaves and double sync replication? • A4: In fact they do the same job. Performance is pretty much the same too. But double sync replication saves one more slave than 2 Semi-Sync Slaves architecture. When the number of MySQL servers grows, it will save lots of money.
  28. 28. Any other Questions? penglixun@gmail.com

×