Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Galera Replication Demystified: How Does It Work?

1 138 vues

Publié le

We always say Galera replication is synchronous, but is it really? Always? For every step? What are the caveats of such replication? What does virtual-synchronous mean?

In this talk, I'll discuss what certification and group communication are, and how they're performed. I'll also cover the difference between MySQL 5.6 GTID and Galera GTID. The goal of this presentation is to explain how a transaction is replicated using Galera.

After this presentation, the audience should be comfortable with terms like "brute force abort," "flow control," "local certification failure," etc.

The plan is to demystify Galera technology with easy illustrations and analogies.

Publié dans : Logiciels
  • Soyez le premier à commenter

Galera Replication Demystified: How Does It Work?

  1. 1. Galera Replication Demystified how does it work ?     1 / 262
  2. 2. about.me/lefred Who am I ?     2 / 262
  3. 3. Frédéric Descamps @lefred     3 / 262
  4. 4. Frédéric Descamps @lefred Working for Percona since 2011     4 / 262
  5. 5. Frédéric Descamps @lefred Working for Percona since 2011 Senior Architect     5 / 262
  6. 6. Frédéric Descamps @lefred Working for Percona since 2011 Senior Architect Managing MySQL since 3.23     6 / 262
  7. 7. Frédéric Descamps @lefred Working for Percona since 2011 Senior Architect Managing MySQL since 3.23 devops believer     7 / 262
  8. 8. Frédéric Descamps @lefred Working for Percona since 2011 Senior Architect Managing MySQL since 3.23 devops believer and I installed my first Galera Cluster in February 2010 ;-)     8 / 262
  9. 9. Galera Replication Cluster     9 / 262
  10. 10. Galera Replication - Cluster What is it ? What does it handle ?     10 / 262
  11. 11. Standard asyncrhonous replication is server-centric, one server streams data to another one. All the nodes have a specific role.     11 / 262
  12. 12. In Galera, the dataset is synchronized between one or more servers: data-centric     12 / 262
  13. 13. You can write to any node in your cluster No need to worry about eventual out-of-sync     13 / 262
  14. 14. Write events/transactions are sent in parallel     14 / 262
  15. 15. Cluster Membership determined by the cluster wsrep_cluster_address is just a pointer any node is permitted to join that knows the cluster name can find a single active cluster node     15 / 262
  16. 16. Cluster Membership determined by the cluster wsrep_cluster_address is just a pointer any node is permitted to join that knows the cluster name can find a single active cluster node 㫤㫠㫟㫜㫝㫘㫚㫙㫢㫠㫡㫜㫟㫘㫘㫛㫛㫟㫜㫠㫠㫔㫖㫔㫞㫚㫜㫚㫚㫣㫘㫘㫚㫢㫛㫗㫚㫟㫡㫗㫚㫗㫚㫕㫚㫢㫛㫗㫚㫟㫡㫗㫚㫗㫛㫕㫚㫢㫛㫗㫚㫟㫡㫗㫚㫗㫜     16 / 262
  17. 17. The cluster manages quorum and has split-brain protection.     17 / 262
  18. 18.     18 / 262
  19. 19.     19 / 262
  20. 20.     20 / 262
  21. 21.     21 / 262
  22. 22.     22 / 262
  23. 23.     23 / 262
  24. 24. Replication     24 / 262
  25. 25. Replication Delivers the writeset to all nodes in the cluster     25 / 262
  26. 26. Replication Delivers the writeset to all nodes in the cluster and all nodes acknowledge the writeset     26 / 262
  27. 27. Replication Delivers the writeset to all nodes in the cluster and all nodes acknowledge the writeset     27 / 262
  28. 28. Replication Delivers the writeset to all nodes in the cluster and all nodes acknowledge the writeset Cost is ~roundtrip latency to furthest node     28 / 262
  29. 29. Replication Delivers the writeset to all nodes in the cluster and all nodes acknowledge the writeset Cost is ~roundtrip latency to furthest node Serialized by Group Communication     29 / 262
  30. 30. GTID     30 / 262
  31. 31. GTID Not the same as 5.6 Aynchronous GTID's     31 / 262
  32. 32. GTID Not the same as 5.6 Aynchronous GTID's though they appear the same     32 / 262
  33. 33. GTID Not the same as 5.6 Aynchronous GTID's though they appear the same 939aac77-f7d1-11e3-bd5e-b211d6ab1ec6:1534285     33 / 262
  34. 34. GTID Not the same as 5.6 Aynchronous GTID's though they appear the same 939aac77-f7d1-11e3-bd5e-b211d6ab1ec6:1534285 GTIDs ensure cluster members are consistent with each other     34 / 262
  35. 35. GTID Not the same as 5.6 Aynchronous GTID's though they appear the same 939aac77-f7d1-11e3-bd5e-b211d6ab1ec6:1534285 GTIDs ensure cluster members are consistent with each other nodes joining a cluster have their GTIDs checked     35 / 262
  36. 36. GTID Not the same as 5.6 Aynchronous GTID's though they appear the same 939aac77-f7d1-11e3-bd5e-b211d6ab1ec6:1534285 GTIDs ensure cluster members are consistent with each other nodes joining a cluster have their GTIDs checked GTIDs can be used to compare downed nodes to each other     36 / 262
  37. 37. GTID The highest GTID is the most recently written Generally the best practice it to bootstrap the node with the most recent data     37 / 262
  38. 38. GTID     38 / 262
  39. 39. GTID     39 / 262
  40. 40. GTID     40 / 262
  41. 41. Global Transaction IDs initial dataset bfb912e5-f560-11e2-0800-1eefab05e57d:0     41 / 262
  42. 42. Global Transaction IDs initial dataset bfb912e5-f560-11e2-0800-1eefab05e57d:0 first change/transaction/writeset bfb912e5-f560-11e2-0800-1eefab05e57d:1     42 / 262
  43. 43. Global Transaction IDs initial dataset bfb912e5-f560-11e2-0800-1eefab05e57d:0 first change/transaction/writeset bfb912e5-f560-11e2-0800-1eefab05e57d:1 undefined GTID 00000000-0000-0000-0000-000000000000:-1     43 / 262
  44. 44. Global Transaction IDs : Galera vs MySQL 5.6     44 / 262
  45. 45. Global Transaction IDs : Galera vs MySQL 5.6     45 / 262
  46. 46. Global Transaction IDs : Galera vs MySQL 5.6     46 / 262
  47. 47. Global Transaction IDs : Galera vs MySQL 5.6 In MySQL 5.6 㫢㫘㫞㫚㫚㫙㫠㫙㫖㫠㫙㫞㫢㫖㫚㫚㫜㫛㫖㫢㫘㫛㫝㫖㫙㫡㫙㫙㫛㫠㫟㫛㫙㫡㫘㫝㫣㫜㫛 㫢㫘㫞㫚㫚㫙㫠㫙㫖㫠㫙㫞㫢㫖㫚㫚㫜㫛㫖㫢㫘㫛㫝㫖㫙㫡㫙㫙㫛㫠㫟㫛㫙㫡㫘㫝㫣㫜㫜 㫢㫘㫞㫚㫚㫙㫠㫙㫖㫠㫙㫞㫢㫖㫚㫚㫜㫛㫖㫢㫘㫛㫝㫖㫙㫡㫙㫙㫛㫠㫟㫛㫙㫡㫘㫝㫣㫜㫝 㫘㫔㫛㫜㫤㫔㫚㫘㫠㫡㫜㫟㫔㫝㫟㫜㫚㫜㫡㫜㫛㫔㫘 㫜㫜㫚㫚㫝㫘㫝㫠㫖㫠㫚㫚㫘㫖㫚㫚㫜㫛㫖㫢㫜㫜㫜㫖㫚㫡㫙㫘㫘㫢㫝㫛㫢㫞㫟㫛㫣㫚 㫜㫜㫚㫚㫝㫘㫝㫠㫖㫠㫚㫚㫘㫖㫚㫚㫜㫛㫖㫢㫜㫜㫜㫖㫚㫡㫙㫘㫘㫢㫝㫛㫢㫞㫟㫛㫣㫛 㫜㫜㫚㫚㫝㫘㫝㫠㫖㫠㫚㫚㫘㫖㫚㫚㫜㫛㫖㫢㫜㫜㫜㫖㫚㫡㫙㫘㫘㫢㫝㫛㫢㫞㫟㫛㫣㫜     47 / 262
  48. 48. Global Transaction IDs : Galera vs MySQL 5.6 In Galera 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫚㫡 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫚㫢 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫛㫙     48 / 262
  49. 49. Global Transaction IDs : Galera vs MySQL 5.6 In Galera 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫚㫡 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫚㫢 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫛㫙 㫘㫔㫛㫜㫤㫔㫚㫘㫠㫡㫜㫟㫔㫝㫟㫜㫚㫜㫡㫜㫛㫔㫘     49 / 262
  50. 50. Global Transaction IDs : Galera vs MySQL 5.6 In Galera 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫚㫡 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫚㫢 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫛㫙 㫘㫔㫛㫜㫤㫔㫚㫘㫠㫡㫜㫟㫔㫝㫟㫜㫚㫜㫡㫜㫛㫔㫘 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫛㫚 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫛㫛 㫙㫝㫙㫢㫚㫛㫜㫞㫖㫝㫞㫟㫙㫖㫚㫚㫜㫛㫖㫙㫡㫙㫙㫖㫚㫜㫜㫝㫘㫙㫙㫞㫜㫞㫠㫛㫘㫣㫚㫚㫛㫜     50 / 262
  51. 51. GTID Assignment UUID The UUID section, 128-bit, is generated during bootstrapping to identify the cluster. Generated by mixing the timer value and pseudo-random numbers (depending primarily from the timer and PID), but, currently, there is no use of NIC's MAC-address although in theory it may be done that way. More info: https://gist.github.com/lefred/88a2cec88d03854d9934     51 / 262
  52. 52. GTID Assignment (3) SEQNO The seqno, 64-bit, is incremented only when the transaction passes certification and is ready for commit. For the curious, the algorithm used to derive sequence number is Totem Single-ring Ordering protocol. Before that, there is already communication between the nodes, group communication is used to define a group-channel id which is a locally maintained counter by each node in sync with the group.     52 / 262
  53. 53. Serialization of writesets     53 / 262
  54. 54. Serialization of writesets     54 / 262
  55. 55. Serialization of writesets     55 / 262
  56. 56. Serialization of writesets     56 / 262
  57. 57. Serialization of writesets     57 / 262
  58. 58. Serialization of writesets     58 / 262
  59. 59. Serialization of writesets     59 / 262
  60. 60. Serialization of writesets     60 / 262
  61. 61. Serialization of writesets     61 / 262
  62. 62. Serialization of writesets     62 / 262
  63. 63. Serialization of writesets     63 / 262
  64. 64. Serialization of writesets     64 / 262
  65. 65. Serialization of writesets     65 / 262
  66. 66. Serialization of writesets     66 / 262
  67. 67. Roles We have 4 distinct roles in Galera: 2 for replication 2 for state transfer     67 / 262
  68. 68. Replication Roles Within the cluster, all nodes are equal     68 / 262
  69. 69. Replication Roles Within the cluster, all nodes are equal 'master/donor node'     69 / 262
  70. 70. Replication Roles Within the cluster, all nodes are equal 'master/donor node' The node a given transaction was written and committed on.     70 / 262
  71. 71. Replication Roles Within the cluster, all nodes are equal 'master/donor node' The node a given transaction was written and committed on. 'slave/joiner node'     71 / 262
  72. 72. Replication Roles Within the cluster, all nodes are equal 'master/donor node' The node a given transaction was written and committed on. 'slave/joiner node' The node that received the given transaction via Galera replication.     72 / 262
  73. 73. Replication Roles Within the cluster, all nodes are equal 'master/donor node' The node a given transaction was written and committed on. 'slave/joiner node' The node that received the given transaction via Galera replication. The terms master and slave in this context are only relevant for a given transaction     73 / 262
  74. 74. Replication Roles Within the cluster, all nodes are equal 'master/donor node' The node a given transaction was written and committed on. 'slave/joiner node' The node that received the given transaction via Galera replication. The terms master and slave in this context are only relevant for a given transaction Writeset: Galera’s term for a transaction. One or more RBR row changes     74 / 262
  75. 75. State Transfer Roles New nodes joining an existing cluster get provisioned automatically     75 / 262
  76. 76. State Transfer Roles New nodes joining an existing cluster get provisioned automatically Joiner = Mew node     76 / 262
  77. 77. State Transfer Roles New nodes joining an existing cluster get provisioned automatically Joiner = Mew node Donor = Node giving a copy of the datadir     77 / 262
  78. 78. State Transfer Roles New nodes joining an existing cluster get provisioned automatically Joiner = Mew node Donor = Node giving a copy of the datadir State Snapshot transfer     78 / 262
  79. 79. State Transfer Roles New nodes joining an existing cluster get provisioned automatically Joiner = Mew node Donor = Node giving a copy of the datadir State Snapshot transfer Full backup of Donor to Joiner     79 / 262
  80. 80. State Transfer Roles New nodes joining an existing cluster get provisioned automatically Joiner = Mew node Donor = Node giving a copy of the datadir State Snapshot transfer Full backup of Donor to Joiner Incremental Snapshot transfer     80 / 262
  81. 81. State Transfer Roles New nodes joining an existing cluster get provisioned automatically Joiner = Mew node Donor = Node giving a copy of the datadir State Snapshot transfer Full backup of Donor to Joiner Incremental Snapshot transfer Only changes since node left cluster     81 / 262
  82. 82. Writeset RBR payload (black box to Galera)     82 / 262
  83. 83. Writeset RBR payload (black box to Galera) Replication keys (generated by master/donor node)     83 / 262
  84. 84. Writeset RBR payload (black box to Galera) Replication keys (generated by master/donor node) Primary keys     84 / 262
  85. 85. Writeset RBR payload (black box to Galera) Replication keys (generated by master/donor node) Primary keys Unique keys     85 / 262
  86. 86. Writeset RBR payload (black box to Galera) Replication keys (generated by master/donor node) Primary keys Unique keys Foreign Keys     86 / 262
  87. 87. Writeset RBR payload (black box to Galera) Replication keys (generated by master/donor node) Primary keys Unique keys Foreign Keys Table names     87 / 262
  88. 88. Writeset RBR payload (black box to Galera) Replication keys (generated by master/donor node) Primary keys Unique keys Foreign Keys Table names Schema names     88 / 262
  89. 89. Writeset RBR payload (black box to Galera) Replication keys (generated by master/donor node) Primary keys Unique keys Foreign Keys Table names Schema names Keys are what make certification possible     89 / 262
  90. 90. Replication ? It consists in 4 operations: Apply Replication Certification Commit The order differs with the node's role     90 / 262
  91. 91. Replication Order on Master/Donor 1. Apply 2. Replication 3. Certification 4. Commit     91 / 262
  92. 92. Replication Order on Slave/Joiner 1. Replication (from master/donor) 2. Certification 3. Apply 4. Commit     92 / 262
  93. 93. Certification Can this writeset be applied?     93 / 262
  94. 94. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor     94 / 262
  95. 95. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes     95 / 262
  96. 96. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes Happens on every node     96 / 262
  97. 97. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes Happens on every node Should be deterministic on every node     97 / 262
  98. 98. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes Happens on every node Should be deterministic on every node Results are not reported to the cluster     98 / 262
  99. 99. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes Happens on every node Should be deterministic on every node Results are not reported to the cluster Pass: enter apply queue (commit success on master/donor)     99 / 262
  100. 100. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes Happens on every node Should be deterministic on every node Results are not reported to the cluster Pass: enter apply queue (commit success on master/donor) Fail: drop transaction (or return deadlock on master/donor)     100 / 262
  101. 101. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes Happens on every node Should be deterministic on every node Results are not reported to the cluster Pass: enter apply queue (commit success on master/donor) Fail: drop transaction (or return deadlock on master/donor) Serialized by group communication sequence (and GTID will be synchronized following the same sequence)     101 / 262
  102. 102. Certification Can this writeset be applied? Based on unapplied earlier transactions on master/donor Such conflicts must come from other nodes Happens on every node Should be deterministic on every node Results are not reported to the cluster Pass: enter apply queue (commit success on master/donor) Fail: drop transaction (or return deadlock on master/donor) Serialized by group communication sequence (and GTID will be synchronized following the same sequence) Cost based on # of keys or # of rows     102 / 262
  103. 103. Apply Apply is done on slave nodes after certification     103 / 262
  104. 104. Apply Apply is done on slave nodes after certification Can be parallelized     104 / 262
  105. 105. Apply Apply is done on slave nodes after certification Can be parallelized if wsrep_slave_threads > 1     105 / 262
  106. 106. Apply Apply is done on slave nodes after certification Can be parallelized if wsrep_slave_threads > 1 if there are no other writesets with conflicting keys also being applied     106 / 262
  107. 107. Apply Apply is done on slave nodes after certification Can be parallelized if wsrep_slave_threads > 1 if there are no other writesets with conflicting keys also being applied Cost: size of transaction     107 / 262
  108. 108. Apply Apply is done on slave nodes after certification Can be parallelized if wsrep_slave_threads > 1 if there are no other writesets with conflicting keys also being applied Cost: size of transaction Generates brute force aborts on local node for conflicts     108 / 262
  109. 109. Commit Final local InnoDB commit     109 / 262
  110. 110. Commit Final local InnoDB commit i.e., innodb_flush_log_at_trx_commit     110 / 262
  111. 111. Commit Final local InnoDB commit i.e., innodb_flush_log_at_trx_commit GTID gets generated     111 / 262
  112. 112. Commit Final local InnoDB commit i.e., innodb_flush_log_at_trx_commit GTID gets generated Done by applier threads on slaves/joiners     112 / 262
  113. 113. Commit Final local InnoDB commit i.e., innodb_flush_log_at_trx_commit GTID gets generated Done by applier threads on slaves/joiners Done by client thread on master/donor     113 / 262
  114. 114. Commit Final local InnoDB commit i.e., innodb_flush_log_at_trx_commit GTID gets generated Done by applier threads on slaves/joiners Done by client thread on master/donor innodb_flush_log_at_trx_commit=1 not required generally for PXC!     114 / 262
  115. 115. Galera Replication (autocommit)     115 / 262
  116. 116. Galera Replication (autocommit)     116 / 262
  117. 117. Galera Replication (autocommit)     117 / 262
  118. 118. Galera Replication (autocommit)     118 / 262
  119. 119. Galera Replication (autocommit)     119 / 262
  120. 120. Galera Replication (autocommit)     120 / 262
  121. 121. Galera Replication (autocommit)     121 / 262
  122. 122. Galera Replication (full transaction)     122 / 262
  123. 123. Galera Replication (full transaction)     123 / 262
  124. 124. Galera Replication (full transaction)     124 / 262
  125. 125. Galera Replication (full transaction)     125 / 262
  126. 126. Galera Replication (full transaction)     126 / 262
  127. 127. Galera Replication (full transaction)     127 / 262
  128. 128. Galera Replication (full transaction)     128 / 262
  129. 129. Galera Replication (full transaction)     129 / 262
  130. 130. Galera Replication (full transaction)     130 / 262
  131. 131. Optimistic Locking Traditional locking     131 / 262
  132. 132. Optimistic Locking Traditional locking     132 / 262
  133. 133. Optimistic Locking Traditional locking     133 / 262
  134. 134. Optimistic Locking Traditional locking     134 / 262
  135. 135. Optimistic Locking Traditional locking     135 / 262
  136. 136. Optimistic Locking Traditional locking     136 / 262
  137. 137. Optimistic Locking Optimistic Locking     137 / 262
  138. 138. Optimistic Locking Optimistic Locking     138 / 262
  139. 139. Optimistic Locking Optimistic Locking     139 / 262
  140. 140. Optimistic Locking Optimistic Locking     140 / 262
  141. 141. Optimistic Locking Optimistic Locking     141 / 262
  142. 142. Optimistic Locking Optimistic Locking     142 / 262
  143. 143. Optimistic Locking Optimistic Locking     143 / 262
  144. 144. Certification Failure     144 / 262
  145. 145. Certification Failure Trx1 is open on 㫙㫜㫛㫜㫚 Trx2 is open on 㫙㫜㫛㫜㫛     145 / 262
  146. 146. Certification Failure 㫙㫜㫛㫜㫚 gets 㫖㫚㫘㫘㫗㫗     146 / 262
  147. 147. Certification Failure Synchronous replication     147 / 262
  148. 148. Certification Failure Certification tests run in isolation on each node     148 / 262
  149. 149. Certification Failure Certification tests: asynchronous     149 / 262
  150. 150. Certification Failure Synchronous replication deterministic     150 / 262
  151. 151. Certification Failure Certification succeeds     151 / 262
  152. 152. Certification Failure Certified transaction goes to the apply queue     152 / 262
  153. 153. Certification Failure     153 / 262
  154. 154. Certification Failure On 㫙㫜㫛㫜㫚, a successful cert test, means an actual 㫚㫜㫚㫚㫠㫡     154 / 262
  155. 155. Certification Failure and transactions in the apply queue (㫙㫜㫛㫜㫛 & 㫙㫜㫛㫜㫜) are executed asynchronously     155 / 262
  156. 156. Certification Failure On 㫛㫜㫛㫜㫛 we commit the transaction     156 / 262
  157. 157. Certification Failure Synchronous replication     157 / 262
  158. 158. Certification Failure     158 / 262
  159. 159. Certification Failure     159 / 262
  160. 160. Certification Failure     160 / 262
  161. 161. Certification Failure     161 / 262
  162. 162. Certification Failure     162 / 262
  163. 163. Certification Failure     163 / 262
  164. 164. Certification Failure     164 / 262
  165. 165. Brute Force Abort (bfa)     165 / 262
  166. 166. Brute Force Abort (bfa)     166 / 262
  167. 167. Brute Force Abort (bfa)     167 / 262
  168. 168. Brute Force Abort (bfa)     168 / 262
  169. 169. Brute Force Abort (bfa)     169 / 262
  170. 170. Brute Force Abort (bfa)     170 / 262
  171. 171. Brute Force Abort (bfa)     171 / 262
  172. 172. Local Certification Failure (lcf)     172 / 262
  173. 173. Local Certification Failure (lcf)     173 / 262
  174. 174. Local Certification Failure (lcf)     174 / 262
  175. 175. Local Certification Failure (lcf)     175 / 262
  176. 176. Local Certification Failure (lcf)     176 / 262
  177. 177. Local Certification Failure (lcf)     177 / 262
  178. 178. Local Certification Failure (lcf)     178 / 262
  179. 179. Local Certification Failure (lcf)     179 / 262
  180. 180. Local Certification Failure (lcf)     180 / 262
  181. 181. Local Certification Failure (lcf)     181 / 262
  182. 182. Local Certification Failure (lcf)     182 / 262
  183. 183. Local Certification Failure (lcf)     183 / 262
  184. 184. Certification Errors: summary     184 / 262
  185. 185. Certification Errors: summary     185 / 262
  186. 186. Certification Errors: summary     186 / 262
  187. 187. Flow Control Ability of any node in the cluster to ask the rest of the nodes to pause writes while it catches up     187 / 262
  188. 188. Flow Control Ability of any node in the cluster to ask the rest of the nodes to pause writes while it catches up Feedback mechanism for replication process     188 / 262
  189. 189. Flow Control Ability of any node in the cluster to ask the rest of the nodes to pause writes while it catches up Feedback mechanism for replication process ONLY caused by 㫤㫠㫟㫜㫝㫘㫙㫜㫚㫘㫙㫘㫟㫜㫚㫣㫘㫞㫢㫜㫢㫜 exceeding a node's 㫝㫚㫘㫙㫠㫚㫠㫡     189 / 262
  190. 190. Flow Control Ability of any node in the cluster to ask the rest of the nodes to pause writes while it catches up Feedback mechanism for replication process ONLY caused by 㫤㫠㫟㫜㫝㫘㫙㫜㫚㫘㫙㫘㫟㫜㫚㫣㫘㫞㫢㫜㫢㫜 exceeding a node's 㫝㫚㫘㫙㫠㫚㫠㫡 CAN pause the entire cluster and look like a cluster stall !     190 / 262
  191. 191. Tuning Flow Control Too low:     191 / 262
  192. 192. Tuning Flow Control Too low: frequent FC from any and all nodes in the cluster     192 / 262
  193. 193. Tuning Flow Control Too low: frequent FC from any and all nodes in the cluster Too high:     193 / 262
  194. 194. Tuning Flow Control Too low: frequent FC from any and all nodes in the cluster Too high: increase in replication conflicts in multi-node writings     194 / 262
  195. 195. Tuning Flow Control Too low: frequent FC from any and all nodes in the cluster Too high: increase in replication conflicts in multi-node writings increase in apply lag on nodes     195 / 262
  196. 196. Tuning Flow Control Too low: frequent FC from any and all nodes in the cluster Too high: increase in replication conflicts in multi-node writings increase in apply lag on nodes increase in commit lag on nodes     196 / 262
  197. 197. Tuning Flow Control Too low: frequent FC from any and all nodes in the cluster Too high: increase in replication conflicts in multi-node writings increase in apply lag on nodes increase in commit lag on nodes One node with FC issues     197 / 262
  198. 198. Tuning Flow Control Too low: frequent FC from any and all nodes in the cluster Too high: increase in replication conflicts in multi-node writings increase in apply lag on nodes increase in commit lag on nodes One node with FC issues deal with that node -bad hardware? too slow ?     198 / 262
  199. 199. Flow Control     199 / 262
  200. 200. Flow Control     200 / 262
  201. 201. Flow Control     201 / 262
  202. 202. Flow Control     202 / 262
  203. 203. Flow Control     203 / 262
  204. 204. Flow Control     204 / 262
  205. 205. Flow Control     205 / 262
  206. 206. Flow Control     206 / 262
  207. 207. Flow Control     207 / 262
  208. 208. Flow Control     208 / 262
  209. 209. Flow Control     209 / 262
  210. 210. Flow Control     210 / 262
  211. 211. Flow Control     211 / 262
  212. 212. Flow Control     212 / 262
  213. 213. Flow Control     213 / 262
  214. 214. Flow Control     214 / 262
  215. 215. Flow Control     215 / 262
  216. 216. Flow Control     216 / 262
  217. 217. Flow Control     217 / 262
  218. 218. Flow Control     218 / 262
  219. 219. Flow Control     219 / 262
  220. 220. State Transfer There are two types of State Transfer in Galera:     220 / 262
  221. 221. State Transfer There are two types of State Transfer in Galera: 1. SST (Snapshot State Transfer): full data copy     221 / 262
  222. 222. State Transfer There are two types of State Transfer in Galera: 1. SST (Snapshot State Transfer): full data copy rsync mysqldump xtrabackup     222 / 262
  223. 223. State Transfer There are two types of State Transfer in Galera: 1. SST (Snapshot State Transfer): full data copy rsync mysqldump xtrabackup 2. IST (Incremental State Transfer): only copy the missing events     223 / 262
  224. 224. State Transfer There are two types of State Transfer in Galera: 1. SST (Snapshot State Transfer): full data copy rsync mysqldump xtrabackup 2. IST (Incremental State Transfer): only copy the missing events It's always better to try to avoid SST!     224 / 262
  225. 225. State Transfer There are two types of State Transfer in Galera: 1. SST (Snapshot State Transfer): full data copy rsync mysqldump xtrabackup 2. IST (Incremental State Transfer): only copy the missing events It's always better to try to avoid SST! 㫤㫠㫟㫜㫝㫘㫠㫠㫡㫘㫛㫜㫛㫜㫟 can be used to specify the donor     225 / 262
  226. 226. State Transfer: grastate.dat When MySQL starts it checks 㫞㫟㫘㫠㫡㫘㫡㫜㫗㫛㫘㫡 file (in datadir)     226 / 262
  227. 227. State Transfer: grastate.dat When MySQL starts it checks 㫞㫟㫘㫠㫡㫘㫡㫜㫗㫛㫘㫡 file (in datadir) This file placeholds GTID between MySQL restarts     227 / 262
  228. 228. State Transfer: grastate.dat When MySQL starts it checks 㫞㫟㫘㫠㫡㫘㫡㫜㫗㫛㫘㫡 file (in datadir) This file placeholds GTID between MySQL restarts Contains UUID and Seqno     228 / 262
  229. 229. State Transfer: grastate.dat When MySQL starts it checks 㫞㫟㫘㫠㫡㫘㫡㫜㫗㫛㫘㫡 file (in datadir) This file placeholds GTID between MySQL restarts Contains UUID and Seqno 㫕㫔㫗㫖㫗㫖㫗㫖㫔㫠㫘㫣㫜㫛㫔㫠㫡㫘㫡㫜 㫣㫜㫟㫠㫠㫜㫛㫣㫔㫛㫗㫚 㫢㫢㫠㫛㫣㫔㫔㫔㫔㫛㫙㫛㫞㫘㫚㫙㫝㫖㫞㫚㫘㫛㫖㫚㫚㫜㫞㫖㫡㫙㫢㫘㫖㫜㫛㫠㫢㫘㫟㫡㫛㫞㫝㫙㫠 㫠㫜㫞㫛㫜㫣㫔㫔㫔㫚㫝㫙㫙㫚 㫚㫜㫟㫡㫘㫠㫛㫛㫜㫥㫣     229 / 262
  230. 230. State Transfer: grastate.dat When MySQL starts it checks 㫞㫟㫘㫠㫡㫘㫡㫜㫗㫛㫘㫡 file (in datadir) This file placeholds GTID between MySQL restarts Contains UUID and Seqno 㫕㫔㫗㫖㫗㫖㫗㫖㫔㫠㫘㫣㫜㫛㫔㫠㫡㫘㫡㫜 㫣㫜㫟㫠㫠㫜㫛㫣㫔㫛㫗㫚 㫢㫢㫠㫛㫣㫔㫔㫔㫔㫛㫙㫛㫞㫘㫚㫙㫝㫖㫞㫚㫘㫛㫖㫚㫚㫜㫞㫖㫡㫙㫢㫘㫖㫜㫛㫠㫢㫘㫟㫡㫛㫞㫝㫙㫠 㫠㫜㫞㫛㫜㫣㫔㫔㫔㫚㫝㫙㫙㫚 㫚㫜㫟㫡㫘㫠㫛㫛㫜㫥㫣 node shutdown cleanly     230 / 262
  231. 231. grastate.dat - example 㫕㫔㫗㫖㫗㫖㫗㫖㫔㫠㫘㫣㫜㫛㫔㫠㫡㫘㫡㫜 㫣㫜㫟㫠㫠㫜㫛㫣㫔㫛㫗㫚 㫢㫢㫠㫛㫣㫔㫔㫔㫔㫛㫙㫛㫞㫘㫚㫙㫝㫖㫞㫚㫘㫛㫖㫚㫚㫜㫞㫖㫡㫙㫢㫘㫖㫜㫛㫠㫢㫘㫟㫡㫛㫞㫝㫙㫠 㫠㫜㫞㫛㫜㫣㫔㫔㫔㫖㫚 㫚㫜㫟㫡㫘㫠㫛㫛㫜㫥㫣     231 / 262
  232. 232. grastate.dat - example 㫕㫔㫗㫖㫗㫖㫗㫖㫔㫠㫘㫣㫜㫛㫔㫠㫡㫘㫡㫜 㫣㫜㫟㫠㫠㫜㫛㫣㫔㫛㫗㫚 㫢㫢㫠㫛㫣㫔㫔㫔㫔㫛㫙㫛㫞㫘㫚㫙㫝㫖㫞㫚㫘㫛㫖㫚㫚㫜㫞㫖㫡㫙㫢㫘㫖㫜㫛㫠㫢㫘㫟㫡㫛㫞㫝㫙㫠 㫠㫜㫞㫛㫜㫣㫔㫔㫔㫖㫚 㫚㫜㫟㫡㫘㫠㫛㫛㫜㫥㫣 node is running or did not shutdown cleanly     232 / 262
  233. 233. grastate.dat - example 㫕㫔㫗㫖㫗㫖㫗㫖㫔㫠㫘㫣㫜㫛㫔㫠㫡㫘㫡㫜 㫣㫜㫟㫠㫠㫜㫛㫣㫔㫛㫗㫚 㫢㫢㫠㫛㫣㫔㫔㫔㫔㫙㫙㫙㫙㫙㫙㫙㫙㫖㫙㫙㫙㫙㫖㫙㫙㫙㫙㫙㫙㫙㫙㫙㫖㫙㫙㫙㫙㫙㫙㫙㫙㫙㫙㫙㫙㫔㫔㫔㫔㫔㫔㫔㫔㫔 㫠㫜㫞㫛㫜㫣㫔㫔㫔㫖㫚 㫚㫜㫟㫡㫘㫠㫛㫛㫜㫥㫣     233 / 262
  234. 234. grastate.dat - example 㫕㫔㫗㫖㫗㫖㫗㫖㫔㫠㫘㫣㫜㫛㫔㫠㫡㫘㫡㫜 㫣㫜㫟㫠㫠㫜㫛㫣㫔㫛㫗㫚 㫢㫢㫠㫛㫣㫔㫔㫔㫔㫙㫙㫙㫙㫙㫙㫙㫙㫖㫙㫙㫙㫙㫖㫙㫙㫙㫙㫙㫙㫙㫙㫙㫖㫙㫙㫙㫙㫙㫙㫙㫙㫙㫙㫙㫙㫔㫔㫔㫔㫔㫔㫔㫔㫔 㫠㫜㫞㫛㫜㫣㫔㫔㫔㫖㫚 㫚㫜㫟㫡㫘㫠㫛㫛㫜㫥㫣 node aborted, SST on next restart     234 / 262
  235. 235. State Transfer: IST When a node starts, it knowns the UUID of the cluster it belonged and the last sequence number it applied. So, it sends that position to the other members of the cluster and if a node can send the next events (ws/trx), IST will be performed, if none, then SST will be triggered.     235 / 262
  236. 236. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file.     236 / 262
  237. 237. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file. preallocated file with a specific size     237 / 262
  238. 238. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file. preallocated file with a specific size used to store the writesets in circular buffer style     238 / 262
  239. 239. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file. preallocated file with a specific size used to store the writesets in circular buffer style default size is 128M     239 / 262
  240. 240. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file. preallocated file with a specific size used to store the writesets in circular buffer style default size is 128M can be increase via provider option 㫞㫚㫘㫚㫟㫜㫗㫠㫠㫚㫜     240 / 262
  241. 241. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file. preallocated file with a specific size used to store the writesets in circular buffer style default size is 128M can be increase via provider option 㫞㫚㫘㫚㫟㫜㫗㫠㫠㫚㫜 㫤㫠㫟㫜㫝㫘㫝㫟㫜㫣㫠㫛㫜㫟㫘㫜㫝㫡㫠㫜㫛㫠㫔㫖㫔㫔㫞㫚㫘㫚㫟㫜㫗㫠㫠㫚㫜㫖㫚㫗㫔     241 / 262
  242. 242. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file. preallocated file with a specific size used to store the writesets in circular buffer style default size is 128M can be increase via provider option 㫞㫚㫘㫚㫟㫜㫗㫠㫠㫚㫜 㫤㫠㫟㫜㫝㫘㫝㫟㫜㫣㫠㫛㫜㫟㫘㫜㫝㫡㫠㫜㫛㫠㫔㫖㫔㫔㫞㫚㫘㫚㫟㫜㫗㫠㫠㫚㫜㫖㫚㫗㫔 Galare Cache is mmaped (I/O buffered to memory)     242 / 262
  243. 243. Galera Cache Those events are stored on the 㫞㫘㫙㫜㫟㫘㫗㫚㫘㫚㫟㫜 file. preallocated file with a specific size used to store the writesets in circular buffer style default size is 128M can be increase via provider option 㫞㫚㫘㫚㫟㫜㫗㫠㫠㫚㫜 㫤㫠㫟㫜㫝㫘㫝㫟㫜㫣㫠㫛㫜㫟㫘㫜㫝㫡㫠㫜㫛㫠㫔㫖㫔㫔㫞㫚㫘㫚㫟㫜㫗㫠㫠㫚㫜㫖㫚㫗㫔 Galare Cache is mmaped (I/O buffered to memory) 㫤㫠㫟㫜㫝㫘㫙㫜㫚㫘㫙㫘㫚㫘㫚㫟㫜㫛㫘㫛㫜㫤㫛㫡㫜 provide the first seqno present in the cache for that node     243 / 262
  244. 244. Galera Cache & IST     244 / 262
  245. 245. Galera Cache & IST     245 / 262
  246. 246. Galera Cache & IST     246 / 262
  247. 247. Galera Cache & IST     247 / 262
  248. 248. Galera Cache & IST     248 / 262
  249. 249. Galera Cache & IST     249 / 262
  250. 250. Galera Cache & IST     250 / 262
  251. 251. Galera Cache & IST     251 / 262
  252. 252. Galera Cache & IST     252 / 262
  253. 253. Galera Cache & IST     253 / 262
  254. 254. Galera Cache & IST     254 / 262
  255. 255. Galera Cache & IST     255 / 262
  256. 256. Galera Cache & IST     256 / 262
  257. 257. Galera Cache & IST     257 / 262
  258. 258. Galera Cache & IST     258 / 262
  259. 259. Galera Cache & IST     259 / 262
  260. 260. Galera Cache & IST     260 / 262
  261. 261. Galera Cache & IST     261 / 262
  262. 262. Thank you !     262 / 262

×