Publicité
Publicité

Contenu connexe

Similaire à Azure BCDR in Action: From Setup to Failover and Back(20)

Publicité

Azure BCDR in Action: From Setup to Failover and Back

  1. Azure BCDR in Action From Setup to Failover and Back Yung Chou Cloud Solution Architect US West
  2. References • Microsoft Cloud Workshop • Building a resilient IaaS architecture • Selected Readings • Multi-tier web application built for HA/DR • Tutorial: AG in multiple subnets - SQL Server on Azure VMs • Azure Application Architecture Fundamentals • Microsoft Azure Well-Architected Framework • Security documentation • General questions about the Azure Site Recovery service 2 3/17/2023
  3. Agenda • The app • Architecture and Deployment • SQL AlwaysOn • DR Approach • DR Settings • SQL Back End • Automation and DR Plan • IIS Front End • Failover Test This delivery covers most of Exercise 2 and beyond of the workshop. 3/17/2023 3
  4. 3/17/2023 4 The following slides are: • Taken from multiple deployments of the workshop • Intended as additional information to facilitate your workshop deployment by providing a reference for the context and expected results These slides are: • NOT for replacing the workshop instructions • With most values if referenced along with the workshop instructions on relevant exercises and tasks Notice while the slides taken from deployments may show resource names inconsistent from one section to another, the process flows with expected resource states remain correctly depicted.
  5. 3/17/2023 5
  6. West US3 East US ContosoWebLBPrimary ContosoWebLBSecondary Contoso Front Door Contoso Insurance App • Front Door pointing to Contoso origin • External LB • HTTP only on port 80 • Source and DR sites • Front Door routes • Web/IIS Tier • Zone redundancy • Backup with CRR • Internal LB • Port 1443 from IIS Tier only • Source and DR sites • Data/SQL Tier • Three-node failover cluster • Cloud Witness • Zone redundancy • AlewaysOn with listener on 1443 • One vnet with a DC in each zone • Vnet peering between westus3 and eastus • RSV • Backup RSV in westus3 • Site Recovery RSV in eastus 3/17/2023 9
  7. Disaster Recovery Approach 10 Tier DR Strategy Web Failover using Azure Site Recovery SQL Secondary SQL AlwaysOn Availability Group replica with asynchronous replication. Failover steps are integrated into Azure Site Recovery using Azure Automation. AD Active-active domain controllers 3/17/2023
  8. Create a Cloud Witness for SQL Failover Cluster 11 3/17/2023
  9. Add SQLVMs to Load Balancer Backend Pool 12 3/17/2023
  10. Create a SQL Failover Cluster (in SQLVM1) New-Cluster -Name AOGCLUSTER -Node SQLVM1,SQLVM2 -StaticAddress 10.0.2.99 13 3/17/2023
  11. Configure Quorum 14 3/17/2023
  12. Verify the Quorum 15 3/17/2023
  13. Enable SQLVM1 AlwaysOn for High Availability 16 3/17/2023
  14. 17 3/17/2023
  15. Enable SQLVM2 AlwaysOn for High Availability 19 3/17/2023
  16. Configure SQL AlwaysOn (with SQLVM1) 21 3/17/2023
  17. 22 3/17/2023
  18. 23 3/17/2023
  19. 24 3/17/2023
  20. 25 3/17/2023
  21. 26 3/17/2023
  22. Add WebVMs to Load Balancer Backend Pool 27 3/17/2023
  23. RDP into WebVM1 28 3/17/2023
  24. 29 3/17/2023
  25. 30 3/17/2023
  26. RDP into WebVM2 31 3/17/2023
  27. Contoso Insurance App • External LB • HTTP only on port 80 only • Web/IIS Tier • Zone redundancy • Internal LB • Port 1443 from IIS Tier only • Data/SQL Tier • Two-node failover cluster • Cloud Witness • Zone redundancy • AlewaysOn with listener on 1443 • One vnet with one DC in each zone HTTP Requests 3/17/2023 34
  28. Contoso Insurance App Deployment Highlights • Contoso.ins • Source in westus3 • contoso-vnet-westus3 • Contoso.ins.DR-Site • DR site in eastus • contoso-vnet-eastus • Contoso.ins.RSV • Backup: contoso-RSV-westus3 • IIS1, IIS2, SQL1 and SQL2 backup • DR: contoso-RSV-eastus • Automation account • Failover runbooks 3/17/2023 35
  29. 3/17/2023 36
  30. 3/17/2023 37
  31. 3/17/2023 38
  32. 3/17/2023 39
  33. 3/17/2023 40
  34. 3/17/2023 41
  35. 3/17/2023 42
  36. 43 3/17/2023
  37. 44 3/17/2023
  38. Create An Automation (RunAs) Account for Executing DR Runbooks 45 3/17/2023
  39. DR Runbooks 46 MCW-Building-a-resilient-IaaS-architecture/studentfiles.zip at master · benstegink/MCW-Building-a-resilient-IaaS-architecture (github.com) 3/17/2023
  40. Import SQL DR Runbooks 47 3/17/2023
  41. 48 3/17/2023
  42. 3/17/2023 49
  43. Import Web DR Runbook 50 3/17/2023
  44. 51 3/17/2023
  45. 3/17/2023 52 Repeat the steps to import and published both runbooks.
  46. Add Automation Variables { "PrimarySiteRG": "contoso-insurance-w3", "PrimarySiteSQLVM1Name": "SQLVM1", "PrimarySiteSQLVM2Name": "SQLVM2", "PrimarySiteSQLPath": "SQLSERVER:SqlSQLVM1DEFAULTAvailabilityGroupssqlAO", "PrimarySiteVNetName": "contoso-insurance-hub-w3", "PrimarySiteWebSubnetName": "Apps", "PrimarySiteWebLBName": "ContosoWebLBPrimary", "SecondarySiteRG": "contoso-insurance-east", "SecondarySiteSQLVMName": "SQLVM3", "SecondarySiteSQLPath": "SQLSERVER:SqlSQLVM3DEFAULTAvailabilityGroupssqlAO", "SecondarySiteVNetName": "contoso-spoke-east", "SecondarySiteWebSubnetName": "Apps", "SecondarySiteWebLBName": "ContosoWebLBSecondary" } 53 3/17/2023
  47. 54 3/17/2023
  48. Extend the SQL AlwaysOn Created Earlier to include SQLVM3 to the Always On group as an asynchronous replica 1. In Azure portal, add SQLVM3 to the load-balancer backend pool in the DR site. 2. In SQLVM1, add SQLVM3 to the existing Windows Server Failover Cluster. 3. In SQLVM3, 3nable AlwaysOn and set the domain login credentials. 4. In SQLVM1, • update the Availability Group Listener to include the SQLVM3 IP address, • add SQLVM3 as an asynchronous replica in the existing Always On Availability Group. 5. In SQLVM1, run PowerShell script to update the failover cluster with the Listener IP addresses. 55 3/17/2023
  49. Add SQLVM3 to DR SQL LB Back End Pool 56 3/17/2023
  50. Add SQLVM3 to the Failover Cluster • Restart ADs as needed to ensure DNS entries are current • RDP into SQLVM1 and Add-ClusterNode -Name SQLVM3 58 3/17/2023
  51. Enable SQL AlwaysOn of SQLVM3 60 3/17/2023
  52. Back to SQLVM1 to Add SQLVM3 as a Replica 61 3/17/2023
  53. 62 3/17/2023
  54. 63 3/17/2023
  55. 64 3/17/2023
  56. Create Backup and DR Recovery Services Vaults 3/17/2023 68
  57. 69 3/17/2023
  58. 70 3/17/2023
  59. 71 3/17/2023
  60. 72 3/17/2023
  61. 73 3/17/2023
  62. 74 3/17/2023
  63. 75 3/17/2023
  64. 3/17/2023 76
  65. Verify DR Readiness 3/17/2023 77
  66. 3/17/2023 78
  67. 3/17/2023 79
  68. 3/17/2023 80
  69. 3/17/2023 81
  70. 3/17/2023 82
  71. 3/17/2023 83
  72. 3/17/2023 84
  73. 3/17/2023 85
  74. 3/17/2023 86
  75. 3/17/2023 87
  76. 3/17/2023 88
  77. 3/17/2023 89
  78. 3/17/2023 90
  79. 3/17/2023 91
  80. 3/17/2023 92
  81. 3/17/2023 93
  82. 3/17/2023 94
  83. 3/17/2023 96
  84. 3/17/2023 97
  85. 3/17/2023 98
  86. 3/17/2023 99
  87. 3/17/2023 100
  88. 3/17/2023 101
  89. 3/17/2023 102
  90. 3/17/2023 103
  91. 3/17/2023 104
  92. 3/17/2023 105
  93. 3/17/2023 106
  94. 3/17/2023 107
  95. 3/17/2023 108
  96. 3/17/2023 109
  97. 3/17/2023 110
  98. 3/17/2023 111
  99. 3/17/2023 112
  100. 3/17/2023 113
  101. 3/17/2023 114
  102. 3/17/2023 115
  103. 3/17/2023 116
  104. 3/17/2023 117
  105. 3/17/2023 118
  106. 3/17/2023 119
  107. 3/17/2023 120
  108. 3/17/2023 121
  109. 3/17/2023 122
  110. 3/17/2023 123
  111. 3/17/2023 124
  112. 3/17/2023 125
  113. 3/17/2023 126
  114. 3/17/2023 127
  115. 3/17/2023 128
  116. 3/17/2023 129
  117. 3/17/2023 130
  118. 3/17/2023 131
  119. 3/17/2023 132
  120. 3/17/2023 133
  121. 3/17/2023 134
  122. 3/17/2023 135
  123. 3/17/2023 136
  124. 3/17/2023 137
  125. 3/17/2023 138
  126. 3/17/2023 139
  127. 3/17/2023 140
  128. 3/17/2023 141
  129. 3/17/2023 142
  130. 3/17/2023 143
  131. 3/17/2023 144
  132. 3/17/2023 145
  133. 3/17/2023 146
  134. 3/17/2023 147
  135. 3/17/2023 148
  136. 3/17/2023 149
  137. 3/17/2023 150
  138. 3/17/2023 151
  139. 3/17/2023 152
  140. 3/17/2023 153
  141. 3/17/2023 154
  142. 3/17/2023 155
  143. 3/17/2023 156
  144. 3/17/2023 157
  145. 3/17/2023 158
  146. 3/17/2023 159
  147. 3/17/2023 161
  148. 3/17/2023 162
  149. 3/17/2023 163
  150. 3/17/2023 164
  151. 3/17/2023 165
  152. 3/17/2023 166
  153. 3/17/2023 167
  154. 3/17/2023 168
  155. 3/17/2023 169
  156. 3/17/2023 170
  157. 3/17/2023 171
  158. 3/17/2023 172
  159. 3/17/2023 173
  160. Step Description Documentation Reference Test Test the ASR configuration routinely and often for failing over from source site to DR one to ensure it works as expected Run a test failover (disaster recovery drill) to ASR Failover Initiate failover to switch over to the replicated environment for DR or planned maintenance About failover and failback in ASR - Modernized - ASR Commit Commit the changes made during the failover process to the replicated environment to ensure it's up-to-date Run a failover during disaster recovery with ASR Re-protect Re-protect the production environment to ensure it's ready for the next failover Reprotect Azure VMs to the primary region with ASR Re-test Re-test the ASR configuration to ensure it works as expected after re-protecting Fall back Fall back to the production environment if the failover was initiated for planned maintenance or testing Re-commit Re-commit the changes made during the failover process to the production environment to ensure it's up-to-date Re-protect Re-protect the replicated environment to ensure it's ready for the next failover after falling back Failover/Failback Routine 3/17/2023 174

Notes de l'éditeur

  1. In my experience, many companies viewed implementing Business Continuity and Disaster Recovery (BCDR) as too technically complex and financially unfeasible, resulting in it becoming more of an academic exercise than an attainable, predictable, measurable, and verifiable business process. With Azure Recovery Services, I have found this perception no longer accurate. I used the Microsoft Cloud Workshop to showcase Azure BCDR with step-by-step guidance to Configure a DR plan for a database app in Azure West US 3 region Drill/rehearse the plan to failover the app to Azure East US region is a DR scenario Execute a failover to mimic conducting a DR episode Commit the failover upon verifying the plan executed with expected results Later Follow a series of steps for reversing and falling back the app to its original region, West US 3 Reenable the protection, i.e., DR pan, and ensure readiness for future DR needs The slide deck includes screen captures of relevant processes and resource settings, serves as a reference for context and expected results. While the deck is not intended to replace the workshop instructions and despite inconsistent resource names in some sections, the process flows with expected resource states are accurately depicted. One may find it handy for realizing the how and what of executing the workshop exercises and tasks.
  2. Upon deployed the application, ContosoWebLBPrimaryIP has the public IP and the DNS name of the app. The landing page here is slightly different from that provided by the workshop.
  3. Policy page
  4. Customer info page
  5. Policy Holder page
  6. The presented demo infrastructure is not necessarily a recommendation. For instance, instead of internal LB, another option may be https://learn.microsoft.com/en-us/azure/architecture/example-scenario/infrastructure/multi-tier-app-disaster-recovery
  7. Enable Disaster Recovery for the Contoso application https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#exercise-2-enable-disaster-recovery-for-the-contoso-application
  8. Configure HA for the SQL Server tier https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-3-configure-ha-for-the-sql-server-tier
  9. https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-3-configure-ha-for-the-sql-server-tier
  10. https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-3-configure-ha-for-the-sql-server-tier BCDR DEMO GITHUB REPO https://github.com/Microsoft/MCW-Building-A-Resilient-IaaS-Architecture Suggest using: westus3 and eastus New-Cluster -Name AOGCLUSTER -Node SQLVM1,SQLVM2 -StaticAddress 10.0.2.99
  11. Select the Subnet of 10.0.2.0/24 and then add IPv4 10.0.2.100 and select OK. This is the IP address of the Internal Load Balancer that is in front of the SQLVM1 and SQLVM2 in the Data subnet running in the Primary Site.
  12. SQLAlwaysOn
  13. 10.0.2.100
  14. The automation account and associated runbooks can be placed in any region other than the source/primary region, as in DR the source/primary region is expected experiencing an outage.
  15. Bastion host names are difference due to an unplanned redeployment on Bastion in westus3.
  16. Location: Any region that support automation except for your primary region.
  17. Repeat the steps to import and published both runbooks.
  18. Configure DR for the SQL Server tier https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-3-configure-dr-for-the-sql-server-tier
  19. Add-ClusterNode -Name SQLVM3
  20. Enable Disaster Recovery for the Contoso application https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#exercise-2-enable-disaster-recovery-for-the-contoso-application
  21. https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-4-configure-dr-for-the-web-tier
  22. Exercise 3: Enable Backup for the Contoso application https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#exercise-3-enable-backup-for-the-contoso-application
  23. Task 3: Enable Backup for the SQL Server tier https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-3-enable-backup-for-the-sql-server-tier AS NEEDED Register-AzResourceProvider -ProviderNamespace Microsoft.SqlVirtualMachine New-AzSqlVM -Name ‘ci-sql1' -ResourceGroupName ‘ci-w3' -SqlManagementType Full -Location ‘westus3' -LicenseType PAYG New-AzSqlVM -Name ‘ci-sql2' -ResourceGroupName ‘ci-w3' -SqlManagementType Full -Location ‘westus3' -LicenseType PAYG New-AzSqlVM -Name ‘ci-sql3' -ResourceGroupName ‘ci-eus' -SqlManagementType Full -Location ‘eastus' -LicenseType PAYG
  24. Task 5: Configure a public endpoint using Azure Front Door https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-5-configure-a-public-endpoint-using-azure-front-door
  25. Task 2: Validate Disaster Recovery - Failover IaaS region to region https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-2-validate-disaster-recovery---failover-iaas-region-to-region
  26. Task 2: Validate Disaster Recovery - Failover IaaS region to region https://github.com/microsoft/MCW-Building-a-resilient-IaaS-architecture/blob/master/Hands-on%20lab/HOL%20step-by%20step%20-%20Building%20a%20resilient%20IaaS%20architecture.md#task-2-validate-disaster-recovery---failover-iaas-region-to-region
Publicité