Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Deploying a highly available Messaging
Hub using the IBM MQ Appliance
Session #3465
Anthony Beardsmore (abeards@uk.ibm.com...
Introduction
High availability versus disaster recovery
Timeline
Very high level overview and terminology
• High availability: providing continuity of service after a ‘component’
failure ...
HA and DR – comparison of aims
High Availability
• 100% data retention
• Short distances (meters or miles)
• Automatic fai...
Timeline / Versions
• This presentation is based on the capabilities of the 8.0.0.4 firmware for
the IBM MQ Appliance
– Re...
HA in the MQ appliance
Setting up HA
4. Then create an HA queue manager:
crtmqm -sx HAQM1
Implementing HA is a simple with the MQ Appliance!
2. O...
High Availability – Physical layout
Replication Connection
(10 Gb Ethernet)Heartbeat Connections
(1 Gb Ethernet)
QM1 QM2 QM3
QM1 QM2 QM3
Fully synchronous
replication
Key design points:
- No (persistent) message loss
- No external depe...
QM1 QM2 QM3
QM1 QM2 QM3
HA Groups
• Exactly two appliances
• Manages automatic failover
Group
QM1 QM2 QM3
QM1 QM2 QM3
• Queue manager or entire
appliance level failures
• Will restart on local system if
possible
Designing a group
HA Queue managers in MQ Console
12
HA Group
Appliance #1
HA Group
Appliance #2
HA Queue managers in MQ Console: After Failover
13
• Appliance #1 is now in Standby
• All HA queue managers are now runnin...
HA failover (CLI view)
On Appliance #2:
• HAQM2 is running there, on its primary and preferred location
• HAQM1 is running...
Appliance HA commands
Command Description
crthagrp Create a high availability group of appliances.
dsphagrp Display the st...
Notes: Designing a group
• This image is straight from the Infocenter and gives a good overview of the
possible combinatio...
HA – requirements / restrictions
• MUST ensure redundancy between heartbeat (1GB) links
– Shared nothing – power, switches...
DR in the MQ appliance
Setting up Disaster Recovery
DR has different goals (asynchronous, manual) so slightly different
externals to HA but simil...
DR Replication
Asynchronous
• Single 10GB Ethernet connection
• Eth20 interface
Disaster Recovery: Physical connection
Production appliance
Off‐site DR appliance
Asynchronous 
replication
• Because DR has no concept of a group Disaster Recov...
Replication, synchronization and snapshots (1)
• There are two modes in which data can be sent from the Primary
instance t...
Replication, synchronization and snapshots (2)
• To resolve this issue, a ‘snapshot’ is taken of the queue manager
wheneve...
DR – requirements / restrictions
• The maximum latency for the replication link is 100 ms.
• The IP addresses used for the...
Communication considerations
MQ channels, Application reconnection, security
Channel reconnection
• The same approach is used whether in HA or DR mode
– While different underlying technology, uses sa...
Client reconnect implications
• When a queue manager ‘fails over’ using HA or DR, effectively from the
point of view of an...
Security
• Most security data - for example certificate stores and authority records
- is replicated alongside queue manag...
Performance
HA and DR Performance
30
0
10
20
30
40
50
60
70
80
90
100
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
1 21 41 61 81 101...
Notices and Disclaimers
31
Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document...
Notices and Disclaimers Con’t.
32
Information concerning non-IBM products was obtained from the suppliers of those product...
Thank You
Your Feedback is Important!
Access the InterConnect 2016 Conference Attendee
Portal to complete your session sur...
Prochain SlideShare
Chargement dans…5
×

sur

Building a Highly available messaging hub using the IBM MQ Appliance Slide 1 Building a Highly available messaging hub using the IBM MQ Appliance Slide 2 Building a Highly available messaging hub using the IBM MQ Appliance Slide 3 Building a Highly available messaging hub using the IBM MQ Appliance Slide 4 Building a Highly available messaging hub using the IBM MQ Appliance Slide 5 Building a Highly available messaging hub using the IBM MQ Appliance Slide 6 Building a Highly available messaging hub using the IBM MQ Appliance Slide 7 Building a Highly available messaging hub using the IBM MQ Appliance Slide 8 Building a Highly available messaging hub using the IBM MQ Appliance Slide 9 Building a Highly available messaging hub using the IBM MQ Appliance Slide 10 Building a Highly available messaging hub using the IBM MQ Appliance Slide 11 Building a Highly available messaging hub using the IBM MQ Appliance Slide 12 Building a Highly available messaging hub using the IBM MQ Appliance Slide 13 Building a Highly available messaging hub using the IBM MQ Appliance Slide 14 Building a Highly available messaging hub using the IBM MQ Appliance Slide 15 Building a Highly available messaging hub using the IBM MQ Appliance Slide 16 Building a Highly available messaging hub using the IBM MQ Appliance Slide 17 Building a Highly available messaging hub using the IBM MQ Appliance Slide 18 Building a Highly available messaging hub using the IBM MQ Appliance Slide 19 Building a Highly available messaging hub using the IBM MQ Appliance Slide 20 Building a Highly available messaging hub using the IBM MQ Appliance Slide 21 Building a Highly available messaging hub using the IBM MQ Appliance Slide 22 Building a Highly available messaging hub using the IBM MQ Appliance Slide 23 Building a Highly available messaging hub using the IBM MQ Appliance Slide 24 Building a Highly available messaging hub using the IBM MQ Appliance Slide 25 Building a Highly available messaging hub using the IBM MQ Appliance Slide 26 Building a Highly available messaging hub using the IBM MQ Appliance Slide 27 Building a Highly available messaging hub using the IBM MQ Appliance Slide 28 Building a Highly available messaging hub using the IBM MQ Appliance Slide 29 Building a Highly available messaging hub using the IBM MQ Appliance Slide 30 Building a Highly available messaging hub using the IBM MQ Appliance Slide 31 Building a Highly available messaging hub using the IBM MQ Appliance Slide 32 Building a Highly available messaging hub using the IBM MQ Appliance Slide 33 Building a Highly available messaging hub using the IBM MQ Appliance Slide 34
Prochain SlideShare
IBM MQ Appliance - Administration simplified
Suivant
Télécharger pour lire hors ligne et voir en mode plein écran

0 j’aime

Partager

Télécharger pour lire hors ligne

Building a Highly available messaging hub using the IBM MQ Appliance

Télécharger pour lire hors ligne

The MQ Appliance includes two key facilities for maintaining the availability of your messaging infrastructure across expected and unexpected outages. This session looks in depth at the HA and DR capabilities of the MQ Appliance and application considerations when using them.

  • Soyez le premier à aimer ceci

Building a Highly available messaging hub using the IBM MQ Appliance

  1. 1. Deploying a highly available Messaging Hub using the IBM MQ Appliance Session #3465 Anthony Beardsmore (abeards@uk.ibm.com) IBM MQ Appliance Architect
  2. 2. Introduction High availability versus disaster recovery Timeline
  3. 3. Very high level overview and terminology • High availability: providing continuity of service after a ‘component’ failure (e.g. disk, software bug, network switch, power supply) • Disaster recovery: providing a means to re-establish service after a catastrophic failure (e.g. entire data center power loss, flood, fire) • Primary and secondary – which instance is currently ‘active’ – Has direct impact on commands you execute and behaviour – N.B. in Disaster Recovery features ‘active’ doesn’t necessarily mean ‘running’ • Main and recovery– which instance is usually ‘active’ – Doesn’t effect behaviour but useful concept in discussions and documentation 2
  4. 4. HA and DR – comparison of aims High Availability • 100% data retention • Short distances (meters or miles) • Automatic failover Disaster recovery • Some (minimal) data loss acceptable • Long distances (out of region) • Manual fail over 3 • Minimal effect possible on performance • As little impact on applications as possible In both cases we want:
  5. 5. Timeline / Versions • This presentation is based on the capabilities of the 8.0.0.4 firmware for the IBM MQ Appliance – Released December 2015 • Support for High Availability (HA) was a key feature of the IBM MQ Appliance from the first release and has been improved in every Fix Pack since – Strongly recommend moving to latest release to get all enhancements • Support for Disaster Recovery (DR) has been added in 8.0.0.4 and it uses some of the technology used for HA 4
  6. 6. HA in the MQ appliance
  7. 7. Setting up HA 4. Then create an HA queue manager: crtmqm -sx HAQM1 Implementing HA is a simple with the MQ Appliance! 2. On Appliance #1 issue the following command: prepareha -s <some random text> -a <address of appliance2> 1. Connect two appliances together  That’s it!  Note that there is no need to run strmqm. Queue managers will start and keep running unless explicitly ended with endmqm 3. On Appliance #2 issue the following command: crthagrp -s <the same random text> -a <address of appliance1>
  8. 8. High Availability – Physical layout Replication Connection (10 Gb Ethernet)Heartbeat Connections (1 Gb Ethernet)
  9. 9. QM1 QM2 QM3 QM1 QM2 QM3 Fully synchronous replication Key design points: - No (persistent) message loss - No external dependencies - Transparent to application
  10. 10. QM1 QM2 QM3 QM1 QM2 QM3 HA Groups • Exactly two appliances • Manages automatic failover Group
  11. 11. QM1 QM2 QM3 QM1 QM2 QM3 • Queue manager or entire appliance level failures • Will restart on local system if possible
  12. 12. Designing a group
  13. 13. HA Queue managers in MQ Console 12 HA Group Appliance #1 HA Group Appliance #2
  14. 14. HA Queue managers in MQ Console: After Failover 13 • Appliance #1 is now in Standby • All HA queue managers are now running on Appliance #2 • The console shows the High Availability alert, and a menu to allow you to see the status and to suspend or resume the appliance in the HA group.
  15. 15. HA failover (CLI view) On Appliance #2: • HAQM2 is running there, on its primary and preferred location • HAQM1 is running on its primary – Appliance #1, so is secondary on Appliance #2 On Appliance #2 – after failover: • HAQM1 is now running on Appliance #2 14 Before failover After failover
  16. 16. Appliance HA commands Command Description crthagrp Create a high availability group of appliances. dsphagrp Display the status of the appliances in the HA group. makehaprimary Specifies that an appliance is the 'winner' when resolving a partitioned situation in the HA group. prepareha Prepare an appliance to be part of an HA group sethagrp Pause and resume an appliance in an HA group. status Per Queue manager detailed information including replication status (e.g. percentage complete when re-syncing from lost connection)
  17. 17. Notes: Designing a group • This image is straight from the Infocenter and gives a good overview of the possible combinations of Queue managers in an HA Group. • As long as IP Addresses etc. for the three HA interfaces are correctly pre- configured, defining a group is as simple as executing the ‘crthagrp/prepareha’ commands. – Appliances can only be in exactly one group of exactly two appliances • Queue managers may be added to the group at crtmqm time, or after creation using the ‘sethagrp’ command – Queue managers can be active on either appliance (both appliances can simultaneously be running different active queue managers). Up to 16 active/passive instances per appliance are permitted. • Unlimited (other than by storage capacity etc.) non-HA queue managers may also be present on either appliance. – This might be desirable for example if you have applications/queue managers with different QOS agreements, or Test and production environments on the same system.
  18. 18. HA – requirements / restrictions • MUST ensure redundancy between heartbeat (1GB) links – Shared nothing – power, switches, routers, cables – Minimises risk of partitioning (‘split brain’) • Less than 10ms latency between appliances (replication interface) – For good application performance may find far lower required. – 1 or 2 ms a good target – practical at ‘metro area’ distances • Sufficient bandwidth for all message data being transferred through HA queue managers (replication interface) • No native VLAN tagging or link aggregation on these connections • Ideal world is physical cabling between systems – In a single datacentre, co-located rack scenario, avoids all infrastructure concerns 17
  19. 19. DR in the MQ appliance
  20. 20. Setting up Disaster Recovery DR has different goals (asynchronous, manual) so slightly different externals to HA but similar process 2. On ‘live’ appliance, convert queue manager to Disaster Recovery ‘source’: crtdrprimary –m <name> -r <standby> -i <ip address> -p <port> 1. Connect two appliances together (only one, 10GB, connection needed)  Synchronization begins immediately (‘status’ command shows progress)  In event that need to start the standby instance, ‘makedrprimary’ takes ownership of queue manager and then business as usual 3. On ‘standby’ appliance simply paste the text provided by the above crtdrsecondary <some provided parameters>
  21. 21. DR Replication Asynchronous • Single 10GB Ethernet connection • Eth20 interface Disaster Recovery: Physical connection
  22. 22. Production appliance Off‐site DR appliance Asynchronous  replication • Because DR has no concept of a group Disaster Recovery configurations can  be even more flexible than HA  • though of course there is no automated management and failover • Each QM independently configures replication to a particular appliance. • E.g. could configure single ‘DR’ site covering live appliances at multiple sites Mixed Test/DR appliance Production appliance Disaster Recover: Flexible topologies
  23. 23. Replication, synchronization and snapshots (1) • There are two modes in which data can be sent from the Primary instance to the Secondary instance 1. Replication – when the two instances are connected, each individual write is replicated from the Primary to the Secondary in the order in which they are made on the Primary 2. Synchronization – when the connection is lost and then restored. Synchronization is used to get the Secondary back in step as quickly as possible – but this means that the Secondary is Inconsistent until the synchronization completes and the queue manager would not be able to start
  24. 24. Replication, synchronization and snapshots (2) • To resolve this issue, a ‘snapshot’ is taken of the queue manager whenever synchronization is started • if connection lost again (or complete failure of the Primary appliance) while synchronizing you can still issue makedrprimary command on standby to recover. However: – This will revert the queue manager to the state at the beginning of the synchronization – Can take a long time (hours for a large queue manager) – updates made to the Primary since original outage will be lost • Space is reserved for this process whenever DR queue managers are configured – So may be surprised to see less disk available than you thought!
  25. 25. DR – requirements / restrictions • The maximum latency for the replication link is 100 ms. • The IP addresses used for the two eth20 ports should belong to the same subnet, and that subnet should only be used for disaster recovery configuration. • Native VLAN (trunked) and link aggregation are not supported on the replication interface 24
  26. 26. Communication considerations MQ channels, Application reconnection, security
  27. 27. Channel reconnection • The same approach is used whether in HA or DR mode – While different underlying technology, uses same ‘externals’ as multi-instance queue manager support (as included since MQ 7.0.1) • Client applications (and other queue managers) reconnect to the standby instance after failure by configuring multiple IP addresses for the channels – Either explicity in CONNAME (comma separated list) – Or by defining a CCDT with multiple endpoints – Or using a preconnect exit • Don’t forget that cluster receivers define their own ‘multiple homes’ 26
  28. 28. Client reconnect implications • When a queue manager ‘fails over’ using HA or DR, effectively from the point of view of an application or remote queue manager, this queue manager has been restarted. – Ordinarily, application would receive MQRC_CONNECTION_BROKEN • This can typically be hidden from applications using 7.0.1 or higher client libraries by using ‘client auto-reconnect’ feature, but there are some limitations to be aware of – Failure during initial connect will still result in ‘MQCONN’ receiving a bad rc and having to retry – Browse cursors are reset – In doubt ‘commit’ cannot be transparently handled. • This can allow existing applications to exploit HA with no change or minimal change, but consult documentation 27
  29. 29. Security • Most security data - for example certificate stores and authority records - is replicated alongside queue manager in HA or DR configuration • However: Users and groups are NOT replicated between appliances – because not all configuration has to be identical – you may have queue managers on either device and associated users which you do not wish replicated – So… strongly consider LDAP for messaging users on replicated queue managers • Group configuration/heartbeat/management is ‘secure by default’ – Prevents e.g. another device configuring itself replication target • But currently no encryption on replication link – possibly acceptable for single data centre HA, needs careful consideration for DR – If necessary, using AMS will ensure all message data encrypted both at rest and across replication 28
  30. 30. Performance
  31. 31. HA and DR Performance 30 0 10 20 30 40 50 60 70 80 90 100 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 CPU% RoundTrips/sec Requester Clients 10 Application, 10QM Request Responder (2KB Persistent) M2000A FP4 HA - Round Trips/sec M200A FP4 HA 2ms - Round Trips/sec M2000A FP4 HA - CPU% M2000A FP4 HA 2ms - CPU% • At just 2ms latency, HA shows significant degradation compared to ‘direct’ connection • For similar scenario, even at 50ms latency DR achieves 90% of ‘direct’ • See report MPA2 for full analysis, including comparisons to non-HA
  32. 32. Notices and Disclaimers 31 Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law
  33. 33. Notices and Disclaimers Con’t. 32 Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
  34. 34. Thank You Your Feedback is Important! Access the InterConnect 2016 Conference Attendee Portal to complete your session surveys from your smartphone, laptop or conference kiosk.

The MQ Appliance includes two key facilities for maintaining the availability of your messaging infrastructure across expected and unexpected outages. This session looks in depth at the HA and DR capabilities of the MQ Appliance and application considerations when using them.

Vues

Nombre de vues

899

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

5

Actions

Téléchargements

52

Partages

0

Commentaires

0

Mentions J'aime

0

×