SlideShare une entreprise Scribd logo
1  sur  69
Télécharger pour lire hors ligne
White Paper




USING VPLEX™ METRO WITH VMWARE
HIGH AVAILABILITY AND FAULT
TOLERANCE FOR ULTIMATE AVAILABILITY




              Abstract
              This white paper discusses using best of breed technologies from
              VMware® and EMC® to create federated continuous availability
              solutions. The following topics are reviewed

                  Choosing between federated Fault Tolerance or
                   federated High Availability
                  Design considerations and constraints
                  Operational Best Practice
September 2012




         Copyright © 2012 EMC Corporation. All Rights Reserved.

         EMC believes the information in this publication is
         accurate as of its publication date. The information is
         subject to change without notice.

         The information in this publication is provided “as is.”
         EMC Corporation makes no representations or
         warranties of any kind with respect to the information in
         this publication, and specifically disclaims implied
         warranties of merchantability or fitness for a particular
         purpose.

         Use, copying, and distribution of any EMC software
         described in this publication requires an applicable
         software license.

         For the most up-to-date listing of EMC product names,
         see EMC Corporation Trademarks on EMC.com.




USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                    2
VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Table of Contents
Executive summary ............................................................................................. 5
   Audience ......................................................................................................................... 6
   Document scope and limitations................................................................................. 6
Introduction .......................................................................................................... 8
EMC VPLEX technology ..................................................................................... 10
   VPLEX terms and Glossary ........................................................................................... 11
   EMC VPLEX architecture.............................................................................................. 13
   EMC VPLEX Metro overview ........................................................................................ 14
   Understanding VPLEX Metro active/active distributed volumes ........................... 15
   VPLEX Witness – An introduction................................................................................. 18
   Protecting VPLEX Witness using VMware FT .............................................................. 22
   VPLEX Metro HA ............................................................................................................ 24
   VPLEX Metro cross cluster connect ............................................................................ 24
Unique VPLEX benefits for availability and I/O response time ...................... 26
   Uniform and non-uniform I/O access ........................................................................ 26
   Uniform access (non-VPLEX) ....................................................................................... 26
   Non-Uniform Access (VPLEX IO access pattern)...................................................... 31
   VPLEX with cross-connect and non-uniform mode ................................................. 35
   VPLEX with cross-connect and forced uniform mode ............................................ 36
Combining VPLEX HA with VMware HA and/or FT .......................................... 39
   vSphere HA and VPLEX Metro HA (federated HA) .................................................. 39
   Use Cases for federated HA ....................................................................................... 40
   Datacenter pooling using DRS with federated HA.................................................. 40
   Avoiding downtime and disasters using federated HA and vMotion .................. 41
   Failure scenarios and recovery using federated HA ............................................... 42
   vSphere FT and VPLEX Metro (federated FT) ............................................................ 45
   Use cases for a federated FT solution ........................................................................ 45
   Failure scenarios and recovery using federated FT ................................................. 46
   Choosing between federated availability or disaster recovery (or both) ........... 49
   Augmenting DR with federated HA and/or FT ......................................................... 51
   Environments where federated HA and/or FT should not replace DR ................. 52
Best Practices and considerations when combining VPLEX HA with VMware
HA and/or FT....................................................................................................... 54
   VMware HA and FT best practice requirements ...................................................... 55
   Networking principles and pre-requisites .................................................................. 55
   vCenter placement options ....................................................................................... 56




             USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                                      3
             VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Path loss handling semantics (PDL and APD)........................................................... 57
   Cross-connect Topologies and Failure Scenarios. ................................................... 58
   Cross-connect and multipathing ............................................................................... 60
   VPLEX site preference rules ......................................................................................... 60
   DRS and site affinity rules ............................................................................................. 61
   Additional best practices and considerations for VMware FT ............................... 61
   Secondary VM placement considerations............................................................... 62
   DRS affinity and cluster node count. ......................................................................... 63
   VPLEX preference rule considerations for FT............................................................. 64
   Other generic recommendations for FT .................................................................... 64
Conclusion ......................................................................................................... 66
References ......................................................................................................... 67
Appendix A - vMotioning over longer distances (10ms) .............................. 69




             USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                              4
             VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Executive summary
The EMC® VPLEX™ family removes physical barriers within, across, and
between datacenters. VPLEX Local provides simplified management and
non-disruptive data mobility for heterogeneous arrays. VPLEX Metro and
Geo provide data access and mobility between two VPLEX clusters within
synchronous and asynchronous distances respectively. With a unique
scale-out architecture, VPLEX’s advanced data caching and distributed
cache coherency provide workload resiliency, automatic sharing,
balancing and failover of storage domains, and enable both local and
remote data access with predictable service levels.
VMware vSphere makes it simpler and less expensive to provide higher
levels of availability for important applications. With vSphere, organizations
can easily increase the baseline level of availability provided for all
applications, as well as provide higher levels of availability more easily and
cost-effectively. vSphere makes it possible to reduce both planned and
unplanned downtime. The revolutionary VMware vMotion™ (vMotion)
capabilities in vSphere make it possible to perform planned maintenance
with zero application downtime.
VMware High Availability (HA), a feature of vSphere, reduces unplanned
downtime by leveraging multiple VMware ESX® and VMware ESXi™ hosts
configured as a cluster, to provide automatic recovery from outages as
well as cost-effective high availability for applications running in virtual
machines.
VMware Fault Tolerance (FT) leverages the well-known encapsulation
properties of virtualization by building fault tolerance directly into the ESXi
hypervisor in order to deliver hardware style fault tolerance to virtual
machines. Guest operating systems and applications do not require
modifications or reconfiguration. In fact, they remain unaware of the
protection transparently delivered by ESXi and the underlying architecture.
By leveraging distance, VPLEX Metro builds on the strengths of VMware FT
and HA to provide solutions that go beyond traditional “Disaster
Recovery”. These solutions provide a new type of deployment which
achieves the absolute highest levels of continuous availability over
distance for today’s enterprise storage and cloud environments. When
using such technologies, it is now possible to provide a solution that has
both zero Recovery Point Objective (RPO) with zero "storage" Recovery
Time Objective (RTO) (and zero "application" RTO when using VMware FT).
This white paper is designed to give technology decision-makers a deeper
understanding of VPLEX Metro in conjunction with VMware Fault Tolerance




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                   5
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
and/or High Availability discussing design, features, functionality and
benefits. This paper also highlights the key technical considerations for
implementing VMware Fault Tolerance and/or High Availability with VPLEX
Metro technology to achieve "Federated Availability" over distance.

Audience
This white paper is intended for technology architects, storage
administrators and EMC professional services partners who are responsible
for architecting, creating, managing and using IT environments that utilize
EMC VPLEX and VMware Fault Tolerance and/or High Availability
technologies (FT and HA respectively). The white paper assumes that the
reader is familiar with EMC VPLEX and VMware technologies and
concepts.

Document scope and limitations
This document applies to EMC VPLEX Metro configured with VPLEX Witness.
The details provided in this white paper are based on the following
configurations:


   •   VPLEX Geosynchrony 5.1 (patch 2) or higher
   •   VPLEX Metro HA only (Local and Geo are not supported with FT or
       HA in a stretched configuration)
   •   VPLEX Clusters are within 5 milliseconds (ms) of each other for
       VMware HA
   •   Cross-connected configurations can be optionally deployed for
       VMware HA solutions (not mandatory).
   •   For VMware FT configurations VPLEX cross cluster connect is in place
       (mandatory requirement).
   •   VPLEX Clusters are within 5 millisecond (ms) round trip time (RTT) of
       each other for VMware HA
   •   VPLEX Clusters are within 1 millisecond (ms) round trip time (RTT) of
       each other for VMware FT
   •   VPLEX Witness is deployed to a third failure domain (Mandatory). The
       Witness functionality is required for “VPLEX Metro” to become a true
       active/active continuously available storage cluster.
   •   ESXi and vSphere 5.0 Update 1 or later are used
   •   Any qualified pair of arrays (both EMC and non-EMC) listed on the
       EMC Simple Support Matrix (ESSM) found here:
       https://elabnavigator.emc.com/vault/pdf/EMC_VPLEX.pdf



        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                6
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
•   The configuration is in full compliance with VPLEX best practice
       found here:
       http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Tech
       nical_Documentation/h7139-implementation-planning-vplex-tn.pdf


Please consult with your local EMC Support representative if you are
uncertain as to the applicability of these requirements.


Note: While out of scope for this document, it should be noted that in
addition to all best practices within this paper, that all federated FT and HA
solutions will carry the same best practices and limitations imposed by the
VMware HA and FT technologies too. For instance at the time of writing
VMware FT technology is only capable of supporting a single vCPU per VM
(VMware HA does not carry the same vCPU limitation) and this limitation
will prevail when federating a VMware FT cluster. Please ensure to review
the VMware best practice documentation as well as the limitations and
considerations documentation (please see the References section) for
further information.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                  7
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Introduction
Increasingly, more and more customers wish to protect their business
services from any event imaginable that would lead to downtime.
Previously (i.e. prior to VPLEX) solutions to prevent downtime fell into two
camps:
   1. Highly available and fault tolerant systems within a datacenter
   2. Disaster recovery solutions outside of a datacenter.
The benefit of FT and HA solutions are that they provide automatic
recovery in the event of a failure. However, the geographical protection
range is limited to a single datacenter therefore not protecting business
services from a datacenter failure.
On the other hand, disaster recovery solutions typically protect business
services using geographic dispersion so that if a datacenter fails, recovery
would be achieved using another datacenter in a separate fault domain
from the primary. Some of the drawbacks with a disaster recovery
solutions, however, are that they are human decision based (i.e. not
automatic) and typically require a 2nd disruptive failback once the primary
site is repaired. In other words, should a primary datacenter fail the
business would need to make a non-trivial decision to invoke disaster
recovery.
Since disaster recovery is decision-based (i.e. manually invoked), it can
lead to extended outages since the very decision itself takes time, and this
is generally made at the business level involving key stakeholders. As most
site outages are caused by recoverable events (e.g. an elongated power
outage), faced with the “Invoke DR” decision some businesses choose not
to invoke DR and to ride through the outage instead. This means that
critical business IT services remain offline for the duration of the event.
These types of scenarios are not uncommon in these "disaster" situations
and non-invocation can be for various reasons. The two biggest ones are:
   1. The primary site that failed can be recovered within 24-48 hours
      therefore not warranting the complexity and risk of invoking DR.
   2. Invoking DR will require a “failback” at some point in the future
      which in turn will bring more disruption.
Other potential concerns to invoking disaster recovery include complexity,
lack of testing, lack of resources, lack of skill sets and lengthy recovery
time.
To avoid such pitfalls, VPLEX and VMware offer a more comprehensive
answer to safeguarding your environments. By combining the benefits of
HA and FT, a new category of availability is created. This new type of



        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                8
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
category provides the automatic (non-decision based) benefits of FT and
HA, but allows them to be leveraged over distance by using VPLEX Metro.
This brings the geographical distance benefits normally associated with
disaster recovery to the table enhancing the HA and FT propositions
significantly.
The new category is known as “Federated Availability” and enables bullet
proof availability which in turn significantly lessens the chance of downtime
for both planned and unplanned events.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                 9
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
EMC VPLEX technology

VPLEX encapsulates traditional physical storage array devices and applies
three layers of logical abstraction to them. The logical relationships of each
layer are shown in Figure 1.
Extents are the mechanism VPLEX uses to divide storage volumes. Extents
may be all or part of the underlying storage volume. EMC VPLEX
aggregates extents and applies RAID protection in the device layer.
Devices are constructed using one or more extents and can be combined
into more complex RAID schemes and device structures as desired. At the
top layer of the VPLEX storage structures are virtual volumes. Virtual
volumes are created from devices and inherit the size of the underlying
device. Virtual volumes are the elements VPLEX exposes to hosts using its
Front End (FE) ports. Access to virtual volumes is controlled using storage
views. Storage views are comparable to Auto-provisioning Groups on EMC
Symmetrix® or to storage groups on EMC VNX®. They act as logical
containers determining host initiator access to VPLEX FE ports and virtual
volumes.




               Figure 1 EMC VPLEX Logical Storage Structures




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                  10
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
VPLEX terms and Glossary


Term                       Definition

VPLEX Virtual              Unit of storage presented by the
Volume                     VPLEX front-end ports to hosts

VPLEX Distributed          A single unit of storage presented by
Volume                     the VPLEX front-end ports of both
                           VPLEX clusters in a VPLEX Metro
                           configuration separated by
                           distance

VPLEX Director             The central processing and
                           intelligence of the VPLEX solution.
                           There are redundant (A and B)
                           directors in each VPLEX Engine

VPLEX Engine               Consists of two directors and is the
                           unit of scale for the VPLEX solution

VPLEX cluster              A collection of VPLEX engines in one
                           rack.

VPLEX Metro                The cooperation of two VPLEX
                           clusters, each serving their own
                           storage domain over synchronous
                           distance forming active/active
                           distributed volume(s)

VPLEX Metro HA             As per VPLEX Metro, but configured
                           with VPLEX Witness to provide fully
                           automatic recovery from the loss of
                           any failure domain. This can also be
                           thought of as an active/active
                           continuously available storage
                           cluster over distance.

Access Anywhere            The term used to describe a
                           distributed volume using VPLEX
                           Metro which has active/active
                           characteristics

Federation                 The cooperation of storage




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH    11
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
elements at a peer level over
                        distance enabling mobility,
                        availability and collaboration

Automatic               No human intervention whatsoever
                        (e.g. HA and FT)

Automated               No human intervention required
                        once a decision has been made
                        (e.g. disaster recovery with
                        VMware's SRM technology)




      USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH   12
      VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
EMC VPLEX architecture
EMC VPLEX represents the next-generation architecture for data mobility
and information access. The new architecture is based on EMC’s more
than 20 years of expertise in designing, implementing, and perfecting
enterprise-class intelligent cache and distributed data protection solutions.
As shown in Figure 2, VPLEX is a solution for vitalizing and federating both
EMC and non-EMC storage systems together. VPLEX resides between
servers and heterogeneous storage assets (abstracting the storage
subsystem from the host) and introduces a new architecture with these
unique characteristics:
   •   Scale-out clustering hardware, which lets customers start small and
       grow big with predictable service levels
   •   Advanced data caching, which utilizes large-scale SDRAM cache to
       improve performance and reduce I/O latency and array contention
   •   Distributed cache coherence for automatic sharing, balancing, and
       failover of I/O across the cluster
   •   A consistent view of one or more LUNs across VPLEX clusters
       separated either by a few feet within a datacenter or across
       synchronous distances, enabling new models of high availability and
       workload relocation


                               Physical Host Layer
               A
                   A   A                                                    A
                                                                                A   A




                               Virtual Storage Layer (VPLEX)
           A

                                                                        A




                               Physical Storage Layer



        Figure 2 Capability of an EMC VPLEX local system to abstract
                           Heterogeneous Storage




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                         13
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
EMC VPLEX Metro overview
VPLEX Metro brings mobility and access across two locations separated by
an inter-site round trip time of up to 5 milliseconds (host application
permitting). VPLEX Metro uses two VPLEX clusters (one at each location)
and includes the unique capability to support synchronous distributed
volumes that mirror data between the two clusters using write-through
caching.
Since a VPLEX Metro Distributed volume is under the control of the VPLEX
Metro advanced cache coherency algorithms, active data I/O access to
the distributed volume is possible at either VPLEX cluster. VPLEX Metro
therefore is a truly active/active solution which goes far beyond traditional
active/passive legacy replication solutions.
VPLEX Metro distributes the same block volume to more than one location
and ensures standard HA cluster environments (e.g. VMware HA and FT)
can simply leverage this capability and therefore can be easily and
transparently deployed and over distance too.
The key to this is to make the host cluster believe there is no distance
between the nodes so they behave identically as they would in a single
data center. This is known as “dissolving distance” and is a key deliverable
of VPLEX Metro.
The other piece to delivering truly active/active FT or HA environments is an
active/active network topology whereby the Layer 2 of the same network
resides in each location giving truly seamless datacenter pooling. Whilst
layer 2 network stretching is a pre-requisite for any FT or HA solution based
on VPLEX Metro, it is outside of the scope of this document. Going forward
throughout this document it is assumed that there is a stretched layer 2
network between datacenters where a VPLEX Metro resides.


Note: Please see further information on Cisco Overlay Transport
Virtualization (OTV) found here
http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DCI/
whitepaper/DCI_1.html and Brocade Virtual Private LAN Service(VPLS)
found here
http://www.brocade.com/downloads/documents/white_papers/Offering_
Scalable_Layer2_Services_with_VPLS_and_VLL.pdf technology for
stretching a layer 2 network over distance.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                 14
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Understanding VPLEX Metro active/active distributed volumes
Unlike traditional legacy replication where access to a replicated volume
is either in one location or another (i.e. an active/passive only paradigm)
VPLEX distributes a virtual device over distance which ultimately means
host access is now possible in more than one location to the same
(distributed) volume.
In engineering terms the distributed volumes that is presented from VPLEX
Metro is said to have “single disk semantics” meaning that in every way
(including failure) the disk will behave as one object as any traditional
block device would. This therefore means that all the rules associated with
a single disk are fully applicable to a VPLEX Metro distributed volume.
For instance, the following figure shows a single host accessing a single
JBOD type volume:




                                        Datacenter
                 Figure 3 Single host access to a single disk

Clearly the host in the diagram is the only host initiator accessing the single
volume.
The next figure shows a local two node cluster.
                              Cluster of hosts coordinate for access




                                         Datacenter
                Figure 4 Multiple host access to a single disk

As shown in the diagram there are now two hosts contending for the single
volume. The dashed orange rectangle shows that each of the nodes is




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                   15
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
required to be in a cluster or utilize a cluster file system so they can
effectively coordinate locking to ensure the volume remains consistent.
The next figure shows the same two node cluster but now connected to a
VPLEX distributed volume using VPLEX cache coherency technology.



                                       Cluster of hosts coordinate
                                                for access




                                VPLEX AccessAnywhere™

                          Datacenter                                 Datacenter
        Figure 5 Multiple host access to a VPLEX distributed volume

In this example there is no difference to the fundamental dynamics of the
two node cluster access pattern to the single volume. Additionally as far as
the hosts are concerned they cannot see any different between this and
the previous example since VPLEX is distributing the device between
datacenters via AccessAnywhere™ (which is a type of federation).
This means that the hosts are still required to coordinate locking to ensure
the volume remains consistent.
For ESXi this mechanism is controlled by the cluster file system Virtual
Machine File System (VMFS) within each datastore. In this case each
distributed volume will be imported into VPLEX and formatted with the
VMFS file system.
The figure below shows a high-level physical topology of a VPLEX Metro
distributed device.

                  A
                      A     A                                                         A
                                                                                          A   A




   SITE A                                                                                         SITE B
                                       AccessAnywhere™
              A




                                                  LINK                            A




        Figure 6 Multiple host access to a VPLEX distributed volume

This figure is a physical representation of the logical configuration shown in
Figure 5. Effectively, with this topology deployed, the distributed volume




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                            16
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
can be treated just like any other volume, the only difference being it is
now distributed and available in two locations at the same time.
Another benefit of this type of architecture is “extreme simplicity” since it is
no more difficult to configure a cluster across distance that it is in a single
data center.


Note: VPLEX Metro can use either 8GB FC or native 10GB Ethernet WAN
connectivity (Where the word link is written). When using FC connectivity
this can be configured with either a dedicated channel (i.e. separate non
merged fabrics) or ISL based (i.e. where fabrics have been merged across
sites). It is assumed that any WAN link will have a second physically
redundant circuit.


Note: It is vital that VPLEX Metro has enough bandwidth between clusters
to meet requirements. EMC can assist in the qualification of this through
the Business Continuity Solution Designer (BCSD) tool. Please engage your
EMC account team to perform a sizing exercise.


For further details on VPLEX Metro architecture, please see the VPLEX HA Techbook
found here: http://www.emc.com/collateral/hardware/technical-
documentation/h7113-vplex-architecture-deployment.pdf




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                     17
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
VPLEX Witness – An introduction
As mentioned previously, VPLEX Metro goes beyond the realms of legacy
active/passive replication technologies since it can deliver true
active/active storage over distance as well as federated availability.
There are three main items that are required to deliver true "Federated
Availability".
   1. True active/active fibre channel block storage over distance.
   2. Synchronous mirroring to ensure both locations are in lock step with
      each other from a data perspective.
   3. External arbitration to ensure that under all failure conditions
      automatic recovery is possible.
In the previous sections we have discussed 1 and 2, but now we will look at
external arbitration which is enabled by VPLEX Witness.
VPLEX Witness is delivered as a zero cost VMware Virtual Appliance (vApp)
which runs on a customer supplied ESXi server. The ESXi server resides in a
physically separate failure domain to either VPLEX cluster and uses
different storage to the VPLEX cluster.
Using VPLEX Witness ensures that true Federated Availability can be
delivered. This means that regardless of site or link/WAN failure a copy of
the data will automatically remain online in at least one of the locations.
When setting up a single or a group of distributed volumes the user will
choose a “preference rule” which is a special property that each
individual or group of distributed volumes has. It is the preference rule that
determines the outcome after failure conditions such as site failure or link
partition. The preference rule can either be set to cluster A preferred,
cluster B preferred or no automatic winner.
At a high level this has the following effect to a single or group of
distributed volumes under different failure conditions as listed below:




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                  18
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Preference     VPLEX CLUSTER PARTITION         SITE A FAILS               SITE B FAILS
   Rule /
 scenario        Site A          Site B    Site A          Site B     Site A          Site B

Cluster A       ONLINE       SUSPENDED     FAILED     SUSPENDED      ONLINE           FAILED
Preferred                 GOOD               BAD (by design)                   GOOD

Cluster B      SUSPENDED         ONLINE    FAILED       ONLINE      SUSPENDED         FAILED
preferred                 GOOD                      GOOD               BAD (by design)
   No
automatic       SUSPENDED (by design)     SUSPENDED (by design)     SUSPENDED (by design)
 winner

                  Table 1 Failure scenarios without VPLEX Witness


 As we can see in Table 1(above) if we only used the preference rules
 without VPLEX Witness then under some scenarios manual intervention
 would be required to bring the volume online at a given VPLEX cluster(e.g.
 if site A is the preferred site, and site A fails, site B would also suspend).
 This is where VPLEX Witness assists since it can better diagnose failures due
 to the network triangulation, and ensures that at any time at least one of
 the VPLEX clusters has an active path to the data as shown in the table
 below:
Preference     VPLEX CLUSTER PARTITION         SITE A FAILS               SITE B FAILS
   Rule          Site A          Site B    Site A          Site B     Site A          Site B

Cluster A        ONLINE      SUSPENDED     FAILED       ONLINE       ONLINE        FAILED
Preferred                 GOOD                      GOOD                       GOOD

Cluster B      SUSPENDED         ONLINE    FAILED       ONLINE       ONLINE        FAILED
preferred                 GOOD                      GOOD                       GOOD
   No
automatic       SUSPENDED (by design)     SUSPENDED (by design)     SUSPENDED (by design)
 winner

                    Table 2 Failure scenarios with VPLEX Witness


 As one can see from Table 2 VPLEX Witness converts a VPLEX Metro from an
 active/active mobility and collaboration solution into an active/active continuously
 available storage cluster. Furthermore once VPLEX Witness is deployed, failure
 scenarios become self-managing (i.e. fully automatic) which makes it extremely
 simple since there is nothing to do regardless of failure condition!




             USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                           19
             VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Figure 7 below shows the high level topology of VPLEX Witness




               Figure 7 VPLEX configured for VPLEX Witness

As depicted in Figure 7 we can see that the Witness VM is deployed in a
separate fault domain (as defined by the customer) and connected into
both VPLEX management stations via an IP network.


Note: Fault domain is decided by the customer and can range from
different racks in the same datacenter all the way up to VPLEX clusters 5ms
of distance away from each other (5ms measured round trip time latency
or typical synchronous distance). The distance that VPLEX witness can be
placed from the two VPLEX clusters can be even further. The current
supported maximum round trip latency for this is 1 second.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH               20
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Figure 8 below shows a more detailed connectivity diagram of VPLEX
Witness




                                                       IMPORTANT /
                                                      REQUIREMENT!




                                    SEPARATE FAULT
                                       DOMAIN!




               Figure 8 Detailed VPLEX Witness network layout


The witness network is physically separate from the VPLEX inter-cluster
network and also uses storage that is physically separate from either VPLEX
cluster. As stated previously, it is critical to deploy VPLEX Witness into a third
failure domain. The definition of this domain changes depending on where
the VPLEX clusters are deployed. For instance if the VPLEX Metro clusters
are to be deployed into the same physical building but perhaps different
areas of the datacenter, then the failure domain here would be deemed
the VPLEX rack itself. Therefore VPLEX Witness could also be deployed into
the same physical building but in a separate rack.
If, however, each VPLEX cluster was deployed 50 miles apart in totally
different buildings then the failure domain here would be the physical
building and/or town. Therefore in this scenario it would makes sense to
deploy VPLEX Witness in another town altogether; and since the maximum
round trip latency can be as much as one second then you could
effectively pick any city in the world, especially given the bandwidth
requirement is as low as 3Kb/sec.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                      21
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
For more in depth VPLEX Witness architecture details please refer to the
VPLEX HA Techbook that can be found here:
http://www.emc.com/collateral/hardware/technical-
documentation/h7113-vplex-architecture-deployment.pdf


Note: Always deploy VPLEX Witness in a 3rd failure domain and ensure that
all distributed volumes reside in a consistency group with the witness
function enabled. Also ensure that EMC Secure Remote Support (ESRS)
Gateway is fully configured and the witness has the capability to alert if it
for whatever reason fails (no impact to I/O if witness fails).



Protecting VPLEX Witness using VMware FT
Under normal operational conditions VPLEX Witness is not a vital
component that is required to drive active/active I/O (i.e. if the Witness is
disconnected or lost, I/O still continues).It does however become a crucial
component to ensure availability in the event of site loss at either of the
locations where the VPLEX clusters reside.
If, for whatever reason, the VPLEX Witness was lost and soon after there
was a catastrophic site failure at a site containing a VPLEX cluster then the
hosts at the remaining site would also lose access to the remaining VPLEX
volumes since the remaining VPLEX would think it was isolated as the VPLEX
Witness is also unavailable.
To minimize this risk, it is considered best practice to disable the VPLEX
Witness function if it has been lost and will remain offline for a long time.
Another way to ensure availability is to minimize the risk of a VPLEX Witness
loss in the first place by increasing the availability of the VPLEX Witness VM
running in the third location.
A way to significantly boost availability for this individual VM is to use
VMware FT to protect VPLEX Witness at the third location. This ensures that
the VPLEX Witness remains unaffected at the third failure domain should a
hardware failure occur to the ESXi server in the third failure domain that is
supporting the VPLEX Witness VM.
To deploy this functionality, simply enable ESXi HA clustering for the VPLEX
Witness VM across two or more ESXi hosts (in the same location), and once
this has been configured right click the VPLEX Witness VM and enable fault
tolerance.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                  22
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Note: At the time of writing, the FT configuration on VPLEX Witness is only
within one location and not a stretched / federated FT configuration. The
storage that the VPLEX Witness uses should be physically contained within
the boundaries of the third failure domain on local (i.e. not VPLEX Metro
distributed) volumes. Additionally it should be noted that currently HA
alone is not supported, only FT or unprotected.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH               23
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
VPLEX Metro HA
As discussed in the two previous sections, VPLEX Metro is able to provide
active/active distributed storage, however we have seen that in some
cases depending on failure, loss of access to the storage volume could
occur if the preferred site fails for some reason causing the non-preferred
site to suspend too. Using VPLEX Witness overcomes this scenario and
ensures that access to a VPLEX cluster is always maintained regardless of
which site fails.
VPLEX Metro HA describes a VPLEX Metro solution that has also been
deployed with VPLEX Witness. As the name suggests VPLEX Metro HA
effectively delivers truly available distributed Storage volumes over
distance and forms a solid foundation for additional layers of VMware
technology such as HA and FT.


Note: It is assumed that all topologies discussed within this white paper use
VPLEX Metro HA (i.e. use VPLEX Metro and VPLEX Witness). This is
mandatory to ensure fully automatic (i.e. decision less) recovery under all
the failure conditions outlined within this document.



VPLEX Metro cross cluster connect
Another important feature of VPLEX Metro that can be optionally
deployed within a campus topology (i.e. up to 1ms) is cross cluster
connect.


Note: At the time of writing cross-connect is a mandatory requirement for
VMware FT implementations.


This feature pushes VPLEX HA into an even greater level of availability than
before since now an entire VPLEX cluster failure at a single location would
not cause an interruption to host I/O at either location (using either
VMware FT or HA)
Figure 9 below shows the topology of a cross-connected configuration:




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                 24
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
A
                           A   A
                                        OPTIONAL                      A
                                                                               A   A




                                       X – CONNECT
  SITE A                                                                               SITE B
                                   AccessAnywhere™
              A




                                          LINK                    A




                                         VPLEX
                                         WITNESS




                      IP                                                  IP


           Figure 9 VPLEX Metro deployment with cross-connect


As we can see in the diagram the cross-connect offers an alternate path or paths
from each ESXi server to the remote VPLEX.
This ensures that if for any reason an entire VPLEX cluster were to fail (which
is unlikely since there is no single-point-of-failure) there would be no
interruption to I/O since the remaining VPLEX cluster will continue to service
I/O across the remote cross link (alternate path)
It is recommended when deploying cross-connect that rather than
merging fabrics and using an Inter Switch Link (ISL), additional host bus
adapters (HBAs) should be used to connect directly to the remote data
centers switch fabric. This ensures that fabrics do not merge and span
failure domains.
Another important note to remember for cross-connect is that it is only
supported for campus environments up to 1ms round trip time.


Note: When setting up cross-connect, each ESXi server will see double the
paths to the datastore (50% local and 50% remote). It is best practice to
ensure that the pathing policy is set to fixed and mark the remote paths
across to the other cluster as passive. This ensures that the workload
remains balanced and only committing to a single cluster at any one time.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                 25
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Unique VPLEX benefits for availability and I/O response
time
VPLEX is built from the ground up to perform block storage distribution over
long distances at enterprise scale and performance. One of the unique
core principles of VPLEX that enables this, is its underlying and extremely
efficient cache coherency algorithms which enable an active/active
topology without compromise.
Since VPLEX is architecturally unique from other virtual storage products,
two simple categories are used to easily distinguish between the
architectures.

Uniform and non-uniform I/O access
Essentially these two categories are a way to describe the I/O access
pattern from the host to the storage system when using a stretched or
distributed cluster configuration. VPLEX Metro (under normal conditions)
follows what is known technically as a non-uniform access pattern,
whereas other products that function differently from VPLEX follow what is
known as a uniform I/O access pattern. On the surface, both types of
topology seem to deliver active/active storage over distance, however at
the simplest level it is only the non-uniform category that delivers true
active/active within the non-uniform category which carries some
significant benefits over uniform type solutions.
The terms are defined as follows:
   1. Uniform access
      All I/O is serviced by the same single storage controller therefore all
      I/O is sent to or received from the same location, hence the term
      "uniform". Typically this involves "stretching" dual controller
      active/passive architectures.
   2. Non Uniform access
      I/O can be serviced by any available storage controller at any given
      location; therefore I/O can be sent to or received from any storage
      target location, hence the term "non-uniform". This is derived from
      "distributing" multiple active controllers/directors in each location.
To understand this in greater detail and to quantify the benefits of non-uniform
access we must first understand uniform access.

Uniform access (non-VPLEX)
Uniform Access works in a very similar way to a dual controller array that
uses an active/passive storage controller. With such an array a host would




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                    26
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
generally be connected to both directors in a HA configuration so if one
failed the other one would continue to process I/O. However since the
secondary storage controller is passive, no write or read I/O can be
propagated to it or from it under normal operations since it remains
passive. The other thing to understand here is that these types of
architectures typically use cache mirroring whereby any write I/O to the
primary controller/director is synchronously mirrored to the secondary
controller for redundancy.
Next imagine taking a dual controller active/passive array and physically
splitting the nodes/controllers apart therefore stretching it over distance so
that the active controller/node resides in site A and the secondary
controller/node resides in site B.
The first thing to note here is that we now only have a single controller at
either location so we have already compromised the local HA ability of
the solution since each location now has a single point of failure.
The next challenge here is to maintain host access to both controllers from
either location.
Let's suppose we have an ESXi server in site A and a second one in site B. If
the only active storage controller resides at A, then we need to ensure that
hosts in both site A and site B have access to the storage controller in site A
(uniform access). This is important since if we want to run a host workload
at site B we will need an active path to connect it back to the active
director in site A since the controller at site B is passive. This may be
handled by a standard FC ISL which stretches the fabric across sites.
Additionally we will also require a physical path from the ESXi hosts in site A
to the passive controller at site B. The reason for this is just in case there is a
controller failure at site A, the controller at site B should be able to service
I/O.
As discussed in the previous section this type of configuration is known as
"Uniform Access" since all I/O will be serviced uniformly by the exact same
controller for any given storage volume, passing all I/O to and from the
same location. The diagram in Figure 10 below shows a typical example of
a uniform architecture.




         USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                      27
         VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Fabric A – Stretched via ISL



                                                                                                                           Fabric B Stretched via ISL A



                                         A

                                             Front End                       A


                                                                                                                                SPLIT CONTROLLERS                                           Front End


                 Single Controller




                                                                                                                                                                                                                    Single Controller
                                                                                       A                                                                                                                                                                        A
                                                                                                                                                                                                                                                                            A       A




                                                                                           Communication




                                                                                                                                                                        Communication
                                              (Active)                                                                                                                                      (Passive)
                                                                                                                                           Proprietary
                                               Cache                                                                                                                                          Cache
   SITE A                                    (Mirrored)
                                                                                                                                               or
                                                                                                                                           Dedicated                                        (Mirrored)
                                                                                                                                                                                                                                        SITE B
                 A
                                              Backend                                                                                         ISL                                            Backend                             A


             A
                                     A



                                             (Mirrored)                                                                                                                                     (Passive)                                                       A




                                                                                 Figure 10 A typical non-uniform layout

As we can see in the above diagram, hosts at each site connect to both
controllers by way of the stretched fabric; however the active controller
(for any given LUN) is only at one of the sites (in this case site A).
While not as efficient (bandwidth and latency) as VPLEX, under normal
operating conditions (i.e. where the active host is at the same location as
the active controller) this type of configuration functions satisfactorily,
however this type of access pattern starts to become sub-optimal if the
active host is propagating I/O at the same location where the passive
controller resides.
Figure 11 shows the numbered sequence of I/O flow for a host connected
to a uniform configuration at the local (i.e. active) site.
      5

                                                                                                                                      Fabric A – Stretched via ISL



                                                                                                                                    Fabric B Stretched via ISL A



                                                                     1 Front End
                                                                         A
                                                                                   A
                                                                                                                                                                                                            Front End
                                                 Single Controller




                                                                                                                                                                                                                                        Single Controller




                                                                                                           A                                                                                                                                                            A
                                                                                                                                                                                                                                                                                A       A




                                                                                                                                      SPLIT CONTROLLERS
                                                                                                               Communication




                                                                                                                                                                                            Communication




                                                                        (Active)                                                                                                                            (Passive)
                                                                         Cache                                                                                                                                Cache
      SITE A                                                           (Mirrored)
                                                                                                                                     All cache mirrored synchronously
                                                                                                                                                                                                            (Mirrored)
                                                                                                                                                                                                                                                            SITE B
                                                 A
                                                                        Backend                                                                                                         2                    Backend                                 A


                                             A
                                                                     A



                                                                       (Mirrored)                                                                                                                           (Passive)                                               A



                                                                                                                                                              3




                                                                                       4                                                                                                                    4

            Figure 11 Uniform write I/O Flow example at local site



          USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                                                                                                                                                                                           28
          VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
The steps below correspond to the numbers in the diagram.
   1. I/O is generated by the host at site A and sent to the active controller in site
      A.

   2. The I/O is committed to local cache, and synchronously mirrored to remote
      cache over the WAN.

   3. The local/active controller’s backend now mirrors the I/O to the back end
      disks. It does this by committing a copy to the local array as well as sending
      another copy of the I/O across the WAN to the remote array.

   4. The acknowledgment from back end disk returns to the owning storage
      controller.

   5. Acknowledgement is received by the host and the I/O is complete.

Now, let's look at a write I/O initiated from the ESXi host at location B where
the controller for the LUN receiving I/O resides at site A.
The concern here is that each write at the passive site B will have to
traverse the link and be acknowledged back to site A. Before the
acknowledgement can be given back to the host at site B from the
controller at site A, the storage system has to synchronously mirror the I/O
back to the controller in site B (both cache and disk), thereby incurring
more round trips of the WAN. This ultimately increases the response time
(i.e. negatively impacts performance) and bandwidth utilization.
The numbered sequence in Figure 12 shows a typical I/O flow of a host
connected to a uniform configuration at the remote (i.e. passive) site.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                          29
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
1       6
                                                                           Fabric A – Stretched via ISL



                                                                          Fabric B Stretched via ISL A



                                      2 Front End
                                          A
                                              A
                                                                                                                                      Front End


                  Single Controller




                                                                                                                                                   Single Controller
                                                      A                                                                                                                       A
                                                                                                                                                                                  A   A




                                                                            SPLIT CONTROLLERS




                                                          Communication




                                                                                                                      Communication
                                         (Active)                                                                                     (Passive)
                                          Cache                                                                                         Cache
    SITE A                              (Mirrored)
                                                                          All cache mirrored synchronously
                                                                                                                                      (Mirrored)
                                                                                                                                                                       SITE B
                  A
                                         Backend                                                              3                        Backend                  A


              A
                                      A



                                        (Mirrored)                                                                                    (Passive)                           A



                                                                                                 4




                                                  5                                                                                   5

          Figure 12 Uniform write I/O flow example at remote site


The following steps correspond to the numbers in the diagram.
   1. I/O is generated by the host at site B and sent across the ISL to the
      active controller at site A.
   2. The I/O is received at the controller at site A from the ISL
   3. The I/O is committed to local cache, and mirrored to remote cache
      over the WAN and acknowledged back to the active controller in
      site A.
   4. The active controllers’ back end now mirrors the I/O to the back end
      disks at both locations. It does this by committing a copy to the local
      array as well as sending another copy of the I/O across the WAN to
      the remote array (this step may sometimes be asynchronous).
   5. Both write acknowledgments are sent back to the active controller
      (back across the ISL)
   6. Acknowledgement back to the host and the I/O is complete.


Clearly if using a uniform access device from a VMware datastore
perspective with ESXi hosts at either location, I/O could be propagated to
both locations perhaps simultaneously (e.g. if a VM were to be vMotioned
to the remote location leaving at least one VM online at the previous
location in the same datastore). Therefore in a uniform deployment, I/O
response time at the passive location will always be worse (perhaps
significantly) than I/O response time at the active location. Additionally,
I/O at the passive site could use up to three times the bandwidth of an I/O




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                                                                                           30
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
at the active controller site due to the need to mirror the disk and cache
as well as send the I/O in the first place across the ISL.

Non-Uniform Access (VPLEX IO access pattern)
While VPLEX can be configured to provide uniform access, the typical
VPLEX Metro deployment uses non-uniform access. VPLEX was built from
the ground up for extremely efficient non-uniform access. This means it
has a different hardware and cache architecture relative to uniform
access solutions and, contrary to what you might have already read about
non-uniform access clusters, provides significant advantages over uniform
access for several reasons:
   1. All controllers in a VPLEX distributed cluster are fully active. Therefore
      if an I/O is initiated at site A, the write will happen to the director in
      site A directly and be mirrored to B before the acknowledgement is
      given. This ensures minimal (up to 3x better compared to uniform
      access) response time and bandwidth regardless of where the
      workload is running.
   2. A cross-connection where hosts at site A connect to the storage
      controllers at site B is not a mandatory requirement (unless using
      VMware FT). Additionally, with VPLEX if a cross-connect is deployed,
      it is only used as a last resort in the unlikely event that a full VPLEX
      cluster has been lost (this would be deemed a double failure since a
      single VPLEX cluster has no SPOFs) or the WAN has failed/been
      partitioned.
   3. Non-uniform access uses less bandwidth and gives better response
      times when compared to uniform access since under normal
      conditions all I/O is handled by the local active controller (all
      controllers are active) and sent across to the remote site only once.
      It is important to note that read and write I/O is serviced locally
      within VPLEX Metro.
   4. Interestingly, due to the active/active nature of VPLEX, should a full
      site outage occur VPLEX does not need to perform a failover since
      the remaining copy of the data was already active. This is another
      key difference when compared to uniform access since if the
      primary active node is lost a failover to the passive node is required.
The diagram below shows a high-level architecture of VPLEX when
distributed over a Metro distance:




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                    31
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Front End           Front End                                                Front End           Front End




                                                                                                                                             VPLEX Cluster B
                                                                                       Communication
                                     A
                                                 A                                                                                      A




        VPLEX Cluster A
                                                         A
                                                                                                                                                               A   A




                                                              Communication
                           (Active)            (Active)                                                 (Active)            (Active)
                             Cache               Cache                                                    Cache               Cache
                                                                              I P or
                          (Distributed)       (Distributed)                                            (Distributed)       (Distributed)
                                                                                FC

                          Backend
                                     A




                                              Backend                                                  Backend         A   Backend
                                          A
                                 A


                                                                                                                                    A




  SITE A                                                                                                                                    SITE B


                                     Figure 13 VPLEX non-uniform access layout


As we can see in Figure 13, each host is only connected to the local VPLEX
cluster ensuring that I/O flow from whatever location is always serviced by
the local storage controllers. VPLEX can achieve this because all of the
controllers (at both sites) are in an active state and able to service I/O.
Some other key differences to observe from the diagram are:
   1. Storage devices behind VPLEX are only connected to each
      respective local VPLEX cluster and are not connected across the
      WAN, dramatically simplifying fabric design.
   2. VPLEX has dedicated redundant WAN ports that can be connected
      natively to either 10GB Ethernet or 8GB FC.
   3. VPLEX has multiple active controllers in each location ensuring there
      are no local single points of failure. With up to eight controllers in
      each location, VPLEX provides N+1 redundancy.
   4. VPLEX uses and maintains single disk semantics across clusters at two
      different locations.
I/O flow is also very different and more efficient when compared to
uniform access too as the diagram below highlights.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                                                                        32
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
4




                                                                                                                     1
                          Front End           Front End                                                           Front End           Front End




                                                                                                                                                        VPLEX Cluster B
                                                                                                  Communication
                                     A
                                                 A                                                                                                 A




        VPLEX Cluster A
                                                         A
                                                                                                                                                                          A   A




                                                              Communication
                           (Active)            (Active)                                                            (Active)            (Active)
                                                                                     2                               Cache               Cache
                             Cache               Cache
                          (Distributed)       (Distributed)                   Inter Cluster Com                   (Distributed)       (Distributed)

                          Backend
                                     A




                                              Backend                                                             Backend         A   Backend
                                          A
                                 A


                                                                                                                                               A




                                                     3                                                                    3
   SITE A                                                                                                                                              SITE B


                           Figure 14 High level VPLEX non-uniform write I/O flow


The steps below correspond to the numbers in the Figure 14:
   1. Write I/O is generated by the host at either site and sent to one of
      the local VPLEX controllers (depending on path policy).
   2. The write I/O is duplicated and sent to the remote VPLEX cluster.
   3. Each VPLEX cluster now has a copy of the write I/O which is written
      through to the backend array at each location. Site A VPLEX does
      this for the array in site A, while site B VPLEX does this for the array in
      site B.
   4. Once the remote VPLEX cluster has acknowledged back to the local
      cluster the acknowledgement is sent to the host and the I/O is
      complete.


Note: Under some conditions depending on the access pattern, VPLEX
may encounter what is known as a local write miss condition. This does not
necessarily cause another step as the remote cache page owner is
invalidated as part of the write through caching activity. In effect, VPLEX is
able to accomplish several distinct tasks through a single cache update
messaging step.


The table below shows a broad comparison of the expected increase in
response time (in milliseconds) for I/O flow for both uniform and non-
uniform layouts if using an FC link with a 3 ms response time (and without
any form of external WAN acceleration / fast write technology). These




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                                                                                   33
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
numbers are additional overhead when compared to a local storage
system of the same hardware, since I/O now has to be sent across the link.

     (Based on 3ms RTT and 2 round trips per IO)             SITE A                   Site B
     Additional RT overhead (ms)                        read        write      read            write
     Full Uniform (sync mirror)                           0          12          6              18
     Full Uniform (async Mirror)                          0           6          6              12
     Non-Uniform (owner hit)                              0          6*          0              6*
     * This is comparable to standard synchronous Active/Passive replication

                                                 Key
                                              Optimal
                                     Acceptable, but not efficient
                                            Sub-optimal

           Table 3 Uniform vs. non-uniform response time increase



Note: Table 3 Only shows the expected additional latency of the IO on the
WAN and does not include any other overheads such as data
propagation delay or additional machine time at either location for
remote copy processing. Your mileage will vary.


As we can see in Table 3, topologies that use a uniform access pattern
and a synchronous disk mirror can add significantly more time to each I/O,
increasing the response time by as much at 3x compared to non-uniform.


Note: VPLEX Metro environments can also be configured using native IP
connectivity between sites. Using this type of topology caries further
response time efficiencies since each and every IO across the WAN only
typically incurs a single round trip.


Another factor to consider when comparing the two topologies is also the
amount of WAN bandwidth used. The table below shows a comparison
between a full uniform topology and a VPLEX non-uniform topology for
bandwidth utilization. The IO size example is 128KB and the results are also
shown in KB.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                        34
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
SITE A                   Site B
     WAN bandwidth used for a 128KB IO                  read        write      read            write
     Full Uniform (sync or async mirror)                  0         256        128             384
     Non-Uniform                                          0         128*         0             128*
     * This is comparable to standard synchronous Active/Passive replication

                                                 Key
                                              Optimal
                                     Acceptable, but not efficient
                                            Sub-optimal

              Table 4 Uniform vs. non-uniform bandwidth usage

As one can see from Table 4, non-uniform always performs local reads and
also only has to send the data payload once across the WAN for a write
I/O regardless of where the data was written. This is in stark contrast to a
uniform topology, especially if the write occurs at the site with the passive
controller, since now the data has to be sent once to across the WAN (ISL)
to the controller where it will both mirror the cache page (synchronously
over the WAN again)as well as mirror the underlying storage again back
over the WAN giving an overall 3x increase in WAN traffic when compared
to non-uniform.


VPLEX with cross-connect and non-uniform mode
When using VPLEX Metro with a cross cluster connect configuration (up to
1ms round-trip time) is sometimes referred to as "VPLEX in uniform mode"
since each ESXi host is now connected to both the local and remote
VPLEX clusters.
While on the surface this does look similar to uniform mode it still typically
functions in a non-uniform mode. This is because under the covers all
VPLEX directors remain active and able to serve data locally, maintaining
the efficiencies of the VPLEX cache coherent architecture. Additionally
when using cross-connected clusters, it is recommended to configure the
ESXi servers so that the cross-connected paths are only standby paths.
Therefore even with a VPLEX cross-connected configuration, I/O flow is still
locally serviced from each local VPLEX cluster and does not traverse the
link.
The diagram below shows an example of this:




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                        35
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Paths in standby




                          Front End           Front End                                                          Front End           Front End




                                                                                                                                                       VPLEX Cluster B
                                                                                                 Communication
                                     A
                                                 A                                                                                                A




        VPLEX Cluster A
                                                         A
                                                                                                                                                                         A   A




                                                              Communication
                           (Active)            (Active)                                                           (Active)            (Active)
                             Cache               Cache                                                              Cache               Cache
                                                                                   I P or
                          (Distributed)       (Distributed)                                                      (Distributed)       (Distributed)
                                                                                     FC

                          Backend
                                     A




                                              Backend                                                            Backend         A   Backend
                                          A
                                 A


                                                                                                                                              A




   SITE A                                                                                                                                             SITE B


  Figure 15 High-level VPLEX cross-connect with non-uniform I/O access


In Figure 15, each ESXi host now has an alternate path to the remote VPLEX
cluster. Compared to the typical uniform diagram in the previous section,
however, we can still see that the underlying VPLEX architecture differs
significantly since it remains identical to the non-uniform layout, servicing
I/O locally at either location.

VPLEX with cross-connect and forced uniform mode

Although VPLEX functions primarily in a non-uniform model, there are
certain conditions where VPLEX can sustain a type of uniform access
mode. One such condition is if cross-connect is used and certain failures
occur causing the uniform mode to be forced.
One of the scenarios where this may occur is when VPLEX and the cross-
connect network are using physically separate channels and the VPLEX
clusters are partitioned while the cross-connect network remains in place.
The diagram below shows an example of this:




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                                                                                  36
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Front End           Front End                                                       Front End           Front End




                                                                                                                                                    VPLEX Cluster B
                                                                                              Communication
                                     A
                                                 A                                                                                             A




        VPLEX Cluster A
                                                         A
                                                                                                                                                                      A   A




                                                                                  Partition
                                                              Communication
                           (Active)            (Active)                                                       (Passive)           (Passive)
                             Cache               Cache                                                           Cache               Cache
                          (Distributed)       (Distributed)                                                   (Distributed)       (Distributed)

                          Backend
                                     A




                                              Backend                                                         Backend         A   Backend
                                          A
                                 A


                                                                                                                                           A




   SITE A                                                                                                                                          SITE B



                           Figure 16 forced uniform mode due to WAN partition

As illustrated in Figure 16 , VPLEX will invoke the "site preference rule"
suspending access to a given distributed virtual volume at one of the
locations (in the case site B). This ultimately means that I/O at site B has to
traverse the link to site A since the VPLEX controller path in site B is now
suspended due to the preference rule.
Another scenario where this might occur is if one of the VPLEX clusters at
either location becomes isolated or destroyed. The diagram below shows
an example of a localized rack failure at site B which has taken the VPLEX
cluster offline at site B.




                          Front End           Front End                                                       Front End Front End
                                                                                                                                                    VPLEX Cluster B
                                                                                              Communication




                                     A
                                                 A                                                                                             A
        VPLEX Cluster A




                                                         A
                                                                                                                                                                      A   A
                                                              Communication




                           (Active)            (Active)                                                        (offline)   (offline)
                                                                                                                      Localized
                             Cache               Cache                                                           Cache               Cache
                          (Distributed)       (Distributed)
                                                                              I P or
                                                                                FC
                                                                                                                     rack failure
                                                                                                              (Distributed)(Distributed)

                          Backend
                                     A




                                              Backend                                                         Backend         A   Backend
                                          A
                                 A


                                                                                                                                           A




   SITE A                                                                                                                                          SITE B


           Figure 17 VPLEX forced uniform mode due to cluster failure

In this scenario the VPLEX cluster remains online at site A (through VPLEX
Witness) and any I/O at site B will automatically access the VPLEX cluster at




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                                                                               37
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
site A over the cross-connect, thereby turning the standby path into an
active path.
In summary, VPLEX can use ‘forced uniform’ mode as a failsafe to ensure
that the highest possible level of availability is maintained at all times.


Note: Cross-connected VPLEX clusters are only supported with distances
up to 1 ms round trip time.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH               38
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Combining VPLEX HA with VMware HA and/or FT
Due to its core design, EMC VPLEX Metro provides the perfect foundation
for VMware Fault Tolerance and High Availability clustering over distance
ensuring simple and transparent deployment of stretched clusters without
any added complexity.

vSphere HA and VPLEX Metro HA (federated HA)
VPLEX Metro takes a single block storage device in one location and
“distributes” to provide single disk semantics across two locations. This
enables a “distributed” VMFS datastore to be created on that virtual
volume.
On top of this, if the layer 2 network has also been “stretched” then a
single instance vSphere (including a single logical datacenter) can now
also be “distributed” into more than one location and HA enabled for any
given vSphere cluster! This is possible since the storage federation layer of
the VPLEX is completely transparent to ESXi. It therefore enables the user to
add ESXi hosts at two different locations to the same HA cluster.
Stretching a HA failover cluster (such as VMware HA) with VPLEX creates a
“Federated HA” cluster over distance. This blurs the boundaries between
local HA and disaster recovery since the configuration has the automatic
restart capabilities of HA combined with the geographical distance
typically associated with synchronous DR.




                                                                                 ESX
                                    Distributed ESX HA Cluster
        A
              A         A                                                A




             ESX
                                                                             A      A




        VPLEX                                 WAN                                  VPLEX
    A


                                                                 A
                  A


                                                                     A




  Heterogeneous                      IP                  IP                      Heterogeneous
         Storage                             VPLEX                               Storage
                                             WITNESS


                      SITE A                                         SITE B



                            Figure 18 VPLEX Metro HA with vSphere HA

For detailed technical setup instruction please see the VPLEX Procedure
generator - Configuring a distributed volume as well as the " VMware
vSphere® Metro Storage Cluster Case Study " white paper found here:




            USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                              39
            VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR-
CLSTR-USLET-102-HI-RES.pdf for additional information around:
   •   Setting up Persistent Device Loss (PDL) handling
   •   vCenter placement options and considerations
   •   DRS enablement and affinity rules
   •   Controlling restart priorities (High/Medium/Low)


Use Cases for federated HA
A federated HA solution is an ideal fit if a customer has two datacenters
that are no more than 5ms (round trip latency) apart and wants to enable
an active/active datacenter design whilst also significantly enhancing
availability.
Using this type of solution brings several key business continuity items into
the solution including downtime and disaster avoidance as well as fully-
automatic service restart in the event of a total site outage. This type of
configuration would need to also be deployed with a stretched layer 2
network to ensure seamless capability regardless of which location the VM
runs in.

Datacenter pooling using DRS with federated HA
A nice feature of the federated HA solution is the ability for VMware DRS
(Dynamic Resource Scheduler) to be enabled and function relatively
transparently within the stretched cluster.
Using DRS effectively means that the vCenter/ESXi server load can be
distributed over two separate locations driving up utilization and using all
available, formerly passive, assets. Effectively with DRS enabled, the
configuration can be considered as two physical datacenters acting as a
single logical datacenter. This has some significant benefits since it brings
the ability to utilize what were once passive assets at a remote location
into a fully-active state.
To enable this functionality DRS can simply be switched on within the
stretched cluster and configured by the user to the desired automation
level. Depending on the setting, VMs will then automatically start to
distribute between the datacenters (Please read
http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR-
CLSTR-USLET-102-HI-RES.pdf for more details).




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                 40
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Note: A design consideration to take into account if DRS is desired within a
solution is to ensure that there are enough compute and network
resources at each location to take the full load of the business services
should either site fail.



Avoiding downtime and disasters using federated HA and vMotion
Another nice feature of a federated HA solution with vSphere is the ability
to avoid planned downtime as well as unplanned downtime. This is
achievable using the vMotion ability of vCenter to move a running VM (or
group of VMs) to any ESXi server in another (physical) datacenter. Since
the vMotion ability is now federated over distance, planned downtime
can be avoided for events that affect an entire datacenter location.
For instance, let's say that we needed to perform a power upgrade at
datacenter A which will result in the power being offline for 2 hours.
Downtime can be avoided since all running VMs at site A can be moved
to site B before the outage. Once the outage has ended, the VMs can be
moved back to site A using vMotion while keeping everything completely
online.
This use case can also be employed for anticipated, yet unplanned
events.
For instance, a hurricane may be in close proximity to your datacenter, this
solution brings the ability to move the VMs elsewhere avoiding any
potential disaster.


Note: During a planned event where power will be taken offline it is best to
engage EMC support to bring the VPLEX down gracefully. However, in the
event of a scenario where time does not permit (perhaps a hurricane) it
may not be possible to involve EMC support. In this case if site A was
destroyed there would still be no interruption assuming the VMs were
vMotioned ahead of time since VPLEX Witness would ensure that the site
that remains online keeps full access to the storage volume once site A has
been powered off. Please see the Failure scenarios and recovery using
federated HA below for more details.




        USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                41
        VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
Failure scenarios and recovery using federated HA
This section addresses all of the different types of failures and shows how in
each case VMware HA is able to continue or restart operations ensuring
maximum uptime.
The configuration below is a representation of a typical federated HA
solution:



                                         STRETCHED VSPHERE CLUSTER (DRS + HA)


          ESX                                                                                           ESX
                                                      optional cross
                                                         connect
                     A
                              A      A                                               A
                                                                                              A     A




  SITE A                                                                                                  SITE B

            VPLEX                                                                                 VPLEX
                 A


                                                                                 A




                                                           WAN



                         IP                                                              IP

                                                          VPLEX
                                                          WITNESS




      Figure 19 Typical VPLEX federated HA layout (multi-node cluster)

The table below shows the different failure scenarios and the outcome:

Failure                           VMs at A                VMs at B          Notes
Storage failure at                Remain online /         Remain online /   Cache read miss at
site A                            uninterrupted           uninterrupted     sire A now incurs
                                                                            additional link
                                                                            latency, cache
                                                                            read hits remain the
                                                                            same as do write
                                                                            I/O response times
Storage failure at                Remain online /         Remain online /   Cache read miss at
site B                            uninterrupted           uninterrupted     site B now incurs
                                                                            additional link
                                                                            latency, cache
                                                                            read hits remain the
                                                                            same as do write
                                                                            I/O response times
VPLEX Witness failure             Remain online /         Remain online /   Both VPLEX clusters
                                  uninterrupted           uninterrupted     dial home
All ESXi hosts fail at A          All VMs are restarted   Remain online /   Once the ESXi hosts
                                  automatically on                          are recovered, DRS




          USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH                                                  42
          VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability
White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability

Contenu connexe

Tendances

White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...
White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...
White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...EMC
 
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...EMC
 
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...EMC
 
VMware Site Recovery Manager
VMware Site Recovery ManagerVMware Site Recovery Manager
VMware Site Recovery ManagerJürgen Ambrosi
 
EMC Vnx master-presentation
EMC Vnx master-presentationEMC Vnx master-presentation
EMC Vnx master-presentationsolarisyougood
 
Backup of Microsoft SQL Server in EMC Symmetrix Environments ...
Backup of Microsoft SQL Server in EMC Symmetrix Environments ...Backup of Microsoft SQL Server in EMC Symmetrix Environments ...
Backup of Microsoft SQL Server in EMC Symmetrix Environments ...webhostingguy
 
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...EMC
 
SYMANTEC Backup Exec 2014 - infographic
SYMANTEC Backup Exec 2014 - infographicSYMANTEC Backup Exec 2014 - infographic
SYMANTEC Backup Exec 2014 - infographicMZERMA Amine
 
VMware Site Recovery Manager (SRM) 6.0 Lab Manual
VMware Site Recovery Manager (SRM) 6.0 Lab ManualVMware Site Recovery Manager (SRM) 6.0 Lab Manual
VMware Site Recovery Manager (SRM) 6.0 Lab ManualSanjeev Kumar
 
Evento 18 giugno - Power Virtualization Center
Evento 18 giugno - Power Virtualization CenterEvento 18 giugno - Power Virtualization Center
Evento 18 giugno - Power Virtualization CenterPRAGMA PROGETTI
 
EMC FAST VP for Unified Storage Systems
EMC FAST VP for Unified Storage Systems EMC FAST VP for Unified Storage Systems
EMC FAST VP for Unified Storage Systems EMC
 
Aix The Future of UNIX
Aix The Future of UNIX Aix The Future of UNIX
Aix The Future of UNIX xKinAnx
 
PowerVC and Power Systems Cloud Trends
PowerVC and Power Systems Cloud TrendsPowerVC and Power Systems Cloud Trends
PowerVC and Power Systems Cloud TrendsJay Kruemcke
 
Why Choose VMware for Server Virtualization
Why Choose VMware for Server VirtualizationWhy Choose VMware for Server Virtualization
Why Choose VMware for Server VirtualizationVMware
 
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...EMC
 

Tendances (20)

White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...
White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...
White Paper: Storage Tiering for VMware Environments Deployed on EMC Symmetri...
 
Whitepaper
WhitepaperWhitepaper
Whitepaper
 
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
 
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
 
VMware Site Recovery Manager
VMware Site Recovery ManagerVMware Site Recovery Manager
VMware Site Recovery Manager
 
Cisco Live
Cisco LiveCisco Live
Cisco Live
 
EMC Vnx master-presentation
EMC Vnx master-presentationEMC Vnx master-presentation
EMC Vnx master-presentation
 
Backup of Microsoft SQL Server in EMC Symmetrix Environments ...
Backup of Microsoft SQL Server in EMC Symmetrix Environments ...Backup of Microsoft SQL Server in EMC Symmetrix Environments ...
Backup of Microsoft SQL Server in EMC Symmetrix Environments ...
 
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
 
SYMANTEC Backup Exec 2014 - infographic
SYMANTEC Backup Exec 2014 - infographicSYMANTEC Backup Exec 2014 - infographic
SYMANTEC Backup Exec 2014 - infographic
 
VMware Site Recovery Manager (SRM) 6.0 Lab Manual
VMware Site Recovery Manager (SRM) 6.0 Lab ManualVMware Site Recovery Manager (SRM) 6.0 Lab Manual
VMware Site Recovery Manager (SRM) 6.0 Lab Manual
 
Evento 18 giugno - Power Virtualization Center
Evento 18 giugno - Power Virtualization CenterEvento 18 giugno - Power Virtualization Center
Evento 18 giugno - Power Virtualization Center
 
EMC FAST VP for Unified Storage Systems
EMC FAST VP for Unified Storage Systems EMC FAST VP for Unified Storage Systems
EMC FAST VP for Unified Storage Systems
 
Emc vplex deep dive
Emc vplex deep diveEmc vplex deep dive
Emc vplex deep dive
 
Aix The Future of UNIX
Aix The Future of UNIX Aix The Future of UNIX
Aix The Future of UNIX
 
PowerVC and Power Systems Cloud Trends
PowerVC and Power Systems Cloud TrendsPowerVC and Power Systems Cloud Trends
PowerVC and Power Systems Cloud Trends
 
Why Choose VMware for Server Virtualization
Why Choose VMware for Server VirtualizationWhy Choose VMware for Server Virtualization
Why Choose VMware for Server Virtualization
 
Paravirtualization
ParavirtualizationParavirtualization
Paravirtualization
 
EMC VNX
EMC VNXEMC VNX
EMC VNX
 
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
White Paper: Using VMware Storage APIs for Array Integration with EMC Symmetr...
 

En vedette

EMC Multisite DR for SQL Server 2012
EMC Multisite DR for SQL Server 2012EMC Multisite DR for SQL Server 2012
EMC Multisite DR for SQL Server 2012xKinAnx
 
Basic stack, queue
Basic stack, queueBasic stack, queue
Basic stack, queue박 민규
 
D penilaian-assesment-materi-6xxx
D penilaian-assesment-materi-6xxxD penilaian-assesment-materi-6xxx
D penilaian-assesment-materi-6xxxEko Bozz
 
Mon economic assumptions
Mon economic assumptionsMon economic assumptions
Mon economic assumptionsTravis Klein
 
20120827maru scaleout seminar
20120827maru scaleout seminar20120827maru scaleout seminar
20120827maru scaleout seminarMaco Yoshioka
 
Is making decisions a skill that you can develop
Is making decisions a skill that you can developIs making decisions a skill that you can develop
Is making decisions a skill that you can developDaleCarnegieIndia1
 
02 block productivity pp fs
02 block productivity pp fs02 block productivity pp fs
02 block productivity pp fsTravis Klein
 
Palestra para Academia - Vida e Saúde
Palestra para Academia - Vida e SaúdePalestra para Academia - Vida e Saúde
Palestra para Academia - Vida e SaúdeJoão Carlos
 
Under what circumstances do motivated
Under what circumstances do motivatedUnder what circumstances do motivated
Under what circumstances do motivatedDaleCarnegieIndia1
 
発券機のNFC対応
発券機のNFC対応発券機のNFC対応
発券機のNFC対応Hirokuma Ueno
 
AP stock market investing
AP stock market investingAP stock market investing
AP stock market investingTravis Klein
 
Monday factors of production
Monday factors of productionMonday factors of production
Monday factors of productionTravis Klein
 
EMC Perspective: Big Data Transforms the Life Science Commercial Model
EMC Perspective: Big Data Transforms the Life Science Commercial ModelEMC Perspective: Big Data Transforms the Life Science Commercial Model
EMC Perspective: Big Data Transforms the Life Science Commercial ModelEMC
 
La telefonia mòbil guillem
La telefonia mòbil guillemLa telefonia mòbil guillem
La telefonia mòbil guillemmgonellgomez
 

En vedette (20)

EMC Multisite DR for SQL Server 2012
EMC Multisite DR for SQL Server 2012EMC Multisite DR for SQL Server 2012
EMC Multisite DR for SQL Server 2012
 
Basic stack, queue
Basic stack, queueBasic stack, queue
Basic stack, queue
 
D penilaian-assesment-materi-6xxx
D penilaian-assesment-materi-6xxxD penilaian-assesment-materi-6xxx
D penilaian-assesment-materi-6xxx
 
Mon economic assumptions
Mon economic assumptionsMon economic assumptions
Mon economic assumptions
 
Windows Server 2012 Virtualization: Notes from the Field
Windows Server 2012 Virtualization: Notes from the FieldWindows Server 2012 Virtualization: Notes from the Field
Windows Server 2012 Virtualization: Notes from the Field
 
20120827maru scaleout seminar
20120827maru scaleout seminar20120827maru scaleout seminar
20120827maru scaleout seminar
 
Is making decisions a skill that you can develop
Is making decisions a skill that you can developIs making decisions a skill that you can develop
Is making decisions a skill that you can develop
 
02 block productivity pp fs
02 block productivity pp fs02 block productivity pp fs
02 block productivity pp fs
 
Palestra para Academia - Vida e Saúde
Palestra para Academia - Vida e SaúdePalestra para Academia - Vida e Saúde
Palestra para Academia - Vida e Saúde
 
Email campaign
Email campaignEmail campaign
Email campaign
 
Ppt toy3
Ppt toy3Ppt toy3
Ppt toy3
 
Under what circumstances do motivated
Under what circumstances do motivatedUnder what circumstances do motivated
Under what circumstances do motivated
 
発券機のNFC対応
発券機のNFC対応発券機のNFC対応
発券機のNFC対応
 
So you’ve successfully installed SCOM… Now what.
So you’ve successfully installed SCOM… Now what.So you’ve successfully installed SCOM… Now what.
So you’ve successfully installed SCOM… Now what.
 
AP stock market investing
AP stock market investingAP stock market investing
AP stock market investing
 
Warren buffet
Warren buffetWarren buffet
Warren buffet
 
Monday factors of production
Monday factors of productionMonday factors of production
Monday factors of production
 
EMC Perspective: Big Data Transforms the Life Science Commercial Model
EMC Perspective: Big Data Transforms the Life Science Commercial ModelEMC Perspective: Big Data Transforms the Life Science Commercial Model
EMC Perspective: Big Data Transforms the Life Science Commercial Model
 
La telefonia mòbil guillem
La telefonia mòbil guillemLa telefonia mòbil guillem
La telefonia mòbil guillem
 
Social networking (1)
Social networking (1)Social networking (1)
Social networking (1)
 

Similaire à White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability

Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...
Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...
Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...EMC
 
Reference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMwareReference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMwareEMC
 
Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments
Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments   Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments
Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments EMC
 
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...EMC
 
H13531.1 eehc-federation-sddc-ra
H13531.1 eehc-federation-sddc-raH13531.1 eehc-federation-sddc-ra
H13531.1 eehc-federation-sddc-rasri200012
 
Cloud Foundry Platform as a Service on Vblock System
Cloud Foundry Platform as a Service on Vblock SystemCloud Foundry Platform as a Service on Vblock System
Cloud Foundry Platform as a Service on Vblock SystemEMC
 
Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage
Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage
Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage EMC
 
Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...
Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...
Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...EMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 
White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...
White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...
White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...EMC
 
White Paper: EMC Compute-as-a-Service
White Paper: EMC Compute-as-a-Service   White Paper: EMC Compute-as-a-Service
White Paper: EMC Compute-as-a-Service EMC
 
TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments
TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments  TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments
TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments EMC
 
Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...
Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...
Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...EMC
 
IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...
IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...
IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...IBM India Smarter Computing
 
EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5
EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5
EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5EMC
 
consolidating and protecting virtualized enterprise environments with Dell EM...
consolidating and protecting virtualized enterprise environments with Dell EM...consolidating and protecting virtualized enterprise environments with Dell EM...
consolidating and protecting virtualized enterprise environments with Dell EM...Itzik Reich
 
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...EMC
 
Cs 7.2 distributed
Cs 7.2 distributedCs 7.2 distributed
Cs 7.2 distributedEswar Eluri
 
H4160 emc solutions for oracle database
H4160 emc solutions for oracle databaseH4160 emc solutions for oracle database
H4160 emc solutions for oracle databasevoyna
 

Similaire à White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability (20)

Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...
Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...
Introduction to the EMC VNX Series VNX5100, VNX5300, VNX5500, VNX5700, and VN...
 
Reference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMwareReference Architecture: EMC Hybrid Cloud with VMware
Reference Architecture: EMC Hybrid Cloud with VMware
 
Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments
Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments   Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments
Techbook : Using EMC Symmetrix Storage in VMware vSphere Environments
 
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
Reference Architecture: EMC Infrastructure for VMware View 5.1 EMC VNX Series...
 
H13531.1 eehc-federation-sddc-ra
H13531.1 eehc-federation-sddc-raH13531.1 eehc-federation-sddc-ra
H13531.1 eehc-federation-sddc-ra
 
Cloud Foundry Platform as a Service on Vblock System
Cloud Foundry Platform as a Service on Vblock SystemCloud Foundry Platform as a Service on Vblock System
Cloud Foundry Platform as a Service on Vblock System
 
Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage
Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage
Backup and Recovery Solution for VMware vSphere on EMC Isilon Storage
 
Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...
Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...
Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmet...
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 
White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...
White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...
White Paper: EMC Backup and Recovery for Microsoft Exchange and SharePoint 20...
 
White Paper: EMC Compute-as-a-Service
White Paper: EMC Compute-as-a-Service   White Paper: EMC Compute-as-a-Service
White Paper: EMC Compute-as-a-Service
 
TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments
TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments  TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments
TechBook: Using EMC Symmetrix Storage in VMware vSphere Environments
 
Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...
Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...
Whitepaper : Oracle Real Application Clusters (RAC) on Extended Distance Clus...
 
IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...
IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...
IBM Storwize 7000 Unified, SONAS, and VMware Site Recovery Manager: An overvi...
 
EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5
EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5
EMC Hybrid Cloud Solution with VMware: Hadoop Applications Solution Guide 2.5
 
consolidating and protecting virtualized enterprise environments with Dell EM...
consolidating and protecting virtualized enterprise environments with Dell EM...consolidating and protecting virtualized enterprise environments with Dell EM...
consolidating and protecting virtualized enterprise environments with Dell EM...
 
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
 
Cs 7.2 distributed
Cs 7.2 distributedCs 7.2 distributed
Cs 7.2 distributed
 
H4160 emc solutions for oracle database
H4160 emc solutions for oracle databaseH4160 emc solutions for oracle database
H4160 emc solutions for oracle database
 

Plus de EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS BreachEMC
 
EMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC
 

Plus de EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach
 
EMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data Storage
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

White Paper: Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability

  • 1. White Paper USING VPLEX™ METRO WITH VMWARE HIGH AVAILABILITY AND FAULT TOLERANCE FOR ULTIMATE AVAILABILITY Abstract This white paper discusses using best of breed technologies from VMware® and EMC® to create federated continuous availability solutions. The following topics are reviewed  Choosing between federated Fault Tolerance or federated High Availability  Design considerations and constraints  Operational Best Practice
  • 2. September 2012 Copyright © 2012 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 2 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 3. Table of Contents Executive summary ............................................................................................. 5 Audience ......................................................................................................................... 6 Document scope and limitations................................................................................. 6 Introduction .......................................................................................................... 8 EMC VPLEX technology ..................................................................................... 10 VPLEX terms and Glossary ........................................................................................... 11 EMC VPLEX architecture.............................................................................................. 13 EMC VPLEX Metro overview ........................................................................................ 14 Understanding VPLEX Metro active/active distributed volumes ........................... 15 VPLEX Witness – An introduction................................................................................. 18 Protecting VPLEX Witness using VMware FT .............................................................. 22 VPLEX Metro HA ............................................................................................................ 24 VPLEX Metro cross cluster connect ............................................................................ 24 Unique VPLEX benefits for availability and I/O response time ...................... 26 Uniform and non-uniform I/O access ........................................................................ 26 Uniform access (non-VPLEX) ....................................................................................... 26 Non-Uniform Access (VPLEX IO access pattern)...................................................... 31 VPLEX with cross-connect and non-uniform mode ................................................. 35 VPLEX with cross-connect and forced uniform mode ............................................ 36 Combining VPLEX HA with VMware HA and/or FT .......................................... 39 vSphere HA and VPLEX Metro HA (federated HA) .................................................. 39 Use Cases for federated HA ....................................................................................... 40 Datacenter pooling using DRS with federated HA.................................................. 40 Avoiding downtime and disasters using federated HA and vMotion .................. 41 Failure scenarios and recovery using federated HA ............................................... 42 vSphere FT and VPLEX Metro (federated FT) ............................................................ 45 Use cases for a federated FT solution ........................................................................ 45 Failure scenarios and recovery using federated FT ................................................. 46 Choosing between federated availability or disaster recovery (or both) ........... 49 Augmenting DR with federated HA and/or FT ......................................................... 51 Environments where federated HA and/or FT should not replace DR ................. 52 Best Practices and considerations when combining VPLEX HA with VMware HA and/or FT....................................................................................................... 54 VMware HA and FT best practice requirements ...................................................... 55 Networking principles and pre-requisites .................................................................. 55 vCenter placement options ....................................................................................... 56 USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 3 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 4. Path loss handling semantics (PDL and APD)........................................................... 57 Cross-connect Topologies and Failure Scenarios. ................................................... 58 Cross-connect and multipathing ............................................................................... 60 VPLEX site preference rules ......................................................................................... 60 DRS and site affinity rules ............................................................................................. 61 Additional best practices and considerations for VMware FT ............................... 61 Secondary VM placement considerations............................................................... 62 DRS affinity and cluster node count. ......................................................................... 63 VPLEX preference rule considerations for FT............................................................. 64 Other generic recommendations for FT .................................................................... 64 Conclusion ......................................................................................................... 66 References ......................................................................................................... 67 Appendix A - vMotioning over longer distances (10ms) .............................. 69 USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 4 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 5. Executive summary The EMC® VPLEX™ family removes physical barriers within, across, and between datacenters. VPLEX Local provides simplified management and non-disruptive data mobility for heterogeneous arrays. VPLEX Metro and Geo provide data access and mobility between two VPLEX clusters within synchronous and asynchronous distances respectively. With a unique scale-out architecture, VPLEX’s advanced data caching and distributed cache coherency provide workload resiliency, automatic sharing, balancing and failover of storage domains, and enable both local and remote data access with predictable service levels. VMware vSphere makes it simpler and less expensive to provide higher levels of availability for important applications. With vSphere, organizations can easily increase the baseline level of availability provided for all applications, as well as provide higher levels of availability more easily and cost-effectively. vSphere makes it possible to reduce both planned and unplanned downtime. The revolutionary VMware vMotion™ (vMotion) capabilities in vSphere make it possible to perform planned maintenance with zero application downtime. VMware High Availability (HA), a feature of vSphere, reduces unplanned downtime by leveraging multiple VMware ESX® and VMware ESXi™ hosts configured as a cluster, to provide automatic recovery from outages as well as cost-effective high availability for applications running in virtual machines. VMware Fault Tolerance (FT) leverages the well-known encapsulation properties of virtualization by building fault tolerance directly into the ESXi hypervisor in order to deliver hardware style fault tolerance to virtual machines. Guest operating systems and applications do not require modifications or reconfiguration. In fact, they remain unaware of the protection transparently delivered by ESXi and the underlying architecture. By leveraging distance, VPLEX Metro builds on the strengths of VMware FT and HA to provide solutions that go beyond traditional “Disaster Recovery”. These solutions provide a new type of deployment which achieves the absolute highest levels of continuous availability over distance for today’s enterprise storage and cloud environments. When using such technologies, it is now possible to provide a solution that has both zero Recovery Point Objective (RPO) with zero "storage" Recovery Time Objective (RTO) (and zero "application" RTO when using VMware FT). This white paper is designed to give technology decision-makers a deeper understanding of VPLEX Metro in conjunction with VMware Fault Tolerance USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 5 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 6. and/or High Availability discussing design, features, functionality and benefits. This paper also highlights the key technical considerations for implementing VMware Fault Tolerance and/or High Availability with VPLEX Metro technology to achieve "Federated Availability" over distance. Audience This white paper is intended for technology architects, storage administrators and EMC professional services partners who are responsible for architecting, creating, managing and using IT environments that utilize EMC VPLEX and VMware Fault Tolerance and/or High Availability technologies (FT and HA respectively). The white paper assumes that the reader is familiar with EMC VPLEX and VMware technologies and concepts. Document scope and limitations This document applies to EMC VPLEX Metro configured with VPLEX Witness. The details provided in this white paper are based on the following configurations: • VPLEX Geosynchrony 5.1 (patch 2) or higher • VPLEX Metro HA only (Local and Geo are not supported with FT or HA in a stretched configuration) • VPLEX Clusters are within 5 milliseconds (ms) of each other for VMware HA • Cross-connected configurations can be optionally deployed for VMware HA solutions (not mandatory). • For VMware FT configurations VPLEX cross cluster connect is in place (mandatory requirement). • VPLEX Clusters are within 5 millisecond (ms) round trip time (RTT) of each other for VMware HA • VPLEX Clusters are within 1 millisecond (ms) round trip time (RTT) of each other for VMware FT • VPLEX Witness is deployed to a third failure domain (Mandatory). The Witness functionality is required for “VPLEX Metro” to become a true active/active continuously available storage cluster. • ESXi and vSphere 5.0 Update 1 or later are used • Any qualified pair of arrays (both EMC and non-EMC) listed on the EMC Simple Support Matrix (ESSM) found here: https://elabnavigator.emc.com/vault/pdf/EMC_VPLEX.pdf USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 6 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 7. The configuration is in full compliance with VPLEX best practice found here: http://powerlink.emc.com/km/live1/en_US/Offering_Technical/Tech nical_Documentation/h7139-implementation-planning-vplex-tn.pdf Please consult with your local EMC Support representative if you are uncertain as to the applicability of these requirements. Note: While out of scope for this document, it should be noted that in addition to all best practices within this paper, that all federated FT and HA solutions will carry the same best practices and limitations imposed by the VMware HA and FT technologies too. For instance at the time of writing VMware FT technology is only capable of supporting a single vCPU per VM (VMware HA does not carry the same vCPU limitation) and this limitation will prevail when federating a VMware FT cluster. Please ensure to review the VMware best practice documentation as well as the limitations and considerations documentation (please see the References section) for further information. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 7 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 8. Introduction Increasingly, more and more customers wish to protect their business services from any event imaginable that would lead to downtime. Previously (i.e. prior to VPLEX) solutions to prevent downtime fell into two camps: 1. Highly available and fault tolerant systems within a datacenter 2. Disaster recovery solutions outside of a datacenter. The benefit of FT and HA solutions are that they provide automatic recovery in the event of a failure. However, the geographical protection range is limited to a single datacenter therefore not protecting business services from a datacenter failure. On the other hand, disaster recovery solutions typically protect business services using geographic dispersion so that if a datacenter fails, recovery would be achieved using another datacenter in a separate fault domain from the primary. Some of the drawbacks with a disaster recovery solutions, however, are that they are human decision based (i.e. not automatic) and typically require a 2nd disruptive failback once the primary site is repaired. In other words, should a primary datacenter fail the business would need to make a non-trivial decision to invoke disaster recovery. Since disaster recovery is decision-based (i.e. manually invoked), it can lead to extended outages since the very decision itself takes time, and this is generally made at the business level involving key stakeholders. As most site outages are caused by recoverable events (e.g. an elongated power outage), faced with the “Invoke DR” decision some businesses choose not to invoke DR and to ride through the outage instead. This means that critical business IT services remain offline for the duration of the event. These types of scenarios are not uncommon in these "disaster" situations and non-invocation can be for various reasons. The two biggest ones are: 1. The primary site that failed can be recovered within 24-48 hours therefore not warranting the complexity and risk of invoking DR. 2. Invoking DR will require a “failback” at some point in the future which in turn will bring more disruption. Other potential concerns to invoking disaster recovery include complexity, lack of testing, lack of resources, lack of skill sets and lengthy recovery time. To avoid such pitfalls, VPLEX and VMware offer a more comprehensive answer to safeguarding your environments. By combining the benefits of HA and FT, a new category of availability is created. This new type of USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 8 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 9. category provides the automatic (non-decision based) benefits of FT and HA, but allows them to be leveraged over distance by using VPLEX Metro. This brings the geographical distance benefits normally associated with disaster recovery to the table enhancing the HA and FT propositions significantly. The new category is known as “Federated Availability” and enables bullet proof availability which in turn significantly lessens the chance of downtime for both planned and unplanned events. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 9 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 10. EMC VPLEX technology VPLEX encapsulates traditional physical storage array devices and applies three layers of logical abstraction to them. The logical relationships of each layer are shown in Figure 1. Extents are the mechanism VPLEX uses to divide storage volumes. Extents may be all or part of the underlying storage volume. EMC VPLEX aggregates extents and applies RAID protection in the device layer. Devices are constructed using one or more extents and can be combined into more complex RAID schemes and device structures as desired. At the top layer of the VPLEX storage structures are virtual volumes. Virtual volumes are created from devices and inherit the size of the underlying device. Virtual volumes are the elements VPLEX exposes to hosts using its Front End (FE) ports. Access to virtual volumes is controlled using storage views. Storage views are comparable to Auto-provisioning Groups on EMC Symmetrix® or to storage groups on EMC VNX®. They act as logical containers determining host initiator access to VPLEX FE ports and virtual volumes. Figure 1 EMC VPLEX Logical Storage Structures USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 10 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 11. VPLEX terms and Glossary Term Definition VPLEX Virtual Unit of storage presented by the Volume VPLEX front-end ports to hosts VPLEX Distributed A single unit of storage presented by Volume the VPLEX front-end ports of both VPLEX clusters in a VPLEX Metro configuration separated by distance VPLEX Director The central processing and intelligence of the VPLEX solution. There are redundant (A and B) directors in each VPLEX Engine VPLEX Engine Consists of two directors and is the unit of scale for the VPLEX solution VPLEX cluster A collection of VPLEX engines in one rack. VPLEX Metro The cooperation of two VPLEX clusters, each serving their own storage domain over synchronous distance forming active/active distributed volume(s) VPLEX Metro HA As per VPLEX Metro, but configured with VPLEX Witness to provide fully automatic recovery from the loss of any failure domain. This can also be thought of as an active/active continuously available storage cluster over distance. Access Anywhere The term used to describe a distributed volume using VPLEX Metro which has active/active characteristics Federation The cooperation of storage USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 11 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 12. elements at a peer level over distance enabling mobility, availability and collaboration Automatic No human intervention whatsoever (e.g. HA and FT) Automated No human intervention required once a decision has been made (e.g. disaster recovery with VMware's SRM technology) USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 12 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 13. EMC VPLEX architecture EMC VPLEX represents the next-generation architecture for data mobility and information access. The new architecture is based on EMC’s more than 20 years of expertise in designing, implementing, and perfecting enterprise-class intelligent cache and distributed data protection solutions. As shown in Figure 2, VPLEX is a solution for vitalizing and federating both EMC and non-EMC storage systems together. VPLEX resides between servers and heterogeneous storage assets (abstracting the storage subsystem from the host) and introduces a new architecture with these unique characteristics: • Scale-out clustering hardware, which lets customers start small and grow big with predictable service levels • Advanced data caching, which utilizes large-scale SDRAM cache to improve performance and reduce I/O latency and array contention • Distributed cache coherence for automatic sharing, balancing, and failover of I/O across the cluster • A consistent view of one or more LUNs across VPLEX clusters separated either by a few feet within a datacenter or across synchronous distances, enabling new models of high availability and workload relocation Physical Host Layer A A A A A A Virtual Storage Layer (VPLEX) A A Physical Storage Layer Figure 2 Capability of an EMC VPLEX local system to abstract Heterogeneous Storage USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 13 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 14. EMC VPLEX Metro overview VPLEX Metro brings mobility and access across two locations separated by an inter-site round trip time of up to 5 milliseconds (host application permitting). VPLEX Metro uses two VPLEX clusters (one at each location) and includes the unique capability to support synchronous distributed volumes that mirror data between the two clusters using write-through caching. Since a VPLEX Metro Distributed volume is under the control of the VPLEX Metro advanced cache coherency algorithms, active data I/O access to the distributed volume is possible at either VPLEX cluster. VPLEX Metro therefore is a truly active/active solution which goes far beyond traditional active/passive legacy replication solutions. VPLEX Metro distributes the same block volume to more than one location and ensures standard HA cluster environments (e.g. VMware HA and FT) can simply leverage this capability and therefore can be easily and transparently deployed and over distance too. The key to this is to make the host cluster believe there is no distance between the nodes so they behave identically as they would in a single data center. This is known as “dissolving distance” and is a key deliverable of VPLEX Metro. The other piece to delivering truly active/active FT or HA environments is an active/active network topology whereby the Layer 2 of the same network resides in each location giving truly seamless datacenter pooling. Whilst layer 2 network stretching is a pre-requisite for any FT or HA solution based on VPLEX Metro, it is outside of the scope of this document. Going forward throughout this document it is assumed that there is a stretched layer 2 network between datacenters where a VPLEX Metro resides. Note: Please see further information on Cisco Overlay Transport Virtualization (OTV) found here http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DCI/ whitepaper/DCI_1.html and Brocade Virtual Private LAN Service(VPLS) found here http://www.brocade.com/downloads/documents/white_papers/Offering_ Scalable_Layer2_Services_with_VPLS_and_VLL.pdf technology for stretching a layer 2 network over distance. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 14 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 15. Understanding VPLEX Metro active/active distributed volumes Unlike traditional legacy replication where access to a replicated volume is either in one location or another (i.e. an active/passive only paradigm) VPLEX distributes a virtual device over distance which ultimately means host access is now possible in more than one location to the same (distributed) volume. In engineering terms the distributed volumes that is presented from VPLEX Metro is said to have “single disk semantics” meaning that in every way (including failure) the disk will behave as one object as any traditional block device would. This therefore means that all the rules associated with a single disk are fully applicable to a VPLEX Metro distributed volume. For instance, the following figure shows a single host accessing a single JBOD type volume: Datacenter Figure 3 Single host access to a single disk Clearly the host in the diagram is the only host initiator accessing the single volume. The next figure shows a local two node cluster. Cluster of hosts coordinate for access Datacenter Figure 4 Multiple host access to a single disk As shown in the diagram there are now two hosts contending for the single volume. The dashed orange rectangle shows that each of the nodes is USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 15 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 16. required to be in a cluster or utilize a cluster file system so they can effectively coordinate locking to ensure the volume remains consistent. The next figure shows the same two node cluster but now connected to a VPLEX distributed volume using VPLEX cache coherency technology. Cluster of hosts coordinate for access VPLEX AccessAnywhere™ Datacenter Datacenter Figure 5 Multiple host access to a VPLEX distributed volume In this example there is no difference to the fundamental dynamics of the two node cluster access pattern to the single volume. Additionally as far as the hosts are concerned they cannot see any different between this and the previous example since VPLEX is distributing the device between datacenters via AccessAnywhere™ (which is a type of federation). This means that the hosts are still required to coordinate locking to ensure the volume remains consistent. For ESXi this mechanism is controlled by the cluster file system Virtual Machine File System (VMFS) within each datastore. In this case each distributed volume will be imported into VPLEX and formatted with the VMFS file system. The figure below shows a high-level physical topology of a VPLEX Metro distributed device. A A A A A A SITE A SITE B AccessAnywhere™ A LINK A Figure 6 Multiple host access to a VPLEX distributed volume This figure is a physical representation of the logical configuration shown in Figure 5. Effectively, with this topology deployed, the distributed volume USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 16 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 17. can be treated just like any other volume, the only difference being it is now distributed and available in two locations at the same time. Another benefit of this type of architecture is “extreme simplicity” since it is no more difficult to configure a cluster across distance that it is in a single data center. Note: VPLEX Metro can use either 8GB FC or native 10GB Ethernet WAN connectivity (Where the word link is written). When using FC connectivity this can be configured with either a dedicated channel (i.e. separate non merged fabrics) or ISL based (i.e. where fabrics have been merged across sites). It is assumed that any WAN link will have a second physically redundant circuit. Note: It is vital that VPLEX Metro has enough bandwidth between clusters to meet requirements. EMC can assist in the qualification of this through the Business Continuity Solution Designer (BCSD) tool. Please engage your EMC account team to perform a sizing exercise. For further details on VPLEX Metro architecture, please see the VPLEX HA Techbook found here: http://www.emc.com/collateral/hardware/technical- documentation/h7113-vplex-architecture-deployment.pdf USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 17 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 18. VPLEX Witness – An introduction As mentioned previously, VPLEX Metro goes beyond the realms of legacy active/passive replication technologies since it can deliver true active/active storage over distance as well as federated availability. There are three main items that are required to deliver true "Federated Availability". 1. True active/active fibre channel block storage over distance. 2. Synchronous mirroring to ensure both locations are in lock step with each other from a data perspective. 3. External arbitration to ensure that under all failure conditions automatic recovery is possible. In the previous sections we have discussed 1 and 2, but now we will look at external arbitration which is enabled by VPLEX Witness. VPLEX Witness is delivered as a zero cost VMware Virtual Appliance (vApp) which runs on a customer supplied ESXi server. The ESXi server resides in a physically separate failure domain to either VPLEX cluster and uses different storage to the VPLEX cluster. Using VPLEX Witness ensures that true Federated Availability can be delivered. This means that regardless of site or link/WAN failure a copy of the data will automatically remain online in at least one of the locations. When setting up a single or a group of distributed volumes the user will choose a “preference rule” which is a special property that each individual or group of distributed volumes has. It is the preference rule that determines the outcome after failure conditions such as site failure or link partition. The preference rule can either be set to cluster A preferred, cluster B preferred or no automatic winner. At a high level this has the following effect to a single or group of distributed volumes under different failure conditions as listed below: USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 18 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 19. Preference VPLEX CLUSTER PARTITION SITE A FAILS SITE B FAILS Rule / scenario Site A Site B Site A Site B Site A Site B Cluster A ONLINE SUSPENDED FAILED SUSPENDED ONLINE FAILED Preferred GOOD BAD (by design) GOOD Cluster B SUSPENDED ONLINE FAILED ONLINE SUSPENDED FAILED preferred GOOD GOOD BAD (by design) No automatic SUSPENDED (by design) SUSPENDED (by design) SUSPENDED (by design) winner Table 1 Failure scenarios without VPLEX Witness As we can see in Table 1(above) if we only used the preference rules without VPLEX Witness then under some scenarios manual intervention would be required to bring the volume online at a given VPLEX cluster(e.g. if site A is the preferred site, and site A fails, site B would also suspend). This is where VPLEX Witness assists since it can better diagnose failures due to the network triangulation, and ensures that at any time at least one of the VPLEX clusters has an active path to the data as shown in the table below: Preference VPLEX CLUSTER PARTITION SITE A FAILS SITE B FAILS Rule Site A Site B Site A Site B Site A Site B Cluster A ONLINE SUSPENDED FAILED ONLINE ONLINE FAILED Preferred GOOD GOOD GOOD Cluster B SUSPENDED ONLINE FAILED ONLINE ONLINE FAILED preferred GOOD GOOD GOOD No automatic SUSPENDED (by design) SUSPENDED (by design) SUSPENDED (by design) winner Table 2 Failure scenarios with VPLEX Witness As one can see from Table 2 VPLEX Witness converts a VPLEX Metro from an active/active mobility and collaboration solution into an active/active continuously available storage cluster. Furthermore once VPLEX Witness is deployed, failure scenarios become self-managing (i.e. fully automatic) which makes it extremely simple since there is nothing to do regardless of failure condition! USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 19 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 20. Figure 7 below shows the high level topology of VPLEX Witness Figure 7 VPLEX configured for VPLEX Witness As depicted in Figure 7 we can see that the Witness VM is deployed in a separate fault domain (as defined by the customer) and connected into both VPLEX management stations via an IP network. Note: Fault domain is decided by the customer and can range from different racks in the same datacenter all the way up to VPLEX clusters 5ms of distance away from each other (5ms measured round trip time latency or typical synchronous distance). The distance that VPLEX witness can be placed from the two VPLEX clusters can be even further. The current supported maximum round trip latency for this is 1 second. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 20 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 21. Figure 8 below shows a more detailed connectivity diagram of VPLEX Witness IMPORTANT / REQUIREMENT! SEPARATE FAULT DOMAIN! Figure 8 Detailed VPLEX Witness network layout The witness network is physically separate from the VPLEX inter-cluster network and also uses storage that is physically separate from either VPLEX cluster. As stated previously, it is critical to deploy VPLEX Witness into a third failure domain. The definition of this domain changes depending on where the VPLEX clusters are deployed. For instance if the VPLEX Metro clusters are to be deployed into the same physical building but perhaps different areas of the datacenter, then the failure domain here would be deemed the VPLEX rack itself. Therefore VPLEX Witness could also be deployed into the same physical building but in a separate rack. If, however, each VPLEX cluster was deployed 50 miles apart in totally different buildings then the failure domain here would be the physical building and/or town. Therefore in this scenario it would makes sense to deploy VPLEX Witness in another town altogether; and since the maximum round trip latency can be as much as one second then you could effectively pick any city in the world, especially given the bandwidth requirement is as low as 3Kb/sec. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 21 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 22. For more in depth VPLEX Witness architecture details please refer to the VPLEX HA Techbook that can be found here: http://www.emc.com/collateral/hardware/technical- documentation/h7113-vplex-architecture-deployment.pdf Note: Always deploy VPLEX Witness in a 3rd failure domain and ensure that all distributed volumes reside in a consistency group with the witness function enabled. Also ensure that EMC Secure Remote Support (ESRS) Gateway is fully configured and the witness has the capability to alert if it for whatever reason fails (no impact to I/O if witness fails). Protecting VPLEX Witness using VMware FT Under normal operational conditions VPLEX Witness is not a vital component that is required to drive active/active I/O (i.e. if the Witness is disconnected or lost, I/O still continues).It does however become a crucial component to ensure availability in the event of site loss at either of the locations where the VPLEX clusters reside. If, for whatever reason, the VPLEX Witness was lost and soon after there was a catastrophic site failure at a site containing a VPLEX cluster then the hosts at the remaining site would also lose access to the remaining VPLEX volumes since the remaining VPLEX would think it was isolated as the VPLEX Witness is also unavailable. To minimize this risk, it is considered best practice to disable the VPLEX Witness function if it has been lost and will remain offline for a long time. Another way to ensure availability is to minimize the risk of a VPLEX Witness loss in the first place by increasing the availability of the VPLEX Witness VM running in the third location. A way to significantly boost availability for this individual VM is to use VMware FT to protect VPLEX Witness at the third location. This ensures that the VPLEX Witness remains unaffected at the third failure domain should a hardware failure occur to the ESXi server in the third failure domain that is supporting the VPLEX Witness VM. To deploy this functionality, simply enable ESXi HA clustering for the VPLEX Witness VM across two or more ESXi hosts (in the same location), and once this has been configured right click the VPLEX Witness VM and enable fault tolerance. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 22 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 23. Note: At the time of writing, the FT configuration on VPLEX Witness is only within one location and not a stretched / federated FT configuration. The storage that the VPLEX Witness uses should be physically contained within the boundaries of the third failure domain on local (i.e. not VPLEX Metro distributed) volumes. Additionally it should be noted that currently HA alone is not supported, only FT or unprotected. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 23 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 24. VPLEX Metro HA As discussed in the two previous sections, VPLEX Metro is able to provide active/active distributed storage, however we have seen that in some cases depending on failure, loss of access to the storage volume could occur if the preferred site fails for some reason causing the non-preferred site to suspend too. Using VPLEX Witness overcomes this scenario and ensures that access to a VPLEX cluster is always maintained regardless of which site fails. VPLEX Metro HA describes a VPLEX Metro solution that has also been deployed with VPLEX Witness. As the name suggests VPLEX Metro HA effectively delivers truly available distributed Storage volumes over distance and forms a solid foundation for additional layers of VMware technology such as HA and FT. Note: It is assumed that all topologies discussed within this white paper use VPLEX Metro HA (i.e. use VPLEX Metro and VPLEX Witness). This is mandatory to ensure fully automatic (i.e. decision less) recovery under all the failure conditions outlined within this document. VPLEX Metro cross cluster connect Another important feature of VPLEX Metro that can be optionally deployed within a campus topology (i.e. up to 1ms) is cross cluster connect. Note: At the time of writing cross-connect is a mandatory requirement for VMware FT implementations. This feature pushes VPLEX HA into an even greater level of availability than before since now an entire VPLEX cluster failure at a single location would not cause an interruption to host I/O at either location (using either VMware FT or HA) Figure 9 below shows the topology of a cross-connected configuration: USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 24 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 25. A A A OPTIONAL A A A X – CONNECT SITE A SITE B AccessAnywhere™ A LINK A VPLEX WITNESS IP IP Figure 9 VPLEX Metro deployment with cross-connect As we can see in the diagram the cross-connect offers an alternate path or paths from each ESXi server to the remote VPLEX. This ensures that if for any reason an entire VPLEX cluster were to fail (which is unlikely since there is no single-point-of-failure) there would be no interruption to I/O since the remaining VPLEX cluster will continue to service I/O across the remote cross link (alternate path) It is recommended when deploying cross-connect that rather than merging fabrics and using an Inter Switch Link (ISL), additional host bus adapters (HBAs) should be used to connect directly to the remote data centers switch fabric. This ensures that fabrics do not merge and span failure domains. Another important note to remember for cross-connect is that it is only supported for campus environments up to 1ms round trip time. Note: When setting up cross-connect, each ESXi server will see double the paths to the datastore (50% local and 50% remote). It is best practice to ensure that the pathing policy is set to fixed and mark the remote paths across to the other cluster as passive. This ensures that the workload remains balanced and only committing to a single cluster at any one time. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 25 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 26. Unique VPLEX benefits for availability and I/O response time VPLEX is built from the ground up to perform block storage distribution over long distances at enterprise scale and performance. One of the unique core principles of VPLEX that enables this, is its underlying and extremely efficient cache coherency algorithms which enable an active/active topology without compromise. Since VPLEX is architecturally unique from other virtual storage products, two simple categories are used to easily distinguish between the architectures. Uniform and non-uniform I/O access Essentially these two categories are a way to describe the I/O access pattern from the host to the storage system when using a stretched or distributed cluster configuration. VPLEX Metro (under normal conditions) follows what is known technically as a non-uniform access pattern, whereas other products that function differently from VPLEX follow what is known as a uniform I/O access pattern. On the surface, both types of topology seem to deliver active/active storage over distance, however at the simplest level it is only the non-uniform category that delivers true active/active within the non-uniform category which carries some significant benefits over uniform type solutions. The terms are defined as follows: 1. Uniform access All I/O is serviced by the same single storage controller therefore all I/O is sent to or received from the same location, hence the term "uniform". Typically this involves "stretching" dual controller active/passive architectures. 2. Non Uniform access I/O can be serviced by any available storage controller at any given location; therefore I/O can be sent to or received from any storage target location, hence the term "non-uniform". This is derived from "distributing" multiple active controllers/directors in each location. To understand this in greater detail and to quantify the benefits of non-uniform access we must first understand uniform access. Uniform access (non-VPLEX) Uniform Access works in a very similar way to a dual controller array that uses an active/passive storage controller. With such an array a host would USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 26 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 27. generally be connected to both directors in a HA configuration so if one failed the other one would continue to process I/O. However since the secondary storage controller is passive, no write or read I/O can be propagated to it or from it under normal operations since it remains passive. The other thing to understand here is that these types of architectures typically use cache mirroring whereby any write I/O to the primary controller/director is synchronously mirrored to the secondary controller for redundancy. Next imagine taking a dual controller active/passive array and physically splitting the nodes/controllers apart therefore stretching it over distance so that the active controller/node resides in site A and the secondary controller/node resides in site B. The first thing to note here is that we now only have a single controller at either location so we have already compromised the local HA ability of the solution since each location now has a single point of failure. The next challenge here is to maintain host access to both controllers from either location. Let's suppose we have an ESXi server in site A and a second one in site B. If the only active storage controller resides at A, then we need to ensure that hosts in both site A and site B have access to the storage controller in site A (uniform access). This is important since if we want to run a host workload at site B we will need an active path to connect it back to the active director in site A since the controller at site B is passive. This may be handled by a standard FC ISL which stretches the fabric across sites. Additionally we will also require a physical path from the ESXi hosts in site A to the passive controller at site B. The reason for this is just in case there is a controller failure at site A, the controller at site B should be able to service I/O. As discussed in the previous section this type of configuration is known as "Uniform Access" since all I/O will be serviced uniformly by the exact same controller for any given storage volume, passing all I/O to and from the same location. The diagram in Figure 10 below shows a typical example of a uniform architecture. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 27 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 28. Fabric A – Stretched via ISL Fabric B Stretched via ISL A A Front End A SPLIT CONTROLLERS Front End Single Controller Single Controller A A A A Communication Communication (Active) (Passive) Proprietary Cache Cache SITE A (Mirrored) or Dedicated (Mirrored) SITE B A Backend ISL Backend A A A (Mirrored) (Passive) A Figure 10 A typical non-uniform layout As we can see in the above diagram, hosts at each site connect to both controllers by way of the stretched fabric; however the active controller (for any given LUN) is only at one of the sites (in this case site A). While not as efficient (bandwidth and latency) as VPLEX, under normal operating conditions (i.e. where the active host is at the same location as the active controller) this type of configuration functions satisfactorily, however this type of access pattern starts to become sub-optimal if the active host is propagating I/O at the same location where the passive controller resides. Figure 11 shows the numbered sequence of I/O flow for a host connected to a uniform configuration at the local (i.e. active) site. 5 Fabric A – Stretched via ISL Fabric B Stretched via ISL A 1 Front End A A Front End Single Controller Single Controller A A A A SPLIT CONTROLLERS Communication Communication (Active) (Passive) Cache Cache SITE A (Mirrored) All cache mirrored synchronously (Mirrored) SITE B A Backend 2 Backend A A A (Mirrored) (Passive) A 3 4 4 Figure 11 Uniform write I/O Flow example at local site USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 28 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 29. The steps below correspond to the numbers in the diagram. 1. I/O is generated by the host at site A and sent to the active controller in site A. 2. The I/O is committed to local cache, and synchronously mirrored to remote cache over the WAN. 3. The local/active controller’s backend now mirrors the I/O to the back end disks. It does this by committing a copy to the local array as well as sending another copy of the I/O across the WAN to the remote array. 4. The acknowledgment from back end disk returns to the owning storage controller. 5. Acknowledgement is received by the host and the I/O is complete. Now, let's look at a write I/O initiated from the ESXi host at location B where the controller for the LUN receiving I/O resides at site A. The concern here is that each write at the passive site B will have to traverse the link and be acknowledged back to site A. Before the acknowledgement can be given back to the host at site B from the controller at site A, the storage system has to synchronously mirror the I/O back to the controller in site B (both cache and disk), thereby incurring more round trips of the WAN. This ultimately increases the response time (i.e. negatively impacts performance) and bandwidth utilization. The numbered sequence in Figure 12 shows a typical I/O flow of a host connected to a uniform configuration at the remote (i.e. passive) site. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 29 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 30. 1 6 Fabric A – Stretched via ISL Fabric B Stretched via ISL A 2 Front End A A Front End Single Controller Single Controller A A A A SPLIT CONTROLLERS Communication Communication (Active) (Passive) Cache Cache SITE A (Mirrored) All cache mirrored synchronously (Mirrored) SITE B A Backend 3 Backend A A A (Mirrored) (Passive) A 4 5 5 Figure 12 Uniform write I/O flow example at remote site The following steps correspond to the numbers in the diagram. 1. I/O is generated by the host at site B and sent across the ISL to the active controller at site A. 2. The I/O is received at the controller at site A from the ISL 3. The I/O is committed to local cache, and mirrored to remote cache over the WAN and acknowledged back to the active controller in site A. 4. The active controllers’ back end now mirrors the I/O to the back end disks at both locations. It does this by committing a copy to the local array as well as sending another copy of the I/O across the WAN to the remote array (this step may sometimes be asynchronous). 5. Both write acknowledgments are sent back to the active controller (back across the ISL) 6. Acknowledgement back to the host and the I/O is complete. Clearly if using a uniform access device from a VMware datastore perspective with ESXi hosts at either location, I/O could be propagated to both locations perhaps simultaneously (e.g. if a VM were to be vMotioned to the remote location leaving at least one VM online at the previous location in the same datastore). Therefore in a uniform deployment, I/O response time at the passive location will always be worse (perhaps significantly) than I/O response time at the active location. Additionally, I/O at the passive site could use up to three times the bandwidth of an I/O USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 30 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 31. at the active controller site due to the need to mirror the disk and cache as well as send the I/O in the first place across the ISL. Non-Uniform Access (VPLEX IO access pattern) While VPLEX can be configured to provide uniform access, the typical VPLEX Metro deployment uses non-uniform access. VPLEX was built from the ground up for extremely efficient non-uniform access. This means it has a different hardware and cache architecture relative to uniform access solutions and, contrary to what you might have already read about non-uniform access clusters, provides significant advantages over uniform access for several reasons: 1. All controllers in a VPLEX distributed cluster are fully active. Therefore if an I/O is initiated at site A, the write will happen to the director in site A directly and be mirrored to B before the acknowledgement is given. This ensures minimal (up to 3x better compared to uniform access) response time and bandwidth regardless of where the workload is running. 2. A cross-connection where hosts at site A connect to the storage controllers at site B is not a mandatory requirement (unless using VMware FT). Additionally, with VPLEX if a cross-connect is deployed, it is only used as a last resort in the unlikely event that a full VPLEX cluster has been lost (this would be deemed a double failure since a single VPLEX cluster has no SPOFs) or the WAN has failed/been partitioned. 3. Non-uniform access uses less bandwidth and gives better response times when compared to uniform access since under normal conditions all I/O is handled by the local active controller (all controllers are active) and sent across to the remote site only once. It is important to note that read and write I/O is serviced locally within VPLEX Metro. 4. Interestingly, due to the active/active nature of VPLEX, should a full site outage occur VPLEX does not need to perform a failover since the remaining copy of the data was already active. This is another key difference when compared to uniform access since if the primary active node is lost a failover to the passive node is required. The diagram below shows a high-level architecture of VPLEX when distributed over a Metro distance: USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 31 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 32. Front End Front End Front End Front End VPLEX Cluster B Communication A A A VPLEX Cluster A A A A Communication (Active) (Active) (Active) (Active) Cache Cache Cache Cache I P or (Distributed) (Distributed) (Distributed) (Distributed) FC Backend A Backend Backend A Backend A A A SITE A SITE B Figure 13 VPLEX non-uniform access layout As we can see in Figure 13, each host is only connected to the local VPLEX cluster ensuring that I/O flow from whatever location is always serviced by the local storage controllers. VPLEX can achieve this because all of the controllers (at both sites) are in an active state and able to service I/O. Some other key differences to observe from the diagram are: 1. Storage devices behind VPLEX are only connected to each respective local VPLEX cluster and are not connected across the WAN, dramatically simplifying fabric design. 2. VPLEX has dedicated redundant WAN ports that can be connected natively to either 10GB Ethernet or 8GB FC. 3. VPLEX has multiple active controllers in each location ensuring there are no local single points of failure. With up to eight controllers in each location, VPLEX provides N+1 redundancy. 4. VPLEX uses and maintains single disk semantics across clusters at two different locations. I/O flow is also very different and more efficient when compared to uniform access too as the diagram below highlights. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 32 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 33. 4 1 Front End Front End Front End Front End VPLEX Cluster B Communication A A A VPLEX Cluster A A A A Communication (Active) (Active) (Active) (Active) 2 Cache Cache Cache Cache (Distributed) (Distributed) Inter Cluster Com (Distributed) (Distributed) Backend A Backend Backend A Backend A A A 3 3 SITE A SITE B Figure 14 High level VPLEX non-uniform write I/O flow The steps below correspond to the numbers in the Figure 14: 1. Write I/O is generated by the host at either site and sent to one of the local VPLEX controllers (depending on path policy). 2. The write I/O is duplicated and sent to the remote VPLEX cluster. 3. Each VPLEX cluster now has a copy of the write I/O which is written through to the backend array at each location. Site A VPLEX does this for the array in site A, while site B VPLEX does this for the array in site B. 4. Once the remote VPLEX cluster has acknowledged back to the local cluster the acknowledgement is sent to the host and the I/O is complete. Note: Under some conditions depending on the access pattern, VPLEX may encounter what is known as a local write miss condition. This does not necessarily cause another step as the remote cache page owner is invalidated as part of the write through caching activity. In effect, VPLEX is able to accomplish several distinct tasks through a single cache update messaging step. The table below shows a broad comparison of the expected increase in response time (in milliseconds) for I/O flow for both uniform and non- uniform layouts if using an FC link with a 3 ms response time (and without any form of external WAN acceleration / fast write technology). These USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 33 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 34. numbers are additional overhead when compared to a local storage system of the same hardware, since I/O now has to be sent across the link. (Based on 3ms RTT and 2 round trips per IO) SITE A Site B Additional RT overhead (ms) read write read write Full Uniform (sync mirror) 0 12 6 18 Full Uniform (async Mirror) 0 6 6 12 Non-Uniform (owner hit) 0 6* 0 6* * This is comparable to standard synchronous Active/Passive replication Key Optimal Acceptable, but not efficient Sub-optimal Table 3 Uniform vs. non-uniform response time increase Note: Table 3 Only shows the expected additional latency of the IO on the WAN and does not include any other overheads such as data propagation delay or additional machine time at either location for remote copy processing. Your mileage will vary. As we can see in Table 3, topologies that use a uniform access pattern and a synchronous disk mirror can add significantly more time to each I/O, increasing the response time by as much at 3x compared to non-uniform. Note: VPLEX Metro environments can also be configured using native IP connectivity between sites. Using this type of topology caries further response time efficiencies since each and every IO across the WAN only typically incurs a single round trip. Another factor to consider when comparing the two topologies is also the amount of WAN bandwidth used. The table below shows a comparison between a full uniform topology and a VPLEX non-uniform topology for bandwidth utilization. The IO size example is 128KB and the results are also shown in KB. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 34 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 35. SITE A Site B WAN bandwidth used for a 128KB IO read write read write Full Uniform (sync or async mirror) 0 256 128 384 Non-Uniform 0 128* 0 128* * This is comparable to standard synchronous Active/Passive replication Key Optimal Acceptable, but not efficient Sub-optimal Table 4 Uniform vs. non-uniform bandwidth usage As one can see from Table 4, non-uniform always performs local reads and also only has to send the data payload once across the WAN for a write I/O regardless of where the data was written. This is in stark contrast to a uniform topology, especially if the write occurs at the site with the passive controller, since now the data has to be sent once to across the WAN (ISL) to the controller where it will both mirror the cache page (synchronously over the WAN again)as well as mirror the underlying storage again back over the WAN giving an overall 3x increase in WAN traffic when compared to non-uniform. VPLEX with cross-connect and non-uniform mode When using VPLEX Metro with a cross cluster connect configuration (up to 1ms round-trip time) is sometimes referred to as "VPLEX in uniform mode" since each ESXi host is now connected to both the local and remote VPLEX clusters. While on the surface this does look similar to uniform mode it still typically functions in a non-uniform mode. This is because under the covers all VPLEX directors remain active and able to serve data locally, maintaining the efficiencies of the VPLEX cache coherent architecture. Additionally when using cross-connected clusters, it is recommended to configure the ESXi servers so that the cross-connected paths are only standby paths. Therefore even with a VPLEX cross-connected configuration, I/O flow is still locally serviced from each local VPLEX cluster and does not traverse the link. The diagram below shows an example of this: USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 35 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 36. Paths in standby Front End Front End Front End Front End VPLEX Cluster B Communication A A A VPLEX Cluster A A A A Communication (Active) (Active) (Active) (Active) Cache Cache Cache Cache I P or (Distributed) (Distributed) (Distributed) (Distributed) FC Backend A Backend Backend A Backend A A A SITE A SITE B Figure 15 High-level VPLEX cross-connect with non-uniform I/O access In Figure 15, each ESXi host now has an alternate path to the remote VPLEX cluster. Compared to the typical uniform diagram in the previous section, however, we can still see that the underlying VPLEX architecture differs significantly since it remains identical to the non-uniform layout, servicing I/O locally at either location. VPLEX with cross-connect and forced uniform mode Although VPLEX functions primarily in a non-uniform model, there are certain conditions where VPLEX can sustain a type of uniform access mode. One such condition is if cross-connect is used and certain failures occur causing the uniform mode to be forced. One of the scenarios where this may occur is when VPLEX and the cross- connect network are using physically separate channels and the VPLEX clusters are partitioned while the cross-connect network remains in place. The diagram below shows an example of this: USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 36 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 37. Front End Front End Front End Front End VPLEX Cluster B Communication A A A VPLEX Cluster A A A A Partition Communication (Active) (Active) (Passive) (Passive) Cache Cache Cache Cache (Distributed) (Distributed) (Distributed) (Distributed) Backend A Backend Backend A Backend A A A SITE A SITE B Figure 16 forced uniform mode due to WAN partition As illustrated in Figure 16 , VPLEX will invoke the "site preference rule" suspending access to a given distributed virtual volume at one of the locations (in the case site B). This ultimately means that I/O at site B has to traverse the link to site A since the VPLEX controller path in site B is now suspended due to the preference rule. Another scenario where this might occur is if one of the VPLEX clusters at either location becomes isolated or destroyed. The diagram below shows an example of a localized rack failure at site B which has taken the VPLEX cluster offline at site B. Front End Front End Front End Front End VPLEX Cluster B Communication A A A VPLEX Cluster A A A A Communication (Active) (Active) (offline) (offline) Localized Cache Cache Cache Cache (Distributed) (Distributed) I P or FC rack failure (Distributed)(Distributed) Backend A Backend Backend A Backend A A A SITE A SITE B Figure 17 VPLEX forced uniform mode due to cluster failure In this scenario the VPLEX cluster remains online at site A (through VPLEX Witness) and any I/O at site B will automatically access the VPLEX cluster at USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 37 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 38. site A over the cross-connect, thereby turning the standby path into an active path. In summary, VPLEX can use ‘forced uniform’ mode as a failsafe to ensure that the highest possible level of availability is maintained at all times. Note: Cross-connected VPLEX clusters are only supported with distances up to 1 ms round trip time. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 38 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 39. Combining VPLEX HA with VMware HA and/or FT Due to its core design, EMC VPLEX Metro provides the perfect foundation for VMware Fault Tolerance and High Availability clustering over distance ensuring simple and transparent deployment of stretched clusters without any added complexity. vSphere HA and VPLEX Metro HA (federated HA) VPLEX Metro takes a single block storage device in one location and “distributes” to provide single disk semantics across two locations. This enables a “distributed” VMFS datastore to be created on that virtual volume. On top of this, if the layer 2 network has also been “stretched” then a single instance vSphere (including a single logical datacenter) can now also be “distributed” into more than one location and HA enabled for any given vSphere cluster! This is possible since the storage federation layer of the VPLEX is completely transparent to ESXi. It therefore enables the user to add ESXi hosts at two different locations to the same HA cluster. Stretching a HA failover cluster (such as VMware HA) with VPLEX creates a “Federated HA” cluster over distance. This blurs the boundaries between local HA and disaster recovery since the configuration has the automatic restart capabilities of HA combined with the geographical distance typically associated with synchronous DR. ESX Distributed ESX HA Cluster A A A A ESX A A VPLEX WAN VPLEX A A A A Heterogeneous IP IP Heterogeneous Storage VPLEX Storage WITNESS SITE A SITE B Figure 18 VPLEX Metro HA with vSphere HA For detailed technical setup instruction please see the VPLEX Procedure generator - Configuring a distributed volume as well as the " VMware vSphere® Metro Storage Cluster Case Study " white paper found here: USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 39 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 40. http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR- CLSTR-USLET-102-HI-RES.pdf for additional information around: • Setting up Persistent Device Loss (PDL) handling • vCenter placement options and considerations • DRS enablement and affinity rules • Controlling restart priorities (High/Medium/Low) Use Cases for federated HA A federated HA solution is an ideal fit if a customer has two datacenters that are no more than 5ms (round trip latency) apart and wants to enable an active/active datacenter design whilst also significantly enhancing availability. Using this type of solution brings several key business continuity items into the solution including downtime and disaster avoidance as well as fully- automatic service restart in the event of a total site outage. This type of configuration would need to also be deployed with a stretched layer 2 network to ensure seamless capability regardless of which location the VM runs in. Datacenter pooling using DRS with federated HA A nice feature of the federated HA solution is the ability for VMware DRS (Dynamic Resource Scheduler) to be enabled and function relatively transparently within the stretched cluster. Using DRS effectively means that the vCenter/ESXi server load can be distributed over two separate locations driving up utilization and using all available, formerly passive, assets. Effectively with DRS enabled, the configuration can be considered as two physical datacenters acting as a single logical datacenter. This has some significant benefits since it brings the ability to utilize what were once passive assets at a remote location into a fully-active state. To enable this functionality DRS can simply be switched on within the stretched cluster and configured by the user to the desired automation level. Depending on the setting, VMs will then automatically start to distribute between the datacenters (Please read http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTRO-STOR- CLSTR-USLET-102-HI-RES.pdf for more details). USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 40 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 41. Note: A design consideration to take into account if DRS is desired within a solution is to ensure that there are enough compute and network resources at each location to take the full load of the business services should either site fail. Avoiding downtime and disasters using federated HA and vMotion Another nice feature of a federated HA solution with vSphere is the ability to avoid planned downtime as well as unplanned downtime. This is achievable using the vMotion ability of vCenter to move a running VM (or group of VMs) to any ESXi server in another (physical) datacenter. Since the vMotion ability is now federated over distance, planned downtime can be avoided for events that affect an entire datacenter location. For instance, let's say that we needed to perform a power upgrade at datacenter A which will result in the power being offline for 2 hours. Downtime can be avoided since all running VMs at site A can be moved to site B before the outage. Once the outage has ended, the VMs can be moved back to site A using vMotion while keeping everything completely online. This use case can also be employed for anticipated, yet unplanned events. For instance, a hurricane may be in close proximity to your datacenter, this solution brings the ability to move the VMs elsewhere avoiding any potential disaster. Note: During a planned event where power will be taken offline it is best to engage EMC support to bring the VPLEX down gracefully. However, in the event of a scenario where time does not permit (perhaps a hurricane) it may not be possible to involve EMC support. In this case if site A was destroyed there would still be no interruption assuming the VMs were vMotioned ahead of time since VPLEX Witness would ensure that the site that remains online keeps full access to the storage volume once site A has been powered off. Please see the Failure scenarios and recovery using federated HA below for more details. USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 41 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY
  • 42. Failure scenarios and recovery using federated HA This section addresses all of the different types of failures and shows how in each case VMware HA is able to continue or restart operations ensuring maximum uptime. The configuration below is a representation of a typical federated HA solution: STRETCHED VSPHERE CLUSTER (DRS + HA) ESX ESX optional cross connect A A A A A A SITE A SITE B VPLEX VPLEX A A WAN IP IP VPLEX WITNESS Figure 19 Typical VPLEX federated HA layout (multi-node cluster) The table below shows the different failure scenarios and the outcome: Failure VMs at A VMs at B Notes Storage failure at Remain online / Remain online / Cache read miss at site A uninterrupted uninterrupted sire A now incurs additional link latency, cache read hits remain the same as do write I/O response times Storage failure at Remain online / Remain online / Cache read miss at site B uninterrupted uninterrupted site B now incurs additional link latency, cache read hits remain the same as do write I/O response times VPLEX Witness failure Remain online / Remain online / Both VPLEX clusters uninterrupted uninterrupted dial home All ESXi hosts fail at A All VMs are restarted Remain online / Once the ESXi hosts automatically on are recovered, DRS USING VMWARE FAULT TOLERANCE AND HIGH AVAILABILITY WITH 42 VPLEX™ METRO HA FOR ULTIMATE AVAILABILITY