Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
VMworld - sto7650 -Software defined storage @VMmware primer
1. Software Defined Storage at
VMware Primer
Duncan Epping (@DuncanYB)
Lee Dilworth (@LeeDilworth)
An introduction in to the world of Software Defined Storage
Twitter: #STO7650
3. The Software Defined Data Center
Compute Networking Storage
Management
• All infrastructure services virtualized:
compute, networking, storage
• Underlying hardware abstracted,
resources are pooled
• Control of data center automated by
software (management, security)
• Virtual Machines are first class citizens
of the SDDC
• Today’s session will focus on one
aspect of the SDDC - storage
3
7. The Hypervisor is the Strategic High Ground
SAN/NASx86 - HCI Object Storage
VMware vSphere
Cloud Storage
7
8. Storage Policy-Based Management – App centric automation
8
• Intelligent placement
• Fine control of services at VM level
• Automation at scale through policy
• Need new services for VM?
• Change current policy on-the-fly
• Attach new policy on-the-fly
Virtual Machine Storage policy
Reserve Capacity 10 GB
Availability 2 Failures to tolerate
Limit IOPS 200 IOPS
Snapshot Every Hour
Replication Synchronous
Deduplication Enabled
Local / SAN / NAS Devices
Storage Policy-Based Management
vSphere
Virtual SAN Virtual Volumes VAIO I/O Filters
10. Virtual SAN, what is it?
10
Hyper-Converged Infrastructure
Distributed, Scale-out Architecture
Integrated with vSphere platform
Ready for today’s vSphere use cases
Software-Defined Storage
vSphere & Virtual SAN
11. But what does that really mean?
VSAN network
Generic x86 hardware
VMware vSphere & Virtual SAN Integrated with your Hypervisor
Leveraging local storage resources
Exposing a single shared datastore
Virtual SAN
11
12. VSAN is the Most Widely Adopted HCI Product
In my experience VMware
solutions are rock solid…we’re
ready to nearly double our Virtual
SAN deployment.
It really did work as
advertised…the fact that I have
been able to set it and forget it is
huge!
5000
Customers choose VMware HCS
12
13. VSAN is the Most Widely Adopted HCI Product
13
14. Virtual SAN Use Cases
14
VMware vSphere + Virtual SAN
End User
Computing Test/Dev
ROBOStagingManagementDMZ
Business
Critical Apps DR / DA
15. Tiered Hybrid vs All-Flash
All-Flash
100K IOPS per Host
+
sub-millisecond latency
Caching
Writes cached first,
Reads from capacity tier
Capacity Tier
Flash Devices
Reads go directly to capacity tier
SSD PCIe Ultra DIMM
Data
Persistence
Hybrid
40K IOPS per Host
Read and Write Cache
Capacity Tier
SAS / NL-SAS / SATA
SSD PCIe Ultra DIMM
Virtual SAN
15
17. Provisioning a VM? Define a policy first…
Virtual SAN currently surfaces multiple storage capabilities to vCenter Server
17
What If APIs
New
Capabilities
in VSAN 6.2
18. Enterprise Availability in a few clicks
18
Today
vSphere & Virtual SAN
Overview
• Set “failures to tolerate” for high availability
• Virtual SAN provides rack awareness
– Allowing for full rack failures through smart
placement mechanism
• Or ”simply” add a second site and stretch
your Virtual SAN across
– Of course within the defined boundaries
• And if that isn’t sufficient, you can always
replicate to a 3rd site!
– With or without the use of Site Recovery
Manager
Rack 1 Rack 2 Rack 3 Rack 4
witness
5ms RTT, 10GbE
vSphere & Virtual SAN
vmdk
vmdk
witness
vSphere & VSAN
Site Recovery Manager
vmdk
19. Deduplication and Compression for Space Efficiency
• Nearline deduplication and compression per disk group level.
– Enabled on a cluster level
– Deduplicated when de-staging from cache tier to capacity tier
– Fixed block length deduplication (4KB Blocks)
• Compression after deduplication
– If block is compressed <= 2KB
– Otherwise full 4KB block is stored
Beta
esxi-01 esxi-02 esxi-03
vmdk vmdk
vSphere & Virtual SAN
vmdk
All Flash
19
20. RAID-5/6 (Inline Erasure Coding)
• When Number of Failures to Tolerate = 1 and Failure Tolerance Method = Capacity RAID-5
– 3+1 (4 host minimum)
– 1.33x overhead for RAID-5 instead of 2x compared to FTT=1 with RAID-1
• When Number of Failures to Tolerate = 2 and Failure Tolerance Method = Capacity RAID-6
– 4+2 (6 host minimum)
– 1.5x overhead for for RAID-6 instead of 3x compared to FTT=2 with RAID-1
RAID-5
ESXi Host
parity
data
data
data
ESXi Host
data
parity
data
data
ESXi Host
data
data
parity
data
ESXi Host
data
data
data
parity
All Flash
20
24. But what about traditional storage?
Goals
• Make VMs a first class citizen on traditional storage
• Provide customers the option to use per-VM data operations on storage systems
• Build framework to offload ANY per-VM data operations to the storage system
• Minimal disruption to existing processes or infrastructure
24
I would like per VM data services for that as well…
VS
25. Software-Defined Storage and Availability
25
vSphere
Storage Capabilities
Snapshot Every Hour
Replication Synchronous
Availability RAID-1
IOPS Limit 150
Local / SAN / NAS Devices
Storage Policy-Based Management
vSphere
Virtual SAN Virtual Volumes VAIO I/O Filters
26. vSphere Virtual Volumes
• VVols
– Virtual machine objects stored natively on the
array
– No Filesystem on-disk formatting required
• There are five types of VVols:
– CONFIG – vmx, logs, nvram, log files, etc
– DATA – VMDKs
– MEM – Snapshots
– SWAP – Swap files
– Other – Vendor solution specific
vSphere Web Client View
26
27. Virtual Volumes Primer
The Basics
• Virtualize SAN and NAS devices
• Virtual disks are natively represented on arrays
• Enables VM granular storage operations using
array-based data services
• Policy enables automated consumption at scale
• Supports existing storage I/O protocols
• Included with vSphere Standard and up
virtual datastore(s)
protocol
endpoint(s)
protocol
endpoint(s)
storage container(s)
VASA Provider –
Published Capabilities
Snapshots
Deduplication
Quality of ServiceStorage System
VASA
Data Path
VMware vSphere
27
28. Storage Container
Storage Containers
• Logical storage constructs for grouping of
virtual volumes.
• Typically defined and setup by storage
administrators on the array in order to
define:
– Storage capacity allocations
– Define capabilities for a pool
• Logically partition or isolate VMs with diverse
storage needs and requirement (or security)
• Minimum one storage container per array
• Maximum depends on the array
28
virtual datastore(s)
protocol
endpoint(s)
protocol
endpoint(s)
storage container(s)
Storage System
Data Path
VMware vSphere
29. VASA Provider (VP)
• Software component developed by
Storage Array Vendors
• ESXi and vCenter Server connect
to VASA Provider
• Provides storage awareness
services
• VASA Provider can be implemented
within the array’s management
server or firmware
– Can be deployed in HA mode,
when vendor has implemented this!
• Responsible for creating Virtual
Volumes
– Required for powering on VMs!
29
virtual datastore(s)
protocol
endpoint(s)
protocol
endpoint(s)
storage container(s)
Storage System
VASA/SPBM
Control Path
VMware vSphere
VASAProvider
Control Path
30. Protocol Endpoint
Protocol Endpoints
• Access points that enables communication
between ESXi hosts and storage array systems.
– Part of the physical storage fabric
– Created by Storage administrators
Scope of Protocol Endpoints
• Compatible with all SAN and NAS Protocols:
- iSCSI
- NFS v3
- FC
- FCoE
• A Protocol Endpoint can support any one of the
protocols at a given time
Why Protocol Endpoints?
• Single access point to avoid LUN limits
30
virtual datastore(s)
protocol
endpoint(s)
protocol
endpoint(s)
storage container(s)
Storage System
Data Path
VMware vSphere
31. VM Provisioning Workflow
vSphere Admin
1. Create Virtual Machines
2. Assign a VM Storage Policy
3. Choose a suitable Datastore
Under the Covers
Provisioning operations are
translated into VASA API calls in
order to create the individual
virtual volumes.
Under the Covers
Provisioning operations are
offloaded to the array for the
creation of virtual volumes on the
storage container that match the
capabilities defined in the VM
Storage Policies
31
virtual datastore(s)
protocol
endpoint(s)
protocol
endpoint(s)
storage container(s)
Storage System
VASA/SPBM
Data Path
VMware vSphere
VASAProvider
32. VVoL replication
32
Goals
• Replicates Virtual Volumes instead of entire lun / exports / datastores
• Ability to group VVols into Replication Groups
• Array-based replication used to replicate VVoLs / Replication Groups
• Leverages VASA 3 APIs, expose storage replication capabilities, match containers to policies
VS
VM VM VM VM
33. Virtual Volumes – Continued Support from the Storage
Ecosystem
33
“[…] storage operations will be
fundamentally simplified.”
Laura Guio
VP and Business Line Executive, Storage
Systems VMware Alliance Executive
“[…] as a key design partner, we’ve
worked very closely with VMware…”
Tim Russell
VP of Product Management for Data Center Solutions
“[…] one of the most important
storage technology advancements[…]”
Ravi Chalaka
VP Solutions Marketing
“This is a huge shift from the LUN-
centric model of today”
Peter Waugh
Marketing Director of Storage and Servers
“[…] will transform the way you
consume storage in your VMware
environment.”
Craig Nunes
VP of Storage Marketing
“[…] brings a relevant solution
...reducing cost and complexity in
virtual environments”
Christopher Ratcliffe
Senior VP of Marketing, Core Technologies
35. vSphere APIs for IO Filtering
• Add new 3rd party software-
based data services seamlessly
in vSphere
• Virtual (software based) data
services controlled by Policy
• Enables secure filtering of a
VM’s IO
• Caching and replication initial
use cases
• Storage agnostic to different
architecture
35
New
Storage Capabilities
Replication Synchronous
IOPS Limit 150
Local / SAN / NAS Devices
vSphere
Storage Policy-Based Management
vSphere
Virtual SAN Virtual Volumes VAIO I/O Filters
36. Why vSphere APIs for IO Filtering?
• To Enable our Ecosystem
– Empower 3rd parties to add functionality to vSphere
– Use cases that VMware cannot address on our own
• To Provide Customer Choice
– Customers want more choice for their infrastructure
– Enabling scenarios that they cannot get from VMware
– Add (Virtual) Data Services to storage systems that may not offer them inbox
36
37. IO Path without vSphere APIs for IO Filtering
37
H2 2015/Q1 2016
Physical Device
Kernel World
vSCSI Backend
File System Layer
File Device Layer
User World
vSCSI Device
38. IO Path with vSphere APIs for IO Filtering
38
H2 2015/Q1 2016
Physical Device
Kernel World
vSCSI Backend
File System Layer
File Device Layer
User World
IO Filter(s)
VAIO Framework
vSCSI Device
3rd Party
Software
Data
Services
Data service
executes the filter
against the IO –
no data copying
needed
2
IO return directly
to be committed to
physical device
3
VAIO framework
detects a filter policy
before IO committed
1
Storage
Policy Based
Management
39. Strong Ecosystem
Today’s use cases:
• Caching
– Write thru and write back Cache
– Distributed Cache Management
• Replication
– Synchronous access to VM IO Event Queue
– Full IP Sockets Interface for Replication
Tomorrow’s use cases:
• Encryption?
– vSphere VMCrypt
• Quality of Service?
– Storage IO Control
• and....
39
The Software Defined Data Center
In SDDC, all three core infrastructure components, compute, storage and networking are virtualized.
Virtualization software abstracts underlying hardware, while pooling compute, network and storage resources to deliver better utilization, faster provisioning and simpler operations.
The VM becomes the centerpiece of the operational model, providing automation and agility to repurpose infrastructure according to business needs.
Today we will focus on Storage, which has been growing at an extremely rapid pace and is a fast changing aspect of the datacenter!
What we are trying to achieve is simplify datacenter operations, and our primary focus will be storage and availability. Storage is we all know traditionally has been a painpoint in many data centers, high cost and usually does not provide the performance and scalability one would want. By offering our customers choice we aim to change the world of IT, start a new revolution. But we cannot do this by ourselves, we need the help of you, the consultant / admin / architect.
vSphere is perfectly positioned for this as it abstracts physical resources and can provide them as a shared pooled construct to the administrator.
Because it sits directly in the I/O path, the hypervisor (through the notion of policies associated with virtual machines) has the unique ability to make optimal decisions around matching the demands of virtualized applications with the supply of underlying physical infrastructure.
On top of that the platform provides you the ability to assign service level agreements to workloads which will reduce the operational complexity and as such significantly reduces the chances of making mistakes.
This is where it all starts, without Storage Policy Based Management many of the products and features we are about to talk about would not be possible! If there is one thing you need to remember when you walk away today, then it is Storage Policy Based Management. it is the key enabled for Software Defined Storage and Availability!
Storage Policy Based Management is composed of the following:
Common Policy framework Across Virtual Volumes, Virtual SAN and VMFS-based Storage
Common API Layer for Cloud Management Frameworks (vRealize Automation, OpenStack), Scripting users (PowerShell, JavaScript, Python, etc.) and Orchestration Platforms (vCO)
Represents Application and VM Level Requirements
Consumes Capabilities Published via VASA
SPBM provides the following benefits for customers:
Stable, Robust Automation Platform
Intelligent placement and fine control of services at the VM level
Shields Automation and Orchestration Platforms from infrastructure changes by abstracting the Underlying Storage Implementation
What is VSAN in a nutshell…
So, it follows a hyper-converged architecture for easy, streamlined management and scaling of both compute and storage. Hyper-converged represents a system architecture – one where compute and persistence are co-located. This system architecture is enabled by software.
It is a SDS product. A layer of software that runs on every ESXi host. It aggregates the local storage devices on ESX hosts (SSD and magnetic disks) and makes them look like a single pool of shared storage across all the hosts.
VSAN has a distributed architecture with no single point of failure.
VSAN goes a step further than other HCI products – VMware owns the most popular hypervisor in the industry. Strong integration of VSAN in the hypervisor means that we can optimize the data path and we ensure optimal resource scheduling (compute, network, storage) according to the needs of each application. At the end, better resource utilization means better consolidation ratios, more bang for your buck! Resource utilization is one part of the story. The other part is the Operational aspects of the product.
VSAN has been designed as a storage product to be used primarily by vSphere admins. So, we put a lot of effort in packaging the product in a way that is ideal for today’s use cases of virtualized environments. Specifically, the VSAN configuration and management workflows have been designed as extensions of the existing host and cluster management features of vSphere. That means easy, intuitive operational experience for vSphere admins. It also means native integration with key vSphere features unlike any other storage product out there, HCI or not.
Looking at real customer deployment data, VMware has become the leading HCI vendor in less than 2 years. By the end of 2015, Virtual SAN surpassed 3,000 total customers. This puts us in a leadership position with the most customers.
These customers range from large enterprises to over 80 federal government customers to mid-sized organizations looking to standardize on a simple IT infrastructure for all of their virtual machines.
We were very conservative when we initially launched VSAN – after all, this was customers data we were talking about.
However, even though we were conservative, our customer were not.
There are plenty of other use cases. The ones listed on the slide are the most commonly used. It is fair to say that Virtual SAN fits in most scenarios:
Of course customers started with the test/dev workloads, just like they did when virtualization was first introduced
Business Critical Apps – We have customers running Exchange / SQL / SAP and billing systems on Virtual San
Virtual SAN is included in the Horizon Suite Advanced and Enterprise, so VDI/EUC is a natural fit.
As a DR destination VSAN is also commonly used as you can scale out and the cost is relatively low compared to a traditional storage system
Isolation workloads also something that VSAN is often used for, both DMZ and Management clusters fit this bill
Of course there is also ROBO, VSAN can start small and grow when desired, both scale-out and scale-up, and with 6.1 we even made things better by introducing a 2 node, but we will get back to that!
Virtual SAN enables both hybrid and all-flash architectures.
Irrespective of the architecture, there is a flash-based caching tier which can be configured out of flash devices like SSDs, PCIe cards, Ultra DIMMs etc. The flash caching tier acts as the read cache/write buffer that dramatically improves the performance of storage operations.
In the hybrid architecture, server-attached magnetic disks are pooled to create a distributed shared datastore, that persists the data. In this type of architecture, you can get up to 40K IOPs per server host.
In All-Flash architecture, the flash-based caching tier is intelligently used as a write-buffer only, while another set of SSDs forms the persistence tier to store data. Since this architecture utilizes only flash devices, it delivers extremely high IOPs of up to 90K per host, with predictable low latencies.
Deployed, configured and manage from vCenter through the vSphere Web Client
Radically simple
Configure VMkernel interface for Virtual SAN
Enable Virtual SAN by clicking Turn On
First thing you do before you deploy a VM is define a policy. VSAN has what if APIs so it will show what the “result” would be of having such a policy applied to a VM of a certain size. Very useful as it gives you an idea of what the “cost” is of certain attributes
Also note that a number of new capabilities were introduces in VSAN 6.2, and these will be discussed in more detail later on.
Stretched storage with Virtual SAN will allow you to split the Virtual SAN cluster across 2 sites, so that if a site fails, you would be able to seamlessly failover to the other site without any loss of data. Virtual SAN in a stretched storage deployment will accomplish this by synchronously mirror data across the 2 sites. The failover will be initiated by a witness VM that resides in a central place, accessible by both sites.
All Flash Only.
“High level description”
Dedupe and compression happens during destaging from the caching tier to the capacity tier. You on a cluster level and deduplication/compression happens on a per disk group basis. Bigger disk groups will result in a higher deduplication ratio. After the blocks are deduplicated they will be compressed. A significant saving already, combined with deduplication and the results achieved can be up to 7x space reduction, of course fully dependent on the workload and type of VMs.
“Lower level description”
Compression (LZ4) would be performed during destaging from the caching tier to the capacity tier. 4KB is the block size for deduplication. For each unique 4k block compression would be performed and if the output block size is less than or equal to 2KB, a compressed block would be saved in place of the 4K block. If the output block size is greater than 2KB, the block would be written uncompressed and tracked as such. The reason is to avoid block alignment issues, as well as reduce the CPU hit for decompressing the data which is greater than compression for data with low compression ratios. All of this data reduction is after the write acknowledgement.
Deduplication domains are within each disk group. This avoids needing a global lookup table (significant resource overhead), and allows us to put those resources towards tracking a smaller and more meaningful block size. We purposefully avoid dedupe of “write hot data” In the cache, or decompressing uncompressible data significant CPU/memory resources can avoid being wasted.
Note: Feature is supported with stretch clusters, ROBO edition
Sometimes RAID 5 and RAID 6 over the network is also referred as erasure coding. This is done inline; there is no post-processing required.
Since VMware has a design goal of not relying on data locality, this implementation of erasure coding does not bring any negative results by distributing the RAID-5/6 stripe across multiple hosts.
In this case RAID-5 requires 4 hosts at a minimum as it uses a 3+1 logic. With 4 hosts 1 can fail without data loss. This results in a significant reduction of required disk capacity. Normally a 20GB disk would require 40GB of disk capacity, but in the case of RAID-5 over the network the requirement is only ~27GB. There is another option if higher availability is desired
Use case Information:
Erasure codes offer “guaranteed capacity reduction unlike deduplication and compression. For customers who have “no thin provisioning policies” have data that is already compressed and deduplicated or have encrypted data this offers “known/fixed” capacity gains.
This can be applied on a granular basis (Per VMDK) using the Storage Policy Based Management system.
30% Savings.
Note: All Flash VSAN only.
Note: Not supported with stretched clusters
Note: this does not require the cluster size be a multiple of 4, just 4or more.
Point-in-time view of the state of the cluster
Geared to hardware – ensuring that everything is functioning as expected (disks, network, objects, components)
We about to wrap up this session, I want to leave you with one more thing. VSAN is being extended to serve as a generic storage platform. One which in addition to the traditional virtualization use cases of VMs and VSCSI disks, VSAN can also serve storage though new abstractions: lightweight block drivers (perhaps using the NVMe protocol), files, and REST APIs. That’s storage that can be made available to individual hosts or be shared according to the protocol semantics across many hosts and application instances in the infrastructure. besides that VMware has been prototyping a distributed file system which leverages Virtual SAN as their core storage provider and serves storage capacity in an easy way and distributed fashion to thousands of clients. Yes the future is bright, and this is just the beginning.
Icons: openstack – pivotal cloud foundry, nginx, mesos , docker
With that I would (click) like to thank you and open the floor for questions
VVols – making the VMDK a first class citizen on the storage system.
Think about how we handle storage right now – create a LUN on array, present LUN to vSphere, place file system on LUN, then use it to store files. We’re dealing with LUN/Volumes & files. Let’s change the granularity to work with VMDKs.
Per-VM data services on storage systems
Our goal is to provide customers an option to use per-VM data services on the storage systems
So, we want to build a framework where we can offload ANY per-VM data operations to the storage array. What are per-VM operations? Create, Delete, Clone, Snapshot, Replicate, Zero, Reclaim Space, etc
And we want to do it with minimal disruption to existing processes or infrastructure
The problem with achieving this goal is that storage arrays don’t think in terms of VMDKs; they think LUNs or volumes. So there’s granularity mismatch between vSphere and storage.
We want the storage systems to also think in terms of VMDKs
Other-Vvol is a generic type of Vvol for solution specific objects i.e HBR side car file, CBRC files, etc
vSphere Virtual Volumes is management & integration framework that delivers a more efficient operational model for external storage.
Virtual Volumes virtualizes SAN and NAS devices into logical pools of capacity, called Virtual Datastore.
Then, Virtual Volumes represents virtual disks natively on the underlying physical storage. This makes the virtual disk the primary unit of data management at the array level.
It becomes possible to execute storage operations with VM granularity and to provision native array-based data services to individual VMs.
To enable efficient storage operations at scale, Virtual Volumes uses vSphere Storage Policy-Based Management
Both Virtual Volumes and SPBM are offered as standard features of the vSphere platform, from a pricing and packaging standpoint.
Container:
Size based on array capacity
Max number of SCs depend only on the array ability
Size of SC can be extended
Can distinguish heterogeneous capabilities for different VMs (Virtual Volumes) provisioned in that SC
LUN:
Fixed size mandates more number of LUNs
Needs a FileSystem
Can only apply homogeneous capability on all VMs (VMDKs) provisioned in that LUN.
Managed by In-band FileSystem commands
Storage Container Discovery Process
Storage admin sets up Storage Container with desired capacity
Desired Capabilities are applied to the Storage Containers
VASA Provider discovers Storage Container and reports to vCenter
Any new VMs that are created will subsequently be provisioned in the Storage Container
Storage awareness such as capabilities, status etc
Why the concept of a PE?
In today’s LUN-Datastore world, the datastore has two purposes – It serves as the access point for ESXi to send IO to. It also serves as storage container to store many VM files (VMDKs). This dual-purpose nature of this entity poses several challenges – You do not need as many access points as you need the storage itself. Because of the rigid nature of the size of the datastore, and the fewer number of datastores, you have to combine several VMs together to be stored in the same datastore even if the VMs have different requirements.
So, how about we separate out the concept of the access point from the storage aspect? This way, we can fewer number of access points to several number of storage entities. And hence the introduction of PE.
Dell Equallogic VASA Provider2.5.1
Fujitsu ETERNUS VASA Provider2.0
Hitachi Storage Provider for VMware
3PARVASAPROVIDER2.1
IBM Storage Provider for VMware VASA2.0.0
NEC Storage VASA Provider2.1
NetApp VASA Provider for Clustered Data ONTAP6.0
SANBlaze VASA Provider7.3
Add value to VVol/VSAN as part of larger SDS vision of extending to 3rd party data services
- Extend SDS vision to ecosystem .. Done through VVOl, VAIO and VADP
VAIO is newest
Partners are VAIO Integrated
If you look at the world of IT today, reality is that in most organizations people care more about their infrastructure than their workloads. However, if you look through the eyes of the application owner, they all feel they are the center of the universe. They all feel the world revolves around them and that they are most important.
And in a way that is true. IT exists today because of these applications. Because there is a business need. Instead of focusing on the infrastructure we should aim to focus on the workloads. Policy Based Management and Software Defined Storage allows you to do exactly that. Focus on workloads, ensure you meet their requirements and wishes and this is reality today. Virtual SAN, Virtual Volumes and the vSphere APIs for IO Filtering enable exactly that.
With that I would (click) like to thank you and open the floor for questions