2. Today’s session
• Dell’s OpenStack Cloud Solution design tenants
– Co-engineering
– Reference Architectures
– Extensions
• Intel: Why and how performance matters in OpenStack
– The hidden impacts if performance is not managed
– Noisy neighbors
– Service Compute Units, Workload placement
3. OpenStack for the enterprise
• Repeatable, validated end-to-end
• Fully supported software, infrastructure
• Massively scalable, elastic infrastructure
• Solutions to get you agile
• Efficient deployment, rapid time to value
• Performance tuned configurations
• Extended lifecycle support
• Expertise to complement in-house skills
4. Snowflakes and Houses of Cards
Unstable
Not Repeatable
Not Optimized
Hard to Support
Impossible to Modify
It WILL collapse!One-of-a-Kind!
5. • Address the gaps around OpenStack for
Enterprise use cases via a jointly engineered
solution
• Deliver enterprise-grade, certified, highly-
scalable, secure, and fully supported cloud
infrastructure solutions
• Provide a fulfillment, deployment, and support
experience like commercial software
• Support specific enterprise use cases for cloud
applications, software defined storage,
development and test, cloud applications
Dell Red Hat Cloud Solutions
Make
OpenStack
the defacto
OPEN
Cloud
Solution
in the
Enterprise
6. Solution components
Dell PowerEdge
Dell EqualLogic
Dell Networking
Dell reference
architecture
Red Hat Enterprise Linux
OpenStack Platform
Dell Professional Services
Red Hat Professional Services
Dell ProSupport
Dell Red Hat
Enterprise Cloud
Solution Extensions
7. Co-Engineered Reference Architecture
• Validated and Prescriptive
– Server configurations
– Network Design and configurations
– Storage configurations
– System sizing recommendations
• Robust, Reliable, Repeatable
– Blueprint for success
– Capture and package joint expertise
– Use cases, best practices
– Accelerate your time to value
No more houses of cards
9. ∞ Investing in OpenStack
∞ Vendors you can count on!
∞ Validated with Dell Ref. Architecture*
∞ Deployment guidance*
∞ Sizing guidance and recommendations*
∞ Professional services and training
∞ Deployment services and tools
∞ Collaborative support
Dell and Intel Service Assurance Administrator
* Under development
11. Software Defined Infrastructure – Service Assurance
Automate On Standard IA Hardware
STATIC, MANUAL
INFRASTRUCTURE
Storage Network Compute
Resource Pool
Orchestration Software
Infrastructure AttributesInfrastructure Attributes
Governance,
Policy
Cooling
Power Trust and
Compliance
Application
A
Application
B
Application
C
Application
D
Intelligent service monitoring
and operational event alerts
SLA Requests
Monitoring
UtilizationPerformance LocationThermalsCompliancePower
Orchestration Software
12. Trust
Performance
Public
Application Workloads & Data
Private
Physical and virtualized machine
management
Challenges in Running Enterprise and Telco Workloads in
the Cloud
CPU
IOPS
Memory
Memory b/w
CPU cache
Instruction set
Capability
Capacity
Consumption
Availability
Resilience
DATA
REGULATIONS
LEGACY
WORKLOADS
Telemetry is hidden
COMPLIANCE
Is my workload running
on trusted infrastructure?
How do I
avoid noisy
neighbors?
Will my
workload get
expected
compute cycles?
13. Intel® Service Assurance Administrator enhances
OpenStack Cloud Services
Nova
Scheduler
Plug-In
Compute Node
Machine
Flavor
Creator
Analysis &
Remediation
Engine
Service
Assurance
Controller
Monitoring
Engine
Capacity
Insight
REST API
Web
Admin
Console
Compute Node
Compute Node Agent
The Challenge
Workloads do not get predictable performance
Limited trust of multi-tenant nodes
Grey machines (unhealthy state) are not avoided
Workload scheduling is less than optimal
www.intel.com/assurance
The Solution
• Intelligent Workload Placement based on ability of nodes
to meet service level objectives
• Compute node capacity and utilization metrics normalized
for use across generation of CPUs
• VM performance monitoring and assurance with cache
contention and ‘Noisy Neighbor’ detection
• Reporting of trust of VMs and nodes – BIOS, Hypervisor
whitelisting and attestation
14. Why Intel® Service Assurance
Administrator?
Match workloads to platforms,
based on capability and
capacity
Find and address software-
defined infrastructure issues
Service assurance for
trust-attestation
• Performance metric
infrastructure capacity
and utilization
• Cache Contention
• System health
• Resource metering
• Boot time trust
attestation of nodes
• Whitelisting
Intel® Architecture Platform Monitoring and Control
“What size virtual machine must
I use for my app?”
“My app is slow sometimes –
How do I diagnose? “
Do I have ‘noisy neighbor’
VMs?”
“Are my VMs running on trusted
platforms?”
15. Automation
Enhance OpenStack* to provision and
monitor machine flavors with specified
service levels
Automated software-defined
infrastructure
Intel® Service Assurance
Administrator Efficiency
Integrate with IT operations tools to
determine probable root cause, report,
and help remediate issues
Efficient service assurance and
administration
Agility
Run workloads with confidence on
software-defined infrastructure
Agile business service deployment
Enhanced Machine Flavors
16. Key Challenges in Cloud
Performance AssuranceHow big a
VM do I
need for
my
workload?
What resources
should be
reserved, how
much burst
capacity?
What metric
do I use to
specify and
measure
performance?
What is the
performance
capacity of the
node?
Intel® Service Assurance Administrator defines Service Compute
Unit as a performance metric for use by cloud administrators
What is the
performance capacity
of the machine
flavor?
Am I getting specified
performance without
interference from
‘noisy neighbors’?
Linux
Compute node
VM VMVM VM
KVM
AppAppApp
Services
VM
17. Monitoring Reporting
Platform monitoring to detect and
avoid ‘noisy neighbors’
Cache Contention
Memory b/w
Resource Metering
Noisy Neighbor’s impact on VM
performance
SharedL3
Cache
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Shared Infrastructure
SharedL3
Cache
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Hypervisor
OS Application & Services
VMVM VMVM
18. What is a Service Compute Unit (SCU)?
18
Provisioning Orchestration
Match workloads to platforms,
based on capability and capacity
Service Compute Units
Performance metric for cloud
infrastructure capacity and utilization
vCPU
Approach
Intel® Service
Assurance
Administration
Resource
Allocation
Static,
processor
dependent
SCU – dynamic,
processor
independent
Machine
Portability
Not portable
across
processor
family or
generations
SCU is portable
across generations
of Intel CPU and
across processor
family
Performance
Service Level
Monitoring
None
Service Assurance of
SCU units
SCU = function of (frequency, throughput,
instruction set efficiency, cache size)
1 GCEU = 2.5 SCUs (Approximate)
1 EC2 = 2.5 to 3 SCUs (Approximate)
SCUs can be set to define a floor (guaranteed
capacity) and ceiling (burst capacity)
19. How much performance capacity are available on my
host?
19
For each compute node, performance
capacity status for OS and VMs are
monitored continuously, displaying how
much resource is available, e.g. SCU and
number of cores available
20. How much cache contention are there in the compute
nodes?
20
Using Cache Monitoring Technology,
console displays LLC Cache Contention
status for the OS and VM on each compute
node
21. Extending flavors with VM bursting and assurance
capabilities
21
VM will only be created on
nodes that have been trust-
attested to be safe during
boot
VM will reserve 1 SCU of compute
performance, and burst up to 2 SCU
based on available compute
resources
22. Extending flavors with trust and core-pinning capabilities
VM will reserve one
dedicated core from the
CPU compute resource
VM will only be created on
nodes that have been trust-
attested to be safe during
boot
23. Finding out the performance consumption level for each
compute node at the OS level
Allocating SCU for OS
level consumption of
compute resource
24. Finding out VM core-pinning status and metrics
Core Pinning, VM received
dedicated core from the
CPU
VM SLO status include
number of cores assigned
and VM uptime
25. Learning more about each VM SCU utilization metrics
Real Time VM SCU
utilization and core
pinning metrics
26. Conducting probable cause analysis of “Noisy Neighbors”
problems
Cloud Administrator can take remedial action (e.g. evacuate a VM to a different node)
once Intel® SAA has identified Noisy Neighbors (VMs aggressively using shared
compute resources in the platform, like CPU cache) and VMs that are affected.
27. Intel® Service Assurance Administrator Key Feature
SummaryFeature Customer pain point Solution
Compute throughput
metering
How do I know how many VMs and
workloads I can put on a compute node?
How do I specify how much compute
capability is needed by my VM?
SCU, service compute unit, a portable compute
performance metric
• Characterize node’s compute capacity
• Specify required VM capacity
• Measure utilization
Host OS and App
Services Assurance
How do I know that applications or OS
services such as Ceph or backup are not
creating issues for my VMs?
Enable creation of Assurance SLA to grant
compute quota to Host Apps and OS services
Contention Aware
Scheduling
How do I detect ‘noisy neighbor’ VMs and
affected VMs?
How do I automate the VM scheduling
process?
Intelligent Workload Placement based on
ranking of node’s ability to meet service level
objectives, using telemetry
Grey Machine Avoidance How do I avoid systems that might not be
healthy, e.g. running hot with fan failure?
Gather system health info via IPMI protocols
and use as a metric in workload placement
Trusted Compute Pool –
Policy enforcement
How do I find out that my workloads are
running on trusted nodes?
Use Intel® Trusted Execution Technology for
boot time attestation of each nodes FW/SW
28. Contact the Dell Red Hat OpenStack Solutions team
• openstack@dell.com
Engage a Dell Solution Center (contact your Dell Rep)
• dell.com/solutioncenters
Dell Cloud resources sites
• Dell.com/openstack
• Dell.com/redhat
• youtube.com/Dell
Intel SAA Resources Site
• www.intel.com/assurance
• Contact Intel SAA team by clicking “Contact Us”
Resources & Contacts
32. Intel Confidential — Do Not Forward
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations
that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction
sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any
optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.
Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer
to the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
32
Optimization Notice
33. Dell and Red Hat Enterprise Cloud Solutions
Prescriptive, practical, right-sized choices fit for you
PoC System
• Dev/test
• Concept testing
• Prototyping
• Rapid deploy
Pilot System
• Mid-scale production
• Scale testing
• Sizing options, 1-3 racks
• 10G networking
• High Availability
• Optimized Inktank Ceph
Scale out System
• Data center scale
• Production workloads
• PowerEdge R, C Series
• 10G networking
• High Availability
• Optimized Inktank Ceph
• Tailored designs
• Customized integrations
with OpenStack
ecosystem technologies
34. Shift your focus from infrastructure to service
IT silos
Server
Storage
Networking
Software
Management
Management
Open standards-based hardware
Infrastructure control
Enterprise applications & workloads
Multi-vendor,cross-platform
unifiedmanagement
Orchestration stack