Optimising Service Deployment and Infrastructure Resource Configuration

Data Centre challenges.
• Decrease Operational Costs,
• Maintain Consistent Performance,
• Increase Scale,
• Innovate, and deliver more value.
TCO Performance
Over-provisioning Utilization
Energy Allocation
Management Availability
More Capacity More Complexity
Application Growth Every
2 Years
Data Volume Every 18
Months
Operational Costs
Every
8 Years
Reduction in
Compute Costs
Every 2 Years
2x
Increase in
Management &
Administration1
50% 2x 2x 8x
More Resources
1 – IDC Directions ‘14 - 2014
Source: Worldwide and Regional Public IT Cloud Services 2013-2017 Forecast. IDC (August 2013) http://www.idc.com/getdoc.jsp?containerId=242464

Why we need to understand infrastructure
3
• T-Nova* project demonstrates a 10X
performance improvement when an
Network Traffic Analyzer is landed onto a
machine that is SR-IOV enabled
• However it’s not feasible to manually
place workloads at scale.
• How can we automatically match
workloads to suitable infrastructure?
VNFC Performance - Bytes Per Second
Total Traffic Standard Deployment Enhanced Deployment
Matching workload types to hardware features can improve performance
* http://www.t-nova.eu/

Infrastructure Landscape
Goal: Support setup and run-time
orchestration for optimised service delivery by
defining and maintaining a layered landscape:
• Physical
• Virtual
• Service
Nodes in each layer are enriched by telemetry

Landscaper Overview
Graph representation of physical, virtual, service
layers of infrastructure landscape
• Landscape Nodes have a category:
• Compute, Storage, Network
• Landscape History
• edges have a ‘from time’ and ‘to time’
• Landscape State
• landscape nodes have state nodes
• Data gathered by collectors
• Data export via RESTful API
• json string - networkx
Xeon E5Xeon Phi AES-NI AtomSSD
NVM 10Gb
Virtual
Storage
Object store Video transcode Wordpress ERP
Virtual Network
Virtual
Machine

• Plugin architecture
• Can detect and update based on events
• Current Collectors
• HwLoc (internal components) and CPUinfo (enrich the core/pu attributes)
• OpenStack Heat
• OpenStack Nova
• OpenStack Neutron
• OpenStack Cinder
• Docker Swarm
• OpenDayLight
• Importer (.csv)
Landscaper Collectors

pu
Service Layer
Virtual Layer
Physical Layer
vm
machine
stack
pcidev
bridge
numanode
core
cache
socket/
package
networkvnic
switch
subnet
switch
bridge pcidev
osdev_storage
osdev_network
osdev_network
puCompute Category
Network Category
Storage Category
volume
cache
cache
cache
L3 Cache L2 Cache
L1 Data Cache
L1 Instruction Cache
Heat
Cinder Nova Neutron
OpenDaylight
(cpuinfo)
hwloc + cpuinfo
Service Stack (10x view)

machine numanode bridge pcidev
ID,
NAME,
CATEGORY,
LAYER,
ARCHITECTURE,
OS_NAME,
OS_VERSION,
OS_RELEASE,
OS_INDEX,
ALLOCATION,
PROCESS_NAME,
HW_LOC_VERSION,
DMI_BOARD_VENDOR,
DMI_BOARD_NAME,
DMI_BOARD_SERIAL,
DMI_BOARD_VERSION,
DMIN_BIOS_DATE,
DMI_BIOS_VENDOR,
DMI_BIOS_VERSION,
DMI_SYS_VENDOR,
DMI_CHASSIS_VENDOR,
DMI_CHASSIS_TYPE,
DMI_CHASSIS_ASSET_TAG,
DMI_CHASSIS_SERIAL,
DMI_PRODUCT_NAME,
DMI_PRODUCT_UUID,
DMI_PRODUCT_VERSION,
LINUX_GROUP,
BACKEND,
NODESET,
COMPLETE_NODESET,
ALLOWED_NODESET,
CPUSET,
COMPLET_CPUSET,
ALLOWED_CPUSET,
ONLINE_CPUSET,
COSTS
ID,
NAME,
CATEGORY,
LAYER,
OS_INDEX,
ALLOCATION,
LOCAL_MEMORY,
NODESET,
COMPLETE_NODESET,
ALLOWED_NODESET,
CPUSET,
COMPLET_CPUSET,
ALLOWED_CPUSET,
ONLINE_CPUSET,
ID,
NAME,
CATEGORY,
LAYER,
OS_INDEX,
ALLOCATION,
BRIDGE_PCI,
BRIDGE_TYPE,
PCI_LINK_SPEED,
PCI_BUS_ID,
PCI_TYPE,
DEPTH
ID,
NAME,
CATEGORY,
LAYER,
OS_INDEX,
ALLOCATION,
PCI_SLOT,
PCI_LINK_SPEED,
PCI_BUS_ID,
PCI_TYPE
Compute Meta-Data: Physical Layer

package cache core pu
ID,
NAME,
CATEGORY,
LAYER,
OS_INDEX
ALLOCATION,
CPU_FAMILY_NUMBER,
CPU_VENDOR,
CPU_MODEL_NUMBER,
CPU_MODEL,
CPU_STEPPING,
NODESET,
COMPLETE_NODESET,
ALLOWED_NODESET,
CPUSET,
COMPLET_CPUSET,
ALLOWED_CPUSET,
ONLINE_CPUSET,
ID,
NAME,
CATEGORY,
LAYER,
ALLOCATION,
CACHE_SIZE,
CACHE_LINESIZE,
CACHE_ASSOCIATIV
ITY
NODESET,
COMPLETE_NODES
ET,
ALLOWED_NODESE
T,
CPUSET,
COMPLET_CPUSET,
ALLOWED_CPUSET,
ONLINE_CPUSET,
ID,
NAME,
CATEGORY,
LAYER,
OS_INDEX,
ALLOCATION,
NODESET,
COMPLETE_NODESET,
ALLOWED_NODESET,
CPUSET,
COMPLET_CPUSET,
ALLOWED_CPUSET,
ONLINE_CPUSET,
ID,
NAME,
CATEGORY,
LAYER,
OS_INDEX,
WP,
ALLOCATION,
CPUID_LEVEL,
CPU_CORES,
CORE_ID,
CPU MHZ,
MICROCODE,
VENDOR_ID,
CPU_FAMILY,
APICID,
INTIAL_APICID,
SIBLINGS,
ADDRESS_SIZES,
MODEL,
MODEL_NAME,
STEPPING,
CACHE_SIZE,
CACHE_ALLIGNMENT,
NODESET,
COMPLETE_NODESET,
ALLOWED_NODESET,
CPUSET,
COMPLET_CPUSET,
ALLOWED_CPUSET,
ONLINE_CPUSET,
PHYSICAL_ID,
FPU,
FLAGS,
BOGOMIPS,
CLF_FLUSHSIZE,
Compute Meta-Data: Physical Layer

Enrichment Through Telemetry
13
Snap: a Lightweight modular programmable telemetry system
• Unified namespace, Configurable at run time, Dynamically derived metrics
• Integration of diverse data for analysis
• Calculation of generic node metrics across the stack (e.g. Utilization & Saturation)
Instrumentation Logs Capture Store
Transform &
Prepare
Access

Snap - architecture
• Full stack: motherboards, cpus, memory, disks, operating systems, hypervisor,
guest operating system, hosted applications
• Performant. Scalable. Dynamically reconfigurable. Secure. Extensible. Manageable.

Snap - telemetry
Process PublishCollect
$ go get github.com/intelsdi-x/snap
http://snap-telemetry.io/
Plugin Catalogue (github)

Adaptive telemetry – anomaly detection approach
16
Challenge: Sending all data all the time
• overflow the system with “redundant”
information.
Goal: reduce data transfer while
preserving essence
Approach & Findings:
• Pluggable anomaly detection algorithm
• Increased transmission rate around
outliers only
• Transmissions typically reduced by
>10x
Time elapse (seconds)
%ageutilizationofCPU
Machine 1
Machine 2
Machine 3

Contextual Information
17
• Automatic application of USE methodology
• Ranking & Cost functions
• Supports comparison of service
configurations & generation of deployment
template for specific workloads
Representation of SDI sub-graph including performance

Application to large scale systems
18
• Optimization of Initial placement
• Re-balancing actuations
• Troubleshooting
• Accounting
• Security
• Capacity planning
Using the landscape data it is possible to develop models for:

Network Model for vCDN deployments
Technical challenges:
• Performance of virtualisation technologies,
especially virtualised storage.
• Orchestration of a multi-tenant vCDN service
and infrastructure.
• Optimisation of placement and scaling of
vCDN system.
• Monitoring and repair of the vCDN system.
• Detection and mitigation of impact of “noisy
neighbours”.

1. Load to Capacity
Requirement Mapping
2. Load to Telemetry Mapping 3. Infrastructure Configuration
Optimization
Resource
A
Telemetry
for
Resource A
Infrastructure
BT Workload
Resource
B
Telemetry
for
Resource B
KPI 1
KPI
2
Cost
Resource
A
Telemetry
for
Resource A
Infrastructure
Workload
Resource
B
Telemetry
for
Resource B
KPI 1
KPI
2
Cost
Resource
A
Telemetry
for
Resource A
Infrastructure
Workload
Resource
B
Telemetry
for
Resource B
KPI 1
KPI
2
Cost
Optimization approaches

BT as
Infrastructure
Provider
Content
Operator
End User
Requesting
for Content
Part A: Provider vs Customer
Part B: Provider vs Customer
Provider vs Customer

Landscape Model
24
UK Exchange PointCore SitesMetro SitesMulti-Service
Access Points
Network Switch
Physical Servers
Virtual Machines
Service Stacks
Legend

Content
Provider
consumers
2
3
MSAN
Metro Site
Core
Exchange
Costs
1

Success Criteria
Create a system to:
• model the performance of VNF’s prior to deployment
• learn the configuration of existing networks and predict the impact of topology,
application and infrastructure changes
• improve the placement decisions of Orchestration systems to improve infrastructure
utilization whilst guaranteeing performance and availability SLAs
• put in place remediation rules a priory to failures happening. Ensuring rapid
protection using the minimum of additional resources
• automate the remediation of unexpected/unpredicted failures in a timely fashion
(several minutes).

Optimising Service Deployment and Infrastructure Resource Configuration

Optimising Service Deployment and Infrastructure Resource Configuration

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Optimising Service Deployment and Infrastructure Resource Configuration

Similaire à Optimising Service Deployment and Infrastructure Resource Configuration (20)

Dernier

Dernier (20)

Optimising Service Deployment and Infrastructure Resource Configuration