After launching several thousand EC2 instances in the cloud, we've learned that the key to running an IT fleet successfully on AWS is enforcing operational and economic discipline. As AWS service consumption grows, operational costs and overhead shouldn't grow linearly. Instead, IT should encourage new tenants that migrate from data centers to AWS to slowly shift toward a self-service delivery model and adopt the DevOps operations model. Creating and offering an AWS Cloud operations service catalog enables organizations to efficiently take full advantage of AWS' flexibility and modularity. T-Mobile, whose journey to AWS Cloud management started more than 2 years ago, uses a service catalog to enforce operational discipline in the Cloud. Their catalog is custom crafted for each Cloud-based IT workload. This session provides insight into the AWS Cloud operations strategy and its transformation, the creation of a Cloud operations service catalog, and how this approach supports reliable engineering on AWS. Sponsored by Accenture.
AWS Competency Partner
2. The What?
IT needs to stimulate new tenants that migrate
from data center to AWS to slowly shift toward a
self-service delivery model and embrace
DevOps way of operations.
The How?
Creating and offering an efficient AWS Cloud
operations service catalog becomes critical in
enabling flexible and modular composition of
AWS services in large organizations.
What to Expect from the Session?
Approximately 40% of Public Cloud adoption can suffer in large IT organizations
due to a lack of operational and economic discipline. Operating discipline becomes a
must, though operational costs and overhead must not linearly grow with consumption.
Customer Journey
T-Mobile’s AWS journey started more than three years ago and, today, T-
Mobile runs efficient cloud operational discipline, custom-crafted for each
cloud-based IT workload.
3. Who are we?
Accenture is a leading global professional services
company, providing a broad range of services and solutions
in strategy, consulting, digital, technology and operations
across 19 industry verticals.
We’ve worked on more than 13,000 cloud computing
projects for clients, including three-quarters of the Fortune
Global 100, and are home to more than 18,000
professionals trained in cloud computing.
Accenture was named a leader in the IDC MarketScape:
Worldwide Cloud Professional Services 2016 Vendor
Assessment, demonstrating the most mature strategies
and capabilities among 17 other vendors.
Accenture/AWS Business Group- helps organizations run
their businesses in the cloud and take advantage of the
benefits of an “As-a-Service” model, where IT and business
services are delivered on demand via the AWS Cloud.
4. History of IT Infrastructure Management
• Large enterprises have traditionally structured their IT
infrastructure and operations organization based on IT Service
Management principles.
• ITIL is an implementation of IT Service Management and
comprises a set of processes, procedures, tasks, and checklists.
The original version of ITIL was a manual published in the 1980s
to help government IT departments in the UK to establish best
practices. Current version is ITIL v3, which emphasizes 5 lifecycle
phases of Service Management.
Process and procedure-
heavy framework for
managing IT
infrastructure
Advocates strict
governance and control
over IT infrastructure
assets
Inflexible in adapting to
the service & economic
profile of public cloud
infrastructure
Often has led to
establishment of
domain-bounded
infrastructure teams
Why does this not scale in the era of Public Cloud infrastructure?
5. IT Operations model – Revisited
IT organizations must provide a set of foundational cloud capabilities, while allowing tenants to
operate with freedom and agility to realize the power of AWS Cloud. Capabilities must constantly
undergo review to keep pace with the constantly changing technology climate.
Constitution
(vs. Control)
Self-Service
Catalog
Automated
Compliance
Operating
Model &
Organization
To avoid a traditional request/response relationship between cloud consumers and Operations, IT
needs to provide an a la carte self-service catalog to unlock the scale and breadth of AWS Cloud.
Infrastructure compliance remains top of mind for large IT organizations. With the elasticity and
magnitude of cloud resources, investments in automating periodic compliance become vital.
AWS blurs the lines between compute, network, and storage, and DevOps blurs the lines between
Development and IT Operations. The next generation Cloud Operations organization does not scale
with compartmentalized teams connected via processes. Instead, multidisciplinary teams with an
experimentation mindset are vital.
6. Cloud Tenant JourneyCLOUDMATURITY
(ABILITYTOSELF-OPERATE)
TIME
Tenant C
(Low Maturity)
TENANT
MANAGED
CENTRALLY
MANAGEDBYIT
Cloud Incubation
& Onboarding
Cloud Learning
& Adaptation
Run: Leveraging
Power of Cloud
Tenant B
(Mid Maturity)
Tenant A
(High Maturity)
Foundational Enterprise Cloud Services
provided centrally by IT
7. Modern Cloud Operating Model
Functions: How we organize
ourselves to deliver services
Processes: How we execute the
work
Interfaces: How we interact to
deliver consistent services
Governance: How we make,
sponsor, and enforce the right
decisions
Roles & Org Structure: Who is
accountable for doing the work
Performance Metrics: How we
measure effectiveness
Tools: What enabling technology
we use to deliver productivity and
agility to service execution
Technology Stack
Management & Control
Service Providers
Service Delivery
Supplier Relationship Management
Service Operations
Service Assembly
Catalog Mgmt. Dev. Ops Service Provisioning
Issue Management Technology Operations
Service Catalog
Management
Service
Measurement
Development
Lifecycle
Provisioning
Transition Planning
Service Validation
Change
Management
Program & Project
Mgmt.
Release and
Deployment
Incident
Management
Request Fulfillment
Event Management
Capacity
Management
Asset & Config
Management
Metering
Monitoring & Control
Access
Management Security Support
Problem
Management
Availability
ManagementService Desk
Strategic Supplier
Management
Supplier Contract
Management
Operational Supplier
Management
Business Impact
Management
Service Invoice
Review
Sales / Relationship
Management Service On-boarding
Service Analysis Account Management
Business Customers
Service Architecture
Service Strategy
Strategy Generation
Investment Planning
Demand & Supply
Management
Architecture and Design
Service Planning
Service Level Management
Service Definition
Service Lifecycle
Management
Portfolio Mgmt.
Performance
Management
Process
Quality
Management
Knowledge
Management
Finance
Service
Financial
Management
Finance &
Accounting
Ops
Workforce Management
Talent
Management
HR Operations
& Support
Security & Risk Management
Information
Security Mgmt.
Physical Asset
Security
Risk Mgmt. &
Controls
Service
Continuity
Mgmt.
IT / Tool Management
Application
Management
Strategy Management
8. CloudOps – From Full Service to Self-Service
Full Service CloudOps Services Catalog
Integrate
Accept to
Operational control
Monitor
Measure
Operational metrics
Operate
Correct errors and
issues
Optimize
Change and
improve
DevOps
Application Support
and Deployment
Operational
Tier 2 Infra and
Database support
White Glove
Services Offerings that can be provide
if Tenant requires
DevOps Services
Application Operations, Middleware,
Database, and Development
Common Services
Baseline Self-Service
Transitional Automation Fulfillment
Self-service AWS provisioning, Self-healing Service Management, Automatic Security and OS updates
End-State Services Catalog
9. CloudOps Services Catalog
Integrate
Accept to Operational control
Monitor
Measure & Report
Operate
Correct errors and issues
Optimize
Change and improve
• Service onboarding
• Provisioning
• AMI inventory maintenance
• Architecture & Design
• Cloud environment
development
• Stack blueprinting
• Security Strategy
• Testing & Validation
• Quality Assurance
• SLA/OLA Strategy
• Operational
Playbooks/Runbooks
• Operational Automation
• Business Continuity Planning
• DevSecOps process
• Decommissioning
• Monitoring & Detection
– Server and Network
– Storage
– User Experience
– Application health
– Security
• SLA monitoring & reporting
• Utilization reporting
• Compliance reporting
• Cost tracking
• Notifications & Integration
• Trends analysis
• Event & Alarm Management
• Asset Management
• Issue/Incident management
• Error correction
• Auto-healing
• Escalations
• Service Requests
• Cloud consumption
management
• OS patching
• Security Management
• Backup & Restore
• Problem management
• Issue pattern recognition
• Auditing
• Change requests
• Security Reviews
• Security whitelisting
• SLA/OLA review
• Knowledge Management
• Quality Management (cont.
improvement)
• Resource Optimization
• Security process improvement
• Oversight and 3rd-party
controls
• Communication Management
10. CloudOps – Expected Outcomes
Fault Tolerance
Graceful Degradation
Zero Downtime Deployments
System Performance
Uptime
Availability
Strategic Drivers Measurements Tactics
Early Detection
Self Healing
Knowledge
Service Monitoring
Resolution Time
Release Time
Agility
Process Cycle Efficiency
Continuous Integration and Delivery
Experimentation in Production
Efficient Processes
Business Outcomes
• Improved Quality
• Reduced Opex
Spend
• Improved
Customer
Satisfaction
• Improved
Employee
Experience
11. Towards Intelligent Automated CloudOps
Experimentation in
Production
Lean Processes
System
Performance and
Hardening
Infrastructure
as Code
Toil Elimination
Platform
Automation
Service Monitoring
and Insights
Service-Level
Objectives
Zero-downtime
deployments
Shared Incentives
and Blameless
Culture
Build and
Configuration
Management
Fault Tolerance
Release
Engineering
Auto ScalingSelf Healing
Organization, Talent, and Culture
Continuous Integration and
Delivery
Intelligent and Automated
Operations
Software Defined
Infrastructure
Platforms and Architecture
EmbracingRisk
Capabilities Principles