Jenn Boden, Director of Amazon Corporate IT, discusses how they are planning to move internal mission critical corporate apps to AWS and shares some case studies at the AWS Enterprise Tour - SF - 2010
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Amazon.com migrating internal it apps to AWS - AWS Enterprise Tour - SF - 2010
1. A PRACTICAL APPROACH TO MIGRATING INTERNAL IT APPS TO THE AWS CLOUD Jennifer Boden, Director Amazon IT
2.
3.
4.
5.
6.
7. OUR LEGACY DEPLOYMENT MODEL Internal Employees Internal Applications Running on Internal Servers Internal Network
8. OUR AMAZON VPC DEPLOYMENT MODEL Amazon Internal Network We extend our internal network into the cloud, securely hosting internal applications on EC2 within a VPC. Employees Amazon VPC
18. CASE STUDY 2: BMC REMEDY MID-TIER Amazon EC2 instances hosting part of Remedy mid-tier server fleet spread across three data centers Employees Amazon VPC DC1 DC2 DC3 Load Balancer Amazon Internal Network
19.
20.
Notes de l'éditeur
My customers are internal Amazon employees. Our internal IT apps are probably a lot like yours… Financial Systems: Accounting , Shared Services, Financial Planning & Analysis, Tax HR Systems: Recruiting, on-boarding, training /development, payroll Developer Tools: Service lifecycle mgmt, shared libraries, source control, build & deploy, change mgmt, issue tracking Knowledge Management: Intranet, search, communities, blogs, wiki, collaboration Employee Tools: Laptops, phones, email, calendar, remote access In total, we have about 200+ applications . All of these systems process and store data that we classify as ‘private’ (and I’ll speak to our data classification policy in just a bit)
What’s motivating us to move to AWS? Clearly – it’s about reducing our Total Cost of Ownership Dealing with computing hardware infrastructure isn’t the core competency of our IT shop…hardware vendor relationships, negotiations, purchasing, shipping, receiving, racking, cabling, powering, cooling, securing, etc. Handing the “muck” of hardware provisioning over to a trusted provider sounds good We would prefer to have the ‘easy’ button – provision hardware with the push of a button That all sounds good – and you’ve been hearing that all day. But we found ourselves asking … Is this really just about cost reduction? What else do we get?....And what’s really motivating us to move?
We want to unleash innovation We all know that hiring great engineers is arguably the most important thing you can do. So once you’ve done that…what’s the best thing you can do for them? Empower them to build… Free them to innovate.. We have quickly learned that infrastructure on-demand is a powerful catalyst. When you remove barriers…and make it easy to build… Engineers are more motivated…they’re more inspired With infrastructure on-demand and the freedom to try…they “just do it” Not only do they just do it….they talk about it…and show it to others…then others get excited If it doesn’t work…tear it down – total cost could be less than a pizza
What else do we get? We want to reduce hardware administration overhead Enterprise IT is a lot more than software innovation…we all know this. Operations plays a big part. Dial tone like availability is expected…Operational efficiency is paramount. Heck it’s complex.. At Amazon, we run on leased hardware. We’re swapping out hardware continuously as leases turn over. Imagine what we could have produced had we not been spending so much time on hardware management. Another motivating factor for our move that helps run our operations is AWS Auto Scaling. Auto Scaling allows you to automatically scale your Amazon EC2 capacity up or down according to conditions you define. As your defined thresholds are breached EC2 instances will be launched or terminated as needed. You can seamlessly scale up during demand spikes to maintain performance or scale down automatically during demand lulls to minimize costs. Many of the engineers in our group thought “that’s great, but many of our apps are ‘steady state’…they don’t see significant spikes or troughs.” Regardless, they still need to be available… So we use Auto Scaling to enable automated response to host failure.
Visibility into hardware utilization rates: At Amazon, we look very closely at our hardware utilization rates – hardware held divided by hardware used. We work very hard to understand our hardware utilization patterns. The easiest way for increase utilization rates is to release unused hardware. One of the steps we took to reduce our TCO was move to Zen technology - and it is helping us prepare for our migration to AWS. We’re also leveraging AWS Auto Scaling – to hold just the capacity we need. Just like the bill from your utility company, we’re getting reliable, auditable, metered usage data from the AWS platform sent directly to service owners. Giving direct visibility encourages action and a sense of ownership of the data – and ultimately drives improved software efficiency.
Our starting point – which is probably just like everyone else’s. We are: Classic IT shop with a secure internal network and fixed capacity in owned datacenters Employees running apps Engineering teams deploying and supporting apps within our own firewall – a mix of dedicated hardware and virts (on Zen technology)
The direction we’ve taken is to extend our internal network to the cloud – utilizing a VPC or Virtual Private Connection to maintain our security and privacy standards.
When we started the program last year, we came up with some core tenets or guiding principles for the program. Amazon is an enterprise customer of AWS: we will drive requirements into AWS accordingly. We are a software vendor: wherever possible, we build our tools to benefit all AWS enterprise customers, not Amazon-specific solutions. We will not take any steps backwards on our key metrics: we will meet or exceed our existing availability and latency SLAs when moving to AWS Customer trust is maintained by our strict adherence to enterprise security requirements: we will drive requirements to ensure compliance with enterprise security and governance standards. Amazon values frugality We will reduce the cost of managing our capacity.
There are many customer-facing applications across Amazon that have run for a long time on AWS, and more every month. The following are some of the best practices we’re following in my group – and that we would recommend to others as we go through the ongoing process of moving substantially all of our apps to AWS. Phase1: Pre Migration Readiness Before we started moving over applications, the first thing we did was set up a program infrastructure and hired a Technical Program Manager to run the program and manage dependencies (IT Security, Networking, etc). We also needed a single voice for priorities and requirements into the AWS organization. We also had a strong champion in our VP who continually reinforced the program strategy to all employees. Then we did a rough system assessment – as stated earlier, we have about 200 applications for consideration so we did rough cuts to get a sense of what was ahead of us. For our 3 rd party vendors – we immediately started looking at licensing and AWS certifications. Phase 2: Experiment and Get our hands dirty Start learning and educating yourself. Get an account.. Go. We quickly started using S3 for data backups. Then we identified 2 pilot apps to deploy in EC2 (via a VPC) to understand operational procedures, etc. Phase 3: Phased Migration We’ve set aggressive internal goals around migration for both internal applications and 3 rd party applications to give us momentum on a multi-year phased application approach. In addition, we always look at AWS first for new development. Above all, we’ve looked for places where we can leverage work across the enterprise – making it easy for all Amazonians to deploy to AWS.
Phase1: Pre Migration Readiness Data Classification: Top Secret, Secret, Private, Public Application Criticality (Availability, SLAs): Mission Critical, Business Critical, Business Operational, Administrative Dependencies Compliance Requirements HW Component Usage (Disk, I/O, Memory) Current TCO
We are collaborating with 3 rd party software vendors and AWS business development to…. Adapt license models to the paradigm of elastic capacity Expand AWS support of 3 rd party vendors’ system requirements (Microsoft support of Windows OS) Test AWS hosted systems against vendors’ system performance benchmarks Work with your 3 rd party vendors.
Phase 2: Experiment and Get our hands dirty First thing we did get an account and get going. We started using S3 for backups. Then then we identified 2 pilot apps to deploy in EC2 (via a VPC) to validate latency, understand operational procedures, etc. The pilot apps were considered low risk – they were simple services classified with private data. One was an HR System that generates mailing lists off of reporting hierarchies and the other was a metadata service used by our software build systems. Once we had the two pilot apps running, we decided to move more – which brings us to phase 3.
Phase 3: Phased Migration We continue to refine our application assessments – looking at criticality, compliance requirements. We’ve set aggressive internal goals around migration for both internal applications and 3 rd party applications to give us momentum that are based on a multi-year phased application approach. What we found was that as we learned more, we wanted to share more and make it even easier for teams across Amazon to deploy to AWS.
As an example of leveraging synergies across the organization, first thing we did was make encryption easy. Amazon takes care of the physical security of the data center - it’s our job to encrypt our data. We built a client library that is used to store data according to our own Amazon Security Data Handling Policy. The library is designed to be easy-to-use with minimal effort required from the developers to get all of the functionality that the Security team requires for data handling while maintaining the scalability that many services need in their day-to-day operations. S3 gives us a really simple file storage solution – such automated data backups. We have a simple website to enable users to put/get files and create hyperlinks to them.
Internal web application to host internal video for Amazonians – our internal YouTube Videos include tech talks, presentations, training, and company events Old solution required manual intervention by the audio/video team to encode and post QuickTime videos We had 2 software engineers in our KM organization who wanted to fix the problem. They went off and in 3 weeks time – had completely refactored how employees post and download internal videos. Self-service publication, automatic encoding, and automatic publication
Our Web front end was launched on existing hardware. Videos stream within a Flash-based embedded player - Encoding technology used in Broadcast: FFmpeg (http://www.ffmpeg.org/) Automatic encoding pipeline to re-render legacy and new video hosted within Amazon EC2 Over 900 hours of video re-encoded Ordinarily, 900 hours * 3 hours to encode per video = 112 days With Amazon EC2, we were able to parallelize encoding and finish within one week Storing and serving “unlimited” video using Amazon S3 Massive productivity increase 2 software engineers, 3 weeks, 1 application Engineers empowered to build the solution on their own, no requisition process involved
Five applications running on Remedy v7.1 Mid-tier. The mid-tier is Remedy’s out-of-the box “web tier”. The license model was adapted – not bound to hardware. These 5 web applications all run on the mid-tier and interact with the BMC Remedy AR System application tier.
We were comfortable in our own firewall. We needed everyone to be comfortable in the cloud. Get security and auditors involved early on. Everyone needs to understand our Access control policies and how we handle data security. We changed the question from “is it secure to run in the cloud” to “what do we need to feel secure in the cloud?” Make it easy. You need to invest time. I already talked about how we made encryption easy. We also looked to integration of our own software deployment system. When we started, Amazon’s software deployment system and infrastructure automation tools weren’t fully integrated with EC2. We want to enable Amazon service owners to easily and securely migrate their applications to EC2 and to take full advantage of AWS’s cloud management offerings, including the new Autoscaling capability. Service owners wishing to move to the cloud will be able to simply click on a “Move to EC2” button, and the new user interface walks them through a few simple steps to set up their EC2 configuration, create an autoscaling group, and execute their first cloud deployment.