2. INTRODUCTION
BRAD ADAIR
▸ Director of Infrastructure Services at IQ Innovations, LLC.
▸ Have been working in IT for 12+ years in various areas ranging
from desktop support to system administration to management.
▸ AWS Certified Solutions Architect
▸ Have been working heavily in AWS for about 2.5 years.
▸ Email: brad@adair.tech
▸ Twitter: @bpadair
3. INTRODUCTION
APPLICATION MANAGEMENT IN AWS
▸ Public cloud in general, and AWS in particular are
changing the way that we think about infrastructure and
the way we manage the applications that run on that
infrastructure.
▸ Less permanence, more ephemeral and temporary.
▸ More purpose built and dedicated resources.
▸ Less “make it fit”
5. PERFORMANCE
WHAT DO WE MEAN?
▸ What do we mean when we talk about performance?
▸ Getting as much power as possible?
▸ Getting just enough?
▸ What about growth?
6. PERFORMANCE
GENERAL GUIDANCE
▸ Use Trusted Advisor to find (somewhat) obvious
performance issues.
▸ Things like over-utilized instances, excessive security
group rules, and cache-hit ratio can be found here.
▸ Plan for performance to scale, not grow.
▸ Monitor, monitor, monitor.
7. PERFORMANCE
DATABASES
▸ Need special consideration.
▸ RDS, Dynamo, EC2 instance.
▸ If using EC2, use provisioned IOPS, and RAID-0 volumes.
▸ Do not put databases on EFS instances.
▸ Replication - yes/no - where?
8. PERFORMANCE
CASE-STUDY: IQ INNOVATIONS
▸ Two data centers and a public cloud provider.
▸ All Centos running on ESXi.
▸ MySQL database.
▸ Apache, Tomcat, Grails stack on app servers.
▸ 1 clients configuration: 8 servers dedicated to MySQL, 14 app servers, 1 NFS server, 2 utility
servers.
▸ Performance was terrible.
▸ Average app response time: ~600ms
▸ Average end-user response time: ~4s
▸ Constantly running out of memory and restarting
▸ Nowhere to grow
9. PERFORMANCE
CASE STUDY: IQ INNOVATIONS
▸ Moved to AWS. Eliminated the collocation space and other cloud provider.
▸ Still running MySQL and Centos.
▸ Databases moved to RDS. Application servers moved to EC2.
▸ Same client configuration: 6 RDS instances for databases, 4 app servers, 1 utility server,
EFS to replace SAN.
▸ Performance improved dramatically:
▸ App response time: ~80-100ms
▸ End-user response time: ~1-2s
▸ No more memory issues.
▸ Cost savings of about 50%.
10. SECURITY
HAVEN’T WE BEEN DOING THIS FOREVER?
▸ Yes, and a lot of existing knowledge still applies.
▸ You still need smart policies.
▸ Your application still needs to protect against common attack vectors.
▸ Some things to change with a move to AWS, however.
▸ You are no longer responsible for physical security.
▸ You are no longer responsible for hypervisor security or patching.
▸ Depending on the service you may not even be responsible for OS
security and patching.
11. SECURITY
BEST PRACTICES
▸ Trusted advisor. This is a recurring theme.
▸ Bastion hosts
▸ VPC
▸ Peering
▸ Security groups
▸ NACL
▸ COMMON SENSE!
12. SECURITY
COMMON MISTAKES
▸ Console access for everyone.
▸ Overly permissive policies.
▸ Lack of two factor authentication.
▸ Overly/Publicly exposed access keys.
▸ Access key rotation.
13. RELIABILITY
EASIER AND HARDER SIMULTANEOUSLY
▸ A lot of the work for reliability is done for you.
▸ It is a mistake to put too much trust in this.
▸ The tools are there, but you have to choose to use them.
▸ Architecture matters.
14. RELIABILITY
CRITICAL THINGS TO UNDERSTAND
▸ Availability zones
▸ Regions
▸ Difference between AZs and Regions and how they should
be used together.
▸ Replication of different services.
▸ Availability SLAs.
▸ S3 storage classes/levels
15. RELIABILITY
CASE STUDY: CONFIDENTIAL COMPANY
▸ Pre-AWS:
▸ Only in one data center due to cost.
▸ Had clients nationwide, but all resources were
centralized.
▸ Had to have 4 or more hours of downtime for
deployments
▸ Many SPoF including storage and network. Redundancy
was attempted but not done well.
16. RELIABILITY
CASE STUDY: CONFIDENTIAL COMPANY
▸ AWS Setup:
▸ Multiple VPCs spread across multiple regions to provide redundancy
and be close to customers.
▸ VPC peering to reduce single points of failure.
▸ MAZ RDS instances for databases.
▸ EFS for network based storage.
▸ Replication of databases across regions.
▸ IaC templates for VPCs to allow for rapid reproduction in other regions.
17. SCALABILITY
WHAT IS SCALABILITY
▸ Scalability is about more than simply adding more
resources in response to increased demand.
▸ Scalability needs to include both scaling up and scaling
down.
▸ Goal is to maximize user experience while minimizing cost.
18. SCALABILITY
DIFFERENT APPROACH
▸ Provision with small spikes in mind, but not growth.
▸ Scale to growth.
▸ Schedule scale downs and scale ups.
▸ Auto-scaling is your friend.
▸ Monitor, monitor, monitor. Don’t alert, alert, alert.
19. SCALABILITY
COMMON MISTAKES
▸ Over-provisioning.
▸ Reserving too quickly.
▸ Planning for vertical scaling as opposed to horizontal.
▸ Provisioning for growth instead of planning for it.
▸ Manual intervention.
▸ Under analysis of utilization.