9. Example Data Center: Where Do We Put All of This on AWS?
DB
(Master)
DB
(Slave)
Back-ups on
tapes
Web
server
Web
server
App serverApp server App server
SAN
NAS file
server
File system
disks
LDAP server
10. Example Data Center: Where Do We Put All of This on AWS?
Web
server
Web
server
App serverApp server App server
Amazon Elastic
File System
Elastic Load
Balancing
Elastic Load
Balancing
Amazon
Elastic
Block Store
Amazon RDS
(Master)
Amazon RDS
(Standby) Backups to
Amazon S3
or Glacier
AWS Directory
Service
36. S3 Standard S3 Standard –
Infrequent Access
Amazon Glacier
Active data Archive dataInfrequently accessed data
Milliseconds Minutes to HoursMilliseconds
$0.021/GB/mo $0.004/GB/mo$0.0125/GB/mo
Choice of storage classes on Amazon S3
37. S3: Sharing web files
172.31.0.0/16
sa-east-1a sa-east-1b sa-east-1c
[Migration process]
In any migration project to AWS, there are four phases that all customers go through in order to be successful and derive optimal value from their migration to the cloud.
First, you must do migration preparation including doing an initial portfolio analysis to build a directional business case for migration. We also suggest to start with a Migration Readiness Assessment to understand your readiness to migrate to the cloud at scale.
Next, a detailed portfolio discovery of all in-scope IT assets needs to be performed to obtain the data necessary to create a migration plan for assets to be migrated. This plan determines the migration strategy and waves of the applications targeted for AWS. The detailed discovery data is also used to further refine the business case.
After migration planning is completed, the actual execution of the migration takes place in which applications are designed, migrated, and validated.
And finally, once the applications are in AWS, you have to operate the applications in the cloud, including going through an optimization phase to maximize the benefits derived from hosting applications in AWS. Optimizing activities can address cost, performance, security or resiliency concerns for a specific application stack.
The discussion today focuses on the first three phases of this process - let’s look at some of the common challenges we see experienced by migration customers…
[AWS Application Discovery Service]
Application Discovery Service is a discovery tool that captures details of your on-premises infrastructure including server details such as OS name, ip addresses, mac address and time series utilization data such as CPU utilization, memory utilization, and IOPS throughput. In addition, it also captures the network connections and process information. You can further analyze the captured information to evaluate the cost of migration to AWS to plan migration.
Application Discovery Service supports three ways to collect data:
Agent-based discovery can be performed on any given virtual machine or physical server that is part of your on-premises environment (Windows or Linux). The agent is deployed on each virtual machine or physical server separately and runs only in the user space. The Agent supports all major Linux versions such as Red Hat, CentOS, Ubuntu etc. and Windows Server versions such as 2003, 2008, 2012 etc.
Agentless Discovery Connector designed for VMware environment that is used by deploying an OVA file on a VMware host that allows it to collect data from each VM server. Agentless discovery process is OS agnostic and is able to read data for any VM irrespective of the OS. Agentless discovery does collect less information than Agent-based collection, however.
Import from existing data sources – You are also able to import data that you may already have (such as data from your CMDB or ServiceNow data) or that you may have already collected using a partner tool (such as RISC Networks).
Data from all of the sources is visible directly in AWS Migration Hub console, as well as, via an API or an export file.
Right-sizing your compute resources is one dimension of understanding your total cost of ownership (TCO). Use the EC2 instance recommendation feature of Migration Hub when you want an understanding of your projected EC2 costs. We also offer a more detailed assessment, including optimizations for Microsoft licensing and storage costs, using TSO Logic
For data collected with Agent-based discovery, data can be automatically exported to directly to Amazon Athena where it may be queried directly or visualized using Amazon QuickSight. Using Amazon QuickSight provides you the flexibility to analyze the data to view the network dependencies between servers and identify the processes that are running on a given server (we will talk more about this later).
From the vast amount of portfolio data collected, we extract key pieces of information to form a migration plan.
Connections is a critical piece. It helps determine application and server dependencies and also help identify access patterns.
Performance metrics is another factor to help right size resources in the cloud.
Another aspect is using service naming conventions, tags and other host metadata to help identify patterns and group servers and applications for migration.
The portfolio discovery process shall aim at completeness and usability. If additional data are required, you need to determine tools and actions to complete discovery.
[AWS Migration Hub]
AWS Migration Hub is designed to be a single location to track the progress of application migrations across AWS and partner solutions. With AWS Migration Hub, you can discover your on-premise portfolio using AWS Application Discovery Service and/or import discovery data from existing sources (such as your CMDB) or from other discovery tools. You can view sever information and group servers into applications to help plan your migration. You can track the progress of your migration using the migration tools that best fit your needs (AWS and AWS partner).
AWS Migration Hub can manage migrations in any AWS Region that has the necessary migration tools available. AWS Migration Hub itself runs in the US West (Oregon) Region.
Migration Hub is available to all AWS customers at no additional charge. You only pay for the cost of the migration tools you use, and any resources being consumed on AWS.
A traditional on-premises or data center–based infrastructure might include a setup like this. Here we'll walk you through just one example of how an arrangement like this could be set up and run on AWS instead.
What happens when you turn this data center infrastructure into an AWS infrastructure?
Servers, such as these web servers and app servers, are replaced with Amazon EC2 instances that run all of the same software. Because Amazon EC2 instances can run a variety of Windows Server, Red Hat, SUSE, Ubuntu, or our own Amazon Linux operating systems, virtually all server applications can be run on Amazon EC2 instances.
The LDAP server is replaced with AWS Directory Service, which supports LDAP authentication and allows you to easily set up and run Microsoft Active Directory in the cloud or connect your AWS resources with existing on-premises Microsoft Active Directory.
Software-based load balancers are replaced with Elastic Load Balancing load balancers. Elastic Load Balancing is a fully managed load balancing solution that scales automatically as needed and can perform health checks on attached resources, thus redistributing load away from unhealthy resources as necessary.
SAN solutions can be replaced with Amazon Elastic Block Store (EBS) volumes. These volumes can be attached to the application servers to store data long-term and share the data between instances.
Amazon Elastic File System (EFS), currently available via preview, could be used to replace your NAS file server. Amazon EFS is a file storage service for Amazon EC2 instances with a simple interface that allows you to create and configure file systems. It also grows and shrinks your storage automatically as you add and remove files, so you are always using exactly the amount of storage you need. Another solution could be to run an NAS solution on an Amazon EC2 instance. Many NAS solutions are available via the AWS Marketplace at https://aws.amazon.com/marketplace/.
Databases can be replaced with Amazon Relational Database Service (RDS), which lets you run Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server on a managed AWS-based platform. Amazon RDS offers master, read replica, and standby instances.
Finally, Amazon RDS instances can be automatically backed up to Amazon S3, thus replacing the need for on-premises database backup hardware.
Notes: Ou seja, uma reserva de c5.4xlarge fornece 720 horas mensais desse tamanho, ou 1440 horas de 2xlarge, ou 360 de 8.xlarge
Each storage option has a unique combination of performance, durability, cost, and interface
What is EBS? Create a volume, attach to an EC server. That’s it.
What is EBS? Create a volume, attach to an EC server. That’s it.
EBS volumes are bound to an AZ. Restriction number 1.
EFS....!
EFS is a shared volume service, based on the NFS protocol.
Because the combination of a bucket, key, and version ID uniquely identifies each object, Amazon S3 can be thought of as a basic data map between "bucket + key + version" and the object itself. Every object in Amazon S3 can be uniquely addressed through the combination of the web service endpoint, bucket name, key, and optionally, a version.
For example, in the URL http://doc.s3.amazonaws.com/2006-03-01/AmazonS3.html, "doc" is the name of the bucket and "2006-03-01/AmazonS3.html" is the key.
S3 is great. Unlimited storage, with a very low cost. But it is accessible via HTTP/HTTPS mainly.
Across the board, S3, SIA and Glacier all offer the same 11 9’s durability, where AWS stores data redundantly across multiple facilities and storage devices, and the services automatically perform data integrity check in the background to guard against potential data corruption. I work with many customers who archive data by storing two copies of tape either in the same building or one copy on-site and one remote. When we discuss durability, which is a big deal for many archive customers, many are accustomed to thinking in number of “copies” and found the 11 9’s a bit non-intuitive. To bridge that, we did a thought experiment with a large studio where, at a high level, we walked them through how we derived the 11 9’s using a Markov chain model where we modeled failures from storage device, server, network, availability zone, etc. We asked them to estimate their two-copy tape durability using a similar concept and they estimated ~4 9’s for two copies in a single building or ~5 9’s for two copies in separate locations. This helped them realize that Glacier’s 11 9’s durability can be thought of as 6 to 7 orders of magnitude more durable than two copies of tape and helped us bridge the conversation.
Across the board, S3, SIA and Glacier all offer the same 11 9’s durability, where AWS stores data redundantly across multiple facilities and storage devices, and the services automatically perform data integrity check in the background to guard against potential data corruption. I work with many customers who archive data by storing two copies of tape either in the same building or one copy on-site and one remote. When we discuss durability, which is a big deal for many archive customers, many are accustomed to thinking in number of “copies” and found the 11 9’s a bit non-intuitive. To bridge that, we did a thought experiment with a large studio where, at a high level, we walked them through how we derived the 11 9’s using a Markov chain model where we modeled failures from storage device, server, network, availability zone, etc. We asked them to estimate their two-copy tape durability using a similar concept and they estimated ~4 9’s for two copies in a single building or ~5 9’s for two copies in separate locations. This helped them realize that Glacier’s 11 9’s durability can be thought of as 6 to 7 orders of magnitude more durable than two copies of tape and helped us bridge the conversation.
When you view our object storage as a portfolio of storage classes, we provide 3 storage options with different performance characteristics and price points.
S3 Standard which is our high performance object storage - very active, hot workloads.
available in milliseconds
starts at 2.1 cents/GB/month depending on the region
S3 Standard - Infrequent Access shares the same millisecond access times as S3 Standard, but
designed for data you plan to access maybe a few times a year or what we think of as “active archive”.
S3-IA costs $0.0125/GB/mo, and then you pay a nominal fee for requests.
Glacier is that cold archival tier
access latency from minutes to hours, depending on the retrieval option you choose,
storage costs $0.004/GB/month.
EBS disks can be attached to RDS DB servers or EC2 servers. Keep in mind that EC2 auto-scaling machines will die and their disks will be lost. Any information that must be saved is better placed in a database table or S3 bucket.
EBS disks attached to a database server are permanent. But even so, a database backup is always needed for production systems.
For those legacy systems that are not auto-scaling friendly, and must write information to local disks without loosing it, EFS is the solution. Any generated information can be saved to a database table, a S3 bucket, or a EFS share.
Many of you have applications…
Databases – any type of them – want big and fast disks. EBS is the case.
Observation: EFS is not a database friendly solution.
Based on our experiences running the databases behind Amazon.com, We introduced Amazon RDS to help customers run their relational databases.
Lower TCO because we manage “the muck”
Get more leverage from your teams
Focus on the things that differentiate you
Built-in high availability and cross region replication across multiple data centers
Available on all engines, including base/standard editions, not just for enterprise editions
Now even a small startup can leverage multiple data centers to design highly available apps with over 99.95% availability.
There are lots of different choices for your database engine on RDS. Each of these engines operate differently, offer different functionality, and have different licensing requirements.
Everyone has their favorite engine and they use them for specific purposes.
On the commercial side we have Oracle and Microsoft SQL Server
On the open source side we have MySQL, PostgreSQL, and MariaDB
And in its own category we have Amazon Aurora which is a My SQL compatible relational database built to take advantage of many of the properties that exist with modern cloud computing.
Relational databases have existed … but standard relational technology is not the best fit for cloud platforms – hence Aurora – we also kept in mind opensource and compatibility, easy for customers to mograte – hence MySQL and now Postgres compatibility
--------
MariaDB - https://en.wikipedia.org/wiki/MariaDB
MariaDB – Fork of the MySQL database. Led by the original developers of MySQL after concerns when it was acquired by Oracle and that the project might become a closed project. Works to maintain high compatibly with MySQL. Also has features to support non-blocking operations and progress reporting.
Talk Track:
Together with the new edition of Amazon Aurora we are also announcing a new database monitoring feature called Performance Insights. Performance Insights is designed to help customers quickly assess whether there are any performance bottlenecks in their relational database workloads and where to take action. Performance Insights collects detailed database performance data through light weight mechanisms and uses the data to drive an intuitive graphical interface that provides a simple and complete view of recent database performance.
Performance Insights is the answer:
Let’s you see your overall instance load
Drill down by SQL Statement, by time, or by calling host.
For questions:
Roadmap: will be rolling out incrementally to all RDS engines over 2018
First release will be on Postgres compatible edition of Aurora followed by the MySQL compatible just after re:Invent
Q1 will introduce support for Mysql, MariaDB and Postgres
Q2 will introduce support for SQL Server and Oracle
See the preview in the demo grounds
Feature is free.
Will be supported on all instances except micros
Notes: Ou seja, uma reserva de c5.4xlarge fornece 720 horas mensais desse tamanho, ou 1440 horas de 2xlarge, ou 360 de 8.xlarge
The computation and memory capacity of a DB instance is determined by its DB instance class. You can change the CPU and memory available to a DB instance by changing its DB instance class; to change the DB instance class, you must modify the DB instance.
Here are the DB instance classes available through Amazon RDS:
Micro instances (db.t1.micro): An instance sufficient for testing but should not be used for production applications.
Standard - Current Generation (m3): Second generation instances that provide more computing capacity than the first generation db.m1 instance classes at a lower price.
Memory Optimized - Current Generation (db.r3): Second generation instances that provide memory optimization and more computing capacity than the first generation db.m2 instance classes at a lower price.
Burst Capable - Current Generation (db.t2): Instances that provide baseline performance level with the ability to burst to full CPU usage.
You can change from one database instance type to another. There will be a brief availability event during the changeover.
You can increase the amount of storage available to your database instance on demand for the MySQL, Oracle, and PostgreSQL database engines. This change is performed online, without an availability impact. Amazon Aurora automatically grows the database size on demand.
Compute scaling in Aurora is by creating a new read replica, and do a failover - can be done in 10-15 seconds versus a host replacement type operations which is a few minutes
Aurora storage volume is 10GB and we do auto scaling,
Let’s take some time to dive deeper into a particular use case around scaling the database master instance.
Here is a quick example of what it would take to scale your RDS database instance up or down.
Here we have screen shots of what you do when you want to change the instance size.
Choose the Modify option from the Instance Actions menu of the RDS console.
Then choose what you want your new DB Instance Class to be.
Finally, determine if you want to apply the change immediately or not. If you do not apply the change immediately then the change will be scheduled to occur during the preferred maintenance window that you defined when creating the database.
Keep in mind that when you are applying the change immediately you could incur some downtime as the instance size is changed so be aware of what your applications that are accessing the database can tolerate in regards to downtime.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.DBInstance.Modifying.html
6 COPIES – 6 copies but actually LESS WRITES
Backup – no WINDOWS or IMPACT
Monitoring
Quorum – 4/6 = COMMITTED transaction
MEMBERSHIP changes (upgrades/repairs)
Storage GROWS automatically
----
(1 minute) The Amazon Aurora Storage Engine duplicates data 6 times across 3 Availability Zones. By the way, that sound like a lot of writes – but remember that no data blocks are ever written (only log records) – so it ends up being LESS data being written, by far. I’ll get into that a little bit more later.
Data is continually backed up to S3. No backup windows. No performance impact during backup.
The system has a separate Monitoring and Management layer you’ll never see as a customer. To you, the databases just keeps running.
Aurora has a quorum algorithm for writes – as soon as 4 of the 6 writes are durable, the system successfully returns a committed transaction to the database. This means that lagging disks/nodes/even AZ’s can be ridden through by the system with little or no impact on performance. Another key benefit of a quorum-based system is that nodes can leave and join the system as they are upgraded or repaired and there is absolutely no performance or availability impact.
Also, storage grows automatically in 10 GB increments as DB size grows – up to 64 TB.
6 COPIES – 6 copies but actually LESS WRITES
Backup – no WINDOWS or IMPACT
Monitoring
Quorum – 4/6 = COMMITTED transaction
MEMBERSHIP changes (upgrades/repairs)
Storage GROWS automatically
----
(1 minute) The Amazon Aurora Storage Engine duplicates data 6 times across 3 Availability Zones. By the way, that sound like a lot of writes – but remember that no data blocks are ever written (only log records) – so it ends up being LESS data being written, by far. I’ll get into that a little bit more later.
Data is continually backed up to S3. No backup windows. No performance impact during backup.
The system has a separate Monitoring and Management layer you’ll never see as a customer. To you, the databases just keeps running.
Aurora has a quorum algorithm for writes – as soon as 4 of the 6 writes are durable, the system successfully returns a committed transaction to the database. This means that lagging disks/nodes/even AZ’s can be ridden through by the system with little or no impact on performance. Another key benefit of a quorum-based system is that nodes can leave and join the system as they are upgraded or repaired and there is absolutely no performance or availability impact.
Also, storage grows automatically in 10 GB increments as DB size grows – up to 64 TB.
6 COPIES – 6 copies but actually LESS WRITES
Backup – no WINDOWS or IMPACT
Monitoring
Quorum – 4/6 = COMMITTED transaction
MEMBERSHIP changes (upgrades/repairs)
Storage GROWS automatically
----
(1 minute) The Amazon Aurora Storage Engine duplicates data 6 times across 3 Availability Zones. By the way, that sound like a lot of writes – but remember that no data blocks are ever written (only log records) – so it ends up being LESS data being written, by far. I’ll get into that a little bit more later.
Data is continually backed up to S3. No backup windows. No performance impact during backup.
The system has a separate Monitoring and Management layer you’ll never see as a customer. To you, the databases just keeps running.
Aurora has a quorum algorithm for writes – as soon as 4 of the 6 writes are durable, the system successfully returns a committed transaction to the database. This means that lagging disks/nodes/even AZ’s can be ridden through by the system with little or no impact on performance. Another key benefit of a quorum-based system is that nodes can leave and join the system as they are upgraded or repaired and there is absolutely no performance or availability impact.
Also, storage grows automatically in 10 GB increments as DB size grows – up to 64 TB.
Storage management is tedious task with traditional databases….
Taking and managing database backups is another challenge with traditional databases…
But if you are doing an application release and you want to click a button ….
Walk through from data inputs on the left (properties, agents, and buyer information and activities) to how the data is streamed and processed, and finally the insights.
Walk through from data inputs on the left (properties, agents, and buyer information and activities) to how the data is streamed and processed, and finally the insights.
We designed AWS Database Migration Service (DMS) to be simple - you can get started in less than ten minutes.
We designed it to enable near-zero-downtime migration. And we designed it to be a kind of replication Swiss army knife to replicate data between on-premises systems, RDS, EC2, and across database engine type.
We have completed over 1,000 unique migrations to Redshift using DMS.
AWS Database Migration Service (DMS) easily and securely migrates and/or replicate your databases and data warehouses to AWS
AWS Schema Conversion Tool (SCT) converts your commercial database and data warehouse schemas to open-source engines or AWS-native services, such as Amazon Aurora and Redshift
The real power of the solution becomes apparent when you realize that you can move between database engines or between data warehouses, enabling you to move away from expensive, commercial databases, to cloud-native, open source solutions. And don’t worry, at AWS we don’t believe in vendor lock-in. You can use DMS just as easily to move data out of the cloud as into it.
AWS serves hundreds of thousands of customers in more than 190 countries.
Amazon CloudFront and Amazon Route 53 services are offered at AWS Edge Locations
Each AZ is placed in a way to ensure that latency is as low as 2 ms 99% of the time.