2. ¿Qué esperar de la sesión?
• ¿Por qué utilizar servicios de BD administradas?
• Opciones de BD en AWS
• Amazon DynamoDB — Una BD administrada no-relacional
• Amazon RDS — Una BD administrada relacional
• Amazon ElastiCache — Un cache en memoria administrado
• Amazon Redshift — Una BD data warehouse administrada
• Resumen
4. Si hospeda su BD on-premises
Energía, HVAC, red
Rack y stack
Mantenimiento Srvr
Parches SO
Parches BD
Respaldos de BD
Escalabilidad
Alta Disponibilidad
Instalación de BD
Instalación de SO
usted
Optimización App
5. Si hospeda su BD on-premises
Energía, HVAC, red
Rack y stack
Mantenimiento Srvr
Instalación de SO
Parches SO
Parches BD
Respaldos de BD
Escalabilidad
Alta Disponibilidad
Instalación de BD
Optimización App
usted
6. Si hospeda su BD en Amazon EC2
Energía, HVAC, red
Rack y stack
Mantenimiento Srvr
Parches SO
Parches BD
Respaldos de BD
Escalabilidad
Alta Disponibilidad
Instalación de BD
Instalación de SO
Optimización App
usted
7. Si hospeda su BD en Amazon EC2
Parches SO
Parches BD
Respaldos de BD
Escalabilidad
Alta Disponibilidad
Instalación de BD
Optimización App
Energía, HVAC, red
Rack y stack
Mantenimiento Srvr
Instalación de SO
usted
8. Si usted elige el servicio administrado de BD
Energía, HVAC, red
Rack y stack
Mantenimiento Srvr
Parches SO
Parches BD
Respaldo de BD
Optimización App
Alta Disponibilidad
Instalación de BD
Instalación de SO
Escalabilidad
usted
9. Resumen rápido de las opciones existentes
• Auto Administrado—Usted es responsable por el
hardware, SO, seguridad, actualizaciones, respaldos,
replicación, etc., pero tiene todo el control sobre el.
• Instancias de EC2—Solo se necesita enfocarse en la
actualización a nivel BD, parches, replicación,
respaldos, etc. y no debe preocuparse por el hardware
ni la instalación del SO.
• Totalmente Administrado—Obtiene características como
respaldos, replicación, etc. como un paquete de
servicios y no debe molestarse por parches y
actualizaciones.
11. Un servicio administrado para cada tipo
Amazon
DynamoDB
Document
and Key-
Value Store
Amazon
RDS
SQL
Database
Engines
Amazon
ElastiCache
In-Memory
Key-Value
Store
Amazon
Redshift
Data
Warehouse
19. NoSQL vs. SQL para una nueva app: ¿Cómo elegir?
• Sin esquema, lecturas y
escrituras sencillas,
modelos de datos simples
• Fácil de escalar
• Con foco en rendimiento y
disponibilidad a cualquier
escala
• Esquemas fuertes,
relaciones complejas,
transacciones y JOINs
• Escalar es difícil
• Con foco en consistencia
sobre la disponibilidad y
escalabilidad
NoSQL SQL
22. Casos de uso comunes
Ad Tech IoT Gaming
Mobile
& Web
Ad serving,
retargeting,
búsqueda de
ID, admon de
perfil de
usuario,
session-
tracking, RTB
Tracking state,
lecturas y
metadatos de
millones de
dispositivos,
notificaciones
en tiempo real
Grabar detalle
del juego,
Tableros de
lideres,
Información de
la sesión,
utilización hist.,
y bitácoras
Almacenar perfil
de usuarios,
detalles
sesiones, config
personalización,
meta datos
23. Predecible rendimiento con baja latencia
Latencia consistente de un solo digito en milisegundos, aún en una escala masiva
24. Escrituras
Continuamente replicada a 3 AZs
Persistente a disco (SSD
especial)
Lectura
Consistencia Fuerte o Eventual
Sin trade-off de latencia
Replicación automática para una sólida durabilidad y
disponibilidad
25. Amazon DynamoDB es una BD sin esquemas
Atributos
Sin Esquema
El esquema se define por ítem
Ítems
Tabla
Llave
Ítem
26. Defina el rendimiento deseado utilizando el
aprovisionamiento de throughput
Lectura
unidades de
capacidad
Escritura
unidades de
capacidad
1 RPS > 2.5 M
peticiones en
un mes
27. Solo pague por los recursos que utiliza
Factura
mensual = GB +
Los precios varían por región. Mas detalles en http://aws.amazon.com/dynamodb/pricing/
Almacenamiento
utilizado
Capacidad
de Escritura
unidades
(WCUs)
+
Capacidad
de lectura
unidades
(RCUs)
Free tier:
• Capa gratuita generosa de 25 GB, 25 WCUs, y 25 RCUs
• Usted tiene mas de 60M de peticiones de escritura y 60M de peticiones de lectura
gratuitas en un mes
• La capa gratuita es indefinida, usted se beneficia cada mes
31. Casos de uso
Aplica donde quiera que requiere BD relacionales
eCommerce Juegos
Sitios web Soluciones TI
Apps
Reporteo
32. Matriz de características de RDS
Característica Aurora MySQL PostgreSQL Oracle SQL Server
VPC
Alta disponibilidad
Escalamiento instancia
Cifrado Proxima-
mente
Replicas lectura Oracle Golden
GateCross región
Almacenamiento Max 64 TB 6 TB 6 TB 6 TB 4 TB
Escalamiento en
almacenamiento
Auto
Escalam.
IOPS Provisionados NA 30,000 30,000 30,000 20,000
Instancia mayor R3.8XL R3.8XL R3.8XL R3.8XL R3.8XL
33. Amazon Aurora: Rápido, disponible, y compatible con
MySQL
SQL
Trans-
actions
AZ 1 AZ 2 AZ 3
Caching
Amazon
S3
5x mas rápido que MySQL en
el mismo hardware
Sysbench: 100K writes/sec y
500K reads/sec
Diseñado para disponibilidad
de 99.99%
El almacenamiento es
replicado 6 veces en 3 AZs
Escala hasta 64 TB y 15
replicas de lectura
34. Amazon RDS es sencillo y fácil de escalar
Los tipos de instancias de
BD ofrecen una selección
de rangos de CPU y
memoria
Incremente o disminuya los
recursos de las instancias bajo
demanda
El almacenamiento de la
BD es escalable bajo
demanda
35. Amazon RDS ofrece almacenamiento rápido y
predecible
Propósito General
(SSD) para la
mayoría de las
cargas de trabajo
IOPS
Aprovisionados(SSD)
para cargas tipo OLTP
de hasta 30,000 IOPS
Magnético para
cargas de trabajo
pequeñas y de acceso
poco frecuente
36. Implementaciones Multi-AZ para alta
disponibilidad
Solución de tolerancia a fallas de grado
empresarial para BD en producción
37. Seleccione replicación cross-region para facilitar migraciones y
localidad de datos
Una recuperación mas fácil
en caso de desastre
Acerque los datos a los
clientes
Promueva a master para
una migración sencilla
38. ¿Como funcionan los respaldos de Amazon RDS?
Respaldos automáticos
Recupera su BD a un punto en el
tiempo
Habilitado por default
Seleccione un periodo de
retención de hasta 35 días
Snaphots Manuales
Construya una nueva instancia
de BD de un snapshot cuando
la necesita
Iniciada por usted
Persiste hasta que usted la
borre
Almacenada en Amazon S3
39. Pago
Mensual = +
Mas detalles en http://aws.amazon.com/rds/pricing/
Pague por los recursos que utilice
Almacenamiento
utilizado
Duración de la instancia
de BD utilizada
(el precio depende
el tipo de
almacenamiento)
(el precio depende del
tipo de instancia de
DB)
Free tier (para los primeros 12 meses)
• 750 horas de instancia micro BD
• 20 GB de almacenamiento de BD
• 20 GB para respaldos
• 10 millones de operaciones I/O
GBN ×
42. Amazon
Redshift
Mucho más rápido
Más económico
Muy simple
Data warehouse relacional
Masivamente paralelo; escala a peta
byte
Totalmente administrado
Plataformas de HDD y SSD
$1,000/TB/año; empieza en $0.25/hr
43. Casos de uso comunes
10x mas económico
Fácil de aprovisionar
Mayor productividad del
DBA
Empresas
tradicionales
10x mas rápido
Sin programación
Fácilmente reutiliza las
herramientas de BI,
Hadoop, machine
learning y streaming
Empresas
con big data
Análisis en línea con flujo
de procesos
Pague por uso, crezca
cuando lo necesite
Disponibilidad
administrada y
recuperación de
desastres
Compañías
SaaS
44. Arquitectura de Amazon Redshift
Nodo Líder
• Simple SQL endpoint
• Almacena los metadatos
• Optimiza el query plan
• Coordina la ejecución del query
Nodo de Cómputo
• Almacenamiento local en columnas
• Ejecución en paralelo/distribuida para
todos los queries, cargas, respaldos,
recuperaciones y cambio de tamaño
Empieza solo a $0.25/hr, crece a
2 PB (comprimido)
• DC1: SSD; escala 160 GB–326 TB
• DS2: HDD; escala 2 TB–2 PB
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
45. Amazon Redshift es rápido
Dramáticamente menos I/O
Column storage
Data compression
Zone maps
Direct-attached storage
Large data block sizes
10 | 13 | 14 | 26 |…
… | 100 | 245 | 324
375 | 393 | 417…
… 512 | 549 | 623
637 | 712 | 809 …
… | 834 | 921 | 959
10
324
375
623
637
959
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
46. Totalmente administrado respaldos
continuos/incrementales
Copias múltiples dentro del cluster
Respaldos continuos e incrementales
hacia Amazon S3
Respaldos continuos e incrementales
a través de las regiones
Streaming restore
Amazon S3
Amazon S3
Región 1
Región 2
47. Amazon Redshift ofrece una sólida tolerancia a
fallas
Amazon S3
Amazon S3
Region 1
Región 2
Fallas de discos
Fallas de nodos
Fallas de red
Desastres a nivel AZ/región
48. Pague por lo que utilice
Mas detalles en https://aws.amazon.com/redshift/pricing/
Factura
mensual = N ×
Duración en la que los
nodos fueron utilizados
Número de nodos
(el precio depende del
tipo de nodo)Free trial de 2 meses
El nodo líder es gratis
Sin costos por adelantado, pague por lo que utilice
El precio incluye tres copias de datos
El almacenamiento del respaldo es gratuito hasta el 100% del almacenamiento
aprovisionado
Compresión de datos de 3x en promedio
49. Redshift tiene un ecosistema completo
Data Integration Systems IntegratorsBusiness Intelligence
52. El Desafío
• Implementar un ambiente de infraestructura para una
Compañía de Seguros, cumpliendo los siguientes
requerimientos:
• Rápida implementación
• Flexibilidad de costos
• Seguridad
• Pago por uso
• Soporte de Oracle Enterprise
53. Por qué AWS?
• La compañía de seguros necesitaba una plataforma con un nivel de
servicio de infraestructura alto 99.99% de disponibilidad y con un
alto standard de seguridad.
• Con certificación de soporte Oracle
• La evaluación de costos fue 30% menor que las plataformas on-
premise.
• La velocidad de implementación fue 4x más rápida con RDS
54. Arquitectura Simplificada AWS
• 12 EC2
• 2 RDS Oracle
• 4 TB de Storage
• 1 TB S3
• VPN Site to Site
• Soluciones de Seguridad Trend Micro
• Servicios Profesionales de In Motion
55. Beneficios
• SLA 99,99%
• Time to Market: velocidad para la creación de
infraestructura.
• Seguridad: compartida con AWS y complementada con
Trend Micro Deep Security.
57. Quiénes Somos?
• In Motion, empresa con presencia regional.
• Con más de 20 años de experiencia en proyectos de
integración y soluciones en la nube.
• Líder en soluciones en Industria de Seguros
• Equipo de 250 profesionales ubicados en distintos
países de LATAM.
61. Capa de Caching para incrementar rendimiento o
optimizar costos de una base de datos
Almacenamiento de datos efímeros key-value
Patrones en aplicaciones de alto rendimiento,
como tableros de lideres (usuarios en juegos),
manejo de sesiones, contadores de eventos, listas
en memoria
Casos de uso comunes
62. • Completamente
administrado
• Cache node auto-
discovery
• Multi-AZ node
placement
Características clave de ElastiCache
• Completamente
administrado
• Multi-AZ con
auto-failover
• Persistente
• Replicas de lectura
63. ¿Cómo se cobra ElastiCache?
Factura
mensual = N ×
Mas detalles en http://aws.amazon.com/elasticache/pricing/
Duración del uso de
los nodos
Numero de nodos
(el precio depende del
tipo de nodos)
Free tier (por los primeros 12 meses)—750 horas de micro cache node
All the time that’s freed up by offloading undifferentiated labor to AWS can be used to do the app optimizations you always wanted to have time to do.
Performance and Availability at Scale
Ad-tech
Gaming
Apps for connected devices
The latency characteristics of DynamoDB are under 10 msec and highly consistent.
Most importantly, the data is durable in DynamoDB, constantly replicated across multiple data centers and persisted to SSD storage.
Has-Offers (Tune) – Near real time, mobile and web marketing platform with tons of data providing ad-hoc analysis
Over 200 billion items and covering 87TB of data
Redfin - 5+ Billion items stored (processed daily)
SmugMug – Photo hosting and sharing platform
Millions of users, Billions of Photos and PBs of storage
DropCam – Video streaming solution – reduced delivery times for video events from 5-10 secs to few milliseconds
MLB – statcast
Duolingo – language learning tool, over 4B items stored
Myriad – Over 170M users (Genetics)
AdBrain – cross device advertising
Tokyo Hands – DB Streams for IoT
DoApp – High performance ads
TigerSpike – millisecond latency
Nextdoor – Low latency social network
VidRoll – Millions of ads per month
JustGiving – Website click stream events
Fully managed RDB solution that offers simple, fast and low cost solution for multiple SQL engines …
SQL Server 2008 R2, 2012 – multiple editions
MySQL – 5.6, 5.5 and 5.1
PostgreSQL – 9.4.4, 9.4.1, 9.3.*
Oracle 12c
FedRamp for RDS
Reason for Launch: Requirement for US Federal agencies
RDS BAA Inclusion
Reason for Launch: US regulatory Compliance
NBC, Expedia, GE Oil and Gas, Washington Post
The computation and memory capacity of a DB instance is determined by its DB instance class. You can change the CPU and memory available to a DB instance by changing its DB instance class; to change the DB instance class, you must modify the DB instance.
Here are the DB instance classes available through Amazon RDS:
Micro instances (db.t1.micro): An instance sufficient for testing but should not be used for production applications.
Standard - Current Generation (m3): Second generation instances that provide more computing capacity than the first generation db.m1 instance classes at a lower price.
Memory Optimized - Current Generation (db.r3): Second generation instances that provide memory optimization and more computing capacity than the first generation db.m2 instance classes at a lower price.
Burst Capable - Current Generation (db.t2): Instances that provide baseline performance level with the ability to burst to full CPU usage.
You can change from one database instance type to another. There will be a brief availability event during the changeover.
You can increase the amount of storage available to your database instance on demand for the MySQL, Oracle, and PostgreSQL database engines. This change is performed online, without an availability impact. Amazon Aurora automatically grows the database size on demand.
Amazon RDS General Purpose (SSD) Storage is suitable for a broad range of database workloads that have moderate I/O requirements. With the baseline of 3 IOPS/GB and ability to burst up to 3,000 IOPS, this storage option provides predictable performance to meet the needs of most applications.
Amazon RDS Provisioned IOPS (SSD) Storage is an SSD-backed storage option designed to deliver fast, predictable, and consistent I/O performance. With Amazon RDS Provisioned IOPS (SSD) Storage, you specify an IOPS rate when creating a DB Instance, and Amazon RDS provisions that IOPS rate for the lifetime of the DB Instance. Amazon RDS Provisioned IOPS (SSD) Storage is optimized for I/O-intensive, transactional (OLTP) database workloads.
Formerly known as Standard storage, Amazon RDS Magnetic Storage is useful for small database workloads where data is accessed less frequently.
For a workload with 50% writes and 50% reads running on an m2.4xlarge instance, you can realize up to 25,000 IOPS for Oracle. For a similar workload running on cr1.8xlarge you can realize up to 20,000 IOPS for MySQL or PostgreSQL. If you are using SQL Server, the maximum storage you can provision is 1TB and maximum IOPS you can provision is 10,000 IOPS. For SQL Server, the ratio of IOPS to storage (in GB) should be 10 and scaling storage or IOPS of a running DB Instance is not currently supported.
Choose the storage type most suited for your workload.
High-performance OLTP workloads: Amazon RDS Provisioned IOPS (SSD) Storage
Database workloads with moderate I/O requirements: Amazon RDS General Purpose (SSD) Storage
Small database workloads with infrequent I/O: Amazon RDS Magnetic Storage
Cross-region read replicas are available for Amazon RDS for MySQL.
When automated backups are turned on for your DB Instance, Amazon RDS automatically performs a full daily snapshot of your data (during your preferred backup window) and captures transaction logs (as updates to your DB Instance are made). When you initiate a point-in-time recovery, transaction logs are applied to the most appropriate daily backup in order to restore your DB Instance to the specific time you requested. Amazon RDS retains backups of a DB Instance for a limited, user-specified period of time called the retention period, which by default is one day but can be set to up to thirty five days.
Manual database snapshots are user-initiated and enable you to back up your DB Instance in a known state as frequently as you wish, and then restore to that specific state at any time. DB Snapshots can be created with the AWS Management Console or CreateDBSnapshot API and are kept until you explicitly delete them with the Console or DeleteDBSnapshot API.
Manual database snapshots are kept in Amazon Simple Storage Service (Amazon S3). Amazon S3 is designed for 99.999999999% durability.
Flipboard is an online magazine with millions of users and billions of “flips” per month. Uses Amazon RDS and its Multi-AZ capabilities to store mission critical user data. Went from concept to delivered product in six months with just a handful of engineers.
Samsung – Delivers app and content via RDS, saved 85% over their on-prem solution
Enterprises - Reduce costs by extending DW rather than adding HW, Migrate completely from existing DW systems, Respond faster to business; provision in minutes
Big Data - Improve performance by an order of magnitude, Make more data available for analysis, Access business data via standard reporting tools
SaaS - Add analytics functionality to applications, Scale DW capacity as demand grows, Reduce HW and SW costs by an order of magnitude
Big data customers – Adtech, gaming etc.
SAAS – Imfhealth, Microstrategy,
People are trying presto + redshift (Nasdaq).
With column storage, you only read the data you need
Data Compression - COPY compresses automatically, You can analyze and override, More performance, less cost
Zone Maps – Track the minimum and maximum value for each block, Skip over blocks that don’t contain relevant data
Direct Attached Storage - Use local storage for performance, Maximize scan rates, Automatic replication and continuous backup, HDD and SSD platforms
Large data block sizes - Typical database block sizes range from 2 KB to 32 KB. Amazon Redshift uses a block size of 1 MB, which is more efficient and further reduces the number of I/O requests needed to perform any database loading or other operations that are part of query execution
Amazon Redshift’s streaming restore feature enables you to resume querying as soon as the new cluster is created and basic metadata is restored. The data itself will be pulled down from S3 in the background, or brought in on demand as needed by individual queries.
If Amazon Redshift detects a drive failure, it automatically begins using the other in-cluster copy of the data on that drive to serve queries while also creating another copy of the data on healthy drives within the cluster. If all of the copies within the cluster are unavailable, it will bring the data down from S3
If Amazon Redshift detects a failure that requires a node to be replaced, it automatically provisions and configures a new node and adds it to your cluster so you can resume operations.
Snapshots can be restored to different availability zones within a region.
Multi-region backup capability to handle global disaster recoverability.
Get started with as low as $0.25 per hour
Leader node is free
Dw2.large has 0.16TB, so 4 of them are 0.64 TB
100% backup storage is free
No data transfer charges unless you’re running in VPC
Redshift works with customer’s BI tool of choice through Postgres drivers and a JDBC, ODBC connection. A number of partners shown here have certified integration with Redshift, meaning they have done testing to validate/build Redshift integration and make using Redshift easy from a UI perspective. If there are tools customer’s use not shown we can work with Redshift on getting them integrated.
A number of partners who have certified their solutions to work with Redshift
Data Loading Options
- Parallel upload to Amazon S3
- AWS Direct Connect
- AWS Import/Export
- Amazon Kinesis
- Systems integrators
Nasdaq security loads billions of records per trading day to track client activities,
HasOffers loads 60M rows per day in 2 min intervals, Desk: high concurrency user facing portal (read/write cluster), Amazon.com/NTT PB scale. Pinterest saw 50-100x speed ups when moved 300TB from Hadoop to Redshift.
Nokia saw 50% reduction in costs.
Key Competition – Vertica, Green plum, Matisa, Oracle, SQL Server (Legacy), Big data customers – Hadoop/Hive, Impala, Presto (Adtech, gaming etc), SAAS – Imfhealth, Microstrategy,
People are trying presto + redshift (Nasdaq).
1/10th cost at 10x speed – $1000/TB/year – cheaper than other products, no benchmarks – columnar data, zone maps for filtering, better compression
Main customer queries –
1) Security – Everything that AWS offers, end-to-end encryption, HSM Keys, VPC – compute nodes are in a separate VPC, tiered key system – block level, cluster, key-auth cluster. Automated backups and replica, incremental backups, cross region backups.
2) ETL – How to get data in, what tools to use, how frequently can I load – Many use S3 and then do a copy command to parallel upload, but should not use inserts over JDBC/ODBC and should prefer copy command, a bunch of Redshift partners to get data in, frequency – 5 mins to hourly and daily batches, batching is better for performance than microbatching
3) BI – visualization tools – supports ODBC/JDBC interface.
Concurrency – 50 concurrent query limit – 51st query is queued, customers can set up different queues. There are tools for workload management and suggest caching to reduce impact.
500 concurrent connections are supported
4) Performance – tuning for performance – follow the best practices – setting up tables with sort keys and distribution keys, getting the tables analyzed in time, vacuum (defrag) for faster scans.
Recently announced – user defined functions in Python.
ElastiCache works great in front of a database hosted in Amazon RDS, or
A self-managed database running in Amazon EC2
So common design patterns in use cases that we see…
Emphasize the caching layer on DBs
The most common one we is as a cache. Historically, Memcached been the most popular caching option and continues to be popular because of its simplicity, high performance ,and scale-out capabilities. However, we see the caching space has been evolving over the past few years and now see Redis as also a good caching option. This allows your solution to be much more scalable and responsive on the front by having a low latency path to application data without having to go to the backend database. The main benefit here is that you can independently scale your front end to cater to a spiking loads without having to grow the costlier backend database. I would like to emphasize that dealing with spikey loads is not only about ensuring lower latency but often to be able to handle a very high request rate for which a cache is perfect especially for Ad-tech/gaming use cases.
Ephemeral key-value data
Web session management
Leaderboards come up often. Users are having rapid interactions with a site , for example in Gaming use cases, and rather than deal with a round trip latency with a database every time, operating out of the cache is more convenient. One thing to remember is that typically you do want to provide persistence for this user state and want to manage that at the application level. But you can do that asynchronously. Here you use ElastiCache Redis to get response time and high request rates and use the backend database for persistence if not doing using ElastiCache Redis. In addition, you also track key statistics of the entire user base in Redis and order users.
High speed sorting is the other one we see. Here you have a set of key/values and you want to order them across various dimensions that can change on demand. Typically the value here is a pointer to the actual object so this behaves as an index
Distributed Counters is also a popular design pattern. You may want to keep track of an event occurrence throughout a day. Redis bitmaps are ideal for this and are also space efficient.
For queuing support Redis allows push and pop from a list. A related design pattern is to use Redis PubSub which allows you to implement messaging between applications. Publish allows you to push a message to a channel and subscribe listens on a channel.
Lastly, a common pattern is track a stream of events or activity stream that a user may like to monitor. Redis lists are good for this. You can list sizes using LTRIM and get the most recent list items using LRANGE.
New Redis features - https://aws.amazon.com/about-aws/whats-new/2015/09/amazon-elasticache-now-with-enhanced-redis-capabilities/
The latest engine version of Amazon ElastiCache for Redis now comes with several enhancements:
More usable memory: You can now safely allocate more memory for your application without the risk of increased swap usage during syncs and snapshots.
Improved synchronization: Improved output buffer management provides more robust synchronization under heavy load and when recovering from network disconnections. Additionally, syncs are faster as both the primary and replicas no longer use the disk for this operation.
Smoother failovers: In the event of a failover, your cluster now recovers faster as replicas will avoid flushing their data to do a full re-sync with the primary.
Dw2.large has 0.16TB, so 4 of them are 0.64 TB
No data transfer charges unless you’re running in VPC
Expedia uses a 400 node cluster of Elasticache
Dealflicks reduced their latency by over 80% after using Elasticache
Riot Games uses ElastiCache Redis for their “League of Legends” game, one of the most played PC games in the world. They use Redis sorted sets within ElastiCache to keep track of player leaderboards, using the Multi-AZ feature with automatic failover for high availability.
Adobe uses MemcacheD for their shared cloud service – managing counters that are not persistent