Contenu connexe Similaire à STG301_Deep Dive on Amazon S3 and Glacier Architecture (20) Plus de Amazon Web Services (20) STG301_Deep Dive on Amazon S3 and Glacier Architecture1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Dive on Amazon S3
and Glacier Architecture
C r a i g C o t t o n , D i r e c t o r P r o d u c t M a n a g e m e n t – A m a z o n S 3
H e n r y Z h a n g , S e n i o r P r o d u c t M a n a g e r – G l a c i e r
J a m a l M a z h a r , H e a d o f I n f r a s t r u c t u r e a n d D e v O p s – S p r i n k l r
2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AGENDA
• Deep dive on Amazon S3 architecture
• Deep dive on Glacier architecture
• Guest Speaker: Jamal Mazhar, Head of Infrastructure and DevOps
@ Sprinklr
3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The AWS Storage Portfolio
Data Transfer
3rd Party
Connectors
S3 Transfer
Acceleration
File
Amazon EFS
Object
Amazon GlacierAmazon S3
Block
Amazon EBS
(persistent)
Amazon EC2
Instance Store
(ephemeral)
AWS
Snow Family
AWS Storage
Gateway
AWS Direct
Connect
Amazon
Kinesis
EFS
File Sync
4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of Amazon S3 & Glacier
Durable, Available, & Scalable Security & Compliance Query In Place
Flexible Management Ecosystem
5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Architecture Deep Dive
Amazon S3
6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3 By The Numbers
44 Availability Zones
(16 more coming in 2018)
16 Regions
(5 more coming in 2018)
Trillions of
objects
Millions of requests
per second
One of first three
AWS Services
(2006)
99.999999999%
Durability
7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3 Architecture
Internet
End
User
PUT
GET
DELETE
Load
Balancers
Metadata
Storage
API
Servers
Blob Storage
8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3 Availability Zones
S3 stores data in at least 3
Availability Zones (AZ’s)
Each AZ can be up to 8
physical data centers
Unavailability of a data center
or an AZ does not impact
overall S3 availability
Low latency private
network connect data
centers and AZ’s
Physically separate – even
extremely uncommon disasters
would only affect a single AZ
Data is automatically distributed
across a minimum of 3 AZ’s GEO
separated within an AWS Region
9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3 Storage Classes
& T ransitions
S3 Standard S3 Standard –
Infrequent Access
Amazon Glacier
Active data
Synchronous access
Milliseconds retrieval
2.1¢-GB/mo
Archive data
Asynchronous access
Minutes-to-hours retrieval
0.4¢-GB/mo
Infrequently accessed data
Synchronous access
Milliseconds retrieval
1.25¢-GB/mo
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3 Security, Encryption & Compliance
T he b roade st se t of tools in the indu stry
Security
• IAM and Bucket Policies
• Access Control Lists
• Audit logging with CloudTrail
& Alerts with CloudWatch
• Secure CloudFormation
templates
• Amazon Macie
• S3 Console Permission Checks
Encryption
• Encryption in transit with TLS
• SSE-S3 – Amazon S3 manages
data & keys
• SSE-C – Customer managed keys
• SSE-KMS – Master keys in KMS
• CSE – 100% Customer managed
• Default Bucket Encryption
• Encryption Status in Inventory
Compliance
• PCI-DSS
• HIPAA/HITECH
• FedRAMP
• FISMA
• EU Data Protection
Directive
11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cross-region Replication
Automatically replicate data to any other AWS Region
• Replicate by object, bucket, or prefix
• Support for SSE-KMS encrypted objects
• Ownership overwrite
• Change the object owner in the destination region
Region A Region B
Cross-region connectivity
12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cross Region Replication Examples
S3 Standard S3 Standard S3 Standard S-IA
S3 Standard Glacier
Zero-day Lifecycle
Policy to Glacier
13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Do More With Your In-place Data
• Athena
• Redshift Spectrum
• QuickSight
• EMR
Data Lake
Storage
IoT Storage
Machine Learning
& AI Storage
• AWS IoT
• Greengrass
• Other IoT sensors
• Rekognition
• LEX
• Polly
• MXNet & TensorFlow
14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Maximize Throughput with Amazon S3
Amazon S3 automatically scales to thousands of requests per
second per prefix based on your steady state traffic
• Amazon S3 automatically partitions your prefixes within hours adjusting
to increases in request rates
• Consider using a three- or four-character hash (see next slide for details)
15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Using a Three or Four Character Hash
examplebucket/232a-2017-26-05-15-00-00/cust1234234/photo1.jpg
examplebucket/7b54-2017-26-05-15-00-00/cust3857422/photo2.jpg
examplebucket/921c-2017-26-05-15-00-00/cust1248473/photo2.jpg
examplebucket/animations/232a-2017-26-05-15-00-00/cust1234234/animation1.obj
examplebucket/videos/ba65-2017-26-05-15-00-00/cust8474937/video2.mpg
examplebucket/photos/8761-2017-26-05-15-00-00/cust1248473/photo3.jpg
A bit more LIST friendly:
Random hash should come before patterns such as dates and sequential IDs
Always first ensure that your application can accommodate
Due to recent Amazon S3 performance enhancements, most customers
no longer need to worry about introducing entropy in key names
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Architecture Deep Dive
Amazon Glacier
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of Amazon S3 & Glacier
Durable, Available, & Scalable Security & Compliance
Flexible Management Ecosystem
Low-Cost
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Just Say No
X No capital investment
X No commitment
X No capacity planning
X No idle capacity
X No onerous media handling
X No complex technology refreshes
X No undifferentiated heavy lifting
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
99.999999999%
Durability
Durability for long-term preservation
Built-in Fixity Checking
Automatic recovery
20. Flexible Data Retrieval Options
All of your Glacier data is accessible with any of three retrieval options.
Standard Retrieval
• Current model
• 3-5 hours
• $0.01/GB
Bulk Retrieval
• Batch/Bulk access
• 5-12 hours
• $0.0025/GB
Expedited Retrieval
• Rare urgent access
• 1-5 minutes
• $0.03/GB
On-site tape replacement Off-site tape replacement
21. Multiple Ways to Access S3 and Glacier
1. Use S3 and Glacier via S3 Lifecycle Management
2. Direct Amazon Glacier API/SDK
3. AWS Storage Gateway
4. 3rd party tools and gateways
FastGlacier
22. Amazon Glacier – Direct access/APIs
Create
Vault
Configure
Access
Upload
Archives
Register
Archive ID
Data Upload
Initiate
Retrieval
Async
Retrieval
Completion
Completion
Notification
Download
Data
Data Retrieval
23. Third-party tools and gateways
• Consumer grade: less than $50 per license
• Example: Cloudberry, FastGlacier, Arq (Haystack Software)
• Small / medium business: $500 - $1,000 per license
• Example: Synology, Veeam, QNap
• Enterprise gateway and data management software
• Example: NetApp AltaVault, Commvault, StorNext, StoreReduce,
Vidispine
24. Which option should I choose?
• Use S3 lifecycle managed Amazon Glacier if the S3
object keys are sufficient for index/search capability
• Use Amazon Glacier directly if you already plan to store
more metadata/indices in a database
• Use 3rd party tools to minimize coding
25. Amazon Glacier Vault Lock allows you to easily
set compliance controls on individual vaults and
enforce them via a lockable policy
Time-based retention
MFA authentication
Controls govern all
records in a vault
Immutable policy
Two-step locking
Compliance storage with Vault Lock
26. Vault Lock for compliance storage
• Non-overwrite, non-erasable records
• Time-based retention with “ArchiveAgeInDays” control
• Policy lockdown (strong governance)
• Legal hold with vault-level tags
• Configure optional designated third-party access and grant
temporary access
27. How does Vault Lock work?
• Do you use WORM drives/media?
• How do you achieve WORM?
• What happens to data under retention if I close my account?
• Does AWS provide Designated 3rd party service?
28. Amazon Glacier received a third-party assessment
from Cohasset Associates on how Amazon Glacier
with Vault Lock can be used to meet the requirements
of SEC Rule 17a-4(f) and CFTC 1.31(b)-(c).
29. Example control: 1-year record retention
• Deny delete archive operation
• From anybody (root, administrators, users, business
partners)
• When ArchiveAgeInDays is <= 365 days
Archive age computed from the time an archive lands in a vault
32. Large Scale Disaster Recovery
Jamal Mazhar, Head of Infrastructure and DevOps @ Sprinklr
© 2017 Sprinklr, Inc. All rights reserved.
33. © 2017 Sprinklr, Inc. All rights reserved.33
MOST COMPLETE SOCIAL MEDIA MANAGEMENT PLATFORM
Reach Engage Listen+ +
advertising marketing commerce care
research +
insights
CUSTOMER EXPERIENCE MANAGEMENT PLATFORM
Integrate legacy systems Collaborate across silos Unified Platform
experience cloud
Social is about managing the disruption of connected & empowered
customers
Digital transformation is about managing new
expectations
35. 35
Sprinklr Platform - Key Technologies
© 2017 Sprinklr, Inc. All rights reserved.
Applications
DBs
Ops &
Automation
+ custom
codeS3 EC2 CloudFront EBS
+ CloudWatch, Elastic
Transcoder, ElastiCache, IAM,
Route 53, SES, SNS, SQS, VPC,
ELB, KMS
AWS
36. 36
What is Disaster Recovery
Difference between High Availability and Disaster
Recovery
S3 is already Highly Available within same region
Different approaches to Disaster Recovery and their
Pros/Cons and challenges
Hot/Cold aka Active/Passive
Hot/Warm aka Active/Standby
Hot/Hot aka Active/Active
© 2017 Sprinklr, Inc. All rights reserved.
37. 37
Sprinklr Disaster Recovery Approach
Disaster Recovery SLAs
Recovery Point Objective - RPO
Recovery Time Objective – RTO
Use of two AWS regions
Independent 3rd party validation of our DR process
© 2017 Sprinklr, Inc. All rights reserved.
38. 38
Scale and Scope of Sprinklr Disaster Recovery
Large data size
Thousands of EBS volumes for Mongo, Solr, Cassandra
1400+ big SSD i3 servers for 100+ Elasticsearch clusters
Thousands of servers running close to 100 different
services
Each service has unique configuration and code
© 2017 Sprinklr, Inc. All rights reserved.
39. 39
Three Major Challenges
1. Copying the data and configuration information quickly
within same region
2. Transferring the data to a different region and keeping it in
sync daily
3. Automation and processes to restore the entire platform
quickly
© 2017 Sprinklr, Inc. All rights reserved.
40. 40
Challenge 1 – Copying Data and Configuration
Traditional backup approaches didn’t work for Mongo
and Solr
EBS snapshots
Backup status dashboard and process
Limits we ran into due to scale
Concurrent Snapshot limits
S3 IO limits for Elasticsearch backup
© 2017 Sprinklr, Inc. All rights reserved.
41. 41
Challenge 2 – Transferring and Syncing Data
Hit limits in keeping petabytes of data across Virginia
and Oregon in Sync
Concurrent incremental snapshot copy limit
Bandwidth limits
What worked well from day one without tweaking
S3 cross region sync for Elasticsearch
S3 is eventually consistent, no issues in our use case
© 2017 Sprinklr, Inc. All rights reserved.
42. 42
Challenge 3 – Restoring the Entire Platform
Custom code to automate the entire platform sequence and
dependencies
Launching servers
Creating / Mounting volumes from snapshots, Code deployment
Creating ELBs, updating DNS, Application Configuration
Restoring over 1 PB of data for Elasticsearch clusters from S3
Workaround for API limits and throttling
Workaround for capacity limits
Built custom dashboard to provide restoration status
© 2017 Sprinklr, Inc. All rights reserved.
44. 44
Key Results and Takeaways
Keeping more than 4 petabytes of data in sync across different geo regions
More than 50 TB of daily incremental data transfer using S3 and EBS volumes
Bandwidth increase and concurrent snapshot optimization reduced the daily data
sync time from 36 hours to 8 hours to help meet the 24 hours RPO
Configuration metadata and code sync across regions is critical for DR
One click automation of the restore process allows us to bring the entire
platform up with 1000s of servers in a different region within hours
© 2017 Sprinklr, Inc. All rights reserved.
45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
STG201 – Storage State of the Union – Wed, 11:30 AM
STG304 – Deep Dive on Data Archiving with Amazon S3 & Amazon Glacier,
Wed, 1:45 PM
STG313 – Big Data Breakthroughs – Wed, 12:15 PM OR 7:00 PM
STG303 – Deep Dive on Amazon Glacier – Thurs, 1:45 PM
STG312 – Best Practices for Building a Data Lake in Amazon S3 & Amazon
Glacier – Thurs, 3:15 PM
Learn more…
46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
For Enterprise Storage Engineers
• Learn how to architect and
manage highly available
solutions on AWS storage
services
• Advance toward AWS
certifications
• Help your organization migrate
to the cloud faster
Online at www.aws.training
• Access 100+ new digital
training courses including
advanced training on storage
• Deep Dives on S3, EFS, and EBS
• Migrating and Tiering Storage
to AWS (Hybrid Solutions)
At re:Invent
• Visit Hands-on Labs at the
Venetian
• Attend a proctored
“Introduction to EFS” Spotlight
Lab on Thursday at 3pm at the
Venetian
• Meet Storage experts at the Ask
the Experts in Hands-on Labs
room at the Venetian
New Storage Training
47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q&A
Amazon S3 Amazon Glacier
48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THANK YOU!