Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AWS Online Tech Talks

1 344 vues

Publié le

Most organizations have data that they need to retain, but is accessed infrequently, if ever. In cases where this data needs to be accessible at a moment’s notice, it’s hard to save money by moving to an archival storage because access times on these platforms are slower. Now, customers are using Amazon S3 & Glacier for “Active Archiving” to reduce storage costs while maintaining the flexibility of instant access. In this tech talk, we’ll show you how implement Active Archiving with AWS Object Storage services, and we’ll provide some real world examples of how AWS customers are saving money with these capabilities today.

Learning Outcomes:
• Define Active Archiving, and understand how it is different from traditional cold archiving
• Review the cost modeling tools available to determine if Active Archiving is a good fit for your organization
• Learn about best practices for using AWS Object Storage features & functionality to enable Active Archiving

Publié dans : Technologie
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AWS Online Tech Talks

  1. 1. Active Archiving with Amazon S3 ….and Tiering To Glacier Marc Trimuschat AWS Storage Services
  2. 2. Data has gravity …easier to move processing to the data 4k/8k Genomics Seismic Financial Logs IoT
  3. 3. Cloud Data Migration Direct ConnectSnow* data transport family 3rd Party Connectors Transfer Acceleration Storage Gateway Kinesis Firehose AWS Storage Platform and SolutionsThe AWS Storage Portfolio Object Amazon GlacierAmazon S3 Block Amazon EBS (persistent) Amazon EC2 Instance Store (ephemeral) File Amazon EFS
  4. 4. Audio Archives – SoundCloud • World’s leading social sound platform • Audio files transcoded and stored in multiple formats • Stores PBs of data • Transcoded files served from Amazon S3 • Originals moved to Amazon Glacier for long-term retention
  5. 5. Satellite Image Archive • DigitalGlobe takes Satellite imagery of the Earth • 100PB image library = 6 billion square kilometers • 1PB new image every year • Images to be archived and retained for decades
  6. 6. Patient Data–Philips Healthcare • HealthSuite digital platform powered by AWS • 15 petabytes of patient data • Archived for decades (beyond the lifetime of patients) • Uses AWS HIPAA-eligible services in the BAA
  7. 7. Archive: Data retained for the long term, for compliance or potential future reference Data archiving needs are growing everywhere • Media assets, 4K, 8K • Health care/life sciences • Financial services • Regulated industries • Oil and gas/geospatial • Digital preservation • Long-term backups • Logs
  8. 8. AWS Storage Review
  9. 9. Choice of storage classes Standard Active data Archive dataInfrequently accessed data Standard - Infrequent Access Amazon Glacier
  10. 10. - Transition Standard to Standard-IA - Transition Standard-IA to Amazon Glacier - Expiration lifecycle policy - Versioning support - Prefix support Data Lifecycle Management T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days Data access frequency over time
  11. 11. Cross-Region Replication Lifecycle Policy Data Classification & Management Event Notifications CloudWatch Metrics S3 Inventory Audit with CloudTrail Data Events Storage Analytics Standard Standard - Infrequent Access Amazon Glacier Amazon S3: What’s New
  12. 12. Data-driven storage management for S3 • Analyze storage usage to transition the right data to the right storage class • Understand how storage usage changes as your S3 objects get older • Discover how much of your storage is retrieved over time
  13. 13. Manage your data Data Classification and Management Manage data based on what it is as opposed to where its located • Easy data management • Classify your data • Tag your objects with key-value pairs • Write policies once based on the type of data Classification Lifecycle PolicyAccess Control
  14. 14. Amazon Glacier • Extremely low-cost archive storage service, starting at $0.004 GB/mo • 3 retrieval options: Expedited (1-5min), Standard (3-5hrs), Batch (5-12 hrs) • 99.999999999% of durability (5-6 orders of magnitude higher than 2 copies of tape) • All data is encrypted at rest • Features: compliance, data management, cost management, audit logging
  15. 15. Glacier: Key Concepts • Vaults – Container for archives, up to 1,000 vaults per account • Archives – basic unit, write-once, 40TB max, unlimited archives • Inventory – Cold index of archives refreshed every 24 hours • Access – Three ways to access Glacier • Uploads – Multi-part, lifecycle, cost optimizations, Snowball • Data management – Vault Lock, tagging, audit logs • Retrievals – Retrieval policies, range retrievals, new feature announcement
  16. 16. Archive Consideration 1 – Total Archive Cost
  17. 17. Traditional archiving approaches • Tape libraries, robots, drives, media • Onsite (online and offline) • Offsite tape out/vaulting • Specialized software and personnel • Tape refresh every 3-5 years
  18. 18. How can AWS help with your archival? Metered usage: Pay as you go No capital investment No commitment No risky capacity planning Avoid risks of physical media handling Control your geographic locality for performance and compliance
  19. 19. Consideration 2 – Durability
  20. 20. Amazon S3 and Glacier Durability 4 9s durability 5 9s durability S3 - IA Glacier 11 9s durability
  21. 21. 99.999999999% Durability Durability for long-term preservation Built-in Fixity Checking Automatic recovery
  22. 22. Consideration 3 – Accessibility
  23. 23. Amazon Glacier – Data Retrieval Tiers Standard Retrieval • Current model • 3-5 hours • Disaster Recovery Bulk Retrieval • Batch/Bulk access • 5-12 hours • PB scale re-transcoding or video/image analysis Expedited Retrieval • Emergency access • 1-5 minutes • Last minute play-out schedule swap $0.03/GB $0.01/GB $0.0025/GB On-site tape replacement Off-site tape replacement
  24. 24. Consideration 4 - Application & Data Management
  25. 25. Accessing Glacier 1. S3 lifecycle integration 2. Direct Glacier API/SDK 3. Third party tools and gateways FastGlacier
  26. 26. Use Glacier via S3 Lifecycle S3 Standard Active data Archive dataInfrequently accessed data S3 - Infrequent Access Amazon Glacier Synchronous access Async accessSynchronous access $0.023/GB/mo. $0.004/GB/mo.$0.0125/GB/mo.
  27. 27. - Transition Standard to Standard-IA - Transition Standard-IA to Amazon Glacier - Transition based on object tags - Expiration and versioning Data lifecycle management T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days Data access frequency over time
  28. 28. Transition older videos to Standard-IA
  29. 29. Glacier Direct Upload– The Basics Create vault1 Configure access policies2 ArchiveApp user policy Effect:Allow Resource: arn:aws:glacier:<accountId>:vaults/Films Action: glacier:UploadArchive 3 Upload archives UploadArchive(data) -> Archive ID
  30. 30. Uploading Data: Inter- or Sneaker- net AWS Direct Connect Dedicated bandwidth between your site and AWS Internet Transfer data in a secure SSL tunnel over the public Internet AWS Import/Export Snowball Physical transfer of media into and out of AWS
  31. 31. AWS Snowball Edge Petabyte-scale hybrid device with onboard compute and storage • 100 TB local storage • Local compute equivalent to an Amazon EC2 m4.4xlarge instance • 10GBase-T, 10/25Gb SFP28, and 40Gb QSFP+ copper, and optical networking • Ruggedized and rack-mountable RE:INVENT 2016 LAUNCH
  32. 32. Use cases: AWS Import/Export Snowball Cloud Migration Disaster Recovery Data Center Decommission Content Distribution
  33. 33. AWS storage migration expansion: AWS Snowmobile
  34. 34. Storage Gateway Enables Hybrid Storage Solutions Use standard storage protocols to access AWS storage services Customer Premises File Volume Tape Amazon EBS snapshots Amazon S3 Amazon Glacier AWS IAM AWS KMS AWS CloudTr ail Amazon CloudWatc h Internet Direct Connect Amazon VPC NFS Enterprise storage Backup servers Application servers iSCSI VTL
  35. 35. Which option should I choose? • Use S3 lifecycle managed Amazon Glacier if the S3 object keys are sufficient for index/search capability • Use Amazon Glacier directly if you already plan to store more metadata/indices in a database • Use 3rd party tools or AWS Storage Gateway to minimize coding
  36. 36. Media Archive Use Case
  37. 37. corporate data center Media Archive and Metadata (cloud transition) Onsite Archive Offsite Tape Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks On-Premise Tape
  38. 38. Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Amazon Glacier Cloud DAM (Syncing Metadata from on-prem) Amazon Direct Connect Offsite Tape ArchiveOn-Premise Tape Media Archive (transition to the cloud)
  39. 39. Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Amazon Glacier Cloud DAM (Syncing Metadata from on- prem) Amazon S3 Cloud Based Processing Tasks Amazon Direct Connect On-Premise Tape Offsite Tape Archive Media Archive (transition to the cloud)
  40. 40. Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Amazon Glacier Cloud DAM (Syncing Metadata from on- prem) Amazon S3 Cloud Based Processing Tasks Amazon Direct Connect Onsite Cache Offsite Tape ArchiveOn-Premise Tape Media Archive (transition to the cloud)
  41. 41. Media Solution: Sony DADC Problem Statement: • Challenged by on-prem legacy infrastructure. • Provide a performant, secure, economical media distribution solution. • Decrease time to market for their customer’s finished content. Use of AWS: • EC2 content processing and SWF, SQS, SNS for media workflow automation • S3 for storage, Glacier for content archive • CloudFront for OTT. Business Benefits: • Workflow pipelines can be run in a highly parallelized fashion through AWS elastic scalability. • Significantly shorten content delivery SLA with a new AWS enabled target of 1-hr. • Fully migrating away from on-prem infrastructure. On-demand cloud-based media supply chain and delivery solution
  42. 42. • Media distribution backbone (Ve.nue platform) • Over-The-Top (OTT) broadcast service • 20PBs of media assets, 1MM+ hours of high-res content • Assets to be archived and retained for decades Video archives
  43. 43. Comprehensive media lifecycle @SonyDADCNMS
  44. 44. “If physical deliveries can happen within one hour based on unpredictable requests, surely we are able to exceed such expectations digitally” @SonyDADCNMS
  45. 45. Sony Migration The Challenge • Seamlessly migrate a platform that enables content delivery across all devices and more than 1,200 distribution points worldwide • Store 20 petabytes of motion picture and television content • Equating to 1,000,000M+ Hrs of content • At a growth curve of ~1 petabyte every quarter Desired Goals: • One hour delivery turn around time • Agile, scalable, predictable cost model & infrastructure • Investing in innovation vs. hardware @SonyDADCNMS
  46. 46. On-Premise Asset Storage Workflow @SonyDADCNMS
  47. 47. AWS Cloud-based Asset Storage Workflow @SonyDADCNMS
  48. 48. Glacier vs. On-Prem Cost Comparison @SonyDADCNMS
  49. 49. Consideration 5 - Compliance and Retention
  50. 50. Amazon Glacier Vault Lock allows you to easily set compliance controls on individual vaults and enforce them via a lockable policy Time-based retention MFA authentication Controls govern all records in a vault Immutable policy Two-step locking Compliance storage with Vault Lock
  51. 51. Glacier Vault Lock • Non-overwrite, non-erasable records • Time-based retention with “ArchiveAgeInDays” control • Policy lockdown (strong governance) • Legal hold with vault-level tags • Configure designated third-party access and grant temporary access Amazon Glacier received a third-party assessment from Cohasset Associates on how Amazon Glacier with Vault Lock can be used to meet the requirements of SEC 17a-4(f) and CFTC 1.31(b)-(c).
  52. 52. Proofpoint • Cloud-based security and compliance for the enterprise: threat research, email, mobile, social, digital risk • Founded 2002, public in 2012 • $350M annual revenue, $3B market cap
  53. 53. Proofpoint SocialPatrol • Policy controls and enforcement for social • Combats fraudulent brand impersonation • Moderates content at scale • Ensures compliance in publishing • Integrates with social APIs • 150+ classifiers using NLP and ML • Text, links, images, meta data • Ingesting >1M social posts per day • Built in AWS
  54. 54. Proofpoint SocialPatrol Archive with Glacier • SEC Rule 17a-4(f)-compliant archive, purpose-built for social, enabled by Amazon Glacier and Vault Lock PFPT in AWS Policy engine MySQL/C*/SolrSocial Amazon Glacier & Vault Lock
  55. 55. Proofpoint SocialPatrol Archive • The customer specifies the retention period in Proofpoint Social:
  56. 56. Proofpoint SocialPatrol Archive • Via AWS API we create a vault for that customer:
  57. 57. Proofpoint SocialPatrol Archive • Via AWS API, we lock the vault, and specify policy to observe a legal hold via a tag.
  58. 58. Active-Archive Resources • Amazon S3: https://aws.amazon.com/s3/ • Amazon S3 Deep Dive (re-invent 2016): https://www.youtube.com/watch?v=bMhWWkhydFQ&t=249s • Amazon Glacier: https://aws.amazon.com/glacier/ • Amazon Glacier Deep-Dive (re:Invent 2016): https://www.youtube.com/watch?v=dfr9mBcDJ-U • WORM Compliance Assessment: https://aws.amazon.com/blogs/aws/glacier-cohasset-assessment/ • Sony Case Study: https://aws.amazon.com/solutions/case-studies/sony-dadc/ • Backup & Archive TCO Calculator: http://www.backuparchive.awstcocalculator.com/
  59. 59. Thank You!