Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Data_Analytics_and_AI_ML

2 872 vues

Publié le

  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Data_Analytics_and_AI_ML

  1. 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data Analytics and Machine Learning on AWS
  2. 2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. WHAT IS BIG DATA?
  3. 3. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data and the 3Vs Variety Velocity Volume
  4. 4. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elastic and highly scalable No upfront capital expense Only pay for what you use + + Available on-demand + The Cloud Advantage
  5. 5. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. BIG DATA ANALYTICS
  6. 6. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Examples of Business Outcomes and Insights Ø Security threat detection Ø User Behavior Analysis Ø Enhanced customer experience Ø Business Intelligence Ø Spending optimization Ø Real-time alerting
  7. 7. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Relational Databases NoSQL Databases Web servers Mobile phones/Tablets 3rd party feeds IoT Clickstream Examples of Big Data Sources
  8. 8. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Examples of AWS Services for Big Data Analytics EMR EC2 Glacier S3 Import Export Kinesis Direct Connect Machine LearningRedshift DynamoDB AWS Database Migration Service AWS Lambda AWS IoT AWS Data Pipeline Amazon KinesisAnalytic Analytics Amazon SNS AWS Snowball Amazon SWF AmazonAthena Amazon QuickSight Amazon AuroraAWS Glue
  9. 9. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon S3—Object Storage Security and Compliance Three different forms of encryption; encrypts data in transit when replicating across regions; log and monitor with CloudTrail, use ML to discover and protect sensitive data with Macie Flexible Management Classify, report, and visualize data usage trends; objects can be tagged to see storage consumption, cost, and security; build lifecycle policies to automate tiering, and retention Durability, Availability & Scalability Built for eleven nine’s of durability; data distributed across 3 physical facilities in an AWS region; automatically replicated to any other AWS region Query in Place Run analytics & ML on data lake without data movement; S3 Select can retrieve subset of data, improving analytics performance by 400%
  10. 10. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Redshift—Data Warehousing Fast at scale Columnar storage technology to improve I/O efficiency and scale query performance Secure Audit everything; encrypt data end-to-end; extensive certification and compliance Open file formats Analyze optimized data formats on the latest SSD, and all open data formats in Amazon S3 Inexpensive As low as $1,000 per terabyte per year, 1/10th the cost of traditional data warehouse solutions; start at $0.25 per hour $
  11. 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Redshift Spectrum Extend the data warehouse to exabytes of data in S3 data lake S3 data lakeRedshift data Redshift Spectrum query engine • Exabyte Redshift SQL queries against S3 • Join data across Redshift and S3 • Scale compute and storage separately • Stable query performance and unlimited concurrency • CSV, ORC, Grok, Avro, & Parquet data formats • Pay only for the amount of data scanned
  12. 12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EMR—Big Data Processing Low cost Flexible billing with per- second billing, EC2 spot, reserved instances and auto-scaling to reduce costs 50–80% $ Easy Launch fully managed Hadoop & Spark in minutes; no cluster setup, node provisioning, cluster tuning Latest versions Updated with the latest open source frameworks within 30 days of release Use S3 storage Process data directly in the S3 data lake securely with high performance using the EMRFS connector Data Lake 1001100001001010111 0010101011100101010 0000111100101100101 010001100001
  13. 13. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Elasticsearch Service Easy to Use Fully managed; Deploy production-ready clusters in minutes Secure Secure access with VPC to keep all traffic within AWS network Open Direct access to Elasticsearch open-source APIs; supports Logstash and Kibana Available Zone awareness replicates data between two AZs; automatically monitors & replaces failed nodes $
  14. 14. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis—Real Time time Load data streams into AWS data stores Kinesis Data Firehose Build custom applications that analyze data streams Kinesis Data Streams Capture, process, and store video streams for analytics Kinesis Video Streams New Analyze data streams with SQL Kinesis Data Analytics SQL
  15. 15. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Athena—Interactive Analysis Interactive query service to analyze data in Amazon S3 using standard SQL No infrastructure to set up or manage and no data to load Ability to run SQL queries on data archived in Amazon Glacier (coming soon) Query Instantly Zero setup cost; just point to S3 and start querying SQL Open ANSI SQL interface, JDBC/ODBC drivers, multiple formats, compression types, and complex joins and data types Easy Serverless: zero infrastructure, zero administration Integrated with QuickSight Pay per query Pay only for queries run; save 30–90% on per-query costs through compression $
  16. 16. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon QuickSight easy Empower everyone Seamless connectivity Fast analysis Serverless
  17. 17. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture ORIGIN DESTINATION Insight consumers
  18. 18. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Data analysts Data scientists Business users Engagement platforms Automation / events ORIGIN DESTINATION
  19. 19. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP Connected devices Social media ORIGIN DESTINATION
  20. 20. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data analysts Data scientists Business users Engagement platforms Automation / events Transactions Web logs / cookies ERP Connected devices Social media Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Semi/Unstructured Amazon EMR
  21. 21. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Data analysts Data scientists Business users Engagement platforms Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Automation / events Amazon S3 Staged Data (Data Lake) Semi/Unstructured Amazon EMR Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media
  22. 22. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Insights to enhance business applications, new digital services Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Data analysts Data scientists Business users Engagement platforms Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Automation / events Amazon S3 Staged Data (Data Lake) Semi/Unstructured Amazon EMR Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media Amazon S3 Raw Data Amazon EMR ETL Advanced Analytics MLlib AWS Cloud Trail AWS IAM Amazon CloudWatch AWS KMS
  23. 23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Data analysts Data scientists Business users Engagement platforms Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Automation / events Amazon S3 Staged Data (Data Lake) Semi/Unstructured Amazon EMR Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon Kinesis Connected devices Social media Amazon S3 Raw Data Amazon EMR ETL Advanced Analytics MLlib Event Capture Amazon Kinesis Stream Analysis Amazon EMR
  24. 24. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modern data architecture Transactions Web logs / cookies ERP AWS Database Migration AWS Direct Connect Internet Interfaces Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Data analysts Data scientists Business users Engagement platforms Amazon Kinesis Connected devices Social media Advanced Analytics MLlib Event Capture Amazon Kinesis Stream Analysis Amazon EMR Event Scoring Amazon AI Event Handler AWS Lambda Response Handler AWS Lambda Automation / events Data Warehouse Amazon Redshift Legacy Apps Amazon RDS Schemaless Amazon ElasticSearch Direct Query Amazon Athena Near-Zero Latency Amazon DynamoDB Semi/Unstructured Amazon EMR
  25. 25. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A Sample Batch Analytics Pipeline Ad-hoc access to data using Athena Athena can query aggregated datasets as well
  26. 26. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Smart Applications | Machine Learning
  27. 27. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Clickstream Analysis
  28. 28. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Customer Success. Powered by AWS.
  29. 29. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sysco is the leader in selling, marketing, and distributing food. Challenge: Large volumes of data in multiple systems. Also, high costs from maintaining on-premises EDW deployment. Solution: • Migrated their on-premises solution to the cloud with Redshift, S3, EMR, and Athena
  30. 30. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Analytics on the Data Lake • Sysco is the leader in selling, marketing, & distributing food • Challenge: large volumes of data in multiple systems • Consolidated data into a single S3 data lake • Data scientists use EMR notebooks, Athena & Amazon Redshift Spectrum used by business users for reporting Redshift ETL process Data preparation Ingest raw data from multiple sources S3 Redshift Spectrum Athena EMR Marketing data source Other source systems Transformed data S3
  31. 31. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. FINRA oversees > 3,000 securities firms doing business in the United States. Challenge: FINRA’s legacy system did not scale well • Up to 75 billion events per day • Run complex surveillance queries over 20+ PB of data Solution: • Migrated their big data appliance to a S3 Data Lake and used EMR for ingestion and processing • Migrated to RDS and testing Aurora
  32. 32. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. FINRA uses S3 to Build Data Lake with EMR • Required fast access across trillions of trade records (20PB+) • Migrated from on-premises system • Use Apache HBase on Amazon EMR to store and serve this data • Use EMR engines— Spark, Presto, and Hive to process data • Lower costs by 60% over on-premises system Spark on EMR Presto on EMR Hive on EMR S3 Herd Metastore HBase on EMR
  33. 33. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Nasdaq operates financial exchanges around the world, and processes large volumes of data. Challenge: Nasdaq wanted to make their large historical data footprint available to analyze as a single dataset. Solution: • Use Amazon Redshift for interactive querying • Use Amazon S3 as a Data Lake, and Presto on EMR to process historical data
  34. 34. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Nasdaq Uses AWS to Build a Data Lake • Migrate legacy on-premises warehouse to Amazon Redshift • 4.8B rows inserted per trading day (orders, trades, quotes) • Ingest data from multiple sources, validates, and stages in S3 • Redshift reads data out of S3 for fast queries • Presto on EMR and S3 used for analysis of massive historical data set Data from all 7 exchanges operated by Nasdaq (orders, quotes, trade executions) Flat files Operational Databases EMR Redshift S3 SQL Clients
  35. 35. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Lake Overview
  36. 36. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • A centralized repository for both structured and unstructured data • Store data as-is in open-source file formats to enable direct analytics What is a Data Lake?
  37. 37. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why a Data Lake? • Decouple storage from compute, allowing you to scale • Enable advanced analytics across all of your data sources • Reduce complexity in ETL and operational overhead • Future extensibility as new database and analytics technologies are invented
  38. 38. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Traditionally, Analytics Looked Like This OLTP ERP CRM LOB Data Warehouse Business Intelligence TBs-PBs Scale Schema Defined Prior to Data Load Operational and Ad Hoc Reporting Large Initial Capex + $$K / TB/ Year Relational Data
  39. 39. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Lakes Extend the Traditional Approach OLTP ERP CRM LOB Data Lake 1001100001001010111001 0101011100101010000101 1111011010001111001011 0010110 0100011000010 Catalog DW Queries Big Data Processing Interactive Real-Time Web Sensors SocialDevices Business Intelligence Machine Learning TB-EBs Scale All Data in one place, a Single Source of Truth Relational and Non-Relational Data Decouples (low cost) Storage and Compute Schema on Read Diverse Analytical Engines
  40. 40. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – All Data in One Place Store and analyze all of your data, from all of your sources, in one centralized location. “Why is the data distributed in many locations? Where is the single source of truth ?”
  41. 41. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Quick Ingest Quickly ingest data without needing to force it into a pre-defined schema. “How can I collect data quickly from various sources and store it efficiently?”
  42. 42. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Storage vs Compute Separating your storage and compute allows you to scale each component as required “How can I scale up with the volume of data being generated?”
  43. 43. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of a Data Lake – Schema on Read “Is there a way I can apply multiple analytics and processing frameworks to the same data?” A Data Lake enables ad-hoc analysis by applying schemas on read, not write.
  44. 44. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Building a Data lake on AWS
  45. 45. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why AWS? Implementing a Data Lake architecture requires a broad set of tools and technologies to serve an increasingly diverse set of applications and use cases.
  46. 46. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Lake on AWS Catalog & Search Access & User Interfaces Data Ingestion Analytics & Serving S3 Amazon DynamoDB Amazon Elasticsearch Service AWS AppSync Amazon API Gateway Amazon Cognito AWS KMS AWS CloudTrail Manage & Secure AWS IAM Amazon CloudWatch AWS Snowball AWS Storage Gateway Amazon Kinesis Data Firehose AWS Direct Connect AWS Database Migration Service Amazon Athena Amazon EMR AWS Glue Amazon Redshift Amazon DynamoDB Amazon QuickSight Amazon Kinesis Amazon Elasticsearch Service Amazon Neptune Amazon RDS Central Storage Scalable, secure, cost- effective AWS Glue
  47. 47. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Designed for 11 9s of durability Designed for 99.99% availability Durable Available High performance § Multiple upload § Range GET § Store as much as you need § Scale storage and compute independently § No minimum usage commitments Scalable § Amazon EMR § Amazon Redshift § Amazon DynamoDB Integrated § Simple REST API § AWS SDKs § Read-after-create consistency § Event notification § Lifecycle policies Easy to use Why Amazon S3 for a Data Lake?
  48. 48. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What can you do with a Data Lake?
  49. 49. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Query Directly with Amazon Athena
  50. 50. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Analyze with Hadoop on Amazon EMR
  51. 51. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Create Visualizations with Amazon QuickSight
  52. 52. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Train ML Models with Amazon SageMaker
  53. 53. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Create a Central Data Catalog with AWS Glue
  54. 54. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Load into Downstream Services AURORAAmazon Redshift Amazon DynamoDB Amazon Aurora Amazon Elasticsearch Run complex analytic queries against petabytes of structured data A NoSQL database service that delivers consistent, single-digit millisecond latency at any scale. A MySQL and PostgreSQL compatible relational database built for the cloud Delivers Elasticsearch’s real-time analytics capabilities alongside the availability, scalability, and security that production workloads require.
  55. 55. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Movement into the Data Lake
  56. 56. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Sources FilesLogsStreamsDatabases
  57. 57. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Sources - Databases Amazon S3Databases
  58. 58. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Change Data Capture Techniques to Capture Changes • Timestamp • Diff Comparison • Triggers • Transaction Log
  59. 59. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Change Data Capture – Timestamp 4/18/18 300 3/12/18 800 9/25/17 230 2/04/18 100 4/18/18 300 7/16/19 1600 9/25/17 230 2/04/18 100 Last Run: 7/16/19 1400 Kinesis Data Firehose Amazon S3
  60. 60. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Change Data Capture – Diff Compare 6/15/18 0300 6/16/18 0300 20180615T0300 20180616T0300 Diff Compare Kinesis Data Firehose Amazon S3
  61. 61. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Change Data Capture – Triggers SELECT Id: 20982358 Name: Jean-Luc Picard Rank: Captain State: Agitated Roster ChangeData Table: Roster Id: 20982358 Operation: Update Job: ag8afh8 ChangeDataBatch SELECT Table: Roster Id: 20982358 Operation: Update Amazon S3 Write operations to Firehose Kinesis Data Firehose
  62. 62. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Change Data Capture – Database Logs LOG_FILE_HDR_SIZE OS_FILE_LOG_BLOCK _SIZE FORMAT CHECKSUM LOG_CHECKPOINT_1 LOG_CHECKPOINT_2 Checkpoint_lsn Checkpoint_no Log.buf_size LOG BLOCK LOG_BLOCK_HDR_SIZ E Hdr_no […] ??? Tx001.log
  63. 63. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Database Migration Service (AWS DMS) easily and securely migrate and/or replicate your databases and data warehouses to AWS AWS Schema Conversion Tool (AWS SCT) convert your commercial database and data warehouse schemas to open- source engines or AWS-native services, such as Amazon Aurora and Redshift Database Migration Service
  64. 64. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Modernize Migrate Replicate Modernize your database tier – • Commercial to open-source • Commercial to Amazon Aurora Modernize your Data Warehouse – • Commercial to Redshift • Migrate business-critical applications • Migrate from Classic to VPC • Migrate data warehouse to Redshift • Upgrade to a minor version • Create cross-regions Read Replicas • Run your analytics in the cloud • Keep your dev/test and production environment sync When to use DMS and SCT?
  65. 65. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Sources - Files Amazon S3Files
  66. 66. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Files Optimizing Transfers Available Services • S3 Multi-Part Upload • S3 Transfer Acceleration • AWS Direct Connect • AWS DataSync • AWS Transfer - SFTP • AWS Snowball/Snowmobile
  67. 67. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Uploading to Amazon S3 • Amazon S3 supports both a single-part upload and a multi-part upload API • The single-part upload supports objects up to 5 GB in size • The multi-part upload supports objects up to 5 TB in size • The multi-part upload also enables you to maximize your throughput by using parallel threads
  68. 68. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PUT requests go through the nearest AWS Edge Location Data transits over the AWS private network rather than Internet AWS private network optimizes throughput and latency to the AWS Region Data is not stored in the edge cache S3 Transfer Acceleration S3 bucket AWS edge location Uploader
  69. 69. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Direct Connect Amazon S3 VPC Endpoint Customer Gateway Corporate Data Center AWS Region Virtual Private Cloud EC2 Direct Connect Location Customer/Partner Cage AWS Cage Customer/Partner Router AWS Direct Connect Endpoint Private Virtual Interface Public Virtual Interface
  70. 70. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS DataSync Online transfer service that simplifies, automates, and accelerates moving data between on-premises storage and AWS Fast data transfer Cost- effective Combines the speed and reliability of network acceleration software with the cost-effectiveness of open source tools Easy to use Secure and reliable Cloud integrated AWS
  71. 71. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Transfer for SFTP Fully managed SFTP service for Amazon S3 Native integration with AWS services Simple to use Cost-effective Fully managed in AWS Secure and Compliant Seamless migration of existing workflows
  72. 72. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Snowball/Snowmobile Use Case AWS Solution Cloud Migration, Disaster Recovery AWS Snowball Internet of Things (IoT), Remote Locations AWS Snowball Edge Migrating Exabytes of Data AWS Snowmobile
  73. 73. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Sources - Streams Amazon S3Streams
  74. 74. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streams Collecting and Analyzing • Amazon Kinesis • Amazon Managed Streaming for Kafka (MSK) • Example: Clickstream Analytics
  75. 75. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis - Stream Processing on AWS Firehose • Buffer records in a stream into a single output for more efficient storage • Automatic flushing of buffer to S3, ElasticSearch, Redshift, or Splunk Analytics • Create time windows over streams and perform aggregate operations using SQL • Join together multiple streams and output to new streams Streams • Capture streaming data for downstream processing • Allow multiple processors to read streams at their own rate
  76. 76. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Summary - Ingestion s3://datalake/ /vendorfeeds /vendorA /vendorB /clickstream /orders /vendors /customers /app_logs /instance1 /instance2 /syslogs /instance1 /instance2 /databases /customers /orders /vendors API Gateway Kinesis Agent DMS Kinesis Data Firehose Amazon S3 Files Streams Logs Databases AWS DataSync
  77. 77. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Consuming Data from the Data Lake
  78. 78. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Anti-Pattern Everything Query
  79. 79. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Also an Anti-Pattern Everything Query
  80. 80. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. One tool to rule them all
  81. 81. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Where do I start? • Understand your data • Data Structure, Access patterns & characteristics, Temperature, Cost, Size • Know your audience • Business Users, Data Scientists, Developers • Select the right service
  82. 82. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Archival In-memory Warehouse NoSQL Hot data Warm data Cold data Data Structure Low High Object Search Understand your Data Latency Data volume HighLow Request rate Cost / GB High Low
  83. 83. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon ElastiCache Amazon ES Amazon DynamoDB Amazon S3 Amazon Glacier Hot data Warm data Cold data Data Structure Low High Understand your Data Latency Data volume HighLow Request rate Cost / GB High Low NoSQL Object Archival Search In-Memory Warehouse Amazon Redshift
  84. 84. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PRIORITIES NEEDS Creating engaging visual and narrative journeys for analytical solutions Data Visualizer Manages data as a product. Ensures freshness and consistency of data; understands lineage and compliance needs; treats DS as customers Data Product Manager Monitoring for reliability, quickly diagnose deployment or availability issues DevOps Engineer ROLE Visualization Dashboards Reporting Reports – data quality, errors Ad hoc querying Dashboards Makes sense of data, generates and communicates insights to improve or create business processes, creates predictive ML models to support them Data Scientist Ad hoc querying Robust ML tools Builds scalable pipelines, transforms and loads data into structures complete with metadata that can be readily consumed by DS Data Engineer Ad hoc querying Quick visualization Vetting the priortization and ROI, funding projects, providing ongoing feedback Business Sponsor Reporting Dashboards
  85. 85. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Overview of AI/ML
  86. 86. Machine Learning Learning without being explicitly programmed Artificial Intelligence Machines or programs exhibiting intelligence Deep Learning Learning based on Deep Neural Networks AI vs Machine Learning vs Deep Learning
  87. 87. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Closer Look at Machine Learning and when do you use it
  88. 88. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  89. 89. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. 43,252,003,274,489,856,000 43 QUINTILLION UNIQUE COMBINATIONS
  90. 90. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. F2 U' R' L F2 R L' U' Learning function
  91. 91. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. F2 U' R' L F2 R L' U' Learning function 1% accuracy R U r U R U2 r U2% accuracy
  92. 92. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Learning function 20% accuracy 40% accuracy 60% accuracy 80% accuracy 95% accuracy 2% accuracy
  93. 93. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Learning function 95% accuracy ? F2 R F R′ B′ D F D′ B D F
  94. 94. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  95. 95. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Don’t code the patterns; let the system learn through data
  96. 96. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Train a model positive/negative reinforcement Infer from a model to obtain a prediction Data Feedback Model
  97. 97. Supervised Learning It is a cat. No, it’s a Dog.
  98. 98. Supervised Learning – How Machine Learn Human intervention and validation required e.g. Photo classification and tagging Input Label Machine Learning Algorithm Dog Prediction Cat Training Data ? Label Dog Adjust Model
  99. 99. Unsupervised Learning No human intervention required (e.g. Customer segmentation) Input Machine Learning Algorithm Prediction
  100. 100. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Retail Demand Forecasting Vendor Lead Time Prediction Pricing Packaging Substitute Prediction Customers Recommendation Product Search Product Ads Shopping Advice Customer Problem Detection Catalogue Browse-Node Classification Meta-data Validation Review Analysis Product Matching Text In-Book Search Named-entity Extraction Summarisation/Xray Plagiarism Detection Seller Fraud Detection Predictive Help Seller Search & Crawling Images Visual Search Product Image Enhancement Brand Tracking Machine Learning at Amazon.com
  101. 101. Personalized recommendation
  102. 102. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Alexa, Hello!
  103. 103. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  104. 104. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  105. 105. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  106. 106. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  107. 107. AmazonFresh
  108. 108. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Put AI and ML in the hands of every developer and data scientist Our Mission at AWS
  109. 109. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. M L F R A M E W O R K S & I N F R A S T R U C T U R E A I S E R V I C E S R E K O G N I T I O N I M A G E P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D L E XR E K O G N I T I O N V I D E O Vision Speech Language Chatbots A M A Z O N S A G E M A K E R B U I L D T R A I N F O R E C A S T Forecasting T E X T R A C T P E R S O N A L I Z E Recommendations D E P L O Y Pre-built algorithms & notebooks Data labeling (G R O U N D T R U T H ) One-click model training & tuning Optimization (N E O ) One-click deployment & hosting M L S E R V I C E S F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e E C 2 P 3 & P 3 N E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C I N F E R E N C E Reinforcement learningAlgorithms & models ( A W S M A R K E T P L A C E F O R M A C H I N E L E A R N I N G )
  110. 110. AWS AI Services AIwithoutworryingaboutML
  111. 111. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Vision: Amazon Rekognition Key Features Object & Scene Detection Image Moderation Facial Analysis Facial Comparison Facial Recognition Celebrity Recognition
  112. 112. Rekognition Demo: Selfie Analyzer
  113. 113. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Object and Activity Detection Person Tracking Face Recognition Real-time Live Stream Content Moderation Celebrity Recognition Vision: Amazon Rekognition Video Video Analysis
  114. 114. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Speech: Amazon Polly Key Features • 50 Voices • 24 Languages • Lip-Syncing & Text Highlighting • Fine-grained Voice Control • Custom Vocabularies
  115. 115. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Language: Amazon Lex Conversational interfaces for your applications, powered by the same Natural Language Understanding (NLU) & Automatic Speech Recognition (ASR) models as Alexa
  116. 116. Amazon Connect Contact Center Can Use Amazon Lex for Natural Conversations
  117. 117. AWS ML Services DemocratizingMachineLearning
  118. 118. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. ML can be very complicated 1 2 3 1 2 3
  119. 119. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker: build, train, and deploy ML at Scale 1 2 3 1 2 3
  120. 120. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. 1 2 3 1 2 3 Amazon SageMaker: build, train, and deploy ML at Scale
  121. 121. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. 1 2 3 1 2 3 Amazon SageMaker: build, train, and deploy ML at Scale
  122. 122. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. 1 2 3 1 2 3 Amazon SageMaker: build, train, and deploy ML at Scale
  123. 123. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. 1 2 3 1 2 3 Amazon SageMaker: build, train, and deploy ML at Scale
  124. 124. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. 1 2 3 1 2 3 Amazon SageMaker: build, train, and deploy ML at Scale
  125. 125. How do you make it easier to obtain high quality labeled data?
  126. 126. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker: Build, train, and deploy ML
  127. 127. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Successful models require high-quality data
  128. 128. Build highly accurate training datasets and reduce data labeling costs by up to 70% using machine learning
  129. 129. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker ground truth Label machine learning training data easily and accurately
  130. 130. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You

×