Big Data on AWS - AWS Washington D.C. Symposium 2014

•Télécharger en tant que PPTX, PDF•

2 j'aime•1,928 vues

Big Data on AWS is a deep dive into Cloud-based big data solutions using Amazon Elastic MapReduce (EMR) and Amazon Redshift. In this session, you will learn how to create big data environments and leverage best practices to design big data environments for security and cost-effectiveness. Demonstrations will include using Amazon EMR to process log data and the ease of provisioning a Redshift data warehouse.

Technologie Voyages

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Big Data
Jon Einkauf
jeinkauf@amazon.com

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Agenda
• Brief overview of AWS Big Data services
• Demo (Query logs in S3 using Amazon EMR)
• Q&A

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Technologies and techniques for working
productively with data, at any scale.
Big Data

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Big data and AWS
Big data Cloud computing
Potentially massive
datasets
Virtually unlimited
capacity

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Big data and AWS
Big data Cloud computing
Iterative, experimental
style of data
manipulation and
analysis
Iterative, experimental
style of infrastructure
deployment/usage

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Big data and AWS
Big data Cloud computing
Frequently not steady-
state workload; peaks
and valleys
At its most efficient with
highly variable workloads

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Big data and AWS
Big data Cloud computing
“Time to results” is critical;
shared resources are a
bottleneck
Parallel compute projects
allow each workgroup to
have more autonomy, get
faster results

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Ease of useLower costs

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Only pay for what you
use
No capital investment
Pay as you go
Lower costs

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Programmable
Integrate with existing
tools
Low admin
Easy to configure
Ease of use

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Use the right tools
Amazon
S3
Amazon
Kinesis
Amazon
DynamoDB
Amazon
Redshift
Amazon
Elastic
MapReduce
AWS Data
Pipeline

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Amazon
S3
• High scalable object store
• 99.999999999% durability
• Encryption
• Data lifecycle management

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Amazon
Kinesis
• Real-time processing
• High throughput
• Elastic
• Integrates with EMR, S3,
Redshift, DynamoDB

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Amazon
DynamoDB
• NoSQL database
• Seamless scalability
• Low admin
• Single digit millisecond latency

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Amazon
Redshift
• Relational data warehouse
• Massively parallel
• Petabyte scale
• Fully Managed
• Low cost ($1K/TB/Year with
3 year Reservation)

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Amazon
Elastic
MapReduce
(EMR)
• Managed Hadoop clusters
• MapReduce, Hive, Pig, Impala,
HBase, Spark, Accumulo, etc.
• Integrates with S3, DynamoDB,
Redshift, Data Pipeline, Kinesis

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Data
Pipeline
• Data-driven workflows
• Integrates with EMR, EC2, S3,
Redshift, DynamoDB, SNS
• Process and move data
between AWS and your own
data center

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Log Analysis
Example

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Demo

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Big Data Test Drives
APN Partner-provided labs
aws.amazon.com/testdrive/bigdata

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
https://aws.amazon.com/tra
ining
AWS Training & Events
Webinars, Bootcamps,
and Self-Paced Labs
aws.amazon.com/events

AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Thank you!
jeinkauf@amazon.com

Contenu connexe

Tendances

AWS Public Sector Summit 2014 Talk - Science as a Service using AWSRavi Madduri

Modern IT Governance Through Transparency and AutomationAmazon Web Services

Introduction to AWS Services and Cloud ComputingAmazon Web Services

NASA Goddard: Head in the CloudsAmazon Web Services

Enterprise Cloud Adoption Strategies in Higher EducationAmazon Web Services

AWS Cost Management Lessons from the Private SectorAmazon Web Services

Large Scale Data Analysis with AWSAmazon Web Services

How Amazon.com Uses AWS Analytics: Data Analytics Week SFAmazon Web Services

Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsAmazon Web Services

Adobe : The Future of SaaSAmazon Web Services

Big Data in The Cloud: Architecting a Better PlatformAmazon Web Services

AWS Deployment Best Practices - AWS Symposium 2014 - Washington D.C. Amazon Web Services

Security & Privacy: Using AWS to Meet Requirements for HIPAA, CJIS, and FERPAAmazon Web Services

An Update on the AWS/FedRAMP TIC Overlay PilotAmazon Web Services

Federal Compliance Deep Dive: FISMA, FedRAMP, and Beyond - AWS Symposium 2014...Amazon Web Services

Defending your workloads against the next zero-day vulnerability Amazon Web Services

Using AWS Services to Go “All In” on AWSAmazon Web Services

Transforming Education in the CloudAmazon Web Services

Hybrid IT Approach and Technologies on AWSAmazon Web Services

Big Data and AnalyticsAmazon Web Services

Tendances (20)

AWS Public Sector Summit 2014 Talk - Science as a Service using AWS

Modern IT Governance Through Transparency and Automation

Introduction to AWS Services and Cloud Computing

NASA Goddard: Head in the Clouds

Enterprise Cloud Adoption Strategies in Higher Education

AWS Cost Management Lessons from the Private Sector

Large Scale Data Analysis with AWS

How Amazon.com Uses AWS Analytics: Data Analytics Week SF

Finding Meaning in the Noise: Understanding Big Data with AWS Analytics

Adobe : The Future of SaaS

Big Data in The Cloud: Architecting a Better Platform

AWS Deployment Best Practices - AWS Symposium 2014 - Washington D.C.

Security & Privacy: Using AWS to Meet Requirements for HIPAA, CJIS, and FERPA

An Update on the AWS/FedRAMP TIC Overlay Pilot

Federal Compliance Deep Dive: FISMA, FedRAMP, and Beyond - AWS Symposium 2014...

Defending your workloads against the next zero-day vulnerability

Using AWS Services to Go “All In” on AWS

Transforming Education in the Cloud

Hybrid IT Approach and Technologies on AWS

Big Data and Analytics

En vedette

(APP203) How Sumo Logic and Anki Build Highly Resilient Services on AWS to Ma...Amazon Web Services

AWSome Day Bangkok Opening KeynoteAmazon Web Services

AWS Customer Presentation - SemantiNet Amazon Web Services

AWS Customer Presentation - Porticor Amazon Web Services

AWS Public Sector Symposium 2014 Canberra | Black Belt Tips on AWS Amazon Web Services

(PFC402) Bigger, Faster: Performance Tips for High Speed and High Volume Appl...Amazon Web Services

Continuous Integration and Deployment Best Practices on AWS Amazon Web Services

Keynote - Werner Vogels Amazon Web Services

Understanding AWS securityAmazon Web Services

High Availability Websites: part twoAmazon Web Services

AWS Public Sector Symposium 2014 Canberra | Getting Started with AWS for Gove...Amazon Web Services

AWS Customer Presentation - qlik TechAmazon Web Services

AWS Summit Stockholm 2014 – T3 – disaster recovery on AWSAmazon Web Services

AWS Customer Presentation - NASA JPL Pervasive Cloud Now and FutureAmazon Web Services

AWS Customer Presentation: Washington Post - AWS NYC Summit 2012Amazon Web Services

(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014Amazon Web Services

AWS Webcast - Intro to DevOps: Using Amazon RDS with AWS OpsWorksAmazon Web Services

AWS Summit Stockholm 2014 – T4 – Continuous integration on AWSAmazon Web Services

AWS Summit Stockholm 2014 – B3 – Integrating on-premises workloads with AWSAmazon Web Services

High Availability Websites: part oneAmazon Web Services

En vedette (20)

(APP203) How Sumo Logic and Anki Build Highly Resilient Services on AWS to Ma...

AWSome Day Bangkok Opening Keynote

AWS Customer Presentation - SemantiNet

AWS Customer Presentation - Porticor

AWS Public Sector Symposium 2014 Canberra | Black Belt Tips on AWS

(PFC402) Bigger, Faster: Performance Tips for High Speed and High Volume Appl...

Continuous Integration and Deployment Best Practices on AWS

Keynote - Werner Vogels

Understanding AWS security

High Availability Websites: part two

AWS Public Sector Symposium 2014 Canberra | Getting Started with AWS for Gove...

AWS Customer Presentation - qlik Tech

AWS Summit Stockholm 2014 – T3 – disaster recovery on AWS

AWS Customer Presentation - NASA JPL Pervasive Cloud Now and Future

AWS Customer Presentation: Washington Post - AWS NYC Summit 2012

(GAM304) How Riot Games re:Invented Their AWS Model | AWS re:Invent 2014

AWS Webcast - Intro to DevOps: Using Amazon RDS with AWS OpsWorks

AWS Summit Stockholm 2014 – T4 – Continuous integration on AWS

AWS Summit Stockholm 2014 – B3 – Integrating on-premises workloads with AWS

High Availability Websites: part one

Similaire à Big Data on AWS - AWS Washington D.C. Symposium 2014

Leveraging the Cloud to Strengthen Democracy: NDI Case Study - AWS Washington...Amazon Web Services

Welcome to the AWS Cloud - AWS Symposium 2014 - Washington D.C. Amazon Web Services

Continuous Integration and Deployment Best Practices on AWS - AWS Symposium 2...Amazon Web Services

DevOps and Continuous Deployment @ WWPS Government, Education, and Non-profit...John Schneider

Bringing Governance to an Existing Cloud at NASA’s Jet Propulsion Laboratory ...Amazon Web Services

Outcome Broker: Data Driven Innovation - AWS Washington D.C. Symposium 2014Amazon Web Services

Big Data and Analytics on AWS Amazon Web Services

AWS Service Drill Downs - AWS Symposium 2014 - Washington D.C. Amazon Web Services

ModernizationAWS.pdfIsmailCassiem

AWS GovCloud (US) Fundamentals: Past, Present, and Future - AWS Symposium 201...Amazon Web Services

Big Open Data Transformation Through Public Data Sets - AWS Washington D.C. S...Amazon Web Services

Move Away From the Worry-Based Fiction of the Cloud - AWS Washington D.C. Sym...Amazon Web Services

Transformational impact of cloud labor session1 062314v1John Hoskins

Accelerating Time to Science:Transforming Research in the CloudAmazon Web Services

Welcome to the AWS CloudAmazon Web Services

Moving Workloads into AWS GovCloud (US) - AWS Symposium 2014 - Washington D.C. Amazon Web Services

Overview of AWS Partner Programs in the Public SectorAmazon Web Services

AWS Shared Responsibility Model - AWS Symposium 2014 - Washington D.C. Amazon Web Services

The NOAA Big Data Project: Public-Private Partnerships at ScaleAmazon Web Services

How Public Sector Entities are Advancing Their Security and Governance Capabi...Amazon Web Services

Similaire à Big Data on AWS - AWS Washington D.C. Symposium 2014 (20)

Leveraging the Cloud to Strengthen Democracy: NDI Case Study - AWS Washington...

Welcome to the AWS Cloud - AWS Symposium 2014 - Washington D.C.

Continuous Integration and Deployment Best Practices on AWS - AWS Symposium 2...

DevOps and Continuous Deployment @ WWPS Government, Education, and Non-profit...

Bringing Governance to an Existing Cloud at NASA’s Jet Propulsion Laboratory ...

Outcome Broker: Data Driven Innovation - AWS Washington D.C. Symposium 2014

Big Data and Analytics on AWS

AWS Service Drill Downs - AWS Symposium 2014 - Washington D.C.

ModernizationAWS.pdf

AWS GovCloud (US) Fundamentals: Past, Present, and Future - AWS Symposium 201...

Big Open Data Transformation Through Public Data Sets - AWS Washington D.C. S...

Move Away From the Worry-Based Fiction of the Cloud - AWS Washington D.C. Sym...

Transformational impact of cloud labor session1 062314v1

Accelerating Time to Science:Transforming Research in the Cloud

Welcome to the AWS Cloud

Moving Workloads into AWS GovCloud (US) - AWS Symposium 2014 - Washington D.C.

Overview of AWS Partner Programs in the Public Sector

AWS Shared Responsibility Model - AWS Symposium 2014 - Washington D.C.

The NOAA Big Data Project: Public-Private Partnerships at Scale

How Public Sector Entities are Advancing Their Security and Governance Capabi...

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services

Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services

Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services

Costruire Applicazioni Moderne con AWSAmazon Web Services

Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services

Open banking as a serviceAmazon Web Services

Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services

OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services

Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services

Computer Vision con AWSAmazon Web Services

Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services

Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services

API moderne real-time per applicazioni mobili e webAmazon Web Services

Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services

Tools for building your MVP on AWSAmazon Web Services

How to Build a Winning Pitch DeckAmazon Web Services

Building a web application without serversAmazon Web Services

Fundraising EssentialsAmazon Web Services

AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services

Introduzione a Amazon Elastic Container ServiceAmazon Web Services

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...

Big Data per le Startup: come creare applicazioni Big Data in modalità Server...

Esegui pod serverless con Amazon EKS e AWS Fargate

Costruire Applicazioni Moderne con AWS

Come spendere fino al 90% in meno con i container e le istanze spot

Open banking as a service

Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...

OpsWorks Configuration Management: automatizza la gestione e i deployment del...

Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads

Computer Vision con AWS

Database Oracle e VMware Cloud on AWS i miti da sfatare

Crea la tua prima serverless ledger-based app con QLDB e NodeJS

API moderne real-time per applicazioni mobili e web

Database Oracle e VMware Cloud™ on AWS: i miti da sfatare

Tools for building your MVP on AWS

How to Build a Winning Pitch Deck

Building a web application without servers

Fundraising Essentials

AWS_HK_StartupDay_Building Interactive websites while automating for efficien...

Introduzione a Amazon Elastic Container Service

Dernier

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

How to write a Business Continuity PlanDatabarracks

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Data governance with Unity Catalog PresentationKnoldus Inc.

From Family Reminiscence to Scholarly Archive .Alan Dix

2024 April Patch TuesdayIvanti

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Take control of your SAP testing with UiPath Test SuiteDianaGray10

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Rise of the Machines: Known As Drones...Rick Flair

Connecting the Dots for Information Discovery.pdfNeo4j

Dernier (20)

Testing tools and AI - ideas what to try with some tool examples

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...

How AI, OpenAI, and ChatGPT impact business and software.

How to write a Business Continuity Plan

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

The State of Passkeys with FIDO Alliance.pptx

What is DBT - The Ultimate Data Build Tool.pdf

Data governance with Unity Catalog Presentation

From Family Reminiscence to Scholarly Archive .

2024 April Patch Tuesday

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes

Genislab builds better products and faster go-to-market with Lean project man...

Generative Artificial Intelligence: How generative AI works.pdf

Generative AI for Technical Writer or Information Developers

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

Take control of your SAP testing with UiPath Test Suite

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Rise of the Machines: Known As Drones...

Connecting the Dots for Information Discovery.pdf

Big Data on AWS - AWS Washington D.C. Symposium 2014

1. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Big Data Jon Einkauf jeinkauf@amazon.com

2. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Agenda • Brief overview of AWS Big Data services • Demo (Query logs in S3 using Amazon EMR) • Q&A

3. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Technologies and techniques for working productively with data, at any scale. Big Data

4. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Big data and AWS Big data Cloud computing Potentially massive datasets Virtually unlimited capacity

5. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Big data and AWS Big data Cloud computing Iterative, experimental style of data manipulation and analysis Iterative, experimental style of infrastructure deployment/usage

6. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Big data and AWS Big data Cloud computing Frequently not steady- state workload; peaks and valleys At its most efficient with highly variable workloads

7. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Big data and AWS Big data Cloud computing “Time to results” is critical; shared resources are a bottleneck Parallel compute projects allow each workgroup to have more autonomy, get faster results

8. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Ease of useLower costs

9. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Only pay for what you use No capital investment Pay as you go Lower costs

10. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Programmable Integrate with existing tools Low admin Easy to configure Ease of use

11. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Use the right tools Amazon S3 Amazon Kinesis Amazon DynamoDB Amazon Redshift Amazon Elastic MapReduce AWS Data Pipeline

12. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Amazon S3 • High scalable object store • 99.999999999% durability • Encryption • Data lifecycle management

13. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Amazon Kinesis • Real-time processing • High throughput • Elastic • Integrates with EMR, S3, Redshift, DynamoDB

14. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Amazon DynamoDB • NoSQL database • Seamless scalability • Low admin • Single digit millisecond latency

15. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Amazon Redshift • Relational data warehouse • Massively parallel • Petabyte scale • Fully Managed • Low cost ($1K/TB/Year with 3 year Reservation)

16. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Amazon Elastic MapReduce (EMR) • Managed Hadoop clusters • MapReduce, Hive, Pig, Impala, HBase, Spark, Accumulo, etc. • Integrates with S3, DynamoDB, Redshift, Data Pipeline, Kinesis

17. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Data Pipeline • Data-driven workflows • Integrates with EMR, EC2, S3, Redshift, DynamoDB, SNS • Process and move data between AWS and your own data center

18. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Log Analysis Example

19. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Demo

20. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Big Data on AWS Brand new course on Big Data aws.amazon.com/training/course- descriptions/bigdata

21. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Big Data Test Drives APN Partner-provided labs aws.amazon.com/testdrive/bigdata

22. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 https://aws.amazon.com/tra ining AWS Training & Events Webinars, Bootcamps, and Self-Paced Labs aws.amazon.com/events

23. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Thank you! jeinkauf@amazon.com

Notes de l'éditeur

From TBs to PBs, we have the capacity and scale to handle your largest big data workloads
When we think of big data, we think of both the proliferation of digital information and also about the innovations to exploit or extract information from that data to increase sales, efficiency, better health, analysis, predictions, recommendations, and innovation More specifically, we think cloud computing is a fundamental component to any big data strategy due to its inherent benefits
From TBs to PBs, we have the capacity and scale to handle your largest big data workloads
You can start and stop on demand, run big data workloads in parallel as you test out new ideas, allowing you to explore without commitments
With services such as Auto Scaling and elastic load balancing, you can dial up and down the amount of infrastructure you need for your variable or experimental workloads
The total time also includes the waiting to get access to those IT resources, with the cloud you can be up and running in minutes and in parallel allowing
In sense, AWS cloud democratizes big data for everyone to use and is based on two foundational benefits, lower costs and the ease of use and by focusing on these key tenents directs us in the direction of how we innovate Lack of constraints leads to new usage models Gives control back to individual development teams Fail-fast (and fail-cheap) opens up exploratory style Many customers create 100s of Amazon EMR clusters per day Classic burst-y workload perfect for the cloud Big data / HPC clusters themselves are parallelized resources Can you build a faster on-premises cluster? Yes, but… Usually a shared/contented resource; in cloud, each user/workgroup gets their own cluster Cloud is often the fastest platform based on “MTTJC” (Mean Time To Job Completion)
We provide all of our services with a self service API, we als provide managed services so you don’t have to the back end administration and you can configure your infrastructure with code, scripts or point and click from our console all the while maintaining compatability with your current tools.
While I won’t be able to go over all of our big data services, I would like to spend some time introducing to you several key big data services that are designed for high availability and durability, as a managed service where we provision the infrastructure on your behalf where you can get significant big data storage and analytics with a few clicks or api calls.
Fundamental storage at internet scale, it can store any number of objects from 1 byte to 5 TB in size It is engineered for 11 9’s of durability replicating your data at least three times in three distinct physical data centers we call availability zones We have customers such as Dropbox, Spotify, Pinterest store billions of objects or files as photos, videos, songs, or any other type of file.
Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources. For instance, instead of having to process log files in batch, you can have log events stream into Kinesis and then have workers with the Kinesis client library read from the stream and process the informaiton and drive a real time dashboard. Later on today, we will have the product manager, Adi Krishnan, for Amazon Kinesis give a deep dive into the service
DynamoDB is a fast, fully managed NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data, and serve any level of request traffic. Its guaranteed throughput and single-digit millisecond latency make it a great fit for gaming, ad tech, mobile and many other applications. Runs on solid state hard drives for high speed performance at scale and you can provision reads and writes to a table without having to worry about the admin of scaling or sharding, it is done all behind the scenes for you. For instance, real time bidding where in less than 200 milliseconds 3 rounds of bidding of what ad to place on a website while a page loads needs the performance of a single-digit millisecond latency to determine what ad to place and what price to bid for that ad impression.
Provision a petabyte scale cluster to handle complex SQL queries in just a few minutes. You can get either a HDD drive based cluster or the recently introduced SSD based cluster that is smaller in total cluster size but higher performance per GB This data warehouse solution is about a tenth of what traditional solutions cost of comparable size. Redshift can drive business intelligence tools such as Jaspersoft or Microstrategy because it supports standard SQL and can connect using ODBC or JDBC drivers.
When you think of big data these days, Hadoop is always an integral part. When you take the benefits of what the cloud can do along with the computational paradigm of MapReduce, you get Elastic MapReduce. Customers have launched millions of clusters to run big data workloads. Amazon Elastic MapReduce A key tool in the toolbox to help with ‘Big Data’ challenges Makes possible analytics processes previously not feasibleCost effective when leveraged with EC2 spot market
When you think of big data these days, Hadoop is always an integral part. When you take the benefits of what the cloud can do along with the computational paradigm of MapReduce, you get Elastic MapReduce. Customers have launched millions of clusters to run big data workloads. Amazon Elastic MapReduce A key tool in the toolbox to help with ‘Big Data’ challenges Makes possible analytics processes previously not feasibleCost effective when leveraged with EC2 spot market
Speaker Notes: We have just released “Big Data to AWS”, a new technical training course for individuals who are responsible for implementing big data environments, namely Data Scientists, Data Analysts, and Enterprise Big Data Solution Architects. This course is designed to teach technical end users how to use Amazon EMR to process data using the broad ecosystem of Hadoop tools like Pig and Hive. We also cover how to create big data environments, work with Amazon DynamoDB and Amazon Redshift, understand the benefits of Amazon Kinesis, and leverage best practices to design big data environments for security and cost-effectiveness. Upcoming classes include: April 22 – Redwood City, CA May 6 – Sao Paulo, Brazil May 20 – Luxembourg May 21 – Rio de Janeiro, Brazil June 3 – New York, NY, Redwood City, CA, and Colombia, MD June 4 – Porto Alegre, Brazil Audience Individuals responsible for implementing big data environments: Data Scientists, Data Analysts, and Enterprise Big Data Solution Architects Objectives Understand the architecture of an Amazon EMR cluster Choose appropriate AWS data storage options for use with Amazon EMR Know your options for ingesting, transferring, and compressing data for use with Amazon EMR Use common programming frameworks for Amazon EMR including Hive, Pig, and Streaming Work with Amazon Redshift and Spark/Shark to implement big data solutions Leverage big data visualization software Choose appropriate security and cost management options for Amazon EMR Understand the benefits of using Amazon Kinesis for big data Prerequisites Basic familiarity with big data technologies, including Apache Hadoop and HDFS Knowledge of big data technologies such as Pig, Hive, and MapReduce helpful, but not required Working knowledge of core AWS services and public cloud implementation AWS Essentials course completion or equivalent experience Basic understanding of data warehousing, relational database systems, and database design Format Instructor-Led & Hands-on Labs Duration 3 days Details aws.amazon.com/training/course-descriptions/bigdata/
Microstrategy Splunk QlikView EMR Pig MongoDB Oracle BI, OBIEE 11g SAP Hana Yellowfin BI
AWS is here to help Thank you very much for your time to day, that concludes this presentation.

Big Data on AWS - AWS Washington D.C. Symposium 2014

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Big Data on AWS - AWS Washington D.C. Symposium 2014

Similaire à Big Data on AWS - AWS Washington D.C. Symposium 2014 (20)

Plus de Amazon Web Services

Plus de Amazon Web Services (20)

Dernier

Dernier (20)

Big Data on AWS - AWS Washington D.C. Symposium 2014

Notes de l'éditeur