SlideShare une entreprise Scribd logo
1  sur  39
An introduction to cloud
computing with
Amazon Web Services
and
MongoDB
Samuel Demharter
DTC, 10 March 2016
Cloud Computing
“Everybody's in it and nobody's in it. It's like
a cloud that everybody has given a little puff
of mist to, and then the cloud does all the
heavy thinking for everybody. I don't mean
there's really a cloud. I just mean it's
something like that.”
The Sirens of Titan, Kurt Vonnegut, 1959
Definition
• Gartner Group: “A style of computing in
which massively scalable scalable and
elastic IT-enabled capabilities are
delivered as a service using Internet
technologies.”
Cloud Computing Service Models
Software As A Service
(SAAS)
Platform As A Service
(PAAS)
Infrastructure As A
Service (IAAS)
Amazon Web Services
• Development started in 2002
• In 2006, Amazon launched its Elastic
Compute cloud (EC2) and S3 storage
service
• Amazon EC2/S3 was the first widely
accessible cloud computing infrastructure
service
Amazon Web Services (AWS)
AWS
Computing
EC2
MapReduce
Storage
S3
EBS
Databases
SimpleDB
DynamoDB
Others
Others
AWS Computing
• Elastic Compute Cloud (EC2)
– Access to individual instances as you would
with any other machine
– Customisable configuration
– Auto Scaling
• Amazon Elastic MapReduce
– Process vast amounts of data
– Utilise Hadoop framework
AWS Storage
• Simple Storage Service (S3)
– Scalable cloud storage
– HTTP access
– Object store not a file system
– Cheap
• Elastic Block Storage (EBS)
– Local storage
– For use with EC2 instances
– Take snapshot backups
– Fast
AWS Databases
• Amazon SimpleDB (noSQL)
– Ease of administration
• Amazon DynamoDB (noSQL)
– Scalability & durability
• Amazon Relational Database Service
(SQL)
– Efficient indexing & querying
• Amazone ElastiCache
– Fast data access
huMONGOus – scalable
– natural
What is a database?
A database is a collection of information that
is organized so that it can easily be
accessed, managed, and updated.
Why use a database?
• Reusability : You need a single, public,
interface for your data storage that all parts of
your application can use.
• Availability : You need be sure that your
application will always be able to read and
write data.
• Durability : You need to be sure that your
data will stick around.
• Scalability : You need your data storage to
be able to grow with your application.
Typical SQL and noSQL databases
SQL
Oracle
MySQL
Microsoft SQL
NoSQL
Key-Value
Column
Document
Graph-based
SQL – Structured Query Language
NoSQL – Not Only SQL
MongoDB
CouchDB
Riak
SQL vs MongoDB
http://sql-vs-nosql.blogspot.co.uk
MongoDB
• Distributed
• Document-oriented
• Schema-less storage solution
• Uses JSON-style documents
• Supports Python, PHP, Java, Ruby, C++, etc.
• Replica sets for failovers and speeding up
reads
• Sharding for high performance
SQL vs MongoDB (noSQL)
SQL MongoDB (noSQL)
Requires structured data/ well-
designed schema
semi-structured, unstructured &
polymorphic data
Table based Document based
Database atomicity Document atomicity/
eventual consistency
Rules enforced by database Rules enforced by user
Scale-up Scale-out (suitable for distributed
computing)
Flexible & fast
Table - Who is the account holder
for account ID 3?
Document - Who is the account
holder for account ID 3?
Redundancy and Data Availability -
Replication
Scaling out - Sharding
• A means for partitioning data across
servers for high performance
Real-time Analytics
Usage Example 1: DNA Sequencing
• Real-time DNA sequencing
• Raw Data
PC
• Basecalling
AWS
• Basecalled
Data
PC
Usage Example 1: DNA Sequencing
• Use AWS EC2 computing and S3 storage
• Spot market – auction of unused EC2
instances
• Pay-Per-Use an important economical
factor for Nanopore
• Use a combination of MongoDB and SQL
Usage Example 2: Genome Analysis
Genetic Variant Calling
Peter White et al., Ohio State University in collaboration with Genome Next
https://youtu.be/upAtK_SOtsY
Resources
• AWS Tutorials - https://qwiklabs.com
• MapReduce -
http://hadoop.apache.org/docs/r1.2.1/mapr
ed_tutorial.html
• AWS for Research -
https://aws.amazon.com/grants/
• MongoDB - http://university.mongodb.com/
Definitions
• Instance: A copy of an Amazon Machine
Image running as a virtual server in the
AWS cloud
• Instance type: A specification that defines
the memory, CPU, storage capacity, and
hourly cost for an instance.
• Amazon Machine Image: AMIs are like a
template of a computer's root drive.
• Pixar accidentally wipes out nearly every
file of "Toy Story 2" about 10 months into
production. Fortunately, supervising
technical director Galyn Susman had just
become a new mom and had an entire
copy of the movie on her home computer
so that she could work from home. Woody
and Buzz live to see another day, and
movie.
An introduction to cloud computing with Amazon Web Services and MongoDB

Contenu connexe

Tendances

Amazon Webservice & Cloud Computing
Amazon Webservice & Cloud ComputingAmazon Webservice & Cloud Computing
Amazon Webservice & Cloud Computing
Jack Smith
 

Tendances (20)

Migrating a multi tenant app to Azure (war biopic)
Migrating a multi tenant app to Azure (war biopic)Migrating a multi tenant app to Azure (war biopic)
Migrating a multi tenant app to Azure (war biopic)
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 
Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...
Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...
Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...
 
Graph Databases at Netflix
Graph Databases at NetflixGraph Databases at Netflix
Graph Databases at Netflix
 
Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"
 
Cloud Computing - War of Stacks
Cloud Computing - War of StacksCloud Computing - War of Stacks
Cloud Computing - War of Stacks
 
Cloud Overview
Cloud OverviewCloud Overview
Cloud Overview
 
Introdcution to Azure
Introdcution to AzureIntrodcution to Azure
Introdcution to Azure
 
Machine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackMachine Learning on the Microsoft Stack
Machine Learning on the Microsoft Stack
 
Amazon Web Services (Database)
Amazon Web Services (Database)Amazon Web Services (Database)
Amazon Web Services (Database)
 
Cloud compt
Cloud comptCloud compt
Cloud compt
 
Azure Global Bootcamp - CIS Handson
Azure Global Bootcamp - CIS HandsonAzure Global Bootcamp - CIS Handson
Azure Global Bootcamp - CIS Handson
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
IronSource Atom - Redshift - Lessons Learned
IronSource Atom -  Redshift - Lessons LearnedIronSource Atom -  Redshift - Lessons Learned
IronSource Atom - Redshift - Lessons Learned
 
AWS Distilled
AWS DistilledAWS Distilled
AWS Distilled
 
Snowball 180625113523
Snowball 180625113523Snowball 180625113523
Snowball 180625113523
 
Introducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom ConnectorsIntroducing Kafka Connect and Implementing Custom Connectors
Introducing Kafka Connect and Implementing Custom Connectors
 
Cloud service comparisons
Cloud service comparisonsCloud service comparisons
Cloud service comparisons
 
Amazon Webservice & Cloud Computing
Amazon Webservice & Cloud ComputingAmazon Webservice & Cloud Computing
Amazon Webservice & Cloud Computing
 
Aws cost optimization: lessons learned, strategies, tips and tools
Aws cost optimization: lessons learned, strategies, tips and toolsAws cost optimization: lessons learned, strategies, tips and tools
Aws cost optimization: lessons learned, strategies, tips and tools
 

Similaire à An introduction to cloud computing with Amazon Web Services and MongoDB

Clould Computing and its application in Libraries
Clould Computing and its application in LibrariesClould Computing and its application in Libraries
Clould Computing and its application in Libraries
Amit Shaw
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 
AWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDS
AWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDSAWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDS
AWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDS
Amazon Web Services
 

Similaire à An introduction to cloud computing with Amazon Web Services and MongoDB (20)

Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & Opportunities
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301
 
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
 
Ceate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureCeate a Scalable Cloud Architecture
Ceate a Scalable Cloud Architecture
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
AWS Summit Auckland - Building a Server-less Data Lake on AWS
AWS Summit Auckland - Building a Server-less Data Lake on AWSAWS Summit Auckland - Building a Server-less Data Lake on AWS
AWS Summit Auckland - Building a Server-less Data Lake on AWS
 
Clould Computing and its application in Libraries
Clould Computing and its application in LibrariesClould Computing and its application in Libraries
Clould Computing and its application in Libraries
 
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your Startup
 
Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015
 
Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop
 
AWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDS
AWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDSAWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDS
AWSome Day 2016 - Module 4: Databases: Amazon DynamoDB and Amazon RDS
 

Dernier

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 

Dernier (20)

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 

An introduction to cloud computing with Amazon Web Services and MongoDB

  • 1. An introduction to cloud computing with Amazon Web Services and MongoDB Samuel Demharter DTC, 10 March 2016
  • 2. Cloud Computing “Everybody's in it and nobody's in it. It's like a cloud that everybody has given a little puff of mist to, and then the cloud does all the heavy thinking for everybody. I don't mean there's really a cloud. I just mean it's something like that.” The Sirens of Titan, Kurt Vonnegut, 1959
  • 3. Definition • Gartner Group: “A style of computing in which massively scalable scalable and elastic IT-enabled capabilities are delivered as a service using Internet technologies.”
  • 4. Cloud Computing Service Models Software As A Service (SAAS) Platform As A Service (PAAS) Infrastructure As A Service (IAAS)
  • 5. Amazon Web Services • Development started in 2002 • In 2006, Amazon launched its Elastic Compute cloud (EC2) and S3 storage service • Amazon EC2/S3 was the first widely accessible cloud computing infrastructure service
  • 6. Amazon Web Services (AWS) AWS Computing EC2 MapReduce Storage S3 EBS Databases SimpleDB DynamoDB Others Others
  • 7. AWS Computing • Elastic Compute Cloud (EC2) – Access to individual instances as you would with any other machine – Customisable configuration – Auto Scaling • Amazon Elastic MapReduce – Process vast amounts of data – Utilise Hadoop framework
  • 8. AWS Storage • Simple Storage Service (S3) – Scalable cloud storage – HTTP access – Object store not a file system – Cheap • Elastic Block Storage (EBS) – Local storage – For use with EC2 instances – Take snapshot backups – Fast
  • 9. AWS Databases • Amazon SimpleDB (noSQL) – Ease of administration • Amazon DynamoDB (noSQL) – Scalability & durability • Amazon Relational Database Service (SQL) – Efficient indexing & querying • Amazone ElastiCache – Fast data access
  • 10.
  • 11.
  • 12.
  • 13.
  • 15. What is a database? A database is a collection of information that is organized so that it can easily be accessed, managed, and updated.
  • 16. Why use a database? • Reusability : You need a single, public, interface for your data storage that all parts of your application can use. • Availability : You need be sure that your application will always be able to read and write data. • Durability : You need to be sure that your data will stick around. • Scalability : You need your data storage to be able to grow with your application.
  • 17. Typical SQL and noSQL databases SQL Oracle MySQL Microsoft SQL NoSQL Key-Value Column Document Graph-based SQL – Structured Query Language NoSQL – Not Only SQL MongoDB CouchDB Riak
  • 19. MongoDB • Distributed • Document-oriented • Schema-less storage solution • Uses JSON-style documents • Supports Python, PHP, Java, Ruby, C++, etc. • Replica sets for failovers and speeding up reads • Sharding for high performance
  • 20. SQL vs MongoDB (noSQL) SQL MongoDB (noSQL) Requires structured data/ well- designed schema semi-structured, unstructured & polymorphic data Table based Document based Database atomicity Document atomicity/ eventual consistency Rules enforced by database Rules enforced by user Scale-up Scale-out (suitable for distributed computing) Flexible & fast
  • 21.
  • 22.
  • 23.
  • 24. Table - Who is the account holder for account ID 3?
  • 25. Document - Who is the account holder for account ID 3?
  • 26. Redundancy and Data Availability - Replication
  • 27. Scaling out - Sharding • A means for partitioning data across servers for high performance
  • 28.
  • 30. Usage Example 1: DNA Sequencing • Real-time DNA sequencing • Raw Data PC • Basecalling AWS • Basecalled Data PC
  • 31. Usage Example 1: DNA Sequencing • Use AWS EC2 computing and S3 storage • Spot market – auction of unused EC2 instances • Pay-Per-Use an important economical factor for Nanopore • Use a combination of MongoDB and SQL
  • 32. Usage Example 2: Genome Analysis Genetic Variant Calling Peter White et al., Ohio State University in collaboration with Genome Next https://youtu.be/upAtK_SOtsY
  • 33. Resources • AWS Tutorials - https://qwiklabs.com • MapReduce - http://hadoop.apache.org/docs/r1.2.1/mapr ed_tutorial.html • AWS for Research - https://aws.amazon.com/grants/ • MongoDB - http://university.mongodb.com/
  • 34.
  • 35.
  • 36.
  • 37. Definitions • Instance: A copy of an Amazon Machine Image running as a virtual server in the AWS cloud • Instance type: A specification that defines the memory, CPU, storage capacity, and hourly cost for an instance. • Amazon Machine Image: AMIs are like a template of a computer's root drive.
  • 38. • Pixar accidentally wipes out nearly every file of "Toy Story 2" about 10 months into production. Fortunately, supervising technical director Galyn Susman had just become a new mom and had an entire copy of the movie on her home computer so that she could work from home. Woody and Buzz live to see another day, and movie.

Notes de l'éditeur

  1. In 2006, Amazon launched its Elastic Compute cloud (EC2) as a commercial web service that allows small companies and individuals to rent computers on which to run their own computer applications. Other key factors that have enabled cloud computing to evolve include the maturing of virtualisation technology, the development of universal high-speed bandwidth, and universal software interoperability standards
  2. a collection of cloud computing services e.g. Amazon markets AWS as a service to provide large computing capacity more quickly and more cheaply than a client company building an actual physical server farm.[3]
  3. Hadoop is a framework for distributing data and processing across resizable cluster of EC2 instances
  4. EMR: A web service that makes it easy to process large amounts of data efficiently. Amazon EMR uses Hadoop processing combined with several AWS products to do such tasks as web indexing, data mining, log file analysis, machine learning, scientific simulation, and data warehousing.
  5. Open source Popular with start-ups
  6. Simple application that stores data in file Want to read data later Another programme wants to read data What if not same language? Multiple programmes at same time use data? Overloaded. Scale up or scale out? Scale up – improve hardware – eventually runs out Scale out – distribute data – manage data across multiple hosts
  7. noSQL termed in 2009
  8. uses JSON-style documents to represent, query and modify data Similar to CouchBase and CouchDB MongoDB success is largely due to having easy-to-use, familiar tools.
  9. MongoDB uses memory mapped file for its storage engine (data is structured per record)
  10. A shard is a replica set that contains a subset of the data for the sharded cluster. Together, the cluster’s shards hold the entire data set for the cluster.
  11. A virtual machine is a software computer that, like a physical computer, runs an operating system and applications. The virtual machine is comprised of a set of specification and configuration files and is backed by the physical resources of a host. Some instance types are designed for standard applications, whereas others are designed for CPU-intensive, memory-intensive applications, and so on. AMI contains the operating system and can also include software and layers of your application, such as database servers, middleware, web servers, and so on.