1. There
is
no
magic,
there
is
only
awesome
Scien&fic
compu&ng
with
Amazon
Web
Services
Deepak
Singh
Business
Development
Manager
-‐
Amazon
Compute
Services
Discovery
2015
Workshop,
July
23
2010
15. The
“ Living
a nd
Evolving”
C loud
AWS
services
and
basic
terminology
Most
Applica9ons
Need:
1. Compute Your
A pplication
2. Storage
Amazon
Amazon
E lastic
3. Messaging RDS MapReduce
J obFlows
Payment
:
A mazon
F PS/
D evPay
Amazon
S impleDB
D omains
4. Payment Amazon
Cloud
Amazon
S QS
Q ueues
Auto-‐ Elastic
Cloud
5. Distribu9on Amazon
S NS
Topics Scaling LB Watch Front
Amazon
S 3
6. Scale Objects
a nd
Amazon
EC2
I nstances Buckets
7. Analy9cs (On-‐Demand,
Reserved,
S pot)
EBS Snapshots
Volumes
Amazon
Virtual
P rivate
C loud
Amazon
Worldwide
P hysical
I nfrastructure
(Geographical
Regions,
Availability
Zones,
Edge
L ocations)
47. Customer
A
Customer
B
Customer
Z
• Guest
operaIng
system
doesn’t
have
elevated
privilege
level.
• Instances
are
completely
…
isolated.
• Intrinsic
network
firewall.
• No
access
to
raw
devices.
• Virtualized
disks,
logically
isolated,
wiped
clean
aRer
use.
Hypervisor
Firewall
Physical
Interface
64. Amazon S3 Amazon EC2 + EBS
• Cost-‐effecIve
blob
or
large
object
storage • Mul9ple
flavors
of
database
engine
• Minimal
rela9onships
between
objects • Complete
control
Amazon SimpleDB Amazon RDS
• Zero
administra9ve
overhead
(automaIc
• Na9ve
access
to
database
engine
handling
of
geo-‐redundant
replicaIon,
index
• Easy
migra9on
path
(exisIng
code,
tools,
creaIon,
database
tuning) applicaIon
are
compaIble)
• AutomaIc
and
elasIc
scaling
of
resources
to
• Key
features
of
a
relaIonal
database,
such
as
meet
request
load joins
or
complex
transac9ons
• High
availability
(mulIple
copies
of
data
for
• Managed
experience
(offload
common
DBA
reliability
and
failover) tasks,
lower
total
cost
of
ownership)
• Flexibility
(schema-‐less
data
store)
77. def access_key
options.services['access-key']
Access end
credentials def secret_key
options.services['secret-key']
end
78. class EC2
attr_accessor :ec2, :instance_index, :image_index, :elastic_ip_index,
:volume_index
def initialize(access_key, secret_key)
@ec2 = RightAws::Ec2.new(access_key, secret_key)
@instance_index = {}
@image_index = {}
@elastic_ip_index = {}
@volume_index = {}
end
end
79. class Instance
attr_accessor :aws_hash, :elastic_ip
def initialize(hash, elastic_ip = nil)
@aws_hash = hash
@elastic_ip = elastic_ip
end
def public_dns
@aws_hash[:dns_name] || ""
end
def friendly_name
public_dns.empty? ? status.capitalize : public_dns.split(".")[0]
end
def id
@aws_hash[:aws_instance_id]
end
end
80. class EC2
attr_accessor :ec2, :instance_index, :image_index, :elastic_ip_index,
:volume_index
def initialize(access_key, secret_key)
@ec2 = RightAws::Ec2.new(access_key, secret_key)
@instance_index = {}
@image_index = {}
@elastic_ip_index = {}
@volume_index = {}
end
def instance_index
if @instance_index.empty?
@ec2.describe_instances.each do |i|
# create an Instance object & add to the array
Custom @instance_index[i[:aws_instance_id]] = Instance.new(i,
get_elastic_ip_for_instance_id(i[:aws_instance_id]))
index end
end
return @instance_index
end
end
81. class Instance
attr_accessor :aws_hash, :elastic_ip
def initialize(hash, elastic_ip = nil)
@aws_hash = hash
@elastic_ip = elastic_ip
end
def public_dns
@aws_hash[:dns_name] || ""
end
def friendly_name
public_dns.empty? ? status.capitalize : public_dns.split(".")[0]
end
def id
@aws_hash[:aws_instance_id]
end
def running?
Helper status == "running"
end
end
86. include_recipe "packages"
include_recipe "ruby"
include_recipe "apache2"
if platform?("centos","redhat")
if dist_only?
# just the gem, we'll install the apache module within apache2
package "rubygem-passenger"
return
else
package "httpd-devel"
end
else
%w{ apache2-prefork-dev libapr1-dev }.each do |pkg|
package pkg do
action :upgrade
end
end
end
gem_package "passenger" do
version node[:passenger][:version]
end
execute "passenger_module" do
command 'echo -en "nnnn" | passenger-install-apache2-module'
creates node[:passenger][:module_path]
end
87. include_recipe "packages"
Modular include_recipe "ruby"
include_recipe "apache2"
if platform?("centos","redhat")
if dist_only?
# just the gem, we'll install the apache module within apache2
package "rubygem-passenger"
return
else
package "httpd-devel"
end
else
%w{ apache2-prefork-dev libapr1-dev }.each do |pkg|
package pkg do
action :upgrade
end
end
end
gem_package "passenger" do
version node[:passenger][:version]
end
execute "passenger_module" do
command 'echo -en "nnnn" | passenger-install-apache2-module'
creates node[:passenger][:module_path]
end
88. include_recipe "packages"
include_recipe "ruby"
include_recipe "apache2"
OS aware if platform?("centos","redhat")
if dist_only?
# just the gem, we'll install the apache module within apache2
package "rubygem-passenger"
return
else
package "httpd-devel"
end
else
%w{ apache2-prefork-dev libapr1-dev }.each do |pkg|
package pkg do
action :upgrade
end
end
end
gem_package "passenger" do
version node[:passenger][:version]
end
execute "passenger_module" do
command 'echo -en "nnnn" | passenger-install-apache2-module'
creates node[:passenger][:module_path]
end
89. include_recipe "packages"
include_recipe "ruby"
include_recipe "apache2"
if platform?("centos","redhat")
if dist_only?
# just the gem, we'll install the apache module within apache2
package "rubygem-passenger"
return
else
package "httpd-devel"
end
else
%w{ apache2-prefork-dev libapr1-dev }.each do |pkg|
Ruby package pkg do
action :upgrade
syntax end
end
end
gem_package "passenger" do
version node[:passenger][:version]
end
execute "passenger_module" do
command 'echo -en "nnnn" | passenger-install-apache2-module'
creates node[:passenger][:module_path]
end
90. include_recipe "packages"
include_recipe "ruby"
include_recipe "apache2"
if platform?("centos","redhat")
if dist_only?
# just the gem, we'll install the apache module within apache2
package "rubygem-passenger"
return
else
package "httpd-devel"
end
else
%w{ apache2-prefork-dev libapr1-dev }.each do |pkg|
package pkg do
action :upgrade
end
end
end
Package gem_package "passenger" do
version node[:passenger][:version]
aware end
execute "passenger_module" do
command 'echo -en "nnnn" | passenger-install-apache2-module'
creates node[:passenger][:module_path]
end
91. include_recipe "packages"
include_recipe "ruby"
include_recipe "apache2"
if platform?("centos","redhat")
if dist_only?
# just the gem, we'll install the apache module within apache2
package "rubygem-passenger"
return
else
package "httpd-devel"
end
else
%w{ apache2-prefork-dev libapr1-dev }.each do |pkg|
package pkg do
action :upgrade
end
end
end
gem_package "passenger" do
version node[:passenger][:version]
end
execute "passenger_module" do
Execute command 'echo -en "nnnn" | passenger-install-apache2-module'
creates node[:passenger][:module_path]
end
104. 2-4% of servers
will die annually
Source: Jeff Dean, LADIS 2009
105. 1-5% of disk drives
will die every year
Source: Jeff Dean, LADIS 2009
106. 2.3% AFR in population of 13,250
3.3% AFR in population of 22,400
4.2% AFR in population of 246,000
107. 2.3% AFR in population of 13,250
3.3% AFR in population of 22,400
4.2% AFR in population of 246,000
Source: James Hamilton (http://perspectives.mvdirona.com)
147. Amazon Elastic MapReduce
Amazon EC2 Instances
End
Deploy Application
Hadoop Hadoop Hadoop
Elastic Elastic
MapReduce MapReduce
Hadoop Hadoop Hadoop Notify
Web Console, Command
line tools Input output
dataset results
Input
S3
Output
S3
Get Results
Input Data
bucket bucket
Amazon S3
148. PREANNOUNCE
–
EXPAND/SHRINK
CLUSTERS
Use
Case:
Increase
speed
of
running
job
flows
Speed
up
job
flow
execuIon
in
response
to
changing
requirements
Dynamically
balance
cost
versus
performance
without
restarIng
a
job
Job Flow
Job Flow
Job Flow
Allocate Expand to Expand to
4 instances 9 instances 25 instances
Time remaining:
Time remaining:
14 Hours 7 Hours
Time remaining:
3 Hours
149. Use
Case:
Agile
Data
Warehouse
Cluster
Customize
cluster
size
to
support
varying
resource
needs
Leverage
flexibility
to
reduce
costs
and
increase
cluster
uIlizaIon
Data Warehouse
(Batch Processing)
Data Warehouse Data Warehouse
(Steady State) (Steady State)
Allocate Expand to Shrink to
9 instances 25 instances 9 instances
150. PREANNOUNCE
–
IntegraIon
with
Spot
Instances
Cost without Spot:
4 instances *14 hrs * $0.50 = $28
Job Flow
Job Flow Cost with Spot:
Allocate Expand to 4 instances *7 hrs * $0.50 = $13 +
4 instances 9 instances 5 instances * 7 hrs * $0.25 = $8.75
Total = $21.75
Time remaining: Savings: ~22%
14 Hours 7 Hours
Time remaining:
169. Protein interactions @ U. Washington
Simple Python scripts automate the
management of 1000s of simultaneous
experiments using the EC2 API
http://faculty.washington.edu/danielt/
Source: Ed Lazowska
171. HEAVY-ION COLLISIONS
Problem: Quark matter physics conference
imminent but no compute resources handy
Solution: NIMBUS context broker allowed
researchers to provision 300 nodes and get the
simulations done
184. BLAT @ U. Penn
Map 100 million, 100 base paired end reads
Quad core with 5 GB of RAM would take 16 days
30 high-memory instances; 32 hours; $195
Source: Angel Pizzaro/John Hogenesch
186. MapReduce for Genomics
Ben Langmead
http://bowtie-bio.sourceforge.net/crossbow/index.shtml
http://contrail-bio.sourceforge.net
http://bowtie-bio.sourceforge.net/myrna/index.shtml
196. deesingh@amazon.com
Twicer:@mndoci
slides
at
hcp://slideshare.net/mndoci
InspiraIon
and
material
from
Mah
Wood,
James
Hamilton
&
Larry
Lessig
By Oberazzi under a CC-BY-NC-SA license