2. Slides and Contact
ijstokes@hkl.hms.harvard.edu
http://linkedin.com/in/ijstokes
http://slidesha.re/ijstokes-grid2011
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
3. Slides and Contact
ijstokes@hkl.hms.harvard.edu
http://linkedin.com/in/ijstokes
http://slidesha.re/ijstokes-grid2011
http://www.sbgrid.org
http://portal.sbgrid.org
http://www.opensciencegrid.org
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
4. ScientiFic Research Today
• International collaborations
• IT becomes embedded into research process: data, results,
analysis, visualization
• Crossing institutional and national boundaries
• Computational techniques increasingly
important
• ... and computationally intensive techniques as well
• requires use of high performance computing systems
• Data volumes are growing fast
• hard to share
• hard to manage
• ScientiFic software often difFicult to use
• or to use properly
• Web based tools increasingly important
• but often lack disconnect from persisted and shared results
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
5. SBGrid Consortium Cornell U.
Washington U. School of Med.
R. Cerione NE-CAT
T. Ellenberger
B. Crane R. Oswald
D. Fremont
S. Ealick C. Parrish
Rosalind Franklin NIH M. Jin H. Sondermann
D. Harrison M. Mayer
A. Ke UMass Medical
U. Washington
T. Gonen
U. Maryland W. Royer
E. Toth
Brandeis U.
UC Davis N. Grigorieff
H. Stahlberg Tufts U.
K. Heldwein
UCSF Columbia U.
JJ Miranda Q. Fan
Y. Cheng
Rockefeller U.
Stanford R. MacKinnon
A. Brunger Yale U.
K. Garcia T. Boggon K. Reinisch
T. Jardetzky D. Braddock J. Schlessinger
Y. Ha F. Sigworth
CalTech E. Lolis F. Zhou
P. Bjorkman Harvard and Affiliates
W. Clemons N. Beglova A. Leschziner
G. Jensen Rice University S. Blacklow K. Miller
D. Rees E. Nikonowicz B. Chen A. Rao
Y. Shamoo Vanderbilt J. Chou T. Rapoport
Y.J. Tao Center for Structural Biology J. Clardy M. Samso
WesternU
W. Chazin C. Sanders M. Eck P. Sliz
M. Swairjo
B. Eichman B. Spiller B. Furie T. Springer
M. Egli M. Stone R. Gaudet G. Verdine
UCSD B. Lacy M. Waterman M. Grant G. Wagner
T. Nakagawa M. Ohi S.C. Harrison L. Walensky
H. Viadiu Thomas Jefferson J. Hogle S.Walker
J. Williams D. Jeruzalmi T.Walz
D. Kahne J. Wang
Not Pictured:
University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan T. Kirchhausen S. Wong
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
6. Boston Life Sciences Hub
• Biomedical researchers
• Government agencies
• Life sciences Tufts
• Universities
Universit
y
School
of
Medicin
e
• Hospitals
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
9. Study of Protein Structure
and Function
400m
1mm
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
10. Study of Protein Structure
and Function
400m
1mm
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
11. Study of Protein Structure
and Function
400m
1mm
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
12. Study of Protein Structure
and Function
400m
1mm
Grid Overview - Ian Stokes-Rees
10nm
ijstokes@hkl.hms.harvard.edu
13. Study of Protein Structure
and Function
400m
1mm
10nm
• Shared scientiFic data collection facility
• Data intensive (10‐100 GB/day)
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
14. Cryo Electron Microscopy
• Previously, 110,000 images, managed by hand
• Now, robotic systems collect millions of images
• estimate 250,000 CPUhours to reconstruct model
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
15. Cryo Electron Microscopy
• Previously, 110,000 images, managed by hand
• Now, robotic systems collect millions of images
• estimate 250,000 CPUhours to reconstruct model
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
16. Cryo Electron Microscopy
• Previously, 110,000 images, managed by hand
• Now, robotic systems collect millions of images
• estimate 250,000 CPUhours to reconstruct model
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
17. Molecular Dynamics Simulations
1 fs time step
1ns snapshot
1 us simulation
1e6 steps
1000 frames
10 MB / frame
10 GB / sim
20 CPUyears
3 months (wall
clock)
Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
18. Molecular Dynamics Simulations
1 fs time step
1ns snapshot
1 us simulation
1e6 steps
1000 frames
10 MB / frame
10 GB / sim
20 CPUyears
3 months (wall
clock)
Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
21. High Energy Physics
40 MHz bunch crossing rate
10 million data channels
1 KHz level 1 event recording rate
110 MB per event
14 hours per day, 7+ months / year
4 detectors
6 PB of data / year
globally distribute data for analysis (x2)
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
22. Open Science Grid
http://opensciencegrid.org
• US National
Cyberinfrastructure
• Primarily used for high
energy physics computing
• 80 sites
• ~100,000 job slots 5,073,293 hours
~570 years
• ~1,500,000 hours per day
• PB scale aggregate storage
• ~ 1 PB transferred each day
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
23. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
24. Home
About Us
Informations
TNGP
News
Calendar
Document Download
Jobs
Forums
Photogallery
Publications
Blog
Related Links สัมมนาวิชาการเทคโนโลยีกริดและคลาวด์
Guestbook ศูนย์ไทยกริดแห่งชาติ สํานักงานส่งเสริมอุตสาหกรรมซอฟต์แวร์
Contact Us แห่งชาติ (องค์การมหาชน) ร่วมกับมหาวิทยาลัยเทคโนโลยีพะ
Travel จอมเกล้าธนบุรี (มจธ) เป็นเจ้าภาพในการจัด...
Healthy 6 May 2011
academic
ประกาศผลการแข่งขัน โครงการ Grid Technology Innovat...
ทําการแข่งขันเมื่อวันที่ 12-13 กุมภาพันธ์ 2554
25 March 2011
โปรแกรม R สําหรับงานวิเคราะห์และวิจัยด้านสถิติ
R เป็นซอฟ์ทแวร์ สําหรับใช้ในงานด้านวิเคราะห์และวิจัยทางด้าน
สถิติซึ่งนิยมใช้กันในผู้ที่ต้องทํางานด้านวิจัยที่เกี่ยวข้องกับการ
สมัครรับข่าวสาร ยกเลิก
คํานวณทางด้านสถิติ
28 February 2011
รัฐบาลมีนโยบายปฏิรูปการทํางาน และเพิ่มศักยภาพ ทุกภาคส่วน
ของรัฐ ให้เอื้ออํานวยต่อการเสริมสร้าง ความเข้มแข็ง ของภาค
เอกชน โดยการผลักดัน ยุทธศาสตร์ การเสริมสร้างศักยภาพการ
แข่งขัน และการพัฒนา ที่ยั่งยืนของประเทศ และต้องการ พัฒนา
ประเทศไปสู่สังคม แห่งภูมิปัญญา และ การเรียนรู้ (Knowledge
Based Society)
http://www.thaigrid.net/
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
26. Grid Opportunities
• New compute intensive workFlows
• think big: tens or hundreds of thousands of hours Finished in 1‐2 days
• sharing resources for efFicient and large scale utilization
• Data intensive problems
• we mirror 20 GB of data to 30 computing centers
• Data movement, management, and archive
• Federated identity and user management
• labs, collaborations or ad‐hoc groups
• role‐based access control (RBAC) and IdM
• Collaborative environment
• Web‐based access to applications
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
29. The Browser as the
Universal Interface
• If it isn’t already obvious to you
• Any interactive application developed today should be web‐based with a
RESTful interface (if at all possible)
• A rich set of tools and techniques
• AJAX, HTML4/5, CSS, and JavaScript
• Dynamic content negotiation
• HTTP headers, caching, security, sessions/cookies
• Scalable, replicable, centralized, multi‐threaded,
multi‐user
• Alternatives
• Command Line (CLI): great for scriptable jobs
• GUI toolkits: necessary for applications with high graphics or I/O demands
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
30. What is a web portal?
• A web‐based gateway to resources and data
• simpliFied access
• centralized access
• uniFied access (CGI, Perl, Python, PHP, static HTML, static Files, etc.)
• Attempt to provide uniform access to a range of
services and resources
• Data access via HTTP
• Leverage brilliance of Apache HTTPD and
associated modules
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
31. SBGrid Science Portal Objectives
A.
Extensible infrastructure to facilitate
development and deployment of novel
computational workFlows
B.
Web‐accessible environment for collaborative,
compute and data intensive science
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
32. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
33. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
39. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
40. Experimental Data Access
• Collaboration
• Access Control
• Identity Management
• Data Management
• High Performance Data Movement
• Multi‐modal Access
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
42. Globus Online: High Performance
Reliable 3rd Party File Transfer
GUMS
DN to user mapping
VOMS
VO membership
portal
cluster
data collection
lab file
facility
server
Grid Overview - Ian Stokes-Rees desktop laptop ijstokes@hkl.hms.harvard.edu
43. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
46. Access Control
• Need a strong Identity Management environment
• individuals: identity tokens and identiFiers
• groups: membership lists
• Active Directory/CIFS (Windows), Open Directory (Apple), FreeIPA (Unix) all LDAP‐
based
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
47. Access Control
• Need a strong Identity Management environment
• individuals: identity tokens and identiFiers
• groups: membership lists
• Active Directory/CIFS (Windows), Open Directory (Apple), FreeIPA (Unix) all LDAP‐
based
• Need to manage and communicate Access Control policies
• institutionally driven
• user driven
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
48. Access Control
• Need a strong Identity Management environment
• individuals: identity tokens and identiFiers
• groups: membership lists
• Active Directory/CIFS (Windows), Open Directory (Apple), FreeIPA (Unix) all LDAP‐
based
• Need to manage and communicate Access Control policies
• institutionally driven
• user driven
• Need Authorization System
• Policy Enforcement Point (shell login, data access, web access, start application)
• Policy Decision Point (store policies and understand relationship of identity token
and policy)
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
49. Access Control
• What is a user?
• .htaccess and .htpasswd
• local system user (NIS or /etc/passwd)
• portal framework user (proprietary DB schema)
• grid user (X.509 DN)
• What are we securing access to?
• Web pages?
• URLs?
• Data?
• SpeciFic operations?
• Meta Data?
• What kind of policies do we enable?
• Simplify to READ WRITE EXECUTE LIST ADMIN
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
50. UniFied Account Management
Hierarchical LDAP database
user basics
passwords
Standard schemas
Relational DB
user custom proFiles
institutions
lab groups
Custom schemas
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
51. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
52. Harvard Catalyst brings together
the intellectual force, technologies,
and clinical expertise of Harvard
University and its affiliates and
partners to reduce the burden of
human illness.
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
54. Service Architecture
GlobusOnline UC San Diego
@Argonne GUMS
User GUMS
GridFTP + glideinWMS
data Hadoop factory Open Science Grid
computations
MyProxy
@NCSA, UIUC
monitoring interfaces data computation ID mgmt
Ganglia scp Condor FreeIPA
Apache DOEGrids CA
Nagios GridFTP Cycle Server @Lawrence
GridSite LDAP
RSV SRM VDT Berkley Labs
Django VOMS
Globus
pacct WebDAV
Sage Math GUMS
glideinWMS Gratia Acct'ing
R-Studio GACL @FermiLab
file SQL
shell CLI server DB cluster
Monitoring
SBGrid Science Portal @ Harvard Medical School @Indiana
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
55. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
56. Acknowledgements & Questions
• Piotr Sliz
• Principle Investigator, head of SBGrid
• SBGrid Science Portal
• Daniel O’Donovan, Meghan Porter‐Mahoney
• SBGrid System Administrators
• Ian Levesque, Peter Doherty, Steve Jahl
• Globus Online Team
• Steve Tueke, Ian Foster, Rachana
Ananthakrishnan, Raj Kettimuthu
• Ruth Pordes
• Director of OSG, for championing SBGrid
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
57. Acknowledgements & Questions
• Piotr Sliz
Please contact me
• Principle Investigator, head of SBGrid
with any questions:
• SBGrid Science Portal • Ian Stokes‐Rees
• Daniel O’Donovan, Meghan Porter‐Mahoney
• ijstokes@hkl.hms.harvard.edu
• SBGrid System Administrators • ijstokes@spmetric.com
• Ian Levesque, Peter Doherty, Steve Jahl
• Globus Online Team Look at our work
• Steve Tueke, Ian Foster, Rachana • portal.sbgrid.org
Ananthakrishnan, Raj Kettimuthu • www.sbgrid.org
• Ruth Pordes • www.opensciencegrid.org
• Director of OSG, for championing SBGrid
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
59. Grid Architectural Details
• Resources • Information
• Uniform compute clusters • LDAP based most common (not
• Managed via batch queues optimized for writes)
• Local scratch disk • Domain speciFic layer
• Sometimes high perf. network • Open problem!
(e.g. InFiniBand) • Fabric
• Behind NAT and Firewall • In most cases, assume functioning
• No shell access Internet
• Data • Some sites part of experimental
private networks
• Tape‐backed mass storage
• Disk arrays (100s TB to PB) • Security
• High bandwidth (multi‐stream) • Typically underpinned by X.509
transfer protocols Public Key Infrastructure
• File catalogs • Same standards as SSL/TLS and
• Meta‐data “server certs” for “https”
• Replica management
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
60. TeraGrid
SBGrid User NERSC
Community
Open Science Grid
National Federated
Cyberinfrastructure Odyssey
Facilitate interface
between community
and cyberinfrastructure Orchestra
EC2
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
61. Existing Security
Infrastructure
• X.509 certiFicates
• Department of Energy CA
• Regional/Institutional RAs (SBGrid is an RA)
• X.509 proxy certiFicate system
• Users self‐sign a short‐lived passwordless proxy certiFicate used for “portable”
and “automated” grid processing identity token
• Similarities to Kerberos tokens
• Virtual Organizations (VO) for deFinitions of roles,
groups, attrs
• Attribute CertiFicates
• Users can (attempt) to fetch ACs from the VO to be attached to proxy certs
• POSIX‐like File access control (Grid ACL)
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
62. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
63. Data Model
• Data Tiers
• VOwide: all sites, admin managed, very stable
• User project: all sites, user managed, 1‐10 weeks, 1‐3 GB
• User static: all sites, user managed, indeFinite, 10 MB
• Job set: all sites, infrastructure managed, 1‐10 days, 0.1‐1 GB
• Job: direct to worker node, infrastructure managed, 1 day, <10 MB
• Job indirect: to worker node via UCSD, infrastructure managed, 1
day, <10 GB
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
64. Data Management
quota
du scan
tmpwatch
conventions
workFlow integration
Data Movement
scp (users)
rsync (VO‐wide)
grid‐ftp (UCSD)
curl (WNs)
cp (NFS)
htcp (secure web)
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
65. red push Diles
green pull Diles
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
66. red push Diles
green pull Diles
1. user Dile upload
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
67. red push Diles
green pull Diles
2. replicate gold standard
1. user Dile upload
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
68. 3. Autoreplicate
red push Diles
green pull Diles
2. replicate gold standard
1. user Dile upload
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
69. 4. pull Diles from
UCSD to WNs
3. Autoreplicate
red push Diles
green pull Diles
2. replicate gold standard
1. user Dile upload
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
70. 4. pull Diles from
UCSD to WNs
5. pull Diles from
3. Autoreplicate local NSF to WNs
red push Diles
green pull Diles
2. replicate gold standard
1. user Dile upload
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
71. 4. pull Diles from
UCSD to WNs
5. pull Diles from
3. Autoreplicate local NSF to WNs
6. pull Diles from
SBGrid to WNs
red push Diles
green pull Diles
2. replicate gold standard
1. user Dile upload
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
72. 4. pull Diles from
UCSD to WNs
5. pull Diles from
3. Autoreplicate local NSF to WNs
6. pull Diles from
SBGrid to WNs
red push Diles
green pull Diles
2. replicate gold standard
7. job results copied
back to SBGrid
1. user Dile upload
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
73. 4. pull Diles from
UCSD to WNs
5. pull Diles from
3. Autoreplicate local NSF to WNs
6. pull Diles from
SBGrid to WNs
red push Diles
green pull Diles
2. replicate gold standard
7. job results copied
back to SBGrid
8a. large job results
copied to UCSD
8b. later pulled to
1. user Dile upload SBGrid
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
74. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
75. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
76. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
77. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
78. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
79. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
80. Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
81. “weak” solution
2nx5q2
Log Likelihood Gain
MHC‐TCR: 2VLJ
“strong” solution
1im3a2
Translation Z score
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
82. • NEBioGrid Django Portal • PyGACL
Interactive dynamic web portal for Python representation of GACL model
workFlow deFinition, submission, and API to work with GACL Files
monitoring, and access control • osg_wrap
• NEBioGrid Web Portal Swiss army knife OSG wrapper script to
GridSite based web portal for File‐system handle File staging, parameter sweep,
level access (raw job output), meta‐data DAG, results aggregation, monitoring
tagging, X.509 access control/sharing, • sbanalysis
CGI
data analysis and graphing tools for
• PyCCP4 structural biology data sets
Python wrappers around CCP4 • osg.monitoring
structural biology applications
tools to enhance monitoring of job set
• PyCondor and remote OSG site status
Python wrappers around common • shex
Condor operations
Write bash scripts in Python: replicate
enhanced Condor log analysis commands, syntax, behavior
• PyOSG • xconDig
Python wrappers around common OSG Universal conFiguration
operations
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
83. 10k grid jobs
Example Job Set approx 30k CPU hours
99.7% success rate evicted - red
24 wall clock hours completed - green
held - orange
MIT
5292
UWisc
1173
1077
120
1657
3
662
Cornell
840
20
Buffalo
720
628
ND
76
407
47
421
Caltech 190
FNAL
1409
237
12
24
79
4
47
UNL
6
1159
3
HMS
60
20
Purdue
349
10,000 jobs
52
17
39
UCR RENCI
local queue
remote queue SPRACE
1216
running
316
248
Grid Overview - Ian Stokes-Rees 24 hours ijstokes@hkl.hms.harvard.edu
85. Typical Layered Environment
Fortran bin
• Command line application (e.g. Fortran)
• Friendly application API wrapper Python API
Map- • Batch execution wrapper for N‐iterations Multi-exec wrapper
Reduce • Results extraction and aggregation Result aggregator
• Grid job management wrapper Grid management
• Web interface Web interface
• forms, views, static HTML results
• GOAL eliminate shell scripts
• often found as “glue” language between layers
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
86. REST
• Don’t try to read too much into the name
• REpresentational State Transfer: coined by Roy Fielding, co‐author of
HTTP protocol and contributor to original Apache httpd server
• Idea
• The web is the worlds largest asynchronous, distributed, parallel
computational system
• Resources are “hidden” but representations are accessible via URLs
• Representations can be manipulated via HTTP operations GET PUT POST
HEAD DELETE and associated state
• State transitions are initiated by software or by humans
• Implication
• Clean URLs (e.g. Flickr)
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
87. Big Data - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
88. Cloud Computing:
Industry solution to the Grid
• Virtualization has taken off in the past 5 years
• VMWare, Xen, VirtualPC, VirtualBox, QEMU, etc.
• Builds on ideas from VMS (i.e. old)
• (Good) System administrators are hard to come by
• And operating a large data center is costly
• Internet boom means there are companies that have Figured out
how to do this really well
• Google, Amazon, Yahoo, Microsoft, etc.
• Outsource IT infrastructure! Outsource software hosting!
• Amazon EC2, Microsoft Azure, RightScale, Force.com, Google Apps
• Over simpliFied:
• You can’t install a cloud
• You can’t buy a grid
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
89. Is “Cloud” the new “Grid”?
• Grid is about mechanisms for federated,
distributed, heterogeneous shared compute and
storage resources
• standards and software
• Cloud is about on‐demand provisioning of
compute and storage resources
• services
No one buys a grid. No one installs a cloud.
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
90. The interesting thing about Cloud Computing is that
we’ve redeTined Cloud Computing to include
everything that we already do. . . . I don’t understand
what we would do differently in the light of Cloud
Computing other than change the wording of some of
our ads.
Larry Ellison, Oracle CEO, quoted in the Wall Street Journal, September 26, 2008*
*http://blogs.wsj.com/biztech/2008/09/25/larry‐ellisons‐brilliant‐anti‐cloud‐computing‐rant/
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu
91. When is cloud computing
interesting?
• My deFinition of “cloud computing”
• Dynamic compute and storage infrastructure provisioning in a scalable manner providing
uniform interfaces to virtualized resources
• The underlying resources could be
• “in‐house” using licensed/purchased software/hardware
• “external” hosted by a service/infrastructure provider
• Consider using cloud computing if
• You have operational problems/constraints in your current data center
• You need to dynamically scale (up or down) access to services and data
• You want fast provisioning, lots of bandwidth, and low latency
• Organizationally you can live with outsourcing responsibility for (some of) your data and
applications
• Consider providing cloud computing services if
• You have an ace team efFiciently running your existing data center
• You have lots of experience with virtualization
• You have a speciFic application/domain that could beneFit from being tied to a large compute
farm or disk array with great Internet connectivity
Grid Overview - Ian Stokes-Rees ijstokes@hkl.hms.harvard.edu