SlideShare a Scribd company logo
1 of 32
Download to read offline
Introduction to Cloud Computing

            Marin Dimitrov
         (technology watch #3)


               Apr 2010
Contents

• Introduction
• Cloud Computing platforms
• Programming for the Cloud
• Semantic Web on the Cloud




                        Cloud Computing   Apr 2010   #2
Contents



       Part I

Introduction




   Cloud Computing   Apr 2010   #3
Cloud Computing - NIST definition

• “Cloud computing is a model for enabling ubiquitous, convenient, on-
  demand network access to a shared pool of configurable computing
  resources (e.g., networks, servers, storage, applications, and services)
  that can be rapidly provisioned and released with minimal management
  effort or service provider interaction.”
• Delivery models
    – IaaS (Infrastructure as a Service) - the consumer uses "fundamental
      resources" such as processing power, storage, networking components or
      middleware. The consumer can control the operating system, storage,
      applications and possibly networking
    – PaaS (Platform as a Service) - the consumer uses a hosting environment for
      their applications and has control over the applications (and some control
      over the hosting environment), but does not control the infrastructure on
      which they are running
    – SaaS (Software as a Service) - the consumer uses an application, but does not
      control the infrastructure on which it's running (OS, hardware)

                                  Cloud Computing                     Apr 2010        #4
XaaS spectrum – Google, Amazon, Microsoft


        • Elastic Map Reduce              • Gmail
SaaS                                      • Google apps




        • SimpleDB                        • App Engine             • SQL Azure
PaaS    • Relational DataStore            • BigTable / MegaStore
        • Flexible Payment Service




        •   EC2                           • Google Storage         • Blob storage
IaaS    •   Simple Queue Service                                   • Azure Computing
        •   Simple Notification Service                            • Queues
        •   Elastic Block Storage                                  • Load Balancer
        •   S3 / RRS
        •   CloudWatch / Auto Scaling
        •   Elastic Load Balancer



                               Cloud Computing                               Apr 2010   #5
Cloud Computing - Essential characteristics
                        (NIST)
• Rapid elasticity – the ability to scale resources both up and down as
  needed. To the consumer, the Cloud appears to be infinite, and the
  consumer can purchase as much / little computing power as they need
• Measured service – aspects of the Cloud service are controlled and
  monitored by the Cloud provider. This is crucial for billing, access control,
  resource optimization & capacity planning
• On-demand self service – a consumer can use cloud services as needed
  without any human interaction with the cloud provider
• Ubiquitous network access – the Cloud provider’s capabilities are
  available over the network and can be accessed through standard
  mechanisms
• Resource Pooling – allows a Cloud provider to serve its consumers via a
  multi-tenant model - resources are (re)assigned according to consumer
  demand.

                                 Cloud Computing                   Apr 2010       #6
Cloud Computing - deployment models (NIST)

• Public cloud
   – Infrastructure owned by some organisation but sold to 3rd parties
   – E.g. Amazon Web Services, Google AppEngine, Windows Azure
• Private cloud
   – Internal infrastructure for a single organisation (on or off-premise)
   – E.g. VMware vCloud, IBM Cloudburst, Microsoft Hyper-V
• Community cloud
   – Infrastructure shared by several organisations, targeting a specific
     community
   – E.g. OpenCirrus (HP, Intel, Yahoo, KIT, CMU, …)
• Hybrid cloud
   – Composition of the above
   – E.g. AWS Virtual Private Cloud
                                Cloud Computing                  Apr 2010    #7
Cloud computing – business drivers

1. Business agility
   –   Faster time to market
       •   No major upfront commitment & investment in infrastructure
   –   Scalability & elasticity
       •   Instant on-demand provisioning
       •   Shifting the risk of over-/under-provisioning to the cloud provider

2. Focus
   –   Outsource non-core tasks to the cloud provider
3. Pay-as-you-go
   –   Speed up new project launching & rollout (start small, add resources
       when needed)
   –   No need for complex planning ahead
   –   Turn fixed costs (CapEx) into variable costs (OpEx)

                                  Cloud Computing                      Apr 2010   #8
Some cloud use cases

• Overflow buffer
   – Avoid over-provisioning for peak loads, but just for the average load
• Seasonal business
   – E.g. Wallmart has 4:1 peak-to-average ratio (source?)
• Small startups time-to-market
   – Less upfront investment, more focus on core competencies
• Experimental playground
   – Rollout experimental projects without major equipment purchases
• Speedup of large scale batch operations
   – 1000 servers for 1 hour cost the same as 1 server for 1000 hours
   – More cost-efficient computing (off-peak tariffs & time zones)
• Unforeseeable events
   – E.g. sudden traffic spikes to web sites (volcanoes, anyone?) 2010
                                Cloud Computing                 Apr          #9
Cloud-able applications

• Typical characteristics
   –   Non mission critical
   –   Need >99% uptime
   –   Low bandwidth / higher latency tolerance
   –   Relaxed security requirements
   –   Few integration points
   –   E.g
        • Batch operations (speedup at the same price!)
        • One-time large scale processing

• Barriers to cloud migration
   – Security & trust
   – Lack of SLA
   – Lack of standardization (vendor lock-in)

                                 Cloud Computing          Apr 2010   #10
Cloud Computing – pros & cons




                            (C) Dion Hinchcliffe



          Cloud Computing                 Apr 2010   #11
Contents



                Part II

     Cloud Computing
         Platforms
AWS, Google AppEngine, Windows Azure



             Cloud Computing   Apr 2010   #12
XaaS spectrum – Google, Amazon, Microsoft
                 (again)

        • Elastic Map Reduce              • Gmail
SaaS                                      • Google apps




        • SimpleDB                        • App Engine             • SQL Azure
PaaS    • Relational Database Service     • BigTable / MegaStore
        • Flexible Payment Service




        •   EC2                           • Google Storage         • Blob storage
IaaS    •   Simple Queue Service                                   • Azure Computing
        •   Simple Notification Service                            • Queues
        •   Elastic Block Storage                                  • Load Balancer
        •   S3 / RRS
        •   CloudWatch / Auto Scaling
        •   Elastic Load Balancer
        •   Virtual Private Cloud


                               Cloud Computing                               Apr 2010   #13
Amazon Web Services

• http://aws.amazon.com/
• Xen VMs, 1 ECU = 1.2GHz AMD Opteron, US/EU prices
EC2 instance   RAM    CU*     HDD bit $/h on $/h                 $/h
               GB     (Cores) GB      demand Spot                reserved
S              1.7    1 (1)       160      32     0.085   0.03   0.03
L              7.5    4 (2)       850      64     0.34    0.13   0.12
XL             15     8 (4)       1690 64         0.68    0.24   0.24
High-mem XL    17.1   6.5 (2)     420      64     0.50    0.18   0.17
High-mem 2XL 34.2     13 (4)      850      64     1.20    0.43   0.42
High-mem 4XL 68.4     26 (8)      1690 64         2.40    0.82   0.84
High-CPU M     1.7    5 (2)       350      32     0.17    0.06   0.06
High-CPU XL    7      20 (8)      1690 64         0.68    0.24   0.24


                                Cloud Computing                  Apr 2010   #14
Amazon Web Services (2)

• Simple Storage Service (S3)
   – Eventually consistent blob storage (SLA available)
   – Max 5GB per object, REST+SOAP API
   – Storage $0.15/GB/mo, transfer $0.15/GB, $0.10 per 100K API calls
• Elastic Compute Cloud (EC2)
   – Xen VM, Amazon Machine Image (AMI), no SLA
• Elastic Block Storage (EBS)
   –   Up to 1TB storage to be used by EC2 instances (attached devices)
   –   Raw/unformatted block devices (create your own filesystem on top)
   –   Replicated
   –   $0.10/GB/mo, $0.10 per 1 million I/O ops (iostat)



                               Cloud Computing                Apr 2010     #15
Amazon Web Services (3)

• Simple Queue Service
   –   Persistent, reliable, secure, distributed queue (no SLA)
   –   Message size 8KB, autodelete 4 days
   –   duplicate and out-of-order delivery may occur
   –   Price: $0.15/GB transfer, $0.10 per 100K API calls
• Simple Notification Service
   – Reliable, secure & scalable pub/sub service (no SLA)
   – Protocols: HTTP, e-mail, SQS
   – Price: $0.15/GB transfer, $0.06 per 100K API calls, price per 100K
     notifications: $0.06 (HTTP), $2.00 (e-mail), free (SQS)
• SimpleDB
   – Distributed column store (built on Erlang)
   – Consistent or eventually consistent reads, flexible schema
   – $0.14/hour consumed, $0.15/GB transfer, $0.25/GB/mo storage
                                 Cloud Computing                  Apr 2010   #16
Amazon Web Services (4)

• Relational Database Service
   – MySQL (no SLA)
   – Automated backup and scaling
   – $0.11 to $3.10 per hour (instance type), $0.10/GB/mo storage, $0.10
     per million I/O ops, $0.15/GB transfer
• Elastic MapReduce
   – Based on Hadoop
   – Price: EC2 instance price + premium ($0.01 - $0.42/hour)
• CloudWatch, Auto Scaling, Elastic Load Balancer
   – Monitoring, auto scaling & load balancing for EC2
• Virtual Private Cloud


                              Cloud Computing                   Apr 2010   #17
Google AppEngine

• http://code.google.com/appengine/
• Features
   –   custom JVM (lots of limitations)
   –   servlet container, JSP
   –   Datastore based on BigTable (column store, consistent, C+P)
   –   JDO/JPA
   –   Google infrastructure services: URL fetch, mail
   –   Memcache (in-memory distributed key/value cache)
   –   Task queues & scheduler
   –   Development: local dev server, Eclipse plugins, administration
• Pricing
   – traffic/GB $0.10 ($0.12); CPU/h $0.10; storage/GB/mo $0.15; e-mail
     $1 per 10K

                                Cloud Computing                  Apr 2010   #18
Google AppEngine (2)




                        (C) Dan Sanderson / O’Reilly




      Cloud Computing                       Apr 2010   #19
Google AppEngine (3)

• Restrictions
   – Applications run in a restricted JVM sandbox
       • No threads, no System calls, limited reflection
   – No sub-process forking
   – Connections
       • Outbound – only URL fetch & mail
       • Inbound – only HTTP(S)
   – No filesystem writes (limited read access), use datastore instead
   – Limits
       •   Request duration – 30 sec
       •   Request/response size – 10 MB (datastore request/response – 1MB)
       •   file size – 10 MB, number of files – 3,000
       •   Datastore: entity size – 1 MB, property values – 1000, entities per batch -
           500


                                   Cloud Computing                       Apr 2010        #20
Google AppEngine (4)

• Datastore
   – Based on BigTable, distributed column-store
       • Entities and multi-valued properties
       • Entities have unique key & a type (kind)
       • Flexible schema                                 Select from Person
                                                         where lastName = …
   – Transactional, consistent                              && height < …
   – JDO/JPA interface                                   order by height desc

• Queries
   – JDOQL: entity kind + property value restrictions + sort order
   – Cursors can be specified (query range)
   – query resultset is materialised in a predefined index
       • query execution only fetches data from the existing index
       • queries with same kind + property restriction operator (but different
         value filler) + same sort order share the same index

                                 Cloud Computing                     Apr 2010    #21
Windows Azure

• http://www.microsoft.com/windowsazure/
• Components
   – Windows Azure
      • Fabric – management & monitoring of cloud services (Hyper-V)
      • Compute – hosted applications (.net, c++, java, …)
      • Storage – blob storage, tables, queues (REST interface)
   – SQL Azure
      • Cloud based MS SQL Server
   – AppFabric
      • Infrastructure services, Service registry
      • Access control

• Pricing
   – CPU/h $0.12; storage $0.15/GB/mo, transfer $0.10 ($0.15), storage
     transactions – $1 per 1 million
                                 Cloud Computing                 Apr 2010   #22
Windows Azure (2)




                      (C) David Chapell



    Cloud Computing        Apr 2010       #23
Contents



          Part III

Programming for the
       Cloud
      Tools & APIs



       Cloud Computing   Apr 2010   #24
Programming for the Cloud

• Amazon
   –   REST API
   –   AWS Java SDK (http://aws.amazon.com/sdkforjava/)
   –   AWS Toolkit for Eclipse (http://aws.amazon.com/eclipse)
   –   Typica (http://code.google.com/p/typica/)
   –   JetS3t (S3 only) http://jets3t.s3.amazonaws.com/index.html
• Google AppEngine
   –   AppEngine SDK (dev server, admin tools, Eclipse plugins)
   –   Datastore: JDO, JPA, low-level Java API
   –   Memcache: JCache + low level Java API
   –   URL fetch: java.net + low level Java API
   –   Mail: java.mail + low level Java API
   –   Task queue, blob store, accounts: low level APIs

                                Cloud Computing                   Apr 2010   #25
Programming for the Cloud (2)

• jClouds
   – http://code.google.com/p/jclouds/
   – Cloud interoperability framework (AWS, Google AppEngine*,
     Windows Azure, GoGrid)
   – Mostly storage oriented functionality
• Eucalyptus
   –   http://www.eucalyptus.com/
   –   Open source private cloud infrastructure
   –   AWS compatible (EC2, EBS, S3)                             (C) Eucalyptus Inc.



   –   Cross-hypervisor support




                                Cloud Computing            Apr 2010            #26
Don’t forget…

• Deploying on EC2 requires minimal to no modifications of
  existing software
• EC2 has some big machines: 70GB RAM / 8 CPU cores
• 1,000 servers for 1hr cost the same as 1 server for 1,000hrs
• Data traffic (in/out) of the Cloud can be expensive
• Storage relatively cheap
• Internal cloud traffic is free (AWS), e.g. accessing other
  applications/datasets on the Cloud
• CPU price: uptime (EC2) vs. computing cycles (AppEngine)
• EC2 spot instances (off-peak hours) are very, very cheap!

                             Cloud Computing             Apr 2010   #27
Contents



         Part IV

Semantic Web on the
       Cloud


      Cloud Computing   Apr 2010   #28
Semantic Web on the Cloud

• Public Data Sets on AWS
   – A lot of datasets hosted for free by Amazon
       • Freebase, UniGene, US Census, …
   – New data sets can be submitted too (after approval)
   – Full LOD cloud still not available (due to licensing issues)
• SaaS
   – Virtuoso (AWS hosted), OpenCalais, …
• “Semantic Cloud” initiatives (cloud interoperability & data
  integration)
   – E.g. fluidOps - Management & provisioning of semantic applications
     (SaaS) and datasources (DaaS) on the Cloud
       • Semantic Web apps as virtual appliances on the Cloud
       • LOD data sources as virtual resources on the Cloud (“Self-service”
         paradigm)
                                 Cloud Computing                     Apr 2010   #29
Unified Cloud Computing

• http://code.google.com/p/unifiedcloud/
• Uses RDF for cloud data interoperability




                          Cloud Computing    Apr 2010   #30
Useful and useless links

• http://groups.google.com/group/cloud-computing
• “An Essential Guide to Possibilities and Risks of Cloud
  Computing”
• “Talking To Your CFO About Cloud Computing”
• Nick Carr @ Atmosphere’2009
• Introducing the Windows Azure platform




                           Cloud Computing             Apr 2010   #31
Q&A




Questions?




  Cloud Computing   Apr 2010   #32

More Related Content

Viewers also liked

Cloud Computing: Practice Makes Perfect
Cloud Computing: Practice Makes PerfectCloud Computing: Practice Makes Perfect
Cloud Computing: Practice Makes Perfect
itnewsafrica
 
Introduction To Cloud Computing
Introduction To Cloud ComputingIntroduction To Cloud Computing
Introduction To Cloud Computing
kevnikool
 

Viewers also liked (16)

Cloud Computing: Practice Makes Perfect
Cloud Computing: Practice Makes PerfectCloud Computing: Practice Makes Perfect
Cloud Computing: Practice Makes Perfect
 
An introduction to Cloud computing
An introduction to Cloud computing  An introduction to Cloud computing
An introduction to Cloud computing
 
Introduction to Cloud Computing - COA101
Introduction to Cloud Computing - COA101Introduction to Cloud Computing - COA101
Introduction to Cloud Computing - COA101
 
Introduction To Cloud Computing
Introduction To Cloud ComputingIntroduction To Cloud Computing
Introduction To Cloud Computing
 
Introduction to cloud Computing
Introduction to cloud ComputingIntroduction to cloud Computing
Introduction to cloud Computing
 
MongoDB - Ruby document store that doesn't rhyme with ouch
MongoDB - Ruby document store that doesn't rhyme with ouchMongoDB - Ruby document store that doesn't rhyme with ouch
MongoDB - Ruby document store that doesn't rhyme with ouch
 
Cloud Computing
Cloud Computing Cloud Computing
Cloud Computing
 
Lect15 cloud
Lect15 cloudLect15 cloud
Lect15 cloud
 
All-IP Telecom Networks
All-IP Telecom NetworksAll-IP Telecom Networks
All-IP Telecom Networks
 
Cloud Computing Introduction
Cloud Computing IntroductionCloud Computing Introduction
Cloud Computing Introduction
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Ruby object model
Ruby object modelRuby object model
Ruby object model
 
HTML Lecture Part 1 of 2
HTML Lecture Part 1 of 2HTML Lecture Part 1 of 2
HTML Lecture Part 1 of 2
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
The Black Magic of Ruby Metaprogramming
The Black Magic of Ruby MetaprogrammingThe Black Magic of Ruby Metaprogramming
The Black Magic of Ruby Metaprogramming
 
IT Geek Week 2016 - Introduction To Cloud Computing
IT Geek Week 2016 - Introduction To Cloud ComputingIT Geek Week 2016 - Introduction To Cloud Computing
IT Geek Week 2016 - Introduction To Cloud Computing
 

More from Marin Dimitrov

DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-Service
Marin Dimitrov
 

More from Marin Dimitrov (20)

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-Service
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-Service
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Introduction to Cloud Computing

  • 1. Introduction to Cloud Computing Marin Dimitrov (technology watch #3) Apr 2010
  • 2. Contents • Introduction • Cloud Computing platforms • Programming for the Cloud • Semantic Web on the Cloud Cloud Computing Apr 2010 #2
  • 3. Contents Part I Introduction Cloud Computing Apr 2010 #3
  • 4. Cloud Computing - NIST definition • “Cloud computing is a model for enabling ubiquitous, convenient, on- demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” • Delivery models – IaaS (Infrastructure as a Service) - the consumer uses "fundamental resources" such as processing power, storage, networking components or middleware. The consumer can control the operating system, storage, applications and possibly networking – PaaS (Platform as a Service) - the consumer uses a hosting environment for their applications and has control over the applications (and some control over the hosting environment), but does not control the infrastructure on which they are running – SaaS (Software as a Service) - the consumer uses an application, but does not control the infrastructure on which it's running (OS, hardware) Cloud Computing Apr 2010 #4
  • 5. XaaS spectrum – Google, Amazon, Microsoft • Elastic Map Reduce • Gmail SaaS • Google apps • SimpleDB • App Engine • SQL Azure PaaS • Relational DataStore • BigTable / MegaStore • Flexible Payment Service • EC2 • Google Storage • Blob storage IaaS • Simple Queue Service • Azure Computing • Simple Notification Service • Queues • Elastic Block Storage • Load Balancer • S3 / RRS • CloudWatch / Auto Scaling • Elastic Load Balancer Cloud Computing Apr 2010 #5
  • 6. Cloud Computing - Essential characteristics (NIST) • Rapid elasticity – the ability to scale resources both up and down as needed. To the consumer, the Cloud appears to be infinite, and the consumer can purchase as much / little computing power as they need • Measured service – aspects of the Cloud service are controlled and monitored by the Cloud provider. This is crucial for billing, access control, resource optimization & capacity planning • On-demand self service – a consumer can use cloud services as needed without any human interaction with the cloud provider • Ubiquitous network access – the Cloud provider’s capabilities are available over the network and can be accessed through standard mechanisms • Resource Pooling – allows a Cloud provider to serve its consumers via a multi-tenant model - resources are (re)assigned according to consumer demand. Cloud Computing Apr 2010 #6
  • 7. Cloud Computing - deployment models (NIST) • Public cloud – Infrastructure owned by some organisation but sold to 3rd parties – E.g. Amazon Web Services, Google AppEngine, Windows Azure • Private cloud – Internal infrastructure for a single organisation (on or off-premise) – E.g. VMware vCloud, IBM Cloudburst, Microsoft Hyper-V • Community cloud – Infrastructure shared by several organisations, targeting a specific community – E.g. OpenCirrus (HP, Intel, Yahoo, KIT, CMU, …) • Hybrid cloud – Composition of the above – E.g. AWS Virtual Private Cloud Cloud Computing Apr 2010 #7
  • 8. Cloud computing – business drivers 1. Business agility – Faster time to market • No major upfront commitment & investment in infrastructure – Scalability & elasticity • Instant on-demand provisioning • Shifting the risk of over-/under-provisioning to the cloud provider 2. Focus – Outsource non-core tasks to the cloud provider 3. Pay-as-you-go – Speed up new project launching & rollout (start small, add resources when needed) – No need for complex planning ahead – Turn fixed costs (CapEx) into variable costs (OpEx) Cloud Computing Apr 2010 #8
  • 9. Some cloud use cases • Overflow buffer – Avoid over-provisioning for peak loads, but just for the average load • Seasonal business – E.g. Wallmart has 4:1 peak-to-average ratio (source?) • Small startups time-to-market – Less upfront investment, more focus on core competencies • Experimental playground – Rollout experimental projects without major equipment purchases • Speedup of large scale batch operations – 1000 servers for 1 hour cost the same as 1 server for 1000 hours – More cost-efficient computing (off-peak tariffs & time zones) • Unforeseeable events – E.g. sudden traffic spikes to web sites (volcanoes, anyone?) 2010 Cloud Computing Apr #9
  • 10. Cloud-able applications • Typical characteristics – Non mission critical – Need >99% uptime – Low bandwidth / higher latency tolerance – Relaxed security requirements – Few integration points – E.g • Batch operations (speedup at the same price!) • One-time large scale processing • Barriers to cloud migration – Security & trust – Lack of SLA – Lack of standardization (vendor lock-in) Cloud Computing Apr 2010 #10
  • 11. Cloud Computing – pros & cons (C) Dion Hinchcliffe Cloud Computing Apr 2010 #11
  • 12. Contents Part II Cloud Computing Platforms AWS, Google AppEngine, Windows Azure Cloud Computing Apr 2010 #12
  • 13. XaaS spectrum – Google, Amazon, Microsoft (again) • Elastic Map Reduce • Gmail SaaS • Google apps • SimpleDB • App Engine • SQL Azure PaaS • Relational Database Service • BigTable / MegaStore • Flexible Payment Service • EC2 • Google Storage • Blob storage IaaS • Simple Queue Service • Azure Computing • Simple Notification Service • Queues • Elastic Block Storage • Load Balancer • S3 / RRS • CloudWatch / Auto Scaling • Elastic Load Balancer • Virtual Private Cloud Cloud Computing Apr 2010 #13
  • 14. Amazon Web Services • http://aws.amazon.com/ • Xen VMs, 1 ECU = 1.2GHz AMD Opteron, US/EU prices EC2 instance RAM CU* HDD bit $/h on $/h $/h GB (Cores) GB demand Spot reserved S 1.7 1 (1) 160 32 0.085 0.03 0.03 L 7.5 4 (2) 850 64 0.34 0.13 0.12 XL 15 8 (4) 1690 64 0.68 0.24 0.24 High-mem XL 17.1 6.5 (2) 420 64 0.50 0.18 0.17 High-mem 2XL 34.2 13 (4) 850 64 1.20 0.43 0.42 High-mem 4XL 68.4 26 (8) 1690 64 2.40 0.82 0.84 High-CPU M 1.7 5 (2) 350 32 0.17 0.06 0.06 High-CPU XL 7 20 (8) 1690 64 0.68 0.24 0.24 Cloud Computing Apr 2010 #14
  • 15. Amazon Web Services (2) • Simple Storage Service (S3) – Eventually consistent blob storage (SLA available) – Max 5GB per object, REST+SOAP API – Storage $0.15/GB/mo, transfer $0.15/GB, $0.10 per 100K API calls • Elastic Compute Cloud (EC2) – Xen VM, Amazon Machine Image (AMI), no SLA • Elastic Block Storage (EBS) – Up to 1TB storage to be used by EC2 instances (attached devices) – Raw/unformatted block devices (create your own filesystem on top) – Replicated – $0.10/GB/mo, $0.10 per 1 million I/O ops (iostat) Cloud Computing Apr 2010 #15
  • 16. Amazon Web Services (3) • Simple Queue Service – Persistent, reliable, secure, distributed queue (no SLA) – Message size 8KB, autodelete 4 days – duplicate and out-of-order delivery may occur – Price: $0.15/GB transfer, $0.10 per 100K API calls • Simple Notification Service – Reliable, secure & scalable pub/sub service (no SLA) – Protocols: HTTP, e-mail, SQS – Price: $0.15/GB transfer, $0.06 per 100K API calls, price per 100K notifications: $0.06 (HTTP), $2.00 (e-mail), free (SQS) • SimpleDB – Distributed column store (built on Erlang) – Consistent or eventually consistent reads, flexible schema – $0.14/hour consumed, $0.15/GB transfer, $0.25/GB/mo storage Cloud Computing Apr 2010 #16
  • 17. Amazon Web Services (4) • Relational Database Service – MySQL (no SLA) – Automated backup and scaling – $0.11 to $3.10 per hour (instance type), $0.10/GB/mo storage, $0.10 per million I/O ops, $0.15/GB transfer • Elastic MapReduce – Based on Hadoop – Price: EC2 instance price + premium ($0.01 - $0.42/hour) • CloudWatch, Auto Scaling, Elastic Load Balancer – Monitoring, auto scaling & load balancing for EC2 • Virtual Private Cloud Cloud Computing Apr 2010 #17
  • 18. Google AppEngine • http://code.google.com/appengine/ • Features – custom JVM (lots of limitations) – servlet container, JSP – Datastore based on BigTable (column store, consistent, C+P) – JDO/JPA – Google infrastructure services: URL fetch, mail – Memcache (in-memory distributed key/value cache) – Task queues & scheduler – Development: local dev server, Eclipse plugins, administration • Pricing – traffic/GB $0.10 ($0.12); CPU/h $0.10; storage/GB/mo $0.15; e-mail $1 per 10K Cloud Computing Apr 2010 #18
  • 19. Google AppEngine (2) (C) Dan Sanderson / O’Reilly Cloud Computing Apr 2010 #19
  • 20. Google AppEngine (3) • Restrictions – Applications run in a restricted JVM sandbox • No threads, no System calls, limited reflection – No sub-process forking – Connections • Outbound – only URL fetch & mail • Inbound – only HTTP(S) – No filesystem writes (limited read access), use datastore instead – Limits • Request duration – 30 sec • Request/response size – 10 MB (datastore request/response – 1MB) • file size – 10 MB, number of files – 3,000 • Datastore: entity size – 1 MB, property values – 1000, entities per batch - 500 Cloud Computing Apr 2010 #20
  • 21. Google AppEngine (4) • Datastore – Based on BigTable, distributed column-store • Entities and multi-valued properties • Entities have unique key & a type (kind) • Flexible schema Select from Person where lastName = … – Transactional, consistent && height < … – JDO/JPA interface order by height desc • Queries – JDOQL: entity kind + property value restrictions + sort order – Cursors can be specified (query range) – query resultset is materialised in a predefined index • query execution only fetches data from the existing index • queries with same kind + property restriction operator (but different value filler) + same sort order share the same index Cloud Computing Apr 2010 #21
  • 22. Windows Azure • http://www.microsoft.com/windowsazure/ • Components – Windows Azure • Fabric – management & monitoring of cloud services (Hyper-V) • Compute – hosted applications (.net, c++, java, …) • Storage – blob storage, tables, queues (REST interface) – SQL Azure • Cloud based MS SQL Server – AppFabric • Infrastructure services, Service registry • Access control • Pricing – CPU/h $0.12; storage $0.15/GB/mo, transfer $0.10 ($0.15), storage transactions – $1 per 1 million Cloud Computing Apr 2010 #22
  • 23. Windows Azure (2) (C) David Chapell Cloud Computing Apr 2010 #23
  • 24. Contents Part III Programming for the Cloud Tools & APIs Cloud Computing Apr 2010 #24
  • 25. Programming for the Cloud • Amazon – REST API – AWS Java SDK (http://aws.amazon.com/sdkforjava/) – AWS Toolkit for Eclipse (http://aws.amazon.com/eclipse) – Typica (http://code.google.com/p/typica/) – JetS3t (S3 only) http://jets3t.s3.amazonaws.com/index.html • Google AppEngine – AppEngine SDK (dev server, admin tools, Eclipse plugins) – Datastore: JDO, JPA, low-level Java API – Memcache: JCache + low level Java API – URL fetch: java.net + low level Java API – Mail: java.mail + low level Java API – Task queue, blob store, accounts: low level APIs Cloud Computing Apr 2010 #25
  • 26. Programming for the Cloud (2) • jClouds – http://code.google.com/p/jclouds/ – Cloud interoperability framework (AWS, Google AppEngine*, Windows Azure, GoGrid) – Mostly storage oriented functionality • Eucalyptus – http://www.eucalyptus.com/ – Open source private cloud infrastructure – AWS compatible (EC2, EBS, S3) (C) Eucalyptus Inc. – Cross-hypervisor support Cloud Computing Apr 2010 #26
  • 27. Don’t forget… • Deploying on EC2 requires minimal to no modifications of existing software • EC2 has some big machines: 70GB RAM / 8 CPU cores • 1,000 servers for 1hr cost the same as 1 server for 1,000hrs • Data traffic (in/out) of the Cloud can be expensive • Storage relatively cheap • Internal cloud traffic is free (AWS), e.g. accessing other applications/datasets on the Cloud • CPU price: uptime (EC2) vs. computing cycles (AppEngine) • EC2 spot instances (off-peak hours) are very, very cheap! Cloud Computing Apr 2010 #27
  • 28. Contents Part IV Semantic Web on the Cloud Cloud Computing Apr 2010 #28
  • 29. Semantic Web on the Cloud • Public Data Sets on AWS – A lot of datasets hosted for free by Amazon • Freebase, UniGene, US Census, … – New data sets can be submitted too (after approval) – Full LOD cloud still not available (due to licensing issues) • SaaS – Virtuoso (AWS hosted), OpenCalais, … • “Semantic Cloud” initiatives (cloud interoperability & data integration) – E.g. fluidOps - Management & provisioning of semantic applications (SaaS) and datasources (DaaS) on the Cloud • Semantic Web apps as virtual appliances on the Cloud • LOD data sources as virtual resources on the Cloud (“Self-service” paradigm) Cloud Computing Apr 2010 #29
  • 30. Unified Cloud Computing • http://code.google.com/p/unifiedcloud/ • Uses RDF for cloud data interoperability Cloud Computing Apr 2010 #30
  • 31. Useful and useless links • http://groups.google.com/group/cloud-computing • “An Essential Guide to Possibilities and Risks of Cloud Computing” • “Talking To Your CFO About Cloud Computing” • Nick Carr @ Atmosphere’2009 • Introducing the Windows Azure platform Cloud Computing Apr 2010 #31
  • 32. Q&A Questions? Cloud Computing Apr 2010 #32