SlideShare une entreprise Scribd logo
1  sur  114
Télécharger pour lire hors ligne
Grid Computing and the
     Globus Toolkit
                Ben Clifford
 University of Chicago Computation Institute


                  ISSGC09
Who?
   I am Ben Clifford
    – University of Chicago Computation Institute
       > Swift workflow system / grid programming language

    – Formerly ISI, Los Angeles
       > MDS component of Globus Toolkit
   Over there, Ravi Madduri
    – Argonne National Laboratory, near Chicago




                                                             2
What is a Grid?
   Resource sharing
    – Computers, storage, sensors, networks, …
    – Sharing always conditional: issues of trust,
      policy, negotiation, payment, …
   Coordinated problem solving
    – Beyond client-server: distributed data
      analysis, computation, collaboration, …
   Dynamic, multi-institutional virtual
    organizations
    – Community overlays on classic org
      structures
    – Large or small, static or dynamic              3
An Old Idea …

   “The time-sharing computer system can unite
    a group of investigators …. one can conceive
    of such a facility as an … intellectual public
    utility.”
    – Fernando Corbato and Robert Fano, 1966
   “We will perhaps see the spread of ‘computer
    utilities’, which, like present electric and
    telephone utilities, will service individual
    homes and offices across the country.”
    – Len Kleinrock, 1967
                                                 4
Why Is this Hard or Different?
   Lack of central control
    – Where things run
    – When they run
    – Who can run
   Shared resources
    – Contention, variability
   Communication and coordination
    – Different sites implies different sys admins,
      users, institutional goals, and often socio-
      political constraints

                                                      5
So Why Do It?
   Computations that need to be done with a
    time limit
   Data that can’t fit on one site
   Data owned by multiple sites

   Applications that need to be run bigger,
    faster, more




                                               6
For Example:
              Digital Astronomy
   Digital observatories provide online
    archives of data at different wavelengths




   Ask questions such as: what objects are
    visible in infrared but not visible spectrum?



                                                    7
For Example:
 Cancer Biology
(Ravi, this afternoon)




                         8
What Kinds of Applications?
   Computation intensive
    – Interactive simulation (climate modeling)
    – Large-scale simulation and analysis (galaxy
      formation, gravity waves, event simulation)
    – Engineering (parameter studies, linked models)
   Data intensive
    – Experimental data analysis (e.g., physics)
    – Image & sensor analysis (astronomy, climate)
   Distributed collaboration
    – Online instrumentation (microscopes, x-ray)
      Remote visualization (climate studies, biology)
    – Engineering (large-scale structural testing)      9
Key Common Features
   The size and/or complexity of the problem
   Collaboration between people in several
    organizations
   Sharing computing resources, data,
    instruments




                                                10
Underlying Problem:
The Application-Infrastructure Gap


              Dynamic
               and/or
             Distributed
             Applications
   Shared Distributed Infrastructure
                             A           B


                         1           1



                                 9           9
                                             11
Grid Infrastructure
   Distributed use and management
    – Of physical resources
    – Of software services
    – Of communities and their policies




                                          12
Globus is…

   A collection of solutions to problems that
    come up frequently when building
    collaborative distributed applications
   Software for Grid infrastructure
   Tools to build applications that exploit Grid
    infrastructure
   Open source & open standards
   Enabler of a rich tool & service ecosystem




                                                    13
Globus is an Hour Glass

   Local sites have their own            Higher-Level Services
                                               and Users
    policies, installs – heterogeneity!
    – Queuing systems, monitors,
      network protocols, etc    Standard
   Globus unifies – standards! Interfaces
    – Common management
      abstractions & interfaces
    – eg built on Web services + WSRF
                                            Local heterogeneity

                                                             14
Globus is a Building Block
   Basic components for Grid functionality
    – Not turnkey solutions, but building blocks &
      tools for application developers & system
      integrators
   Highest-level services are often application
    specific, we let apps concentrate there
   Easier to reuse than to reinvent
    – Compatibility with other Grid systems
      comes for free
   We provide basic infrastructure to get you
    one step closer
                                                     15
Globus Philosophy
   Globus was first established as an open
    source project in 1996
   The Globus Toolkit is open source to:
    – Allow for inspection
       > for consideration in standardization processes

    – Encourage adoption
       > in pursuit of ubiquity and interoperability

    – Encourage contributions
       > harness the expertise of the community
   The Globus Toolkit is distributed under the
    (BSD-style) Apache License version 2
                                                          16
Globus Technology Areas
   Core runtime
    – Infrastructure for building new services
   Security
    – Apply uniform policy across distinct systems
   Execution management
    – Provision, deploy, & manage services
   Data management
    – Discover, transfer, & access large data
   Monitoring
    – Discover & monitor dynamic services
                                                     17
Globus Software: dev.globus.org
     Globus Projects
                                                         OGSA-DAI      GT4
     MPICH-
       G2               Java                               Data        Replica
                                 Delegation   MyProxy
                       Runtime                             Rep        Location

     GridWay             C                      GSI-
                                    CAS                  GridFTP       MDS4
                       Runtime                OpenSSH

    Incubator                                            Reliable
      Mgmt             Python
                                   C Sec       GRAM        File      GT4 Docs
                       Runtime
                                                         Transfer


 Incubator
 Projects              GAARDS MEDICUS Cog WF Virt WkSp              Others...
 GDTE       GridShib    OGRO     UGP       Dyn Acct Gavia JSC   DDM      Metrics
Introduce   PURSE      HOC-SA    LRMA       WEEP    Gavia MS    SGGC     ServMark

Common                     Execution                       Info
Runtime
               Security
                             Mgmt
                                           Data Mgmt
                                                         Services
                                                                         Other  18
Globus User Community

   Large & diverse
    – 10s of national Grids, 100s of applications,
      1000s of users; probably much more
    – Every continent except Antarctica
    – Applications ranging across many sciences
    – Dozens (at least) of commercial deployments
   Successful
    – Many production systems doing real work
    – Many applications producing real results
   Smart, energetic, demanding
    – Constant stream of new use cases & tools       19
Global
Community




            20
Examples of
        Production Scientific Grids
   APAC (Australia)
   China Grid
   China National Grid
   DGrid (Germany)
   EGEE
   NAREGI (Japan)
   Open Science Grid
   Taiwan Grid
   TeraGrid
   ThaiGrid
   UK Nat’l Grid Service

                                      21
More Specifically, I May Want To …
   Manage who is allowed to access my
    service or my experimental data or …
   Ensure reliable & secure distribution of
    data from my lab to my partners
   Run 10,000 jobs on whatever computers I
    can get hold of
   Monitor the status of the different
    resources to which I have access
   Create a service for use by my colleagues


                                                22
More Specifically, I May Want To …
   Manage who is allowed to access my
    service or my experimental data or …
   Ensure reliable & secure distribution of
    data from my lab to my partners
   Run 10,000 jobs on whatever computers I
    can get hold of
   Monitor the status of the different
    resources to which I have access
   Create a service for use by my colleagues


                                                23
Grid Security Concerns
   Control access to shared services
    – Address autonomous management, e.g.,
      different policy in different work groups
   Support multi-user collaborations
    – Federate through mutually trusted services
    – Local policy authorities rule
   Allow users and application communities to
    set up dynamic trust domains
    – Personal/VO collection of resources working
      together based on trust of user/VO

                                                    24
Virtual Organizations (VO)

                                                        Virtual Comm unity C


                                                                                                            Person E
                                                 Person B                         File server F1
                                                                                                          (Researcher)
                           Compute Server C1' (Adm inistrator)                        (disk A)
                 Person A
                                                                          Person D
         (Principal Investigator)
                                                                        (Researcher)




                                                  Person B
                                                                                                           Person E
                                                   (Staff)                Person D File server F1          (Faculty)
Compute Server C2           Compute Server C1                              (Staff) (disks A and B)
                    Person A                                                                         Person F
                    (Faculty)                                                                        (Faculty)
                                            Person C
                                            (Student)        Compute Server C3
                     Organization A                                              Organization B


            VO for each application or workload
            Carve out and configure resources for a
             particular use and set of users
                                                                                                                         25
Security Basics
   Privacy
    – Only the sender and receiver should be able
      to understand the conversation
   Integrity
    – Receiving end must know that the received
      message was the one from the sender
   Authentication
    – Users are who they say they are (authentic)
   Authorization
    – Is user allowed to perform the action

                                                    26
Globus Security

   Globus security is based on the Grid
    Security Infrastructure (GSI)
    – IETF standards
   Public-key-based authentication
    using X.509 certificates
    – X509 standard




                                           27
Authentication
         Private Key - known only by owner
         Public Key- known to everyone
         What one key encrypts, the other decrypts




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch09s03.html
                                                                                 28
Authentication using
                           Digital Certificates

         Digital document that certifies a public key is
          owned by a particular user
         Signed by 3rd party – the Certificate Authority (CA)
         X509 standard
                                                                To know if you
                                                                should trust the
                                                                certificate, you
                                                                have to trust the
                                                                CA



Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch09s04.html
                                                                                 29
Requesting a Certificate
         To request a
          certificate a user
          starts by generating
          a key pair




                                 Private Key   Public Key

Rachana Ananthakrishnan
                                                            30
Certificate Request
         The user signs their
          own public key to                 Public Key
          form what is called a
          Certificate Request
         Email/Web upload
         Note private key is            Sign
          never sent anywhere

                                     Certificate
                                      Request

                                     Public Key


Rachana Ananthakrishnan
                                                         31
Certificate Issuance
         The CA then creates,
                                                      Certificate
          signs and issues a                           Request
          certificate for the
                                                  Public Key
          user, combining the
          public key and the
                                  Name
          identity




                                         Name
                                         Issuer
                                         Validity
                                         Public Key
                                         Signature
Rachana Ananthakrishnan
                                                                    32
Examples of CAs
   GILDA CA, Globus GCS
    – An online service that issues low-quality
      certificates. You have used GILDA CA.
   SimpleCA
    – part of Globus Toolkit. Simple to use.
   DOEgrids CA
    – production CA often used by users of the
      Open Science Grid
   MyProxy CA
    – part of Globus Toolkit

                                                  33
Authorization – the GridMap File
   Maps distinguished names (found in
    certificates) to local names (such as login
    accounts)
   Also an access control list for GSI enabled
    services




                                                  34
Delegation
         Allows another service on the grid to act
          on your behalf.
         For example, you ask Reliable File Transfer
          service to transfer files using GridFTP.




Rachana Ananthakrishnan                                 35
Proxy Certificate
         Proxy Certificate allows another user to act
          upon their behalf
           – Credential delegation




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch10s05.html
                                                                                 36
Proxy Certificate
   Proxy empowers 3rd party to act upon your
    behalf
   Proxy certificate is signed by the end user,
    not a CA
   Proxy cert’s public key is a new one from
    the private-public key pair generated
    specifically for the proxy certificate
   Proxy also allows you to do single sign-on
    – Setup a proxy for a time period and you
      don’t need to sign in again


                                                   37
Proxy Certificate Chain




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch10s05.html
                                                                                 38
Single Sign-on
   Don’t need to remember (or even know)
    ID/passwords for each resource.
   Automatically get a Grid proxy certificate
    for use with other Grid tools
   More secure
    – No ID/password is sent over the wire: not
      even in encrypted form
    – Proxy certificate expires in a few hours and
      then is useless to anyone else
    – Don’t need to write down 10 passwords
   It’s fast and it’s easy!
                                                     39
Globus and Delegation:
                    MyProxy
   Remote service that
    stores user credentials
    – Users request proxies
      for local use
    – Web Portals request
      user proxies for use with
      back-end Grid services
   Grid administrators
    can pre-load
    credentials in the
    server for users to
    retrieve when needed

                                      40
VOMS
     A community-level group
      membership system
     Database of user roles
       – Administrative tools
       – Client interface
     voms-proxy-init
       – Uses client interface to produce
         an attribute certificate (instead
         of proxy) that includes roles &
         capabilities signed by VOMS
         server
       – Works with non-VOMS services,
         but gives more info to VOMS-
         aware services
     Allows VOs to centrally manage
      user roles




                                             41
Globus Authorization Framework


   VOMS          Shibboleth         LDAP      …             PERMIS




                                                                Authorization
          Attributes                                              Decision




                              PIP     PIP         PIP        PDP
Globus Client
                                            Globus Server

                                                                       42
Globus Security: How It Works

                       Services




Compute
 Center

                                  Users


             VO

                                      43
Globus Security: How It Works

                       Services




Compute
 Center

                                  Users
      Rights


               VO
                     Rights
                                      44
Globus Security: How It Works

                                  Services




  Compute
   Center                  CAS


                                             Users
             Rights
Local policy
on VO identity        VO
or attribute
authority                        Rights
                                                 45
Globus Security: How It Works

                                      Services (running
                                      on user’s behalf)

                      Access

  Compute                                                 Rights
   Center                      CAS


                                                     Users
             Rights
Local policy
on VO identity          VO
or attribute
authority                            Rights
                                                             46
Globus Security: How It Works

Authz Callout                with Proxy
                             Certificates
                                             Services (running
                                             on user’s behalf)

                         Access

   Compute                                                       Rights
    Center                         CAS


                                                            Users
                Rights
Local policy
on VO identity             VO
or attribute
authority                                   Rights
                                                                    47
A Cautionary Note
   Grid security mechanisms are tedious to set up
    – If exposed to users, hand-holding is usually required
    – These mechanisms can be hidden from end users,
      but still used behind the scenes
   These mechanisms exist for good reasons.
    – Many useful things can’t be done without Grid
      security
    – It is unlikely that an ambitious project could go into
      production operation without security like this
    – Most successful projects end up using Grid security,
      but using it in ways that end users don’t see much




                                                               48
More Specifically, I May Want To …
   Create a service for use by my colleagues
   Manage who is allowed to access my
    service (or my experimental data or …)
   Ensure reliable & secure distribution of
    data from my lab to my partners
   Run 10,000 jobs on whatever computers I
    can get hold of
   Monitor the status of the different
    resources to which I have access


                                                49
File Management

   Stage/move large data to/from nodes
    – GridFTP for basic file movement
    – Reliable File Transfer (RFT)
   Locate data of interest
    – Replica Location Service (RLS)




                                          50
GridFTP: The Protocol
   A high-performance, secure, reliable data transfer
    protocol optimized for high-bandwidth wide-area
    networks
     – Based on standard FTP with well-defined
       extensions
     – Uses GSI security
     – Multiple data channels for parallel transfers
     – Partial file transfers
     – Third-party transfers
     – Reusable data channels
     – Command pipelining
    – Striping  multi-Gb/sec wide area transport
   GGF recommendation GFD.20
                                                         51
GridFTP the Service
   IPv6 Support
   Globus XIO for different transports
   Pluggable
    – Front-end: e.g., future WS control channel
    – Back-end: e.g., HPSS, cluster file systems
    – Transfer: e.g., UDP, NetBLT transport




                                                   52
Striped GridFTP Service

   Multiple nodes work together as a single logical
    GridFTP server
   Every node of the cluster is used to transfer
    data into/out of the cluster
    – Each node reads/writes
      only pieces they’re




                                                                                               Parallel Filesystem
                                        Parallel Filesystem
      responsible for
    – Head node coordinates transfers
   Multiple levels of parallelism
    – CPU, bus, NIC, disk etc.
    – Maximizes use of Gbit+ WANs                                  Striped Transfer
                                                                Fully utilizes bandwidth of
                                                              Gb+ WAN using multiple nodes.

                                                                                              53
GridFTP
   Integrated instrumentation: Developers can use client API and
    plug-in mechanism to leverage different instrumentation
    – Performance markers
    – Restart markers
    – Throughput performance
    – Netlogger style performance tracking
   TCP buffer size control
     – Tune buffers to latency of network
     – Regular FTP optimized for low latency networks, not
       tunable
   Dramatic improvements for high latency WAN transfers
     – 90% of network utilization possible
     – 27 GB/s achieved with commodity hardware

                                                              54
Working with Different Data Transports
   XIO: eXtensible Input/Output
   POSIX-like interfaces to support multiple
    protocols
   Protocol implementations encapsulated in
    a stack of drivers
    – transport drivers: TCP, UDP, UDT, disk
    – transform drivers: compression, security,
      logging




                                                  55
Globus XIO

                                             Network
                                           Network
                                             Protocol
                                          Network
                                           Protocol
                                          Protocol
              Globus XIO

Application                      Driver
                                                   Disk


                                 Driver

                                          Special Device




                                                           56
RFT – Reliable File Transfer
   A WSRF service for queuing and reliably
    performing file transfer requests
    – Server-to-server transfers
    – Checkpointing for restarts
    – Database back-end for failovers
   Allows clients to requests transfers and
    then “disappear”
    – No need to manage the transfer
    – Status monitoring available if desired


                                               57
Reliable File Transfer (RFT)

                          RFT Client
         Web Service
           invocation                        Optional
      (SOAP via https)                       notifications
                                                                Database
                         RFT Service                         preserves state
                                               Client API
                                               speaks GridFTP
                                               protocol


                         Multiple parallel
   GridFTP Control                           GridFTP Control
                          data channels
                            move files
    GridFTP Data                               GridFTP Data


Has transferred >900,000 files.                                            58
Globus Replica Location Service

   Why replicate files?
     – Fault tolerance: avoid single points of failure
     – Reduce latency: use “nearest” copy
   Logical File Name (LFN)
     – Location-independent identifier (name)
     – Example: foo
   Physical File Name (PFN)
     – Specific file identifier such as a URL
     – E.g.: gsiftp://myserver.mycompany.com/foo
   RLS maps between LFNs and PFNs
     – foo  gsiftp://myserver.mycompany.com/foo



                                                         59
LFNs and PFNs


    LFN to PFN mappings are often many-to-one
    Multiple PFNs may indicate different access to
     a file

access via GridFTP server

access via one NFS mount     foo  gsiftp://dataserver.mycompany.com/foo
                             foo  file://nodeA.mycompany.com/foo
access via 2nd NFS mount     foo  file://nodeB.mycompany.com/foo
                             foo  https://www.mycompany.com/foo
access via web server




                                                                       60
RLS services
   Local replica catalog (LRC): Catalog of LFN to PFN
    mappings
   Replica Location Index (RLI): Aggregate information
    about one or more LRCs
   Only the LFN content for LRC is aggregated
     – Each configured LRC sends list of LFNs to LRCs
     – PFNs and mappings not aggregated


                     RLI                RLI




             LRC     LRC     LRC     LRC      LRC

                                                          61
OGSA-DAI
   Grid Interfaces to Databases
    – Data access
       > Relational & XML Databases, semi-structured files
    – Data integration
       > Multiple data delivery mechanisms, data translation
   Sessions 40-41 next Tuesday




                                                               62
More Specifically, I May Want To …
   Manage who is allowed to access my
    service (or my experimental data or …)
   Ensure reliable & secure distribution of
    data from my lab to my partners
   Run 10,000 jobs on whatever computers I
    can get hold of
   Monitor the status of the different
    resources to which I have access
   Create a service for use by my colleagues


                                                63
Execution Management (GRAM)
   Common webservices interface to job
    schedulers / local resource managers
    – Unix, Condor, LSF, PBS, SGE, …
   More generally: interface for process
    execution management
    – Lay down execution environment
    – Stage data
    – Monitor & manage lifecycle
    – Kill it, clean up


                                            64
GRAM - Basic Job
    Submission and Control Service
   A uniform service interface
    for remote job submission
    and control
    – Includes file staging and I/O
      management
    – Includes reliability features
   GRAM is not a scheduler.
    – No scheduling
    – No metascheduling, brokering,
      workflow
    – Often used as a front-end to
      schedulers, and often used by
      metaschedulers, brokers,
      workflow engines.
                                      65
Using GRAM vs Building a Service
   GRAM is intended for jobs that
     – are arbitrary programs
     – need stateful monitoring or credential
       management
     – Where file staging is important
   If the application is lightweight, with
    modest input/output, may be a better
    candidate for hosting directly as a WSRF
    service


                                                66
GRAM4 Architecture
                          GRAM on a Compute Element



                                                Local
                                          resource manager
          GRAM                             eg Condor, PBS
Client




         services   Transfer
                    request                                  User
                                                              job
               RFT File
                                   GridFTP
               Transfer




          GridFTP
                                                               67
Resource Specification Language (RSL)
   For complicated jobs, use RSL to specify the job
<job>
<executable>/bin/echo</executable>
<argument>this is an example_string </argument>
<argument>Globus was here</argument>
<stdout>${GLOBUS_USER_HOME}/stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>




                                                       68
At Most Once Submission
   You may specify a UUID with your job
    submission
   If you’re not sure the submission worked,
    you may submit the job again with the
    same UUID
   If the job has already been submitted, the
    new submission will have no effect
   If you do not specify a UUID, one will be
    generated for you


                                                 69
Staging Data
   GRAM’s RSL allows many
    fileStageIn/fileStageOut directives to bring
    input files in before your job and take
    output files away after your job.
   The transfers will be executed by RFT and
    GridFTP




                                                   70
RSL Substitutions
   GRAM will perform some variable
    substitutions for you
    –   GLOBUS_USER_HOME
    –   GLOBUS_USER_NAME
    –   GLOBUS_SCRATCH_DIR
    –   GLOBUS_LOCATION
   SCRATCH_DIR will be a compute-node
    local high-speed storage if defined, or
    GLOBUS_USER_HOME if not



                                              71
Choosing User Accounts
   You may be authorized to use more than
    one account at the remote site
   By default, the first listed in the grid-
    mapfile will be used
   You may request a specific user account
    using the <localUserId> element




                                                72
Multijobs
   You may specify more than one <job>
    element in a <multijob>
   Used by MPICH-G to support MPI jobs




                                          73
Workspace Service



                                   Policy
         Negotiate access
          Initiate activity
Client    Monitor activity         Activity
          Control activity
                                Environment


             Interface        Resource provider
                                                  74
Nimbus – Science Clouds




                            Virtual
Client
                           Machine




                       Resource provider
                                           75
More Specifically, I May Want To …
   Create a service for use by my colleagues
   Manage who is allowed to access my
    service (or my experimental data or …)
   Ensure reliable & secure distribution of
    data from my lab to my partners
   Run 10,000 jobs on whatever computers I
    can get hold of
   Monitor the status of the different
    resources to which I have access


                                                76
Monitoring and Discovery System
                  (MDS4)
   Grid-level monitoring system
    – Aid user/agent to identify host(s) on which
      to run an application
    – Warn on errors
   Uses standard interfaces to provide
    publishing of data, discovery, and data
    access, including subscription/notification
    – WS-ResourceProperties, WS-
      BaseNotification, WS-ServiceGroup
   Functions as an hourglass to provide a
    common interface to lower-level
    monitoring tools                                77
Information Users :
      Schedulers, Portals, Warning Systems, etc.




                                          WS standard
                                          interfaces for
Standard Schemas
                                          subscription,
(GLUE schema, eg)
                                          registration,
                                          notification


Cluster monitors
(Ganglia, Hawkeye,
Clumon, and                              Queuing systems
Nagios)                  Services       (PBS, LSF, Torque)
                     (GRAM, RFT, RLS)                 78
MDS4 Components
   Information providers
    – Monitoring is a part of every WSRF service
    – Non-WS services are also be used
   Collective services
    – Index Service – a way to aggregate data
    – Trigger Service – a way to be notified of changes
   Clients
    – WebMDS
    – wsrf-query command line tool

   All of the tool are schema-agnostic, but
    interoperability needs a well-understood
    common language
                                                          79
Information Providers:
              Globus Services
   GRAM
    – Queue and cluster information
   Reliable File Transfer Service (RFT)
    – Service status data, number of active
      transfers, transfer status, information about
      the resource running the service
   Replica Location Service (RLS)
    – (not a web service)
    – Location and status of replica catalogs
   Others: CAS, your own providers

                                                      80
Information Providers:
          Cluster and Queue Data
   Interfaces to Hawkeye, Ganglia, CluMon, Nagios
    – Basic host data (name, ID), processor information,
      memory size, OS name and version, file system
      data, processor load data
    – Some condor/cluster specific data
    – This can also be done for sub-clusters, not just at
      the host level
   Interfaces to PBS, Torque, LSF
    – Queue information, number of CPUs available and
      free, job count information, some memory statistics
      and host info for head node of cluster




                                                            81
MDS4 Index Service
   Index Service is both registry and cache
    – Datatype and data provider info, like a
      registry (UDDI)
    – Last value of data, like a cache
   In memory default approach
    – DB backing store currently being developed
      to allow for very large indexes
   Can be set up for a site or set of sites, a
    specific set of project data, or for user-
    specific data only
   Can be a multi-rooted hierarchy
    – No *global* index

                                                   82
MDS4 Trigger Service
   Subscribe to a set of resource properties
   Evaluate that data against a set of pre-
    configured conditions (triggers)
   When a condition matches, action occurs
    – Email is sent to pre-defined address
    – Website updated


   Similar functionality in Hawkeye



                                                83
WebMDS User Interface
   Web-based interface to WSRF resource
    property information
   User-friendly front-end to Index Service
   Uses standard resource property requests
    to query resource property data
   XSLT transforms to format and display
    them
   Customized pages are simply done by
    using HTML form options and creating your
    own XSLT transforms
   Sample page:
    – http://mds.globus.org:8080/webmds/webm
      ds?info=indexinfo&xsl=servicegroupxsl
                                                84
85
Working with TeraGrid
   Large US project across 9 different sites
   Explore metascheduling approaches
   -> Need a common source of data with a
    standard interface for basic scheduling info
   Collects data from Ganglia, Hawkeye,
    CluMon, and Nagios




                                                   86
87
DOE Earth System Grid

Goal: Enable
sharing &
analysis of
high-volume
data from
advanced
earth system
models



www.earthsystemgrid.org
                                            88
Monitoring Overall System Status

     Monitored data are collected in
      MDS4 Index service
     Information providers check
      resource status at a configured
      frequency
      – Currently, every 10 minutes
     Report status to Index Service
     Information in Index Service is
      queried by ESG Web portal
     Used to generate overall picture of
      state of ESG resources
     Displayed on ESG Web portal page

Ann Chervenak, USC/ISI
                                            89
Web Service Basics
         Web Services are basic distributed
          computing technology that let us construct
          client-server interactions




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html   90
Web Service Basics 2
   Web services are platform independent and
    language independent
    – Client and servers: different languages,
      different environments
   Web service is *not* a website
    – Web service is accessed by software, not
      humans
   Web services are ideal for loosely coupled
    systems


                                                 91
WSDL: Web Services
                   Description Language



                                     Define expected messages for a service,
                                     and their (input or output parameters)


                                     An interface groups together a number of
                                     messages (operations)




Bind an Interface via a definition
to a specific transport (e.g.            The network location where the service is
HTTP) and messaging (e.g.                implemented , e.g. http://localhost:8080
SOAP) protocol
                                                                                92
Let’s talk about state
         Plain Web services are stateless




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html   93
However, Many Grid
                     Applications Require State




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html   94
Keep the Web Service
                     and the State Separate

      Instead of putting state in a Web
       service, we keep it in a resource
      Each resource has a unique key




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html   95
Resources Can Be Anything Stored


                   Web Service
                       +
                    Resource
                       =
                   WS-Resource


                   Address of a WS-
                   resource is called
                   an end-point
                   reference
                                    96
Standard Interfaces

                                    Service information
                                    State representation
            GetRP                    – Resource
                                     – Resource Property
          GetMultRPs
                                    State identification
            SetRP                    – Endpoint Reference
Client                  Web
           QueryRPs
                       Service      State Interfaces
           Subscribe                 – GetRP, QueryRPs,
           SetTerm
                                       GetMultipleRPs, SetRP
            Time                    Lifetime Interfaces
            Destroy                  – SetTerminationTime
                                     – ImmediateDestruction
                                    Notification Interfaces
                                     – Subscribe
                                     – Notify
                                    ServiceGroups             97
Globus and Web Services

                                        User Applications



                                      Custom         Globus
     (e.g., Apache Axis)
      Globus Container




                                                                    and Admin
                                       WSRF        WSRF Web




                                                                     Registry
                           Custom
                            Web       Services      Services
                           Services

                                      WS-A, WSRF, WS-Notification


                                      WSDL, SOAP, WS-Security



Globus Core: Java , C (fast, small footprint), Python                           98
The GT4 Distribution
Globus Software: dev.globus.org
     Globus Projects
                                                         OGSA-DAI      GT4
     MPICH-
       G2               Java                               Data      Replica
                                 Delegation   MyProxy
                       Runtime                             Rep      Location

     GridWay             C                      GSI-
                                    CAS                  GridFTP       MDS4
                       Runtime                OpenSSH

    Incubator                                            Reliable
      Mgmt             Python
                                   C Sec       GRAM        File     GT4 Docs
                       Runtime
                                                         Transfer


 Incubator
 Projects              GAARDS MEDICUS Cog WF Virt WkSp
 GDTE       GridShib    OGRO     UGP       Dyn Acct Gavia JSC   DDM     Metrics
Introduce   PURSE      HOC-SA    LRMA       WEEP    Gavia MS    SGGC   ServMark

Common                     Execution                       Info
Runtime
               Security
                             Mgmt
                                           Data Mgmt
                                                         Services          100
                                                                        Other
dev.globus

   Governance model based on Apache Jakarta
    – Consensus based decision making
   Globus software is organized as several
    dozen “Globus Projects”
    – Each project has its own “Committers”
      responsible for their products
    – Cross-project coordination through shared
      interactions and committers meetings
   A “Globus Management Committee”
    – Overall guidance and conflict resolution

                                                  101
http://dev.globus.org

  Guidelines
   (Apache
   Jakarta)

Infrastructure
 (CVS, email,
bugzilla, Wiki)




   Projects
   Include
      …



                                          102
Tested Platforms
   Debian                   SGI Altix (IA64
   Fedora Core               running Red Hat)
   FreeBSD                  SuSE Linux
   HP/UX                    Tru64 Unix
   IBM AIX                  Apple MacOS X (no
   Red Hat                   binaries)
   Sun Solaris
                             Windows – Java
                              components only

List of binaries and known platform-specific
   install bugs at
http://www.globus.org/toolkit/docs/4.0/admin/
  docbook/ ch03.html
                                                  103
Installation in a nutshell
   Quickstart guide is very useful
    http://www.globus.org/toolkit/docs/4.0/
    admin/docbook/quickstart.html

   Verify your prereqs!
   Security – check spellings and permissions
   Globus is system software – plan
    accordingly



                                                 104
General Globus Help and Support
   Globus toolkit help lists list
    – gt-user@globus.org
    – gt-dev@globus.org
    – http://dev.globus.org/wiki/
       Mailing_Lists
   Each project has specific lists
   Bugzilla
    – bugzilla.globus.org

                                      105
Incubator Process in dev.globus
   Entry point for new Globus projects
   Incubator Management Project (IMP)
    – Oversees incubator process form first
      contact to becoming a Globus project
    – Quarterly reviews of current projects
    – Process being debugged by “Incubator
      Pioneers”
http://dev.globus.org/wiki/Incubator/
  Incubator_Process


                                              106
Current Incubator Projects
            dev.globus.org
   Distributed Data        GridShib               Portal-based
    Management (DDM)        Grid Toolkit Handle     User Registration
                             System (gt-hs)          Service (PURSe)
   Dynamic Accounts
                            Higher Order           ServMark
   Gavia-Meta
    Scheduler                Component Service      SJTU GridFTP
                             Architecture (HOC-      GUI Client
   Gavia- Job               SA)                     (SGGC)
    Submission Client       Introduce              UCLA Grid Portal
   Grid Authentication     Local Resource          Software (UGP)
    and Authorization        Manager Adaptors       WEEP
    with Reliably            (LRMA)                 Cog Workflow
    Distributed Services    Metrics                Virtual
    (GAARDS)                MEDICUS                 Workspaces
   Grid Development        Open GRid OCSP
    Tools for Eclipse        (Online Certificate
    (GDTE)                   Status Protocol)

                                                                    107
GridWay Meta-Scheduler
         Scheduler virtualization layer on top of Globus
          services
          – A LRM-like environment for submitting, monitoring,
            and controlling jobs
          – Submit jobs to the Grid, without having to worry
            about the details of exactly which local resource will
            run the job
          – A policy-driven job scheduler
          – Accounting
          – Fault detection & recovery
          – Arrays of jobs, DAGs




GridWay: http://www.gridway.org                                      108
How Can You Contribute?
            Create a New Project
   Do you have a project you’d like to
    contribute?
   Does your software solve a problem you
    think the Globus community would be
    interested in?
   Contact incubator-committers@globus.org




                                              109
Contribute to an Existing Project
   Contribute code, documentation, design
    ideas, and feature requests
   Joining the mailing lists
    – *-dev, *-user, *-announce for each project
    – See the project wiki page at dev.globus.org
   Chime in at any time
   Regular contributors can become
    committers, with a role in defining project
    directions
http://dev.globus.org/wiki/How_to_contribute

                                                    110
Summary: Grids are About …


Enabling “coordinated resource sharing
& problem solving in dynamic, multi-
institutional virtual organizations.”
     (Source: “The Anatomy of the Grid”)

    Access to shared resources
       Virtualization, allocation, management
    With predictable behaviors
       Provisioning, quality of service
    In dynamic, heterogeneous environments
       Standards-based interfaces and protocols
                                                   111
… By Providing
            Open Infrastructure

   Web services standards
    – State, notification, security, …
   Services that enable access to resources
    – Service-enable new & existing resources
    – E.g., GRAM on computer, GridFTP on
      storage system, custom application services
    – Uniform abstractions & mechanisms
   Tools to build applications that exploit this
    infrastructure
    – Registries, security, data management, …
   A rich tool & service ecosystem                 112
More Specifically,
          Making it Possible to …
   Create a service for use by my colleagues
   Manage who is allowed to access my
    service (or my experimental data or …)
   Ensure reliable & secure distribution of
    data from my lab to my partners
   Run 10,000 jobs on whatever computers I
    can get hold of
   Monitor the status of the different
    resources to which I have access
   And so on …
                                                113
For More Information
   Globus Alliance
    – http://www.globus.org
   Dev.globus
    – http://dev.globus.org
   Upcoming Events
    – http://dev.globus.org/wiki/Outreach
   Globus Solutions
    – http://www.globus.org/solutions/



                                            114

Contenu connexe

Tendances

Grid Computing - Collection of computer resources from multiple locations
Grid Computing - Collection of computer resources from multiple locationsGrid Computing - Collection of computer resources from multiple locations
Grid Computing - Collection of computer resources from multiple locationsDibyadip Das
 
Applications of SOA and Web Services in Grid Computing
Applications of SOA and Web Services in Grid ComputingApplications of SOA and Web Services in Grid Computing
Applications of SOA and Web Services in Grid Computingyht4ever
 
Grid Computing
Grid ComputingGrid Computing
Grid Computingabhiritva
 
Inroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar vermaInroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar vermagargishankar1981
 
Grid computing 2007
Grid computing 2007Grid computing 2007
Grid computing 2007Tank Bhavin
 
Computing Outside The Box
Computing Outside The BoxComputing Outside The Box
Computing Outside The BoxIan Foster
 
Grid Computing (An Up-Coming Technology)
Grid Computing (An Up-Coming Technology)Grid Computing (An Up-Coming Technology)
Grid Computing (An Up-Coming Technology)LJ PROJECTS
 
Grid computing notes
Grid computing notesGrid computing notes
Grid computing notesSyed Mustafa
 
Cs6703 grid and cloud computing book
Cs6703 grid and cloud computing bookCs6703 grid and cloud computing book
Cs6703 grid and cloud computing bookkaleeswaranme
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]Raul Soto
 
Introduction to Grid Computing
Introduction to Grid ComputingIntroduction to Grid Computing
Introduction to Grid Computingabhijeetnawal
 
Grid computing the grid
Grid computing the gridGrid computing the grid
Grid computing the gridJivan Nepali
 

Tendances (20)

Grid Computing - Collection of computer resources from multiple locations
Grid Computing - Collection of computer resources from multiple locationsGrid Computing - Collection of computer resources from multiple locations
Grid Computing - Collection of computer resources from multiple locations
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid computing
Grid computingGrid computing
Grid computing
 
Applications of SOA and Web Services in Grid Computing
Applications of SOA and Web Services in Grid ComputingApplications of SOA and Web Services in Grid Computing
Applications of SOA and Web Services in Grid Computing
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Inroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar vermaInroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar verma
 
Grid computing 2007
Grid computing 2007Grid computing 2007
Grid computing 2007
 
Computing Outside The Box
Computing Outside The BoxComputing Outside The Box
Computing Outside The Box
 
Grid Computing (An Up-Coming Technology)
Grid Computing (An Up-Coming Technology)Grid Computing (An Up-Coming Technology)
Grid Computing (An Up-Coming Technology)
 
Grid computing notes
Grid computing notesGrid computing notes
Grid computing notes
 
Grid computing
Grid computingGrid computing
Grid computing
 
grid computing
grid computinggrid computing
grid computing
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
4. the grid evolution
4. the grid evolution4. the grid evolution
4. the grid evolution
 
Cs6703 grid and cloud computing book
Cs6703 grid and cloud computing bookCs6703 grid and cloud computing book
Cs6703 grid and cloud computing book
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]
 
Introduction to Grid Computing
Introduction to Grid ComputingIntroduction to Grid Computing
Introduction to Grid Computing
 
Grid computing the grid
Grid computing the gridGrid computing the grid
Grid computing the grid
 

En vedette

En vedette (12)

Grid Computing Certification
Grid Computing CertificationGrid Computing Certification
Grid Computing Certification
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
 
Deploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache HadoopDeploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache Hadoop
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Cs6703 grid and cloud computing unit 1
Cs6703 grid and cloud computing unit 1Cs6703 grid and cloud computing unit 1
Cs6703 grid and cloud computing unit 1
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid computing ppt 2003(done)
Grid computing ppt 2003(done)Grid computing ppt 2003(done)
Grid computing ppt 2003(done)
 
Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Grid computing ppt
Grid computing pptGrid computing ppt
Grid computing ppt
 
Grid computing Seminar PPT
Grid computing Seminar PPTGrid computing Seminar PPT
Grid computing Seminar PPT
 
1. GRID COMPUTING
1. GRID COMPUTING1. GRID COMPUTING
1. GRID COMPUTING
 

Similaire à Session19 Globus

stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...NETWAYS
 
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...NETWAYS
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.pptNileshkuGiri
 
grid mining
grid mininggrid mining
grid miningARNOLD
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3Robert Grossman
 
MPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
MPLS/SDN 2013 Intercloud Standardization and Testbeds - SillMPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
MPLS/SDN 2013 Intercloud Standardization and Testbeds - SillAlan Sill
 
Azure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big dataAzure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big dataMicrosoft Technet France
 
Cloud computing vs grid computing
Cloud computing vs grid computingCloud computing vs grid computing
Cloud computing vs grid computing8neutron8
 
Scientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & FutureScientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & Futurestratuslab
 
Grid and Cloud Computing Lecture-2a.pptx
Grid and Cloud Computing Lecture-2a.pptxGrid and Cloud Computing Lecture-2a.pptx
Grid and Cloud Computing Lecture-2a.pptxDrAdeelAkram2
 
Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011IndicThreads
 
g-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionalityg-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking FunctionalityNicholas Loulloudes
 
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...IOSR Journals
 
Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computingsudha kar
 
Privacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storagePrivacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storagedbpublications
 
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...Christian Esteve Rothenberg
 

Similaire à Session19 Globus (20)

stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
 
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
stackconf 2023 | SCS: Buildig Open Source Cloud and Container Infrastructure ...
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.ppt
 
grid mining
grid mininggrid mining
grid mining
 
GRID COMPUTING.ppt
GRID COMPUTING.pptGRID COMPUTING.ppt
GRID COMPUTING.ppt
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3
 
MPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
MPLS/SDN 2013 Intercloud Standardization and Testbeds - SillMPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
MPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
 
Azure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big dataAzure Brain: 4th paradigm, scientific discovery & (really) big data
Azure Brain: 4th paradigm, scientific discovery & (really) big data
 
Grid computing
Grid computingGrid computing
Grid computing
 
Cloud computing vs grid computing
Cloud computing vs grid computingCloud computing vs grid computing
Cloud computing vs grid computing
 
Scientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & FutureScientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & Future
 
Grid and Cloud Computing Lecture-2a.pptx
Grid and Cloud Computing Lecture-2a.pptxGrid and Cloud Computing Lecture-2a.pptx
Grid and Cloud Computing Lecture-2a.pptx
 
Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011Monitoring applications on cloud - Indicthreads cloud computing conference 2011
Monitoring applications on cloud - Indicthreads cloud computing conference 2011
 
Cloud vs grid
Cloud vs gridCloud vs grid
Cloud vs grid
 
g-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionalityg-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionality
 
H017144148
H017144148H017144148
H017144148
 
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
 
Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computing
 
Privacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storagePrivacy preserving public auditing for secured cloud storage
Privacy preserving public auditing for secured cloud storage
 
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
IEEE HPSR 2017 Keynote: Softwarized Dataplanes and the P^3 trade-offs: Progra...
 

Plus de ISSGC Summer School

Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future ISSGC Summer School
 
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake EdlundSession 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake EdlundISSGC Summer School
 
Session 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeSession 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeISSGC Summer School
 
Session 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteSession 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteISSGC Summer School
 
Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management ISSGC Summer School
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical ISSGC Summer School
 
Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution ISSGC Summer School
 
Session 37 - Intro to Workflows, API's and semantics
Session 37 - Intro to Workflows, API's and semantics Session 37 - Intro to Workflows, API's and semantics
Session 37 - Intro to Workflows, API's and semantics ISSGC Summer School
 
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an exampleSession 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an exampleISSGC Summer School
 
Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction ISSGC Summer School
 
Session 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLiteSession 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLiteISSGC Summer School
 

Plus de ISSGC Summer School (20)

Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future
 
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake EdlundSession 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
 
Session 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeSession 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in Europe
 
Integrating Practical2009
Integrating Practical2009Integrating Practical2009
Integrating Practical2009
 
Session 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteSession 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky Note
 
Departure
DepartureDeparture
Departure
 
Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical
 
Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution
 
Session 42 - GridSAM
Session 42 - GridSAMSession 42 - GridSAM
Session 42 - GridSAM
 
Session 37 - Intro to Workflows, API's and semantics
Session 37 - Intro to Workflows, API's and semantics Session 37 - Intro to Workflows, API's and semantics
Session 37 - Intro to Workflows, API's and semantics
 
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an exampleSession 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
 
Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction
 
Session 36 - Engage Results
Session 36 - Engage ResultsSession 36 - Engage Results
Session 36 - Engage Results
 
Session 23 - Intro to EGEE-III
Session 23 - Intro to EGEE-IIISession 23 - Intro to EGEE-III
Session 23 - Intro to EGEE-III
 
Session 33 - Production Grids
Session 33 - Production GridsSession 33 - Production Grids
Session 33 - Production Grids
 
Social Program
Social ProgramSocial Program
Social Program
 
Session29 Arc
Session29 ArcSession29 Arc
Session29 Arc
 
Session 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLiteSession 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLite
 
Session 23 - gLite Overview
Session 23 - gLite OverviewSession 23 - gLite Overview
Session 23 - gLite Overview
 

Dernier

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 

Dernier (20)

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 

Session19 Globus

  • 1. Grid Computing and the Globus Toolkit Ben Clifford University of Chicago Computation Institute ISSGC09
  • 2. Who?  I am Ben Clifford – University of Chicago Computation Institute > Swift workflow system / grid programming language – Formerly ISI, Los Angeles > MDS component of Globus Toolkit  Over there, Ravi Madduri – Argonne National Laboratory, near Chicago 2
  • 3. What is a Grid?  Resource sharing – Computers, storage, sensors, networks, … – Sharing always conditional: issues of trust, policy, negotiation, payment, …  Coordinated problem solving – Beyond client-server: distributed data analysis, computation, collaboration, …  Dynamic, multi-institutional virtual organizations – Community overlays on classic org structures – Large or small, static or dynamic 3
  • 4. An Old Idea …  “The time-sharing computer system can unite a group of investigators …. one can conceive of such a facility as an … intellectual public utility.” – Fernando Corbato and Robert Fano, 1966  “We will perhaps see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.” – Len Kleinrock, 1967 4
  • 5. Why Is this Hard or Different?  Lack of central control – Where things run – When they run – Who can run  Shared resources – Contention, variability  Communication and coordination – Different sites implies different sys admins, users, institutional goals, and often socio- political constraints 5
  • 6. So Why Do It?  Computations that need to be done with a time limit  Data that can’t fit on one site  Data owned by multiple sites  Applications that need to be run bigger, faster, more 6
  • 7. For Example: Digital Astronomy  Digital observatories provide online archives of data at different wavelengths  Ask questions such as: what objects are visible in infrared but not visible spectrum? 7
  • 8. For Example: Cancer Biology (Ravi, this afternoon) 8
  • 9. What Kinds of Applications?  Computation intensive – Interactive simulation (climate modeling) – Large-scale simulation and analysis (galaxy formation, gravity waves, event simulation) – Engineering (parameter studies, linked models)  Data intensive – Experimental data analysis (e.g., physics) – Image & sensor analysis (astronomy, climate)  Distributed collaboration – Online instrumentation (microscopes, x-ray) Remote visualization (climate studies, biology) – Engineering (large-scale structural testing) 9
  • 10. Key Common Features  The size and/or complexity of the problem  Collaboration between people in several organizations  Sharing computing resources, data, instruments 10
  • 11. Underlying Problem: The Application-Infrastructure Gap Dynamic and/or Distributed Applications Shared Distributed Infrastructure A B 1 1 9 9 11
  • 12. Grid Infrastructure  Distributed use and management – Of physical resources – Of software services – Of communities and their policies 12
  • 13. Globus is…  A collection of solutions to problems that come up frequently when building collaborative distributed applications  Software for Grid infrastructure  Tools to build applications that exploit Grid infrastructure  Open source & open standards  Enabler of a rich tool & service ecosystem 13
  • 14. Globus is an Hour Glass  Local sites have their own Higher-Level Services and Users policies, installs – heterogeneity! – Queuing systems, monitors, network protocols, etc Standard  Globus unifies – standards! Interfaces – Common management abstractions & interfaces – eg built on Web services + WSRF Local heterogeneity 14
  • 15. Globus is a Building Block  Basic components for Grid functionality – Not turnkey solutions, but building blocks & tools for application developers & system integrators  Highest-level services are often application specific, we let apps concentrate there  Easier to reuse than to reinvent – Compatibility with other Grid systems comes for free  We provide basic infrastructure to get you one step closer 15
  • 16. Globus Philosophy  Globus was first established as an open source project in 1996  The Globus Toolkit is open source to: – Allow for inspection > for consideration in standardization processes – Encourage adoption > in pursuit of ubiquity and interoperability – Encourage contributions > harness the expertise of the community  The Globus Toolkit is distributed under the (BSD-style) Apache License version 2 16
  • 17. Globus Technology Areas  Core runtime – Infrastructure for building new services  Security – Apply uniform policy across distinct systems  Execution management – Provision, deploy, & manage services  Data management – Discover, transfer, & access large data  Monitoring – Discover & monitor dynamic services 17
  • 18. Globus Software: dev.globus.org Globus Projects OGSA-DAI GT4 MPICH- G2 Java Data Replica Delegation MyProxy Runtime Rep Location GridWay C GSI- CAS GridFTP MDS4 Runtime OpenSSH Incubator Reliable Mgmt Python C Sec GRAM File GT4 Docs Runtime Transfer Incubator Projects GAARDS MEDICUS Cog WF Virt WkSp Others... GDTE GridShib OGRO UGP Dyn Acct Gavia JSC DDM Metrics Introduce PURSE HOC-SA LRMA WEEP Gavia MS SGGC ServMark Common Execution Info Runtime Security Mgmt Data Mgmt Services Other 18
  • 19. Globus User Community  Large & diverse – 10s of national Grids, 100s of applications, 1000s of users; probably much more – Every continent except Antarctica – Applications ranging across many sciences – Dozens (at least) of commercial deployments  Successful – Many production systems doing real work – Many applications producing real results  Smart, energetic, demanding – Constant stream of new use cases & tools 19
  • 21. Examples of Production Scientific Grids  APAC (Australia)  China Grid  China National Grid  DGrid (Germany)  EGEE  NAREGI (Japan)  Open Science Grid  Taiwan Grid  TeraGrid  ThaiGrid  UK Nat’l Grid Service 21
  • 22. More Specifically, I May Want To …  Manage who is allowed to access my service or my experimental data or …  Ensure reliable & secure distribution of data from my lab to my partners  Run 10,000 jobs on whatever computers I can get hold of  Monitor the status of the different resources to which I have access  Create a service for use by my colleagues 22
  • 23. More Specifically, I May Want To …  Manage who is allowed to access my service or my experimental data or …  Ensure reliable & secure distribution of data from my lab to my partners  Run 10,000 jobs on whatever computers I can get hold of  Monitor the status of the different resources to which I have access  Create a service for use by my colleagues 23
  • 24. Grid Security Concerns  Control access to shared services – Address autonomous management, e.g., different policy in different work groups  Support multi-user collaborations – Federate through mutually trusted services – Local policy authorities rule  Allow users and application communities to set up dynamic trust domains – Personal/VO collection of resources working together based on trust of user/VO 24
  • 25. Virtual Organizations (VO) Virtual Comm unity C Person E Person B File server F1 (Researcher) Compute Server C1' (Adm inistrator) (disk A) Person A Person D (Principal Investigator) (Researcher) Person B Person E (Staff) Person D File server F1 (Faculty) Compute Server C2 Compute Server C1 (Staff) (disks A and B) Person A Person F (Faculty) (Faculty) Person C (Student) Compute Server C3 Organization A Organization B  VO for each application or workload  Carve out and configure resources for a particular use and set of users 25
  • 26. Security Basics  Privacy – Only the sender and receiver should be able to understand the conversation  Integrity – Receiving end must know that the received message was the one from the sender  Authentication – Users are who they say they are (authentic)  Authorization – Is user allowed to perform the action 26
  • 27. Globus Security  Globus security is based on the Grid Security Infrastructure (GSI) – IETF standards  Public-key-based authentication using X.509 certificates – X509 standard 27
  • 28. Authentication  Private Key - known only by owner  Public Key- known to everyone  What one key encrypts, the other decrypts Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch09s03.html 28
  • 29. Authentication using Digital Certificates  Digital document that certifies a public key is owned by a particular user  Signed by 3rd party – the Certificate Authority (CA)  X509 standard To know if you should trust the certificate, you have to trust the CA Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch09s04.html 29
  • 30. Requesting a Certificate  To request a certificate a user starts by generating a key pair Private Key Public Key Rachana Ananthakrishnan 30
  • 31. Certificate Request  The user signs their own public key to Public Key form what is called a Certificate Request  Email/Web upload  Note private key is Sign never sent anywhere Certificate Request Public Key Rachana Ananthakrishnan 31
  • 32. Certificate Issuance  The CA then creates, Certificate signs and issues a Request certificate for the Public Key user, combining the public key and the Name identity Name Issuer Validity Public Key Signature Rachana Ananthakrishnan 32
  • 33. Examples of CAs  GILDA CA, Globus GCS – An online service that issues low-quality certificates. You have used GILDA CA.  SimpleCA – part of Globus Toolkit. Simple to use.  DOEgrids CA – production CA often used by users of the Open Science Grid  MyProxy CA – part of Globus Toolkit 33
  • 34. Authorization – the GridMap File  Maps distinguished names (found in certificates) to local names (such as login accounts)  Also an access control list for GSI enabled services 34
  • 35. Delegation  Allows another service on the grid to act on your behalf.  For example, you ask Reliable File Transfer service to transfer files using GridFTP. Rachana Ananthakrishnan 35
  • 36. Proxy Certificate  Proxy Certificate allows another user to act upon their behalf – Credential delegation Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch10s05.html 36
  • 37. Proxy Certificate  Proxy empowers 3rd party to act upon your behalf  Proxy certificate is signed by the end user, not a CA  Proxy cert’s public key is a new one from the private-public key pair generated specifically for the proxy certificate  Proxy also allows you to do single sign-on – Setup a proxy for a time period and you don’t need to sign in again 37
  • 38. Proxy Certificate Chain Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch10s05.html 38
  • 39. Single Sign-on  Don’t need to remember (or even know) ID/passwords for each resource.  Automatically get a Grid proxy certificate for use with other Grid tools  More secure – No ID/password is sent over the wire: not even in encrypted form – Proxy certificate expires in a few hours and then is useless to anyone else – Don’t need to write down 10 passwords  It’s fast and it’s easy! 39
  • 40. Globus and Delegation: MyProxy  Remote service that stores user credentials – Users request proxies for local use – Web Portals request user proxies for use with back-end Grid services  Grid administrators can pre-load credentials in the server for users to retrieve when needed 40
  • 41. VOMS  A community-level group membership system  Database of user roles – Administrative tools – Client interface  voms-proxy-init – Uses client interface to produce an attribute certificate (instead of proxy) that includes roles & capabilities signed by VOMS server – Works with non-VOMS services, but gives more info to VOMS- aware services  Allows VOs to centrally manage user roles 41
  • 42. Globus Authorization Framework VOMS Shibboleth LDAP … PERMIS Authorization Attributes Decision PIP PIP PIP PDP Globus Client Globus Server 42
  • 43. Globus Security: How It Works Services Compute Center Users VO 43
  • 44. Globus Security: How It Works Services Compute Center Users Rights VO Rights 44
  • 45. Globus Security: How It Works Services Compute Center CAS Users Rights Local policy on VO identity VO or attribute authority Rights 45
  • 46. Globus Security: How It Works Services (running on user’s behalf) Access Compute Rights Center CAS Users Rights Local policy on VO identity VO or attribute authority Rights 46
  • 47. Globus Security: How It Works Authz Callout with Proxy Certificates Services (running on user’s behalf) Access Compute Rights Center CAS Users Rights Local policy on VO identity VO or attribute authority Rights 47
  • 48. A Cautionary Note  Grid security mechanisms are tedious to set up – If exposed to users, hand-holding is usually required – These mechanisms can be hidden from end users, but still used behind the scenes  These mechanisms exist for good reasons. – Many useful things can’t be done without Grid security – It is unlikely that an ambitious project could go into production operation without security like this – Most successful projects end up using Grid security, but using it in ways that end users don’t see much 48
  • 49. More Specifically, I May Want To …  Create a service for use by my colleagues  Manage who is allowed to access my service (or my experimental data or …)  Ensure reliable & secure distribution of data from my lab to my partners  Run 10,000 jobs on whatever computers I can get hold of  Monitor the status of the different resources to which I have access 49
  • 50. File Management  Stage/move large data to/from nodes – GridFTP for basic file movement – Reliable File Transfer (RFT)  Locate data of interest – Replica Location Service (RLS) 50
  • 51. GridFTP: The Protocol  A high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks – Based on standard FTP with well-defined extensions – Uses GSI security – Multiple data channels for parallel transfers – Partial file transfers – Third-party transfers – Reusable data channels – Command pipelining – Striping  multi-Gb/sec wide area transport  GGF recommendation GFD.20 51
  • 52. GridFTP the Service  IPv6 Support  Globus XIO for different transports  Pluggable – Front-end: e.g., future WS control channel – Back-end: e.g., HPSS, cluster file systems – Transfer: e.g., UDP, NetBLT transport 52
  • 53. Striped GridFTP Service  Multiple nodes work together as a single logical GridFTP server  Every node of the cluster is used to transfer data into/out of the cluster – Each node reads/writes only pieces they’re Parallel Filesystem Parallel Filesystem responsible for – Head node coordinates transfers  Multiple levels of parallelism – CPU, bus, NIC, disk etc. – Maximizes use of Gbit+ WANs Striped Transfer Fully utilizes bandwidth of Gb+ WAN using multiple nodes. 53
  • 54. GridFTP  Integrated instrumentation: Developers can use client API and plug-in mechanism to leverage different instrumentation – Performance markers – Restart markers – Throughput performance – Netlogger style performance tracking  TCP buffer size control – Tune buffers to latency of network – Regular FTP optimized for low latency networks, not tunable  Dramatic improvements for high latency WAN transfers – 90% of network utilization possible – 27 GB/s achieved with commodity hardware 54
  • 55. Working with Different Data Transports  XIO: eXtensible Input/Output  POSIX-like interfaces to support multiple protocols  Protocol implementations encapsulated in a stack of drivers – transport drivers: TCP, UDP, UDT, disk – transform drivers: compression, security, logging 55
  • 56. Globus XIO Network Network Protocol Network Protocol Protocol Globus XIO Application Driver Disk Driver Special Device 56
  • 57. RFT – Reliable File Transfer  A WSRF service for queuing and reliably performing file transfer requests – Server-to-server transfers – Checkpointing for restarts – Database back-end for failovers  Allows clients to requests transfers and then “disappear” – No need to manage the transfer – Status monitoring available if desired 57
  • 58. Reliable File Transfer (RFT) RFT Client Web Service invocation Optional (SOAP via https) notifications Database RFT Service preserves state Client API speaks GridFTP protocol Multiple parallel GridFTP Control GridFTP Control data channels move files GridFTP Data GridFTP Data Has transferred >900,000 files. 58
  • 59. Globus Replica Location Service  Why replicate files? – Fault tolerance: avoid single points of failure – Reduce latency: use “nearest” copy  Logical File Name (LFN) – Location-independent identifier (name) – Example: foo  Physical File Name (PFN) – Specific file identifier such as a URL – E.g.: gsiftp://myserver.mycompany.com/foo  RLS maps between LFNs and PFNs – foo  gsiftp://myserver.mycompany.com/foo 59
  • 60. LFNs and PFNs  LFN to PFN mappings are often many-to-one  Multiple PFNs may indicate different access to a file access via GridFTP server access via one NFS mount foo  gsiftp://dataserver.mycompany.com/foo foo  file://nodeA.mycompany.com/foo access via 2nd NFS mount foo  file://nodeB.mycompany.com/foo foo  https://www.mycompany.com/foo access via web server 60
  • 61. RLS services  Local replica catalog (LRC): Catalog of LFN to PFN mappings  Replica Location Index (RLI): Aggregate information about one or more LRCs  Only the LFN content for LRC is aggregated – Each configured LRC sends list of LFNs to LRCs – PFNs and mappings not aggregated RLI RLI LRC LRC LRC LRC LRC 61
  • 62. OGSA-DAI  Grid Interfaces to Databases – Data access > Relational & XML Databases, semi-structured files – Data integration > Multiple data delivery mechanisms, data translation  Sessions 40-41 next Tuesday 62
  • 63. More Specifically, I May Want To …  Manage who is allowed to access my service (or my experimental data or …)  Ensure reliable & secure distribution of data from my lab to my partners  Run 10,000 jobs on whatever computers I can get hold of  Monitor the status of the different resources to which I have access  Create a service for use by my colleagues 63
  • 64. Execution Management (GRAM)  Common webservices interface to job schedulers / local resource managers – Unix, Condor, LSF, PBS, SGE, …  More generally: interface for process execution management – Lay down execution environment – Stage data – Monitor & manage lifecycle – Kill it, clean up 64
  • 65. GRAM - Basic Job Submission and Control Service  A uniform service interface for remote job submission and control – Includes file staging and I/O management – Includes reliability features  GRAM is not a scheduler. – No scheduling – No metascheduling, brokering, workflow – Often used as a front-end to schedulers, and often used by metaschedulers, brokers, workflow engines. 65
  • 66. Using GRAM vs Building a Service  GRAM is intended for jobs that – are arbitrary programs – need stateful monitoring or credential management – Where file staging is important  If the application is lightweight, with modest input/output, may be a better candidate for hosting directly as a WSRF service 66
  • 67. GRAM4 Architecture GRAM on a Compute Element Local resource manager GRAM eg Condor, PBS Client services Transfer request User job RFT File GridFTP Transfer GridFTP 67
  • 68. Resource Specification Language (RSL)  For complicated jobs, use RSL to specify the job <job> <executable>/bin/echo</executable> <argument>this is an example_string </argument> <argument>Globus was here</argument> <stdout>${GLOBUS_USER_HOME}/stdout</stdout> <stderr>${GLOBUS_USER_HOME}/stderr</stderr> </job> 68
  • 69. At Most Once Submission  You may specify a UUID with your job submission  If you’re not sure the submission worked, you may submit the job again with the same UUID  If the job has already been submitted, the new submission will have no effect  If you do not specify a UUID, one will be generated for you 69
  • 70. Staging Data  GRAM’s RSL allows many fileStageIn/fileStageOut directives to bring input files in before your job and take output files away after your job.  The transfers will be executed by RFT and GridFTP 70
  • 71. RSL Substitutions  GRAM will perform some variable substitutions for you – GLOBUS_USER_HOME – GLOBUS_USER_NAME – GLOBUS_SCRATCH_DIR – GLOBUS_LOCATION  SCRATCH_DIR will be a compute-node local high-speed storage if defined, or GLOBUS_USER_HOME if not 71
  • 72. Choosing User Accounts  You may be authorized to use more than one account at the remote site  By default, the first listed in the grid- mapfile will be used  You may request a specific user account using the <localUserId> element 72
  • 73. Multijobs  You may specify more than one <job> element in a <multijob>  Used by MPICH-G to support MPI jobs 73
  • 74. Workspace Service Policy Negotiate access Initiate activity Client Monitor activity Activity Control activity Environment Interface Resource provider 74
  • 75. Nimbus – Science Clouds Virtual Client Machine Resource provider 75
  • 76. More Specifically, I May Want To …  Create a service for use by my colleagues  Manage who is allowed to access my service (or my experimental data or …)  Ensure reliable & secure distribution of data from my lab to my partners  Run 10,000 jobs on whatever computers I can get hold of  Monitor the status of the different resources to which I have access 76
  • 77. Monitoring and Discovery System (MDS4)  Grid-level monitoring system – Aid user/agent to identify host(s) on which to run an application – Warn on errors  Uses standard interfaces to provide publishing of data, discovery, and data access, including subscription/notification – WS-ResourceProperties, WS- BaseNotification, WS-ServiceGroup  Functions as an hourglass to provide a common interface to lower-level monitoring tools 77
  • 78. Information Users : Schedulers, Portals, Warning Systems, etc. WS standard interfaces for Standard Schemas subscription, (GLUE schema, eg) registration, notification Cluster monitors (Ganglia, Hawkeye, Clumon, and Queuing systems Nagios) Services (PBS, LSF, Torque) (GRAM, RFT, RLS) 78
  • 79. MDS4 Components  Information providers – Monitoring is a part of every WSRF service – Non-WS services are also be used  Collective services – Index Service – a way to aggregate data – Trigger Service – a way to be notified of changes  Clients – WebMDS – wsrf-query command line tool  All of the tool are schema-agnostic, but interoperability needs a well-understood common language 79
  • 80. Information Providers: Globus Services  GRAM – Queue and cluster information  Reliable File Transfer Service (RFT) – Service status data, number of active transfers, transfer status, information about the resource running the service  Replica Location Service (RLS) – (not a web service) – Location and status of replica catalogs  Others: CAS, your own providers 80
  • 81. Information Providers: Cluster and Queue Data  Interfaces to Hawkeye, Ganglia, CluMon, Nagios – Basic host data (name, ID), processor information, memory size, OS name and version, file system data, processor load data – Some condor/cluster specific data – This can also be done for sub-clusters, not just at the host level  Interfaces to PBS, Torque, LSF – Queue information, number of CPUs available and free, job count information, some memory statistics and host info for head node of cluster 81
  • 82. MDS4 Index Service  Index Service is both registry and cache – Datatype and data provider info, like a registry (UDDI) – Last value of data, like a cache  In memory default approach – DB backing store currently being developed to allow for very large indexes  Can be set up for a site or set of sites, a specific set of project data, or for user- specific data only  Can be a multi-rooted hierarchy – No *global* index 82
  • 83. MDS4 Trigger Service  Subscribe to a set of resource properties  Evaluate that data against a set of pre- configured conditions (triggers)  When a condition matches, action occurs – Email is sent to pre-defined address – Website updated  Similar functionality in Hawkeye 83
  • 84. WebMDS User Interface  Web-based interface to WSRF resource property information  User-friendly front-end to Index Service  Uses standard resource property requests to query resource property data  XSLT transforms to format and display them  Customized pages are simply done by using HTML form options and creating your own XSLT transforms  Sample page: – http://mds.globus.org:8080/webmds/webm ds?info=indexinfo&xsl=servicegroupxsl 84
  • 85. 85
  • 86. Working with TeraGrid  Large US project across 9 different sites  Explore metascheduling approaches  -> Need a common source of data with a standard interface for basic scheduling info  Collects data from Ganglia, Hawkeye, CluMon, and Nagios 86
  • 87. 87
  • 88. DOE Earth System Grid Goal: Enable sharing & analysis of high-volume data from advanced earth system models www.earthsystemgrid.org 88
  • 89. Monitoring Overall System Status  Monitored data are collected in MDS4 Index service  Information providers check resource status at a configured frequency – Currently, every 10 minutes  Report status to Index Service  Information in Index Service is queried by ESG Web portal  Used to generate overall picture of state of ESG resources  Displayed on ESG Web portal page Ann Chervenak, USC/ISI 89
  • 90. Web Service Basics  Web Services are basic distributed computing technology that let us construct client-server interactions Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html 90
  • 91. Web Service Basics 2  Web services are platform independent and language independent – Client and servers: different languages, different environments  Web service is *not* a website – Web service is accessed by software, not humans  Web services are ideal for loosely coupled systems 91
  • 92. WSDL: Web Services Description Language Define expected messages for a service, and their (input or output parameters) An interface groups together a number of messages (operations) Bind an Interface via a definition to a specific transport (e.g. The network location where the service is HTTP) and messaging (e.g. implemented , e.g. http://localhost:8080 SOAP) protocol 92
  • 93. Let’s talk about state  Plain Web services are stateless Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html 93
  • 94. However, Many Grid Applications Require State Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html 94
  • 95. Keep the Web Service and the State Separate  Instead of putting state in a Web service, we keep it in a resource  Each resource has a unique key Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html 95
  • 96. Resources Can Be Anything Stored Web Service + Resource = WS-Resource Address of a WS- resource is called an end-point reference 96
  • 97. Standard Interfaces  Service information  State representation GetRP – Resource – Resource Property GetMultRPs  State identification SetRP – Endpoint Reference Client Web QueryRPs Service  State Interfaces Subscribe – GetRP, QueryRPs, SetTerm GetMultipleRPs, SetRP Time  Lifetime Interfaces Destroy – SetTerminationTime – ImmediateDestruction  Notification Interfaces – Subscribe – Notify  ServiceGroups 97
  • 98. Globus and Web Services User Applications Custom Globus (e.g., Apache Axis) Globus Container and Admin WSRF WSRF Web Registry Custom Web Services Services Services WS-A, WSRF, WS-Notification WSDL, SOAP, WS-Security Globus Core: Java , C (fast, small footprint), Python 98
  • 100. Globus Software: dev.globus.org Globus Projects OGSA-DAI GT4 MPICH- G2 Java Data Replica Delegation MyProxy Runtime Rep Location GridWay C GSI- CAS GridFTP MDS4 Runtime OpenSSH Incubator Reliable Mgmt Python C Sec GRAM File GT4 Docs Runtime Transfer Incubator Projects GAARDS MEDICUS Cog WF Virt WkSp GDTE GridShib OGRO UGP Dyn Acct Gavia JSC DDM Metrics Introduce PURSE HOC-SA LRMA WEEP Gavia MS SGGC ServMark Common Execution Info Runtime Security Mgmt Data Mgmt Services 100 Other
  • 101. dev.globus  Governance model based on Apache Jakarta – Consensus based decision making  Globus software is organized as several dozen “Globus Projects” – Each project has its own “Committers” responsible for their products – Cross-project coordination through shared interactions and committers meetings  A “Globus Management Committee” – Overall guidance and conflict resolution 101
  • 102. http://dev.globus.org Guidelines (Apache Jakarta) Infrastructure (CVS, email, bugzilla, Wiki) Projects Include … 102
  • 103. Tested Platforms  Debian  SGI Altix (IA64  Fedora Core running Red Hat)  FreeBSD  SuSE Linux  HP/UX  Tru64 Unix  IBM AIX  Apple MacOS X (no  Red Hat binaries)  Sun Solaris  Windows – Java components only List of binaries and known platform-specific install bugs at http://www.globus.org/toolkit/docs/4.0/admin/ docbook/ ch03.html 103
  • 104. Installation in a nutshell  Quickstart guide is very useful http://www.globus.org/toolkit/docs/4.0/ admin/docbook/quickstart.html  Verify your prereqs!  Security – check spellings and permissions  Globus is system software – plan accordingly 104
  • 105. General Globus Help and Support  Globus toolkit help lists list – gt-user@globus.org – gt-dev@globus.org – http://dev.globus.org/wiki/ Mailing_Lists  Each project has specific lists  Bugzilla – bugzilla.globus.org 105
  • 106. Incubator Process in dev.globus  Entry point for new Globus projects  Incubator Management Project (IMP) – Oversees incubator process form first contact to becoming a Globus project – Quarterly reviews of current projects – Process being debugged by “Incubator Pioneers” http://dev.globus.org/wiki/Incubator/ Incubator_Process 106
  • 107. Current Incubator Projects dev.globus.org  Distributed Data  GridShib  Portal-based Management (DDM)  Grid Toolkit Handle User Registration System (gt-hs) Service (PURSe)  Dynamic Accounts  Higher Order  ServMark  Gavia-Meta Scheduler Component Service  SJTU GridFTP Architecture (HOC- GUI Client  Gavia- Job SA) (SGGC) Submission Client  Introduce  UCLA Grid Portal  Grid Authentication  Local Resource Software (UGP) and Authorization Manager Adaptors  WEEP with Reliably (LRMA)  Cog Workflow Distributed Services  Metrics  Virtual (GAARDS)  MEDICUS Workspaces  Grid Development  Open GRid OCSP Tools for Eclipse (Online Certificate (GDTE) Status Protocol) 107
  • 108. GridWay Meta-Scheduler  Scheduler virtualization layer on top of Globus services – A LRM-like environment for submitting, monitoring, and controlling jobs – Submit jobs to the Grid, without having to worry about the details of exactly which local resource will run the job – A policy-driven job scheduler – Accounting – Fault detection & recovery – Arrays of jobs, DAGs GridWay: http://www.gridway.org 108
  • 109. How Can You Contribute? Create a New Project  Do you have a project you’d like to contribute?  Does your software solve a problem you think the Globus community would be interested in?  Contact incubator-committers@globus.org 109
  • 110. Contribute to an Existing Project  Contribute code, documentation, design ideas, and feature requests  Joining the mailing lists – *-dev, *-user, *-announce for each project – See the project wiki page at dev.globus.org  Chime in at any time  Regular contributors can become committers, with a role in defining project directions http://dev.globus.org/wiki/How_to_contribute 110
  • 111. Summary: Grids are About … Enabling “coordinated resource sharing & problem solving in dynamic, multi- institutional virtual organizations.” (Source: “The Anatomy of the Grid”)  Access to shared resources  Virtualization, allocation, management  With predictable behaviors  Provisioning, quality of service  In dynamic, heterogeneous environments  Standards-based interfaces and protocols 111
  • 112. … By Providing Open Infrastructure  Web services standards – State, notification, security, …  Services that enable access to resources – Service-enable new & existing resources – E.g., GRAM on computer, GridFTP on storage system, custom application services – Uniform abstractions & mechanisms  Tools to build applications that exploit this infrastructure – Registries, security, data management, …  A rich tool & service ecosystem 112
  • 113. More Specifically, Making it Possible to …  Create a service for use by my colleagues  Manage who is allowed to access my service (or my experimental data or …)  Ensure reliable & secure distribution of data from my lab to my partners  Run 10,000 jobs on whatever computers I can get hold of  Monitor the status of the different resources to which I have access  And so on … 113
  • 114. For More Information  Globus Alliance – http://www.globus.org  Dev.globus – http://dev.globus.org  Upcoming Events – http://dev.globus.org/wiki/Outreach  Globus Solutions – http://www.globus.org/solutions/ 114