SlideShare une entreprise Scribd logo
1  sur  77
The gLite WMS and the
Data Management System
Giuseppe LA ROCCA
INFN Catania
giuseppe.larocca@ct.infn.it

Master Class for Life Science,
4-6 May 2010
Singapore
Outline
• An introduction to the gLite WMS
   • Job Submission via WMS
   • Command line interface
   • Job status
• The Job Description Language overview
   • JDL attributes
• The gLite DMS
     – The Storage Resource Manager (SRM)
•   Grid file referencing schemes
•   LFC File Catalogue
     – Architecture
     – LFC commands
•   File & Replica Management Client Tools
•   Run bioinformatics applications via Grid portal
The gLite stack: overview
Overview of the WMS
• The Workload Management System (WMS) is the gLite 3
  component that allows users to submit jobs, and performs all
  tasks required to execute them, without exposing the user to the
  complexity of the Grid.

• Workload Management System (WMS) comprises a set of Grid
  middleware components responsible for distribution and
  management of tasks across Grid resources.
  – The Workload Manager (WM) aims to accept and satisfy
    requests for job management coming from its clients.
     • WM will pass the job to an appropriate CE for execution
       taking into account requirements and the preferences
       expressed in the job description.
     • The decision of which resource should be used is the
       outcome of a matchmaking process.
  – The Logging and Bookkeeping service tracks jobs managed by
    the WMS. It collects events from many WMS components and
    records the status and history of the job.
Job Submission via WMS
  GILDA User Interface




         create
         proxy




                                   Grid Site
                                    Computing Element   Storage Element




 VO Management
    Service
(DB of VO users)
Job Submission via WMS
  GILDA User Interface                           Workload                   Information System
                  Write JDL, Submit job         Management
                  (executable) + small inputs     System

                                                                    query


         create
         proxy


                                                                                      publish
                                                                                      state




                                                      Grid Site
                                                       Computing Element        Storage Element




 VO Management
    Service
(DB of VO users)
Job Submission via WMS
  GILDA User Interface                           Workload                        Information System
                  Write JDL, Submit job         Management
                  (executable) + small inputs     System

                                                                         query


         create
         proxy


                                                                                           publish
                                                                    Submit job
                                                                                           state


                                                  Logging

                                                        Grid Site
                                                            Computing Element        Storage Element




 VO Management                                                     process
    Service
(DB of VO users)                 Logging and
                                 bookkeeping
Job Submission via WMS
  GILDA User Interface                              Workload                        Information System
                  Write JDL, Submit job            Management
                  (executable) + small inputs        System

                                                                            query

                     Retrieve status
         create             &
         proxy     (small) output files


                                                                                              publish
                                                                       Submit job
                                                          Retrieve                            state
                                                           output
                                           Job
                                                     Logging
                                          status

                                                           Grid Site
                                                               Computing Element        Storage Element




 VO Management                                                        process
    Service
(DB of VO users)                 Logging and
                                 bookkeeping
The Command Line Interface

• The gLite WMS implements two different services to manage
  jobs: the Network Server and the WMProxy.
   – The recommended method to manage jobs is through
     the gLite WMS via WMProxy, because it gives the best
     performance and allows to use the most advanced
     functionalities
• The WMProxy implements several
  functionalities, among which:
   –   submission of job collections;
   –   faster authentication;
   –   faster match-making;
   –   faster response time for users;
   –   higher job throughput.
Proxy Delegation
To explicitly delegate a user proxy to WMProxy, the
command to use is:
glite-wms-job-delegate-proxy -d <delegID>


Example:

$ glite-wms-job-delegate-proxy -d mydelegID
Connecting to the service
https://rb102.cern.ch:7443/glite_wms_wmproxy_server

======= glite-wms-job-delegate-proxy Success ========
Your proxy has been successfully delegated to the
  WMProxy:
https://rb102.cern.ch:7443/glite_wms_wmproxy_server
with the delegation identifier: mydelegID
=====================================================
Job Submission

Starting from a simple JDL file, we can submit it via
  WMProxy by doing:


$ glite-wms-job-submit –d mydelegID test.jdl
Connecting to the service
https://rb102.cern.ch:7443/glite_wms_wmproxy_server

======== glite-wms-job-submit Success ========
The job has been successfully submitted to the WMProxy
Your job identifier is:
https://rb102.cern.ch:9000/vZKKk3gdBla6RySximq_vQ
==============================================
Listing CE(s) that matching a job

It is possible to see which CEs are eligible to run a job
   described by a given JDL using:

$ glite-wms-job-list-match –d mydelegID test.jdl

Connecting to the service
https://rb102.cern.ch:7443/glite_wms_wmproxy_server
====================================================
COMPUTING ELEMENT IDs LIST
The following CE(s) matching your job requirements have
  been found:

*CEId*
- CE.pakgrid.org.pk:2119/jobmanager-lcgpbs-cms
- grid-ce0.desy.de:2119/jobmanager-lcgpbs-cms
- gw-2.ccc.ucl.ac.uk:2119/jobmanager-sge-default
- grid-ce2.desy.de:2119/jobmanager-lcgpbs-cms
Retrieving the status of a job
$ glite-wms-job-status
   https://rb102.cern.ch:9000/fNdD4FW_Xxkt2s2aZJeoeg
=====================================================
BOOKKEEPING INFORMATION:
Status info for the Job :
   https://rb102.cern.ch:9000/fNdD4FW_Xxkt2s2aZJeoeg
Current Status: Done (Success)
Exit code: 0
Status Reason: Job terminated successfully
Destination: ce1.inrne.bas.bg:2119/jobmanager-lcgpbs-cms
Submitted: Mon Dec 4 15:05:43 2006 CET
=====================================================

The verbosity level controls the amount of information provided.
  The value of the -v option ranges from 0 to 3.

The commands to get the job status can have several jobIDs as
  arguments, i.e.: glite-wms-job-status <jobID1> ... or,
  more conveniently, the -i <file path> option can be used to
Retrieving the output(s)

$ glite-wms-job-output
https://rb102.cern.ch:9000/yabp72aERhofLA6W2-LrJw
Connecting to the service
https://128.142.160.93:7443/glite_wms_wmproxy_server
=====================================================
JOB GET OUTPUT OUTCOME
Output sandbox files for the job:
https://rb102.cern.ch:9000/yabp72aERhofLA6W2-LrJw
have been successfully retrieved and stored in the
  directory:
/tmp/doe_yabp72aERhofLA6W2-LrJw
=====================================================

The default location for storing the outputs (normally
/tmp) is defined in the UI configuration, but it is possible
to specify in which directory to save the output using the
--dir <path name> option.
Cancelling a job

$ glite-wms-job-cancel
  https://rb102.cern.ch:9000/P1c60RFsrIZ9mnBALa7yZA
Are you sure you want to remove specified job(s)
  [y/n]y : y
Connecting to the service
https://128.142.160.93:7443/glite_wms_wmproxy_server
========== glite-wms-job-cancel Success ============
The cancellation request has been successfully
  submitted for the following job(s):
- https://rb102.cern.ch:9000/P1c60RFsrIZ9mnBALa7yZA
====================================================

If the cancellation is successful, the job will terminate in
   status CANCELLED
Job Submission with CLI
  GILDA User Interface
                      glite-wms-job-delegate-proxy -d delegID
                          glite-wms-job-list-match –d delegID hostname.jdl


                                                                delegID
          glite-wms-job-submit
          -d delegID hostname.jdl
           JobID
          glite-wms-job-status JobID
          glite-wms-job-output JobID
                                                   Manage job

voms-proxy-init --voms gilda
                                       Grid Site
                                        Computing Element        Storage Element




 VO Management                                     process
    Service
(DB of VO users)
Possible Job states
Job Description Language

• The Job Description Language (JDL) is a high-level
  language based on the Classified Advertisement
  (ClassAd) language, used to describe jobs and
  aggregates of jobs with arbitrary dependency relations.
   – The JDL is used in WLCG/EGEE to specify the desired
     job characteristics and constraints, which are taken
     into account by the WMS to select the best resource
     to execute the job.

   – A job description is a file (called JDL file) consisting
     of lines having the format: attribute = expression;
   – Expressions can span several lines, but only the last
     one must be terminated by a semicolon.
Job Description Language

• The character “ ‘ ” cannot be used in the JDL.

• Comments must be preceded by a sharp character
  (#) or a double slash (//) at the beginning if each
  line.

• Multi-line comments must be enclosed between “/
  *” and “*/” .

Attention!   The JDL is sensitive to blank characters and
  tabs. No blank characters or tabs should follow the
  semicolon at the end of a line.
Simple JDL example

          Executable = "/bin/hostname";
          StdOutput = "std.out";
          StdError = "std.err";

The Executable attribute specifies the command to be
run by the job. If the command is already present on
the WN, it must be expressed as a absolute path; if it
has to be copied from the UI, only the file name must
be specified, and the path of the command on the UI
should be given in the InputSandbox attribute.

          Executable = "test.sh";
          InputSandbox = {"/home/larocca/test.sh"};
          StdOutput = "std.out";
          StdError = "std.err";
• The Arguments attribute can contain a string value,
  which is taken as argument list for the executable:
  Arguments = "fileA 10";

• In the Executable and in the Arguments attributes it
  may be necessary to use special characters, such as
  &, , |, >, <. These characters should be preceded by
  triple  in the JDL, or specified inside quoted strings
  e.g.: Arguments = "-f file1&file2";


• The shell environment of the job can be modified using
  the Environment attribute.
  Environment = {"CMS_PATH=$HOME/cms"};
• If files have to be copied from the UI to the execution
  node, they must be listed in the InputSandbox
  attribute: InputSandbox = {"test.sh", ... ,"fileN"};

• The files to be transferred back to the UI after the job
  is finished can be specified using the OutputSandbox
  attribute: OutputSandbox = {"std.out","std.err"};

• Wildcards are allowed only in the InputSandbox
  attribute.
• Absolute paths cannot be specified in the
  OutputSandbox attribute.
• The InputSandbox cannot contain two files with the
  same name, even if they have a different absolute
  path, as when transferred they would overwrite each
  other.
• The Requirements attribute can be used to express
  constraints on the resources where the job should run.
   – Its value is a Boolean expression that must
     evaluate to true for a job to run on that specific CE.

• Note: Only one Requirements attribute can be specified
  (if there are more than one, only the last one is
  considered). If several conditions must be applied to
  the job, then they all must be combined in a single
  Requirements attribute.

• For example, let us suppose that the user wants to run
  on a CE using PBS as batch system, and whose WNs
  have at least two CPUs. He will write then in the job
  description file:

Requirements = other.GlueCEInfoLRMSType ==
  "PBS" && other.GlueCEInfoTotalCPUs > 1;
• The WMS can be also asked to send a job to a particular queue
  in a CE with the following expression:
  Requirements = other.GlueCEUniqueID ==
  "lxshare0286.cern.ch:2119/jobmanager-pbs-short";

• It is also possible to use regular expressions when expressing
  a requirement.
   – Let us suppose for example that the user wants all his
     jobs to run on any CE in the domain cern.ch. This can be
     achieved putting in the JDL file the following
     expression:
      Requirements =
  RegExp("cern.ch",other.GlueCEUniqueID);

   – The opposite can be required by using:
       Requirements =
       (!RegExp("cern.ch", other.GlueCEUniqueID));
• If the job must run on a CE where a particular
  experiment software is installed and this information is
  published by the CE, something like the following must
  be written:

Requirements = Member(“BLAST-1.0.3”,
other.GlueHostApplicationSoftwareRunTimeEnvironment);


Note: The Member operator is used to test if its first argument
  (a scalar value) is a member of its second argument (a list).
  In fact, the GlueHostApplicationSoftwareRunTimeEnvironment
  attribute is a list of strings and is used to publish any VO-
  specific information relative to the CE (typically, information
  on the VO software available on that CE).
Advanced job types
• Job Collection: a set of independent jobs that user can
  submit and monitor as it was a single job
[
     Type = “Collection";

     nodes={ [
                     Executable = "/bin/hostname";
                     Arguments = “-f";
                     StdOutput = "hostname.out";
                     StdError = "hostname.err";
                     OutputSandbox = {"hostname.err","hostname.out"};
    ],[
                     Executable = "/bin/sh";
                     Arguments = "start_povray_valve.sh";
                     StdOutput = “povray.out";
                     StdError = “povray.err";
                     InputSandbox = {“start_povray_valve.sh"};
                     OutputSandbox = {“povray.err",“povray.out"};

                    Requirements = Member (“POVRAY-3,5”,
      other.GlueHostApplicationSoftwareRunTimeEnvironment);
    ] };
]
Advanced job types
• Parametric Job: a job collection where the jobs are identical
  but for the value of a running parameter

                JobType = "Parametric";
                Executable = “/bin/echo";
                Arguments = “_PARAM_”;
                StdOutput = "myoutput_PARAM_.txt";
                StdError = "myerror_PARAM_.txt";
                Parameters = 3;
                ParameterStep = 1;
                ParameterStart = 1;
                OutputSandbox = {“myoutput_PARAM_.txt”};
Advanced job types
• DAG is a set of jobs where the input, output, or execution of
     one or more jobs depends on one or more other ones
              • The jobs are nodes (vertices) in the graph
Type = "dag";
              • the edges (arcs) identify the dependencies
max_nodes_running = 5;
InputSandbox = {"/tmp/foo/*.exe", "/home/larocca/bar", "gsiftp://neo.datamat.it:5678/tmp/cms_sim.exe ", "file:///tmp/myconf"};
InputSandboxBaseURI = "gsiftp://matrix.datamat.it:5432/tmp";
nodes = [
                nodeA = [      description = [
                                                 JobType = "Normal";
                                                 Executable = "a.exe";
                                                 InputSandbox = { "/home/larocca/myfile.txt", root.InputSandbox};
                               ];
                ];
                nodeF = [      description = [
                                                 JobType = "Normal";
                                                 Executable = "b.exe";
                                                 Arguments = "1 2 3";                                                nodeA
                                                 OutputSandbox = {"myoutput.txt", "myerror.txt" };
                               ];
                ];
                nodeD = [      description = [
                                                 JobType = "Checkpointable";
                                                 Executable = "b.exe";
                                                 Arguments = "1 2 3";                                  nodeB           nodeC     NodeF
                                                 InputSandbox = { "file:///home/larocca/data.txt",
                                                 root.nodes.nodeF.description.OutputSandbox[0] };
                               ];
                ];
                nodeC = [      file = "/home/larocca/nodec.jdl";        ];
                nodeB = [      file = "foo.jdl";       ];
];                                                                                                                     nodeD
dependencies = { { nodeA, nodeB }, { nodeA, nodeC }, {nodeA, nodeF }, { { nodeB, nodeC, nodeF }, nodeD } };
References
WMProxy User’s guide
  https
  ://edms.cern.ch/file/674643/1/EGEE-JRA1-TEC-674643-WMPROXY-guide


JDL Attributes Specification
  https://edms.cern.ch/file/555796/1/EGEE-JRA1-TEC-555796-JDL-Attributes-v0-8.pd


  https://edms.cern.ch/file/590869/1/EGEE-JRA1-TEC-590869-JDL-Attributes-v0-9.pd


gLite 3.1 user’s guide
  https://edms.cern.ch/file/722398/1.2/gLite-3-UserGuide.pdf

Complex jobs
  https://grid.ct.infn.it/twiki/bin/view/GILDA/WmProxyUse

WMProxy API usage
  http://www.euasiagrid.org/wiki/index.php/WMProxy_Java_API https
  ://grid.ct.infn.it/twiki/bin/view/GILDA/WMProxyCPPAPI
The gLite stack: overview
Storage Elements

• The Storage Element is the service which allows a user or an
  application to store/retrieve data for future retrieval.

• The DMS provides services to locate, access and transfer files
   – User does not need to know the physical location of file, just its
     logical file name;
   – Files can be replicated or transferred to several locations (SEs) as
     needed;
   – Files are shared with all the members of the given VO.

• Files stored in a SE are written-once, read-many
   – Files cannot be changed unless remove or replaced;
Protocols




– The GSIFTP protocol offers the functionalities of FTP, but
  with support for GSI. It is responsible for secure, fast and
  efficient file transfers to/from Storage Elements.
– RFIO was developed to access tape archiving systems, such
  as CASTOR (CERN Advanced STORage manager) and it
  comes in a secure and an insecure version.
– The gsidcap protocol is the GSI enabled version of the
  dCache native access protocol, dcap.
Types of Storage Elements /1

• In WLCG/EGEE, different types of Storage Elements are
  available:

• CASTOR. It consists in a disk buffer frontend to a tape
  mass storage system. A virtual file system (namespace)
  shields the user from the complexities of the disk and
  tape underlying setup. File migration between disk and
  tape is managed by a process called “stager”. The
  native storage protocol, the insecure RFIO, allows
  access of files in the SE. Since the protocol is not GSI-
  enabled, only RFIO access from a location in the same
  LAN of the SE is allowed. With the proper modifications,
  the CASTOR disk buffer can be used also as disk-only
  storage system.
Types of Storage Elements /2

• StoRM. It has been designed to support space
  reservation and direct access (native POSIX I/O call),
  as well as other standard libraries (like RFIO).

• StoRM takes advantage from high performance parallel
  file systems like GPFS (from IBM).
   – In addition, standard POSIX file systems are supported
      (XFS from SGI and ext3).

• StoRM takes advantage of ACL support provided by the
  underlying file systems to implement the security
  models
Types of Storage Elements /3

• dCache. It consists of a server and one or more pool
  nodes. The server represents the single point of access
  to the SE and presents files in the pool disks under a
  single virtual file system tree. Nodes can be
  dynamically added to the pool. The native gsidcap
  protocol allows POSIX-like data access. dCache is
  widely employed as disk buffer frontend to many mass
  storage systems, like HPSS and Enstore, as well as a
  disk-only storage system.

• LCG Disk pool manager. It’s a lightweight disk pool
  manager, suitable for relatively small sites (max 10 TB
  of total space). Disks can be added dynamically to the
  pool at any time. Like in dCache and CASTOR, a virtual
  file system hides the complexity of the disk pool
  architecture. The secure RFIO protocol allows file
  access from the WAN.
The Storage Resource Manager




                      SRM
The Storage Resource Manager

The Storage Resource Manager (SRM) has been
  designed to be the single interface for the
  management of disk and tape storage resources.

Any type of Storage Element in WLCG/EGEE offers
  an SRM interface except for the Classic SE, which
  is being phased out.

SRM hides the complexity of the resources setup
  behind it and allows the user to request files,
  keep them on a disk buffer for a specified lifetime,
  reserve space for new entries, and so on.
   – In gLite, interactions with the SRM is hidden
     by high level services (DM tools and APIs)
The gLite Storage Element
Grid file referencing schemes

     LFN                GUID              SURL              TURL

•   Logical File Name (LFN)
     – lfn:/grid/gilda/tutorials/input-file
•   Grid Unique IDentifier (GUID)
     – guid:4d57edef-fa5c-4512-a345-1c838916b357
•   Storage URL (for a specific replica, on a specific Storage
    Element)
     – srm://aliserv6.ct.infn.it/gilda/generated/2007-11-13/file
       b366f371-b2c0-485d-b12c-c114edaf4db4
     – sfn://se01.athena.hellasgrid.gr/data/dteam/doe/file1
•   Transport URL (for a specific replica, on an SE, with a specific
    protocol)
     – gsiftp://aliserv6.ct.infn.it/gilda/generated/2007-11-13/fil
       eb366f371-b2c0-485d-b12c-c114edaf4db4
LCG File Catalog
Symlink
                        Replica Catalog
Symlink                                       SURL
           LFN               GUID
Symlink                                       SURL

Symlink
                          SRM Interface
                                             TURL

                 various protocols: gsiftp, gsidcap, rfio
Needles in a haystack
• How do I keep track of all files I have on Grid ?

• How does the Grid keep track of the mapping
  between LFN(s), GUID and SURL(s) ?


      LFC File Catalogue
      LFC = LCG File Catalogue
      LCG = LHC Compute Grid
      LHC = Large Hadron Collider

• The LCG File Catalogue is the service which
  maintains mappings between LFN(s), GUID
  and SURL(s).
LFC File Catalogue

• It consists of a unique catalogue, where the LFN is the
  main key. Further LFNs can be added as symlinks to the
  main LFN.
   – Looks like a “top-level” directory in the Grid
   – For each of the supported VO a separate subdirectory
     does exist under “/grid” directory
   – All the members of the VO have read/write
     permissions
   – System metadata are supported, while for user
     metadata only a single string entry is available

• The catalogue publishes its endpoint in the Information
  Service so that it can be discovered by Data
  Management tools and other services (the WMS for
  example).
Architecture of the LFC Catalogue
• LFN acts as main key in the database.
  It has:
   – Symbolic links to it (additional LFNs)
   – System metadata
   – Information on replicas
   – One field of user metadata
   – Access Control Lists
   – Integration with VOMS
     (VirtualID and VirtualGID)
   – C API language
Before to start..

• User can interact with the file catalogue through CLIs
  and APIs.
   – The environment variable LFC_HOST
     (e.g.: LFC_HOST=lfc-gilda.ct.infn.it)
     must contains the host name of the LFC server
     to be used.

• The directory structure of the LFC namespace has the
  form: /grid/<VO>/<subpaths>
   – Users of a given VO will have read and write
     permissions only under the corresponding
     <VO> subdirectory.
LFC Commands
lfc-chmod        Change access mode of the LFC file/directory

lfc-chown        Change owner and group of the LFC file/directory

lfc-delcomment   Delete the comment associated with the
                 file/directory
lfc-getacl       Get file/directory access control lists

lfc-ln           Make a symbolic link to a file/directory

lfc-ls           List file/directory entries in a directory

lfc-mkdir        Create a directory

lfc-rename       Rename a file/directory

lfc-rm           Remove a file/directory

lfc-setacl       Set file/directory access control lists

lfc-setcomment   Add/replace a comment
lfc-ls

• Listing the entries of a LFC directory
   – lfc-ls [-cdiLlRTu] [--class] [--comment] [--deleted] [--display_side] [--
      ds] path…
    – where path specifies the LFN pathname (mandatory)
    – Remember that LFC has a directory tree structure
    – /grid/<VO_name>/<you create it>
            LFC Namespace       Defined by the user


    – All members of a VO have read-write permissions under
      their directory
    – You can set LFC_HOME to use relative paths

            lfc-ls     /grid/gilda/tutorials/taipei02
            export     LFC_HOME=/grid/gilda/tutorials
            lfc-ls     -l taipei02
            lfc-ls     -l -R /grid
lfc-mkdir

• Creating directories in the LFC
   – lfc-mkdir [-m mode] [-p] path...

• Where path specifies the LFC pathname
• Remember that while registering a new file (using lcg-
  cr, for example) the corresponding destination
  directory must be created in the catalog beforehand.

• Examples:
     lfc-mkdir /grid/gilda/<YOUR_DIRECTORY>

                             Created by the user
lfc-ln
• Creating a symbolic link
   – lfc-ln -s file linkname
   – lfc-ln -s directory linkname

   – Create a link to the specified file or directory with
     linkname

   Examples:
   – lfc-ln -s   /grid/gilda/test              /grid/gilda/aLink
                      Original File                 Symbolic Link



   Let’s check the link using lfc-ls with long listing
   – lfc-ls -l aLink
     lrwxrwxrwx   1 19122       1077       0 Jun 14 11:58 aLink -
     > /grid/gilda/test
Access Control List (ACL)

• LFC allows to attach to a file or directory an access control list
  (ACL), a list of permissions which specify who is allowed to
  access or modify it. The permissions are very much like those of
  a UNIX file system: read (r), write (w) and execute (x).

• In LFC, users and groups are internally identified as numerical
  virtual uids and virtual gids, which are virtual in the sense that
  they exist only in the LFC namespace.
   – A user can be specified as a name, as a virtual uid or as a
      DN.
   – A group can be specified as name, as a virtual gid or as a
      VOMS FQAN.

• A directory in LFC has also a default ACL (which is the ACL
  associated to any file or directory being created under that
  directory). After creation, the ACLs can be freely changed.
   – When creating a sub-directory, its default ACL is inherited
     from the parent directory
Print the ACL of a directory
$ lfc-getacl /grid/gilda/tutorials/test-acl

  # file: /grid/gilda/tutorials/test-acl
  # owner: /C=IT/O=INFN/OU=Personal
  Certificate/L=Catania/CN=Giuseppe La Rocca/Email=
  giuseppe.larocca@ct.infn.it
  # group: gilda
  user::rwx
  group::rwx #effective:rwx
  other::r-x
  default:user::rwx
  default:group::rwx
  default:other::r-x


In this example, the owner and all users in the gilda group
  have full privileges to the directory, while other users cannot
  write into it.
Modify the ACL

   lfc-setacl [-d] [-m] [-s] acl_entries path

The -m option means that we are modifying the existing
  ACL. Other options of lfc-setacl are -d to remove ACL
  entries, and -s to replace the complete set of ACL
  entries.
acl_entries is a coma separated list of entries. Each entry
  has colon separated fields: ACL type, id (uid or gid),
  permission. Only directories can have default ACL
  entries!

The entries look like: user::perm       defaul::user:perm
                       user:uid:perm    defaul::user:uid:perm
                       group:perm       defaul::group:perm
                       group:gid:perm   defaul::group:gid:perm
                       mask:perm        default::mask:perm
                       other:perm       deafult::other:perm
Modify the ACL of a directory


Let's change default ACL, with read/write
  permission for user and group, and no privileges
  for others.
   – The syntax we apply here is modify (-m)
     default (d:) for user (u:), and the same of
     course for group and others.

  $ lfc-setacl -m d::u:6,d::g:6,d::o:0 
                   $LFC_HOME/test-acl/
Adding metadata information

The lfc-setcomment and lfc-delcomment commands allow the
  user to associate a comment with a catalogue entry and delete
  such comment. This is the only user-defined metadata that
  can be associated with catalogue entries.

The comments for the files may be listed using the --comment
  option of the lfc-ls command. This is shown in the following
  example:

$ lfc-setcomment /grid/gilda/file1 “My metadata“

$ lfc-ls --comment /grid/gilda/file1
  /grid/gilda/file1 My metadata
LCG Data Management Client Tools
•    The LCG Data Management tools allow users to copy files between
     UI, WN and a SE, to register entries in the file catalogue and
     replicate files between SEs.
    lcg-cp    Copies a Grid file to a local destination
    lcg-cr    Copies a file to a SE and registers it in the catalogue
    lcg-del   Deletes one file (either one replica or all the replicas)
    lcg-rep   Copies a file from one SE to another SE and registers it
              in the catalogue
    lcg-gt    Gets the TURL for a given SURL and transfer protocol
    lcg-aa    Adds an alias in the catalogue for a given GUID
    lcg-ra    Removes an alias in the catalogue for a given GUID
    lcg-rf    Registers in the catalogue a file residing on a SE
    lcg-uf    Unregisters in the catalogue a file residing on a SE
    lcg-la    Lists the aliases for a given LFN, GUID or SURL
    lcg-lg    Gets the GUID for a given LFN or SURL
    lcg-lr    Lists the replicas for a given LFN, GUID or SURL
Environment variables /1
• The --vo <vo name> option, to specify the virtual
  organisation of the user, is present in all commands,
  except for lcg-gt. Its usage is mandatory unless the
  variable LCG_GFAL_VO is set (e.g.: export
  LCG_GFAL_VO=gilda)

Timeouts
  The commands lcg-cr, lcg-del, lcg-gt, lcg-rf, lcg-sd and
  lcg-rep all have timeouts implemented.
  By using the option -t, the user can specify a number of
  seconds for the timeout.
  The default is 0 seconds, that is no timeout.
  If we got a times out during the performing of an
  operation, all actions performed till that moment are
  rolled back, so no broken files are left on a SE and no
  existing files are not registered in the catalogues.
Environment variables /2


• For all lcg-* commands to work, the environment
  variable LCG_GFAL_INFOSYS must be set to point to a
  top BDII in the format <hostname>:<port>, so that
  the commands can retrieve the necessary information

  export LCG_GFAL_INFOSYS=gilda-bdii.ct.infn.it:2170

• The VO_<VO>_DEFAULT_SE variable specifies the
  default SE for the VO.

  export VO_GILDA_DEFAULT_SE=aliserv6.ct.infn.it
Uploading a file to the Grid /1

$ lcg-cr --vo gilda -d aliserv6.ct.infn.it 
  file:/home/larocca/file1

  guid:6ac491ea-684c-11d8-8f12-9c97cebf582a

  where the only argument is the local file to be
  uploaded and the -d <destination> option
  indicates the SE used as the destination for the
  file. The command returns the file GUID.

  If no destination is given, the SE specified by the
  VO_<VO>_DEFAULT_SE environmental variable is taken.

  The -P option allows the user to specify a relative path
  name for the file in the SE. If no -P option is given, the
  relative path is automatically generated.
Uploading a file to the Grid /2
The following are examples of the different ways to
  specify a destination:

  -d aliserv6.ct.infn.it
  -d srm://aliserv6.ct.infn.it/data/gilda/my_file
  -d aliserv6.ct.infn.it -P my_dir/my_file

The –l <lfn> option can be used to specify a LFN:

$ lcg-cr --vo gilda -d aliserv6.ct.infn.it    
             -l lfn:/grid/gilda/myalias1 
             file:/home/larocca/file1

  guid:db7ddbc5-613e-423f-9501-3c0c00a0ae24
Replicating a file

$ lcg-rep -v --vo gilda -d <SECOND_SE> 
guid:db7ddbc5-613e-423f-9501-3c0c00a0ae24

Source URL:
sfn://aliserv6.ct.infn.it/data/gilda/larocca/file1
File size: 30
Destination specified: <SECOND_SE>
Source URL for copy:
gsiftp://aliserv6.ct.infn.it/data/gilda/larocca/file1
Destination URL for copy:
gsiftp://<SECOND_SE>/data/gilda/generated/2004-07-09/
  file50c0752c-f61f-4bc3-b48e-af3f22924b57

# streams: 1
Transfer took 2040 ms
Destination URL registered in LRC:
  srm://<SECOND_SE>/data/gilda/generated/2004-07-09/fi
  le50c0752c-f61f-4bc3-b48e-af3f22924b57
Listing replicas

$ lcg-lr --vo gilda 
  lfn:/grid/gilda/tutorials/larocca/my_alias1



  srm://aliserv6.ct.infn.it/data/gilda/generated/2004-07
  -09/file79aee616-6cd7-4b75-8848-f091

  srm://<SECOND_SE>/data/gilda/generated/2004-07-08/file
  0dcabb46-2214-4db8-9ee8-2930

Again, a LFN, the GUID or a SURL can be used to specify
  the file.
Copying files out the Grid

$ lcg-cp --vo gilda -t 100 -v
  lfn:/grid/gilda/tutorials/mytext.txt
  file:/tmp/mytext.txt


Source URL: lfn:/grid/gilda/mytext.txt
File size: 104857600
Source URL for copy:
gsiftp://aliserv6.ct.infn.it:/storage/gilda/2007-07-06/
  input2.dat.10.0
Destination URL: file:///tmp/myfile
# streams: 1
# set timeout to 100 (seconds)
  85983232 bytes 8396.77 KB/sec avg 9216.11
Transfer took 12040 ms
Deleting replicas /1

A file stored on a SE and registered in LFC can be
   deleted using the lcg-del command.

• If a SURL is provided as argument, then that
  particular replica will be deleted.

• If a LFN or GUID is given instead then the –s <SE>
  option must be used to indicate which one of the
  replicas must be erased

$ lcg-del --vo gilda -s aliserv6.ct.infn.it 
  guid:91b89dfe-ff95-4614-bad2-c538bfa28fac
Deleting replicas /2
• If the –a option is used, all the replicas of the given file
  will be deleted and unregistered from the catalog.



$ lcg-del --vo gilda -a 
  guid:91b89dfe-ff95-4614-bad2-c538bfa28fac
Registering Grid files

The lcg-rf command allows to register a file physically
  present in a SE, creating a GUID-SURL mapping in the
  catalogue.

The -g <GUID> option allows to specify a GUID (otherwise
  automatically created).


$ lcg-rf --vo gilda 
  -l lfn:/grid/gilda/newfile 

  srm://aliserv6.ct.infn.it/data/gilda/generated/2004-0
  7 08/file0dcabb46-2214-4db8-9ee8-2930de1
  guid:baddb707-0cb5-4d9a-8141-a046659d243b
Unregistering Grid files

lcg-uf allows to delete a GUID-SURL mapping
   (respectively the first and second argument of the
   command) from the catalogue:

$ lcg-uf --vo gilda 
  guid:baddb707-0cb5-4d9a-8141-a046659d243b 

  srm://aliserv6.ct.infn.it/data/gilda/generated/2004-
  07 08/file0dcabb46-2214-4db8-9ee8-2930de1

If the last replica of a file is unregistered, the
   corresponding GUID-LFN mapping is also removed.

  Attention!
  lcg-uf just removes entries from the catalogue.
Working with large data datasets
• The InputSandbox and OutputSandbox attributes are the
  basic way to move files to and from the User Interface
  (UI) and the Worker Node (WN).

• However, there are other ways to move files to and from
  the WN especially when large files (> 10 MB) are involved
“User        Input “sandbox”
                                         DataSets info
interface”
             Output “sandbox”

                         WMS                                 LCG File




                                    In
                                      pu
                                                             Catalogue (LFC)




                                        t“
                                           san
                                 Ou


                                              db
                                    tp
                                   ut


                                                 o
                                                 x”
                                      “sa


                                                  +B
                                          n
                                          db


                                                      ro
                                            ox


                                                       erk
                                              ”


                                                          In
                                                             fo
                 Storage        Computing
                 Element 2      Element
References

• gLite 3 User Guide – Manual Series
   – https
     ://edms.cern.ch/file/722398/1.3/gLite-3-UserGuide.pdf
• gLite Documentation homepage
   – http://glite.web.cern.ch/glite/documentation/default.as
• DM subsystem documentation
   – http://egee-jra1-dm.web.cern.ch/egee-jra1-dm/doc.htm

• LFC and DPM documentation
   – https://uimon.cern.ch/twiki/bin/view/LCG/DataManage
• DM API
   – http://www.euasiagrid.org/wiki/index.php
     /Data_Management_Java_API
Running more realistic jobs
 with the GENIUS Grid portal:


Porting “BLAST” & “MrBayes”
      applications to Grid
        Case study from




          CNR - ITB
The GENIUS Grid Portal architecture


                                                  www.enginframe.com




                                                  www.nice-italy.com




                                                    www.infn.it



• The GENIUS Grid portal (license ver 4.2 is free for educational)
is built on top of the EnginFrame Java/XML framework;
• It’s a gateway to European EGEE Project middleware (it’s
easily customizable for other middleware);
• It allows to expose gLite-enabled applications via web browser
as well as Web Services.
What is EnginFrame ?


• It is a web-based technology able to expose Grid
  services running on Grid infrastructures

• It allows organizations to provide application-oriented
  computing and data services to both users (via Web
  browsers) and applications (via SOAP/WSDL and/or
  RSS)

• It’s a Grid gateway!!

• It greatly simplifies the development of Web Portals
  exposing computing services that can run on a broad
  range of different computational Grid systems
About MrBayes

• MrBayes is a program for the Bayesian estimation of
  phylogeny.
• Bayesian inference of phylogeny is based on the posterior
  probability distribution of trees, which is the probability of
  a tree conditioned on the observations.
   – To approximate the posterior probability distribution of
      trees MrBayes uses a simulation technique called
      Markov Chain Monte Carlo (or MCMC).
• The program takes as input a character matrix in a NEXUS
  file format.
• The output is several files with the parameters that were
  sampled by the MCMC algorithm.
• The application is CPU demanding, especially if the MPI
  version of the software is used.
EnginFrame & MrBayes
The Users Tracking System (UTS) /1
The Users Tracking System (UTS) /2
About BLAST
BLAST (Basic Local Alignment Search Tool) provides a
  method for rapid searching of nucleotide and protein
  databases.
The program compares nucleotide or protein sequences to
  sequence databases and calculates the statistical
  significance of matches.
                                   Click here to download results
Thank you for your attention!

Contenu connexe

Similaire à Larocca

End-to-End Integrated Management with System Center 2012
End-to-End Integrated Management with System Center 2012End-to-End Integrated Management with System Center 2012
End-to-End Integrated Management with System Center 2012wwwally
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overviewjimliddle
 
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...Jeffrey West
 
Sweet Streams (Are made of this)
Sweet Streams (Are made of this)Sweet Streams (Are made of this)
Sweet Streams (Are made of this)Corneil du Plessis
 
The Java Content Repository
The Java Content RepositoryThe Java Content Repository
The Java Content Repositorynobby
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)Carles Farré
 
Introduction to share point 2010 development
Introduction to share point 2010 developmentIntroduction to share point 2010 development
Introduction to share point 2010 developmentEric Shupps
 
Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020Kamalesh Ramasamy
 
Доклад Растислава Хлавача на SPCUA 2012
Доклад Растислава Хлавача на SPCUA 2012Доклад Растислава Хлавача на SPCUA 2012
Доклад Растислава Хлавача на SPCUA 2012Lizard Soft
 
Highload JavaScript Framework without Inheritance
Highload JavaScript Framework without InheritanceHighload JavaScript Framework without Inheritance
Highload JavaScript Framework without InheritanceFDConf
 
Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)
Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)
Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)Rafal Los
 
Harish Aspnet Deployment
Harish Aspnet DeploymentHarish Aspnet Deployment
Harish Aspnet Deploymentrsnarayanan
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftLee Stott
 
Spring Cairngorm
Spring CairngormSpring Cairngorm
Spring Cairngormdevaraj ns
 
Id0115
Id0115Id0115
Id0115FNian
 
SproutCore GTUG
SproutCore GTUGSproutCore GTUG
SproutCore GTUGsproutit
 

Similaire à Larocca (20)

BlazeDS
BlazeDSBlazeDS
BlazeDS
 
Offline Html5 3days
Offline Html5 3daysOffline Html5 3days
Offline Html5 3days
 
End-to-End Integrated Management with System Center 2012
End-to-End Integrated Management with System Center 2012End-to-End Integrated Management with System Center 2012
End-to-End Integrated Management with System Center 2012
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overview
 
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...
WebLogic Developer Webcast 5: Troubleshooting and Testing with WebLogic, Soap...
 
Sweet Streams (Are made of this)
Sweet Streams (Are made of this)Sweet Streams (Are made of this)
Sweet Streams (Are made of this)
 
The Java Content Repository
The Java Content RepositoryThe Java Content Repository
The Java Content Repository
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
 
Microsoft Dynamics GP 2013 - Mejoras
Microsoft Dynamics GP 2013 - MejorasMicrosoft Dynamics GP 2013 - Mejoras
Microsoft Dynamics GP 2013 - Mejoras
 
Introduction to share point 2010 development
Introduction to share point 2010 developmentIntroduction to share point 2010 development
Introduction to share point 2010 development
 
Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020Oracle ZDM KamaleshRamasamy Sangam2020
Oracle ZDM KamaleshRamasamy Sangam2020
 
Доклад Растислава Хлавача на SPCUA 2012
Доклад Растислава Хлавача на SPCUA 2012Доклад Растислава Хлавача на SPCUA 2012
Доклад Растислава Хлавача на SPCUA 2012
 
Highload JavaScript Framework without Inheritance
Highload JavaScript Framework without InheritanceHighload JavaScript Framework without Inheritance
Highload JavaScript Framework without Inheritance
 
Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)
Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)
Into the Rabbithole - Evolved Web App Security Testing (OWASP AppSec DC)
 
Harish Aspnet Deployment
Harish Aspnet DeploymentHarish Aspnet Deployment
Harish Aspnet Deployment
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop Microsoft
 
Spring Cairngorm
Spring CairngormSpring Cairngorm
Spring Cairngorm
 
Id0115
Id0115Id0115
Id0115
 
SproutCore GTUG
SproutCore GTUGSproutCore GTUG
SproutCore GTUG
 

Dernier

NAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataNAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataExhibitors Data
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers referencessuser2c065e
 
Entrepreneurship lessons in Philippines
Entrepreneurship lessons in  PhilippinesEntrepreneurship lessons in  Philippines
Entrepreneurship lessons in PhilippinesDavidSamuel525586
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdfShaun Heinrichs
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Anamaria Contreras
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...ssuserf63bd7
 
business environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxbusiness environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxShruti Mittal
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMVoces Mineras
 
Supercharge Your eCommerce Stores-acowebs
Supercharge Your eCommerce Stores-acowebsSupercharge Your eCommerce Stores-acowebs
Supercharge Your eCommerce Stores-acowebsGOKUL JS
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024Adnet Communications
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants
 
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...Operational Excellence Consulting
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerAggregage
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...Hector Del Castillo, CPM, CPMM
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 

Dernier (20)

NAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataNAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors Data
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers reference
 
Entrepreneurship lessons in Philippines
Entrepreneurship lessons in  PhilippinesEntrepreneurship lessons in  Philippines
Entrepreneurship lessons in Philippines
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
 
business environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxbusiness environment micro environment macro environment.pptx
business environment micro environment macro environment.pptx
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQM
 
Supercharge Your eCommerce Stores-acowebs
Supercharge Your eCommerce Stores-acowebsSupercharge Your eCommerce Stores-acowebs
Supercharge Your eCommerce Stores-acowebs
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024
 
WAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdfWAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdf
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
 
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
 
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information TechnologyCorporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon Harmer
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 

Larocca

  • 1. The gLite WMS and the Data Management System Giuseppe LA ROCCA INFN Catania giuseppe.larocca@ct.infn.it Master Class for Life Science, 4-6 May 2010 Singapore
  • 2. Outline • An introduction to the gLite WMS • Job Submission via WMS • Command line interface • Job status • The Job Description Language overview • JDL attributes • The gLite DMS – The Storage Resource Manager (SRM) • Grid file referencing schemes • LFC File Catalogue – Architecture – LFC commands • File & Replica Management Client Tools • Run bioinformatics applications via Grid portal
  • 3. The gLite stack: overview
  • 4. Overview of the WMS • The Workload Management System (WMS) is the gLite 3 component that allows users to submit jobs, and performs all tasks required to execute them, without exposing the user to the complexity of the Grid. • Workload Management System (WMS) comprises a set of Grid middleware components responsible for distribution and management of tasks across Grid resources. – The Workload Manager (WM) aims to accept and satisfy requests for job management coming from its clients. • WM will pass the job to an appropriate CE for execution taking into account requirements and the preferences expressed in the job description. • The decision of which resource should be used is the outcome of a matchmaking process. – The Logging and Bookkeeping service tracks jobs managed by the WMS. It collects events from many WMS components and records the status and history of the job.
  • 5. Job Submission via WMS GILDA User Interface create proxy Grid Site Computing Element Storage Element VO Management Service (DB of VO users)
  • 6. Job Submission via WMS GILDA User Interface Workload Information System Write JDL, Submit job Management (executable) + small inputs System query create proxy publish state Grid Site Computing Element Storage Element VO Management Service (DB of VO users)
  • 7. Job Submission via WMS GILDA User Interface Workload Information System Write JDL, Submit job Management (executable) + small inputs System query create proxy publish Submit job state Logging Grid Site Computing Element Storage Element VO Management process Service (DB of VO users) Logging and bookkeeping
  • 8. Job Submission via WMS GILDA User Interface Workload Information System Write JDL, Submit job Management (executable) + small inputs System query Retrieve status create & proxy (small) output files publish Submit job Retrieve state output Job Logging status Grid Site Computing Element Storage Element VO Management process Service (DB of VO users) Logging and bookkeeping
  • 9. The Command Line Interface • The gLite WMS implements two different services to manage jobs: the Network Server and the WMProxy. – The recommended method to manage jobs is through the gLite WMS via WMProxy, because it gives the best performance and allows to use the most advanced functionalities • The WMProxy implements several functionalities, among which: – submission of job collections; – faster authentication; – faster match-making; – faster response time for users; – higher job throughput.
  • 10. Proxy Delegation To explicitly delegate a user proxy to WMProxy, the command to use is: glite-wms-job-delegate-proxy -d <delegID> Example: $ glite-wms-job-delegate-proxy -d mydelegID Connecting to the service https://rb102.cern.ch:7443/glite_wms_wmproxy_server ======= glite-wms-job-delegate-proxy Success ======== Your proxy has been successfully delegated to the WMProxy: https://rb102.cern.ch:7443/glite_wms_wmproxy_server with the delegation identifier: mydelegID =====================================================
  • 11. Job Submission Starting from a simple JDL file, we can submit it via WMProxy by doing: $ glite-wms-job-submit –d mydelegID test.jdl Connecting to the service https://rb102.cern.ch:7443/glite_wms_wmproxy_server ======== glite-wms-job-submit Success ======== The job has been successfully submitted to the WMProxy Your job identifier is: https://rb102.cern.ch:9000/vZKKk3gdBla6RySximq_vQ ==============================================
  • 12. Listing CE(s) that matching a job It is possible to see which CEs are eligible to run a job described by a given JDL using: $ glite-wms-job-list-match –d mydelegID test.jdl Connecting to the service https://rb102.cern.ch:7443/glite_wms_wmproxy_server ==================================================== COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been found: *CEId* - CE.pakgrid.org.pk:2119/jobmanager-lcgpbs-cms - grid-ce0.desy.de:2119/jobmanager-lcgpbs-cms - gw-2.ccc.ucl.ac.uk:2119/jobmanager-sge-default - grid-ce2.desy.de:2119/jobmanager-lcgpbs-cms
  • 13. Retrieving the status of a job $ glite-wms-job-status https://rb102.cern.ch:9000/fNdD4FW_Xxkt2s2aZJeoeg ===================================================== BOOKKEEPING INFORMATION: Status info for the Job : https://rb102.cern.ch:9000/fNdD4FW_Xxkt2s2aZJeoeg Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: ce1.inrne.bas.bg:2119/jobmanager-lcgpbs-cms Submitted: Mon Dec 4 15:05:43 2006 CET ===================================================== The verbosity level controls the amount of information provided. The value of the -v option ranges from 0 to 3. The commands to get the job status can have several jobIDs as arguments, i.e.: glite-wms-job-status <jobID1> ... or, more conveniently, the -i <file path> option can be used to
  • 14. Retrieving the output(s) $ glite-wms-job-output https://rb102.cern.ch:9000/yabp72aERhofLA6W2-LrJw Connecting to the service https://128.142.160.93:7443/glite_wms_wmproxy_server ===================================================== JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://rb102.cern.ch:9000/yabp72aERhofLA6W2-LrJw have been successfully retrieved and stored in the directory: /tmp/doe_yabp72aERhofLA6W2-LrJw ===================================================== The default location for storing the outputs (normally /tmp) is defined in the UI configuration, but it is possible to specify in which directory to save the output using the --dir <path name> option.
  • 15. Cancelling a job $ glite-wms-job-cancel https://rb102.cern.ch:9000/P1c60RFsrIZ9mnBALa7yZA Are you sure you want to remove specified job(s) [y/n]y : y Connecting to the service https://128.142.160.93:7443/glite_wms_wmproxy_server ========== glite-wms-job-cancel Success ============ The cancellation request has been successfully submitted for the following job(s): - https://rb102.cern.ch:9000/P1c60RFsrIZ9mnBALa7yZA ==================================================== If the cancellation is successful, the job will terminate in status CANCELLED
  • 16. Job Submission with CLI GILDA User Interface glite-wms-job-delegate-proxy -d delegID glite-wms-job-list-match –d delegID hostname.jdl delegID glite-wms-job-submit -d delegID hostname.jdl  JobID glite-wms-job-status JobID glite-wms-job-output JobID Manage job voms-proxy-init --voms gilda Grid Site Computing Element Storage Element VO Management process Service (DB of VO users)
  • 18. Job Description Language • The Job Description Language (JDL) is a high-level language based on the Classified Advertisement (ClassAd) language, used to describe jobs and aggregates of jobs with arbitrary dependency relations. – The JDL is used in WLCG/EGEE to specify the desired job characteristics and constraints, which are taken into account by the WMS to select the best resource to execute the job. – A job description is a file (called JDL file) consisting of lines having the format: attribute = expression; – Expressions can span several lines, but only the last one must be terminated by a semicolon.
  • 19. Job Description Language • The character “ ‘ ” cannot be used in the JDL. • Comments must be preceded by a sharp character (#) or a double slash (//) at the beginning if each line. • Multi-line comments must be enclosed between “/ *” and “*/” . Attention! The JDL is sensitive to blank characters and tabs. No blank characters or tabs should follow the semicolon at the end of a line.
  • 20. Simple JDL example Executable = "/bin/hostname"; StdOutput = "std.out"; StdError = "std.err"; The Executable attribute specifies the command to be run by the job. If the command is already present on the WN, it must be expressed as a absolute path; if it has to be copied from the UI, only the file name must be specified, and the path of the command on the UI should be given in the InputSandbox attribute. Executable = "test.sh"; InputSandbox = {"/home/larocca/test.sh"}; StdOutput = "std.out"; StdError = "std.err";
  • 21. • The Arguments attribute can contain a string value, which is taken as argument list for the executable: Arguments = "fileA 10"; • In the Executable and in the Arguments attributes it may be necessary to use special characters, such as &, , |, >, <. These characters should be preceded by triple in the JDL, or specified inside quoted strings e.g.: Arguments = "-f file1&file2"; • The shell environment of the job can be modified using the Environment attribute. Environment = {"CMS_PATH=$HOME/cms"};
  • 22. • If files have to be copied from the UI to the execution node, they must be listed in the InputSandbox attribute: InputSandbox = {"test.sh", ... ,"fileN"}; • The files to be transferred back to the UI after the job is finished can be specified using the OutputSandbox attribute: OutputSandbox = {"std.out","std.err"}; • Wildcards are allowed only in the InputSandbox attribute. • Absolute paths cannot be specified in the OutputSandbox attribute. • The InputSandbox cannot contain two files with the same name, even if they have a different absolute path, as when transferred they would overwrite each other.
  • 23. • The Requirements attribute can be used to express constraints on the resources where the job should run. – Its value is a Boolean expression that must evaluate to true for a job to run on that specific CE. • Note: Only one Requirements attribute can be specified (if there are more than one, only the last one is considered). If several conditions must be applied to the job, then they all must be combined in a single Requirements attribute. • For example, let us suppose that the user wants to run on a CE using PBS as batch system, and whose WNs have at least two CPUs. He will write then in the job description file: Requirements = other.GlueCEInfoLRMSType == "PBS" && other.GlueCEInfoTotalCPUs > 1;
  • 24. • The WMS can be also asked to send a job to a particular queue in a CE with the following expression: Requirements = other.GlueCEUniqueID == "lxshare0286.cern.ch:2119/jobmanager-pbs-short"; • It is also possible to use regular expressions when expressing a requirement. – Let us suppose for example that the user wants all his jobs to run on any CE in the domain cern.ch. This can be achieved putting in the JDL file the following expression: Requirements = RegExp("cern.ch",other.GlueCEUniqueID); – The opposite can be required by using: Requirements = (!RegExp("cern.ch", other.GlueCEUniqueID));
  • 25. • If the job must run on a CE where a particular experiment software is installed and this information is published by the CE, something like the following must be written: Requirements = Member(“BLAST-1.0.3”, other.GlueHostApplicationSoftwareRunTimeEnvironment); Note: The Member operator is used to test if its first argument (a scalar value) is a member of its second argument (a list). In fact, the GlueHostApplicationSoftwareRunTimeEnvironment attribute is a list of strings and is used to publish any VO- specific information relative to the CE (typically, information on the VO software available on that CE).
  • 26. Advanced job types • Job Collection: a set of independent jobs that user can submit and monitor as it was a single job [ Type = “Collection"; nodes={ [ Executable = "/bin/hostname"; Arguments = “-f"; StdOutput = "hostname.out"; StdError = "hostname.err"; OutputSandbox = {"hostname.err","hostname.out"}; ],[ Executable = "/bin/sh"; Arguments = "start_povray_valve.sh"; StdOutput = “povray.out"; StdError = “povray.err"; InputSandbox = {“start_povray_valve.sh"}; OutputSandbox = {“povray.err",“povray.out"}; Requirements = Member (“POVRAY-3,5”, other.GlueHostApplicationSoftwareRunTimeEnvironment); ] }; ]
  • 27. Advanced job types • Parametric Job: a job collection where the jobs are identical but for the value of a running parameter JobType = "Parametric"; Executable = “/bin/echo"; Arguments = “_PARAM_”; StdOutput = "myoutput_PARAM_.txt"; StdError = "myerror_PARAM_.txt"; Parameters = 3; ParameterStep = 1; ParameterStart = 1; OutputSandbox = {“myoutput_PARAM_.txt”};
  • 28. Advanced job types • DAG is a set of jobs where the input, output, or execution of one or more jobs depends on one or more other ones • The jobs are nodes (vertices) in the graph Type = "dag"; • the edges (arcs) identify the dependencies max_nodes_running = 5; InputSandbox = {"/tmp/foo/*.exe", "/home/larocca/bar", "gsiftp://neo.datamat.it:5678/tmp/cms_sim.exe ", "file:///tmp/myconf"}; InputSandboxBaseURI = "gsiftp://matrix.datamat.it:5432/tmp"; nodes = [ nodeA = [ description = [ JobType = "Normal"; Executable = "a.exe"; InputSandbox = { "/home/larocca/myfile.txt", root.InputSandbox}; ]; ]; nodeF = [ description = [ JobType = "Normal"; Executable = "b.exe"; Arguments = "1 2 3"; nodeA OutputSandbox = {"myoutput.txt", "myerror.txt" }; ]; ]; nodeD = [ description = [ JobType = "Checkpointable"; Executable = "b.exe"; Arguments = "1 2 3"; nodeB nodeC NodeF InputSandbox = { "file:///home/larocca/data.txt", root.nodes.nodeF.description.OutputSandbox[0] }; ]; ]; nodeC = [ file = "/home/larocca/nodec.jdl"; ]; nodeB = [ file = "foo.jdl"; ]; ]; nodeD dependencies = { { nodeA, nodeB }, { nodeA, nodeC }, {nodeA, nodeF }, { { nodeB, nodeC, nodeF }, nodeD } };
  • 29. References WMProxy User’s guide https ://edms.cern.ch/file/674643/1/EGEE-JRA1-TEC-674643-WMPROXY-guide JDL Attributes Specification https://edms.cern.ch/file/555796/1/EGEE-JRA1-TEC-555796-JDL-Attributes-v0-8.pd https://edms.cern.ch/file/590869/1/EGEE-JRA1-TEC-590869-JDL-Attributes-v0-9.pd gLite 3.1 user’s guide https://edms.cern.ch/file/722398/1.2/gLite-3-UserGuide.pdf Complex jobs https://grid.ct.infn.it/twiki/bin/view/GILDA/WmProxyUse WMProxy API usage http://www.euasiagrid.org/wiki/index.php/WMProxy_Java_API https ://grid.ct.infn.it/twiki/bin/view/GILDA/WMProxyCPPAPI
  • 30. The gLite stack: overview
  • 31. Storage Elements • The Storage Element is the service which allows a user or an application to store/retrieve data for future retrieval. • The DMS provides services to locate, access and transfer files – User does not need to know the physical location of file, just its logical file name; – Files can be replicated or transferred to several locations (SEs) as needed; – Files are shared with all the members of the given VO. • Files stored in a SE are written-once, read-many – Files cannot be changed unless remove or replaced;
  • 32. Protocols – The GSIFTP protocol offers the functionalities of FTP, but with support for GSI. It is responsible for secure, fast and efficient file transfers to/from Storage Elements. – RFIO was developed to access tape archiving systems, such as CASTOR (CERN Advanced STORage manager) and it comes in a secure and an insecure version. – The gsidcap protocol is the GSI enabled version of the dCache native access protocol, dcap.
  • 33. Types of Storage Elements /1 • In WLCG/EGEE, different types of Storage Elements are available: • CASTOR. It consists in a disk buffer frontend to a tape mass storage system. A virtual file system (namespace) shields the user from the complexities of the disk and tape underlying setup. File migration between disk and tape is managed by a process called “stager”. The native storage protocol, the insecure RFIO, allows access of files in the SE. Since the protocol is not GSI- enabled, only RFIO access from a location in the same LAN of the SE is allowed. With the proper modifications, the CASTOR disk buffer can be used also as disk-only storage system.
  • 34. Types of Storage Elements /2 • StoRM. It has been designed to support space reservation and direct access (native POSIX I/O call), as well as other standard libraries (like RFIO). • StoRM takes advantage from high performance parallel file systems like GPFS (from IBM). – In addition, standard POSIX file systems are supported (XFS from SGI and ext3). • StoRM takes advantage of ACL support provided by the underlying file systems to implement the security models
  • 35. Types of Storage Elements /3 • dCache. It consists of a server and one or more pool nodes. The server represents the single point of access to the SE and presents files in the pool disks under a single virtual file system tree. Nodes can be dynamically added to the pool. The native gsidcap protocol allows POSIX-like data access. dCache is widely employed as disk buffer frontend to many mass storage systems, like HPSS and Enstore, as well as a disk-only storage system. • LCG Disk pool manager. It’s a lightweight disk pool manager, suitable for relatively small sites (max 10 TB of total space). Disks can be added dynamically to the pool at any time. Like in dCache and CASTOR, a virtual file system hides the complexity of the disk pool architecture. The secure RFIO protocol allows file access from the WAN.
  • 36. The Storage Resource Manager SRM
  • 37. The Storage Resource Manager The Storage Resource Manager (SRM) has been designed to be the single interface for the management of disk and tape storage resources. Any type of Storage Element in WLCG/EGEE offers an SRM interface except for the Classic SE, which is being phased out. SRM hides the complexity of the resources setup behind it and allows the user to request files, keep them on a disk buffer for a specified lifetime, reserve space for new entries, and so on. – In gLite, interactions with the SRM is hidden by high level services (DM tools and APIs)
  • 38. The gLite Storage Element
  • 39. Grid file referencing schemes LFN GUID SURL TURL • Logical File Name (LFN) – lfn:/grid/gilda/tutorials/input-file • Grid Unique IDentifier (GUID) – guid:4d57edef-fa5c-4512-a345-1c838916b357 • Storage URL (for a specific replica, on a specific Storage Element) – srm://aliserv6.ct.infn.it/gilda/generated/2007-11-13/file b366f371-b2c0-485d-b12c-c114edaf4db4 – sfn://se01.athena.hellasgrid.gr/data/dteam/doe/file1 • Transport URL (for a specific replica, on an SE, with a specific protocol) – gsiftp://aliserv6.ct.infn.it/gilda/generated/2007-11-13/fil eb366f371-b2c0-485d-b12c-c114edaf4db4
  • 40. LCG File Catalog Symlink Replica Catalog Symlink SURL LFN GUID Symlink SURL Symlink SRM Interface TURL various protocols: gsiftp, gsidcap, rfio
  • 41. Needles in a haystack • How do I keep track of all files I have on Grid ? • How does the Grid keep track of the mapping between LFN(s), GUID and SURL(s) ? LFC File Catalogue LFC = LCG File Catalogue LCG = LHC Compute Grid LHC = Large Hadron Collider • The LCG File Catalogue is the service which maintains mappings between LFN(s), GUID and SURL(s).
  • 42. LFC File Catalogue • It consists of a unique catalogue, where the LFN is the main key. Further LFNs can be added as symlinks to the main LFN. – Looks like a “top-level” directory in the Grid – For each of the supported VO a separate subdirectory does exist under “/grid” directory – All the members of the VO have read/write permissions – System metadata are supported, while for user metadata only a single string entry is available • The catalogue publishes its endpoint in the Information Service so that it can be discovered by Data Management tools and other services (the WMS for example).
  • 43. Architecture of the LFC Catalogue • LFN acts as main key in the database. It has: – Symbolic links to it (additional LFNs) – System metadata – Information on replicas – One field of user metadata – Access Control Lists – Integration with VOMS (VirtualID and VirtualGID) – C API language
  • 44. Before to start.. • User can interact with the file catalogue through CLIs and APIs. – The environment variable LFC_HOST (e.g.: LFC_HOST=lfc-gilda.ct.infn.it) must contains the host name of the LFC server to be used. • The directory structure of the LFC namespace has the form: /grid/<VO>/<subpaths> – Users of a given VO will have read and write permissions only under the corresponding <VO> subdirectory.
  • 45. LFC Commands lfc-chmod Change access mode of the LFC file/directory lfc-chown Change owner and group of the LFC file/directory lfc-delcomment Delete the comment associated with the file/directory lfc-getacl Get file/directory access control lists lfc-ln Make a symbolic link to a file/directory lfc-ls List file/directory entries in a directory lfc-mkdir Create a directory lfc-rename Rename a file/directory lfc-rm Remove a file/directory lfc-setacl Set file/directory access control lists lfc-setcomment Add/replace a comment
  • 46. lfc-ls • Listing the entries of a LFC directory – lfc-ls [-cdiLlRTu] [--class] [--comment] [--deleted] [--display_side] [-- ds] path… – where path specifies the LFN pathname (mandatory) – Remember that LFC has a directory tree structure – /grid/<VO_name>/<you create it> LFC Namespace Defined by the user – All members of a VO have read-write permissions under their directory – You can set LFC_HOME to use relative paths lfc-ls /grid/gilda/tutorials/taipei02 export LFC_HOME=/grid/gilda/tutorials lfc-ls -l taipei02 lfc-ls -l -R /grid
  • 47. lfc-mkdir • Creating directories in the LFC – lfc-mkdir [-m mode] [-p] path... • Where path specifies the LFC pathname • Remember that while registering a new file (using lcg- cr, for example) the corresponding destination directory must be created in the catalog beforehand. • Examples: lfc-mkdir /grid/gilda/<YOUR_DIRECTORY> Created by the user
  • 48. lfc-ln • Creating a symbolic link – lfc-ln -s file linkname – lfc-ln -s directory linkname – Create a link to the specified file or directory with linkname Examples: – lfc-ln -s /grid/gilda/test /grid/gilda/aLink Original File Symbolic Link Let’s check the link using lfc-ls with long listing – lfc-ls -l aLink lrwxrwxrwx 1 19122 1077 0 Jun 14 11:58 aLink - > /grid/gilda/test
  • 49. Access Control List (ACL) • LFC allows to attach to a file or directory an access control list (ACL), a list of permissions which specify who is allowed to access or modify it. The permissions are very much like those of a UNIX file system: read (r), write (w) and execute (x). • In LFC, users and groups are internally identified as numerical virtual uids and virtual gids, which are virtual in the sense that they exist only in the LFC namespace. – A user can be specified as a name, as a virtual uid or as a DN. – A group can be specified as name, as a virtual gid or as a VOMS FQAN. • A directory in LFC has also a default ACL (which is the ACL associated to any file or directory being created under that directory). After creation, the ACLs can be freely changed. – When creating a sub-directory, its default ACL is inherited from the parent directory
  • 50. Print the ACL of a directory $ lfc-getacl /grid/gilda/tutorials/test-acl # file: /grid/gilda/tutorials/test-acl # owner: /C=IT/O=INFN/OU=Personal Certificate/L=Catania/CN=Giuseppe La Rocca/Email= giuseppe.larocca@ct.infn.it # group: gilda user::rwx group::rwx #effective:rwx other::r-x default:user::rwx default:group::rwx default:other::r-x In this example, the owner and all users in the gilda group have full privileges to the directory, while other users cannot write into it.
  • 51. Modify the ACL lfc-setacl [-d] [-m] [-s] acl_entries path The -m option means that we are modifying the existing ACL. Other options of lfc-setacl are -d to remove ACL entries, and -s to replace the complete set of ACL entries. acl_entries is a coma separated list of entries. Each entry has colon separated fields: ACL type, id (uid or gid), permission. Only directories can have default ACL entries! The entries look like: user::perm defaul::user:perm user:uid:perm defaul::user:uid:perm group:perm defaul::group:perm group:gid:perm defaul::group:gid:perm mask:perm default::mask:perm other:perm deafult::other:perm
  • 52. Modify the ACL of a directory Let's change default ACL, with read/write permission for user and group, and no privileges for others. – The syntax we apply here is modify (-m) default (d:) for user (u:), and the same of course for group and others. $ lfc-setacl -m d::u:6,d::g:6,d::o:0 $LFC_HOME/test-acl/
  • 53. Adding metadata information The lfc-setcomment and lfc-delcomment commands allow the user to associate a comment with a catalogue entry and delete such comment. This is the only user-defined metadata that can be associated with catalogue entries. The comments for the files may be listed using the --comment option of the lfc-ls command. This is shown in the following example: $ lfc-setcomment /grid/gilda/file1 “My metadata“ $ lfc-ls --comment /grid/gilda/file1 /grid/gilda/file1 My metadata
  • 54. LCG Data Management Client Tools • The LCG Data Management tools allow users to copy files between UI, WN and a SE, to register entries in the file catalogue and replicate files between SEs. lcg-cp Copies a Grid file to a local destination lcg-cr Copies a file to a SE and registers it in the catalogue lcg-del Deletes one file (either one replica or all the replicas) lcg-rep Copies a file from one SE to another SE and registers it in the catalogue lcg-gt Gets the TURL for a given SURL and transfer protocol lcg-aa Adds an alias in the catalogue for a given GUID lcg-ra Removes an alias in the catalogue for a given GUID lcg-rf Registers in the catalogue a file residing on a SE lcg-uf Unregisters in the catalogue a file residing on a SE lcg-la Lists the aliases for a given LFN, GUID or SURL lcg-lg Gets the GUID for a given LFN or SURL lcg-lr Lists the replicas for a given LFN, GUID or SURL
  • 55. Environment variables /1 • The --vo <vo name> option, to specify the virtual organisation of the user, is present in all commands, except for lcg-gt. Its usage is mandatory unless the variable LCG_GFAL_VO is set (e.g.: export LCG_GFAL_VO=gilda) Timeouts The commands lcg-cr, lcg-del, lcg-gt, lcg-rf, lcg-sd and lcg-rep all have timeouts implemented. By using the option -t, the user can specify a number of seconds for the timeout. The default is 0 seconds, that is no timeout. If we got a times out during the performing of an operation, all actions performed till that moment are rolled back, so no broken files are left on a SE and no existing files are not registered in the catalogues.
  • 56. Environment variables /2 • For all lcg-* commands to work, the environment variable LCG_GFAL_INFOSYS must be set to point to a top BDII in the format <hostname>:<port>, so that the commands can retrieve the necessary information export LCG_GFAL_INFOSYS=gilda-bdii.ct.infn.it:2170 • The VO_<VO>_DEFAULT_SE variable specifies the default SE for the VO. export VO_GILDA_DEFAULT_SE=aliserv6.ct.infn.it
  • 57. Uploading a file to the Grid /1 $ lcg-cr --vo gilda -d aliserv6.ct.infn.it file:/home/larocca/file1 guid:6ac491ea-684c-11d8-8f12-9c97cebf582a where the only argument is the local file to be uploaded and the -d <destination> option indicates the SE used as the destination for the file. The command returns the file GUID. If no destination is given, the SE specified by the VO_<VO>_DEFAULT_SE environmental variable is taken. The -P option allows the user to specify a relative path name for the file in the SE. If no -P option is given, the relative path is automatically generated.
  • 58. Uploading a file to the Grid /2 The following are examples of the different ways to specify a destination: -d aliserv6.ct.infn.it -d srm://aliserv6.ct.infn.it/data/gilda/my_file -d aliserv6.ct.infn.it -P my_dir/my_file The –l <lfn> option can be used to specify a LFN: $ lcg-cr --vo gilda -d aliserv6.ct.infn.it -l lfn:/grid/gilda/myalias1 file:/home/larocca/file1 guid:db7ddbc5-613e-423f-9501-3c0c00a0ae24
  • 59. Replicating a file $ lcg-rep -v --vo gilda -d <SECOND_SE> guid:db7ddbc5-613e-423f-9501-3c0c00a0ae24 Source URL: sfn://aliserv6.ct.infn.it/data/gilda/larocca/file1 File size: 30 Destination specified: <SECOND_SE> Source URL for copy: gsiftp://aliserv6.ct.infn.it/data/gilda/larocca/file1 Destination URL for copy: gsiftp://<SECOND_SE>/data/gilda/generated/2004-07-09/ file50c0752c-f61f-4bc3-b48e-af3f22924b57 # streams: 1 Transfer took 2040 ms Destination URL registered in LRC: srm://<SECOND_SE>/data/gilda/generated/2004-07-09/fi le50c0752c-f61f-4bc3-b48e-af3f22924b57
  • 60. Listing replicas $ lcg-lr --vo gilda lfn:/grid/gilda/tutorials/larocca/my_alias1 srm://aliserv6.ct.infn.it/data/gilda/generated/2004-07 -09/file79aee616-6cd7-4b75-8848-f091 srm://<SECOND_SE>/data/gilda/generated/2004-07-08/file 0dcabb46-2214-4db8-9ee8-2930 Again, a LFN, the GUID or a SURL can be used to specify the file.
  • 61. Copying files out the Grid $ lcg-cp --vo gilda -t 100 -v lfn:/grid/gilda/tutorials/mytext.txt file:/tmp/mytext.txt Source URL: lfn:/grid/gilda/mytext.txt File size: 104857600 Source URL for copy: gsiftp://aliserv6.ct.infn.it:/storage/gilda/2007-07-06/ input2.dat.10.0 Destination URL: file:///tmp/myfile # streams: 1 # set timeout to 100 (seconds) 85983232 bytes 8396.77 KB/sec avg 9216.11 Transfer took 12040 ms
  • 62. Deleting replicas /1 A file stored on a SE and registered in LFC can be deleted using the lcg-del command. • If a SURL is provided as argument, then that particular replica will be deleted. • If a LFN or GUID is given instead then the –s <SE> option must be used to indicate which one of the replicas must be erased $ lcg-del --vo gilda -s aliserv6.ct.infn.it guid:91b89dfe-ff95-4614-bad2-c538bfa28fac
  • 63. Deleting replicas /2 • If the –a option is used, all the replicas of the given file will be deleted and unregistered from the catalog. $ lcg-del --vo gilda -a guid:91b89dfe-ff95-4614-bad2-c538bfa28fac
  • 64. Registering Grid files The lcg-rf command allows to register a file physically present in a SE, creating a GUID-SURL mapping in the catalogue. The -g <GUID> option allows to specify a GUID (otherwise automatically created). $ lcg-rf --vo gilda -l lfn:/grid/gilda/newfile srm://aliserv6.ct.infn.it/data/gilda/generated/2004-0 7 08/file0dcabb46-2214-4db8-9ee8-2930de1 guid:baddb707-0cb5-4d9a-8141-a046659d243b
  • 65. Unregistering Grid files lcg-uf allows to delete a GUID-SURL mapping (respectively the first and second argument of the command) from the catalogue: $ lcg-uf --vo gilda guid:baddb707-0cb5-4d9a-8141-a046659d243b srm://aliserv6.ct.infn.it/data/gilda/generated/2004- 07 08/file0dcabb46-2214-4db8-9ee8-2930de1 If the last replica of a file is unregistered, the corresponding GUID-LFN mapping is also removed. Attention! lcg-uf just removes entries from the catalogue.
  • 66. Working with large data datasets • The InputSandbox and OutputSandbox attributes are the basic way to move files to and from the User Interface (UI) and the Worker Node (WN). • However, there are other ways to move files to and from the WN especially when large files (> 10 MB) are involved
  • 67. “User Input “sandbox” DataSets info interface” Output “sandbox” WMS LCG File In pu Catalogue (LFC) t“ san Ou db tp ut o x” “sa +B n db ro ox erk ” In fo Storage Computing Element 2 Element
  • 68. References • gLite 3 User Guide – Manual Series – https ://edms.cern.ch/file/722398/1.3/gLite-3-UserGuide.pdf • gLite Documentation homepage – http://glite.web.cern.ch/glite/documentation/default.as • DM subsystem documentation – http://egee-jra1-dm.web.cern.ch/egee-jra1-dm/doc.htm • LFC and DPM documentation – https://uimon.cern.ch/twiki/bin/view/LCG/DataManage • DM API – http://www.euasiagrid.org/wiki/index.php /Data_Management_Java_API
  • 69. Running more realistic jobs with the GENIUS Grid portal: Porting “BLAST” & “MrBayes” applications to Grid Case study from CNR - ITB
  • 70. The GENIUS Grid Portal architecture www.enginframe.com www.nice-italy.com www.infn.it • The GENIUS Grid portal (license ver 4.2 is free for educational) is built on top of the EnginFrame Java/XML framework; • It’s a gateway to European EGEE Project middleware (it’s easily customizable for other middleware); • It allows to expose gLite-enabled applications via web browser as well as Web Services.
  • 71. What is EnginFrame ? • It is a web-based technology able to expose Grid services running on Grid infrastructures • It allows organizations to provide application-oriented computing and data services to both users (via Web browsers) and applications (via SOAP/WSDL and/or RSS) • It’s a Grid gateway!! • It greatly simplifies the development of Web Portals exposing computing services that can run on a broad range of different computational Grid systems
  • 72. About MrBayes • MrBayes is a program for the Bayesian estimation of phylogeny. • Bayesian inference of phylogeny is based on the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. – To approximate the posterior probability distribution of trees MrBayes uses a simulation technique called Markov Chain Monte Carlo (or MCMC). • The program takes as input a character matrix in a NEXUS file format. • The output is several files with the parameters that were sampled by the MCMC algorithm. • The application is CPU demanding, especially if the MPI version of the software is used.
  • 74. The Users Tracking System (UTS) /1
  • 75. The Users Tracking System (UTS) /2
  • 76. About BLAST BLAST (Basic Local Alignment Search Tool) provides a method for rapid searching of nucleotide and protein databases. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. Click here to download results
  • 77. Thank you for your attention!

Notes de l'éditeur

  1. Service-oriented grid middleware
  2. When user submit a job in Grid its status changes according to the following state machine
  3. Slide inherited from EDG – European Data Grid