Automating Google Workspace (GWS) & more with Apps Script
EGEE 3 Project
1. Enabling Grids for E-sciencE
EGEE-III
A presentation for EU officials
Status: May 2008
www.eu-egee.org
EGEE-III INFSO-RI-222667
2. EGEE
Enabling Grids for E-sciencE
Flagship Grid infrastructure project co-funded by the European Commission
Main Objectives Service &
Application
• Operate a large-scale, Networking support
support 20%
production quality Grid 50%
infrastructure for e-Science
Training
8%
Integration
• Attract new resources and and testing
Middleware
5%
users from sciences as well 9%
as business
Dissemination
& International
Management
Cooperation
2%
6%
EGEE-III INFSO-RI-222667 2
3. EGEE-III
Enabling Grids for E-sciencE
• EGEE-III
– Third phase of the EGEE programme:
EGEE: April 2004 – March 2006
EGEE-II: April 2006 – April 2008
– Co-funded under European Commission under call INFRA-2007-1.2.3
– 9010 person months/375 FTEs
– 2 year period – 1 May 2008 to 30 April 2010
– EC Requested Contribution : €32M - represents less than 1/3 of total project costs
• Key objectives
– Expand/optimise existing EGEE infrastructure, include more resources and user
communities
– Prepare migration from a project-based model to a sustainable federated
infrastructure based on National Grid Initiatives
• Consortium
– Structured on a national basis (National Grid Initiatives/Joint Research Units)
– 42 beneficiaries (+ 100 JRU members)
EGEE-III INFSO-RI-222667 3
4. EGEE-III activities and leaders
Enabling Grids for E-sciencE
NA1: Management of the project
Bob Jones, CERN
NA2: Dissemination, Communication and SA1: Grid Operations
Outreach Maite Barroso Lopez, CERN
Catherine Gater, CERN
NA3: User Training and support SA2: Networking Support
Robin McConnell, UEDIN Xavier Jeannin, CNRS
NA4: User Community Support and SA3: Integration, testing & certification
Expansion Oliver Keeble, CERN
Cal Loomis, CNRS
NA5: International Cooperation & Policy JRA1: Middleware engineering
Panos Louridas, GRNET Francesco Giacomini, INFN
EGEE-III INFSO-RI-222667 4
5. EGEE – What do we deliver?
Enabling Grids for E-sciencE
• Infrastructure operation
– Sites distributed across many countries
Large quantity of CPUs and storage
Continuous monitoring of Grid services & automated site
configuration/management
Support multiple Virtual Organisations from diverse
research disciplines
• Middleware
– Production quality middleware distributed under business
friendly open source licence
Implements a service-oriented architecture that virtualises
resources
Adheres to recommendations on web service inter-
operability and evolving towards emerging standards
• User Support - Managed process from first contact
through to production usage
– Training
– Expertise in Grid-enabling applications
– Online helpdesk
– Networking events (User Forum, Conferences etc.)
EGEE-III INFSO-RI-222667 5
6. EGEE – Infrastructure
Enabling Grids for E-sciencE
Application areas include:
Archeology
Astronomy
Astrophysics
Civil Protection
Comp. Chemistry
Earth Sciences
Finance >250 sites
Fusion 48 countries
Geophysics >50,000 CPUs
High Energy Physics >20 PetaBytes
Life Sciences >10,000 users
Multimedia >150 VOs
Material Sciences >150,000 jobs/day
…
EGEE-III INFSO-RI-222667 6
7. Users and resources distribution
Enabling Grids for E-sciencE
February 2008 figures
EGEE-III INFSO-RI-222667 7
8. gLite Grid Middleware Services
Enabling Grids for E-sciencE
Access
CLI API
Security
Information & Monitoring
Authorization
Auditing Information & Application
Monitoring Monitoring
Authentication
Data Management Workload Management
Metadata File & Replica Accounting Job Package
Catalog Catalog Provenance Manager
Storage Data Site Proxy Computing Workload
Element Movement Element Management
Overview paper http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-001.pdf
EGEE-III INFSO-RI-222667 8
9. Disciplines and user communities
Enabling Grids for E-sciencE
Astrophysics and astroparticle physics Biomedical and bioinformatics information Computational chemistry Others
argo libi enmr.eu aegis
inaf bio trgrida apesci
pamela biomed compchem astron
astro.vo.eu-egee.org embrace gaussian cesga
planck enea
virgo High Energy Physics Infrastructure grid-it
magic calice edteam gridmosi.ici.ro
auger hone euindia lights.infn.it
ific ops ncf
Earth sciences ildg pvier vo.agata.org
trgridc minos.vo.gridpp.ac.uk rdteam vo.ipno.in2p3.fr
esr pheno rgstest vo.northgrid.ac.uk
supernemo.vo.eu-egee.org swetest webcom
Geophysics vo.lal.in2p3.fr vo.deploymenttest.cea.fr geant4
egeode All user communities are required to contribute
vo.llr.in2p3.fr vo.e-ca.es imath.cesga.es
Finance
resources to the infrastructure
vo.lpnhe.in2p3.fr
vo.sbg.in2p3.fr
vo.grif.fr
infngrid
proactive
cosmo
egrid hermes eela crypto.swing-grid.ch
vo.dapnia.cea.fr eumed diligent
Fusion alice dteam cyclops
fusion atlas vo.plgrid.pl geclipse
babar balticgrid gridcc
belle dech
~9000 users cdf
cms
see
seegrid
listed in dzero
gridpp
twgrid
trgrida/b/c/d/e
registered ilc
lhcb
voce
na48 Digital libraries, disaster
VOs zeus
ghep
recovery, computational
sciences, etc.
desy
http://cic.gridops.org/index.php?section=home&page=volist
EGEE-III INFSO-RI-222667 9
10. Why users choose the EGEE Grid
Enabling Grids for E-sciencE
• Share more than information
• Efficient use of resources at many institutes
• Leverage other sources of funding
• Data, computing power, applications
• Join local communities
Challenges:
• share data between thousands of scientists with multiple
interests
• link major and minor computer centres
• ensure all data accessible anywhere, anytime
• grow rapidly, yet remain reliable for more than a decade
• cope with different management policies of different centres
• ensure data security
• continuous, production service
EGEE-III INFSO-RI-222667 10
11. Why do particle physicists
Enabling Grids for E-sciencE need the Grid? 1/2
CERN Large Hadron Collider
The world’s most powerful particle accelerator
4 Large Experiments
ATLAS
EGEE-III INFSO-RI-222667 11
12. Why do particle physicists
Enabling Grids for E-sciencE need the Grid? 2/2
Example from LHC: One year’s data
starting from this event from LHC would
fill a stack of
CDs 20km high
• ~100,000,000
electronic
Concorde channels
We are looking for this “signature” • 0.0002 Higgs
(15 km)
per second
• 15 PBytes of
data a year
• (10 Million
Mt. Blanc GBytes = 14
Selectivity: 1 in 1013 (4.8 km) Million CDs)
Like looking for 1 person in a
thousand world populations;
or for a needle in 20 million
haystacks!
EGEE-III INFSO-RI-222667 12
13. A question of scale
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 13
14. Recent Grid activity
Enabling Grids for E-sciencE
In 2007, Worldwide LHC
Computing Grid ran ~ 44 M
300k /day jobs on different
infrastructures (EGEE,
NGDF, OSG) with the large
majority of them served by
230k /day
EGEE – workload has
continued to increase
29M in 1st quarter of 2008 –
now at ~ >300k jobs/day
Distribution of work across
Tier0/Tier1/Tier2 really
illustrates the importance of
the Grid system
Tier 2 contribution is around
50%; > 85% is external to
CERN
These workloads (reported across all WLCG centres) are at the
level anticipated for 2008 data taking
EGEE-III INFSO-RI-222667 14
15. In silico drug discovery
Enabling Grids for E-sciencE
• Diseases such as HIV/AIDS, SRAS, Bird Flu etc. are a threat to public
health due to world wide exchanges and circulation of persons
• Grids open new perspectives to in silico drug discovery
– Reduced cost, adding an accelerating factor in the search for new drugs
International collaboration
is required for:
• Early detection
• Epidemiological watch
• Prevention
• Search for new drugs •Avian influenza:
• Search for vaccines •bird casualties
EGEE-III INFSO-RI-222667 15
16. WISDOM
Enabling Grids for E-sciencE
http://wisdom.healthgrid.org/
EGEE-III INFSO-RI-222667 16
17. Computational Chemistry
Enabling Grids for E-sciencE
• Researchers from more than 30 universities across
Europe use EGEE for their work
• Chemical software ported include commercial
(Gaussian03, Turbomole, Wien2k) and several freely
available packages (GAMES, DL_POLY, CPMD, DALTON,
Columbus etc.)
• Virtual Organisations:
– CompChem (http://compchem.unipg.it)
– Gaussian (http://egee.grid.cyfronet.pl/gaussian)
– Turbomole (http://egee.grid.cyfronet.pl/turbomole-vo)
• ~ 3 million jobs executed during year 2007
• 90+ users actively using EGEE infrastructure
EGEE-III INFSO-RI-222667 17
18. Computational chemistry example
Enabling Grids for E-sciencE
• Cytochrome c Oxydase (CcO) consists of approximately
10.000 atoms and the dynamics calculations are
unfeasible on ordinary clusters (2.4 years needed for a
simulation of 5.2 ns).
• Grid computations
– Three structures
studied
– Total time - 93 days
– Nearly 6000 jobs
– 3043 days of CPU time
EGEE-III INFSO-RI-222667 18
19. Grid added value
Enabling Grids for E-sciencE
• Grid can help satisfy computational chemistry
demands:
– both CPU power and intermediate data storage for future
restarts
– easy management for large numbers of jobs (e.g. GANGA)
– automation of common tasks during job execution via workflows
– possibility of direct cooperation between computational chemistry
and other scientific disciplines
some ligand properties such as geometry, charges etc. can be
stored on the Grid
these data can be accessed by others to study interaction between
ligand and protein for example
– possibility to execute many parallel jobs at the same time
– for some commercial software packages, Grid is the only way to
allowing users access to these programs
EGEE-III INFSO-RI-222667 19
20. Expanding Geosciences-On-Demand
Enabling Grids for E-sciencE
(EGEODE) services to SMEs
• Modern seismic data processing and
geophysical simulations require greater CGGVeritas market
amounts of computing power, data storage
and sophisticated software High Tech.
• Difficult for oil & gas small & medium size
enterprises (SMEs) to exploit innovative
algorithms SMEs market
• SME Market: small O&G structures Conventional
– 1035 O&G companies in EU
– 93% are SMEs; 63% < 10 employees
Research labs
Very small projects of large firms
EGEE-III INFSO-RI-222667 20
21. EGEE workload in 2007
Enabling Grids for E-sciencE
Data:
25Pb stored
11Pb transferred
CPU: 114 Million hours
CPU
Xfer
Storage
Estimated cost if performed with Amazon’s EC2 and S3: $58,690,000 = €37M
http://gridview.cern.ch/GRIDVIEW/same_index.php http://calculator.s3.amazonaws.com/calc5.html? 17/05/08
Paper on Clouds and Grids, May 2008: https://edms.cern.ch/file/925013/4/EGEE-Grid-Cloud-v1_2.pdf
EGEE-III INFSO-RI-222667 21
22. gLite Business Use Cases
Enabling Grids for E-sciencE
• Adopted gLite on own infrastructure
– BEinGRID
Earth Sciences; Finance
– EU-IndiaGrid
Financial Stock Analysis application using gLite
– Health-e-Child
Biomedical information platform for Paediatrics
– Imense Ltd
gLite-based Grid computing for large scale image indexing and retrieval
– Philips Research
Using gLite for medical imaging, bio-informatics and simulation
• Proof of Concept
– GridVideo
gLite-based multimedia application
– TOTAL, UK
Application to assess the usefulness of External Grids using GILDA testbed
• Application and Development
– CERN Openlab
CERN and industrial partners to develop data-intensive Grid solutions
– WISDOM
Using EGEE infrastructure for drug discovery
EGEE-III INFSO-RI-222667 22
23. Business and EGEE-III
Enabling Grids for E-sciencE
• Technology Transfer and potential commercial exploitation
– Further develop the Business Forum as a means of dialog with business
actors
– More attention to
SMEs
start-ups (innovative applications and portals)
collaborative projects (partner grids)
– Develop a network of companies to prepare the future commercial exploitation
of EGEE technology
EGEE Business Associates; ISVs; Software integrators and IT Services
providers
• Provide solutions to challenges for Business adoption
– MoUs signed with related projects and interested partners to develop
identified higher-level services and solutions (e.g. SLA; Windows porting, ...)
– Further develop EGEE technology to simplify the interaction between grids
and commercial cloud services
– Explain the advantages and limitations of grids & cloud computing to
businesses
EGEE-III INFSO-RI-222667 23
25. Evolution
National
European
e-Infrastructure
Global
Routine Usage
Testbeds Utility Service
25
26. European Grid Initiative
Enabling Grids for E-sciencE
• Need to prepare permanent, common Grid infrastructure
• Ensure the long-term sustainability of the European e-Infrastructure
independent of short project funding cycles
• Coordinate the integration and interaction between National Grid
Infrastructures (NGIs)
• Operate the production Grid infrastructure on a European level for a
wide range of user communities
Must be no gap in
the support of the
production Grid
EGEE-III INFSO-RI-222667 26
27. Summary
Enabling Grids for E-sciencE
• EGEE operates the world’s largest multi-disciplinary Grid
infrastructure for scientific research
– In constant and significant production use
– Constantly growing in scale of resources and breadth of user communities
supported
• A third phase of EGEE has now started
– EGEE-III 2008-2010
• Need to prepare the long-term
– EGEE, collaborating projects, National Grid Initiatives and user
communities are working to define a model for a sustainable Grid
infrastructure that is independent of short project cycles
www.eu-egee.org
EGEE-III INFSO-RI-222667 27