SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
Structural biology in the clouds:
A success story of 10 years
Alexandre Bonvin
Utrecht University
The Netherlands
a.m.j.j.bonvin@uu.nl
Solution NMR: 950, 900-cryo, 750, 600-cryo, 600US, 2x500 MHz
Solid-state NMR: 800WB-DNP, 400WB-DNP, 700US, 500WB MHz
e-infrastructure: >1900 CPU cores + EGI grid (>110’000 CPU cores)
2019?: 1.2 GHz
National and European infrastructure
The molecular machines of life
NMR and structural biology
A	SpronkNMR production
The protein-protein interaction Cosmos
Understanding molecular recognition at atomic details
Solving molecular puzzles
by computational docking
haddock.science.uu.nl
>10000 users worldwide
Used by major pharma companies
Haddock
web portal
• > 10500 registered users
• > 188000 served runs
since June 2008
• > 35% on the GRID
Visit bonvinlab.org/software
De Vries et al. Nature Prot. 2010
Van Zundert et al. J.Mol.Biol. 2016
HADDOCK’s user base
10 years of serving the research
community
Made possible via HTC resources
provided by H2020 e-Infrastructure
projects over the years
European Open Science Cloud
CC
Under EGI-Engage
The eInfrastructure landscape over the years
European Open Science Cloud
CC
Under EGI-Engage
The eInfrastructure landscape over the years
Virtual Research Community
# Number of dimensions 2
# INAME 1 1H
# INAME 2 1H
12 2.137 2.387 1 T 0.000e+00 0.00e+00 - 0 2756 2760 0
14 2.387 4.140 1 T 0.000e+00 0.00e+00 - 0 2760 2752 0
32 1.849 4.432 1 T 0.000e+00 0.00e+00 - 0 2259 2257 0
36 1.849 3.143 1 T 0.000e+00 0.00e+00 - 0 2259 2587 0
39 1.760 4.432 1 T 0.000e+00 0.00e+00 - 0 2260 2257 0
40 1.760 1.849 1 T 0.000e+00 0.00e+00 - 0 2260 2259 0
43 1.760 3.143 1 T 0.000e+00 0.00e+00 - 0 2260 2587 0
46 1.649 4.432 1 T 1.035e+05 0.00e+00 r 0 2583 2257 0
47 1.649 1.849 1 T 0.000e+00 0.00e+00 - 0 2583 2259 0
assign ( resid 501 and name OO )
( resid 501 and name Z )
( resid 501 and name X )
( resid 501 and name Y )
( resid 2 and name CA ) -0.1400 0.15000
assign ( resid 501 and name OO )
( resid 501 and name Z )
( resid 501 and name X )
( resid 501 and name Y )
( resid 3 and name CA ) -0.0100 0.15000
Data
interpretation
Structure, dynamics & interactions
è impact on research and health:
- origin of disease
- design of new experiments
- drug design…
Exploiting GRID resources in structural biology…
Computations
NMR data collection and processing SAXS data analysis
eScience hub for NMR and structural biology
Infrastructure
Science
Com
m
unity
Knowledge
The WeNMR VRC
www.wenmr.eu
WeNMR VRC (February 2018)
• enmr.eu: One of the largest (#users) VO in life sciences
• >830 users have registered so far(36% outside EU)
• Support from >40 sites for >200’000 CPU cores via EGI infrastructure
• User-friendly access to Grid via web portals
• Supported by an SLA (2016, updated in 2017) with EGI and NGIs
www.wenmr.eu
NMR
SAXSA worldwide
e-Infrastructure for NMR and
structural biology
Sustained growth of the WeNMR VRC
0
250
500
750
1000
1250
1500
1750
2000
2250
2500
2750
3000
3250
3500
2011-01
2011-04
2011-07
2011-10
2012-01
2012-04
2012-07
2012-10
2013-01
2013-04
2013-07
2013-10
2014-01
2014-04
2014-07
2014-10
2015-01
2015-04
2015-07
2015-10
2016-01
2016-04
2016-07
2016-10
2017-01
2017-04
2017-07
2017-10
2018-01
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
500000
2009/1
2009/4
2009/7
2009/10
2010/1
2010/4
2010/7
2010/10
2011/1
2011/4
2011/7
2011/10
2012/1
2012/4
2012/7
2012/10
2013/1
2013/4
2013/7
2013/10
2014/1
2014/4
2014/7
2014/10
2015/1
2015/4
2015/7
2015/10
2016/1
2016/4
2016/7
2016/10
2017/1
2017/4
2017/7
2017/10
enmr.eu VO grid jobs
Operational since 10 years
End of
WeNMR
funding
Start of EGI-Engage
Start of West-Life
End of
eNMR
~2400 normalized CPU years over 2017
Challenges & e-Solutions
§ Attract users!
§ Offer them top of the line eScience solutions for
their research ... which means top of the line
software
The WeNMR VRC
Knowledge
Help Center
Tutorials, Wiki
Consultancy
Services
Portals
VRC
Third-party aggregation
Grid
Exposure
Marketplace
Blogs, news,
events..
User
SSO
Facebook
• 39 web portals (31 NMR, 7 SAXS)
• of which 29 by partners
• Uniform access through the new
Single Sign On functionality
• RPC access available for some
portals
The WeNMR services portfolio
Challenges & e-Solutions
§ Attract users!
§ Offer them top of the line eScience solutions for
their research ... which means top of the line
software
§ Provide them training, tutorials and support
The WeNMR VRC
Knowledge
Help Center
Tutorials, Wiki
Consultancy
Services
Portals
VRC
Third-party aggregation
Grid
Exposure
Marketplace
Blogs, news,
events..
User
SSO
Facebook
• Help center
• Consultancy remote or on
location
• Tutorials, wiki documents,
movies
• YouTube channel
• Many workshops …
Challenges & e-Solutions
§ Attract users!
§ Offer them top of the line eScience solutions for
their research ... which means top of the line
softwares)
§ Provide them training, tutorials and support
§ Make their life easier
The WeNMR VRC
Knowledge
Help Center
Tutorials, Wiki
Consultancy
Services
Portals
VRC
Third-party aggregation
Grid
Exposure
Marketplace
Blogs, news,
events..
User
SSO
Facebook
Challenges & e-Solutions
§ Attract users!
§ Access to e-infrastructure
Distribution of resources
World-wide: Dec 2015 ~ 140’000 CPU cores from 42 sites
(EGI & OSG)
• Support for
WeNMR/MoBrain/West-Life
activities
• 75M CPU hours, 50 TB storage from
7 sites
• Renewed in 2017
– CESNET-MetaCloud (Czech Republic)
– INFN-PADOVA (Italy)
– IFCA-LCG2 (Spain)
– NCG-INGRID-PT (Portugal)
– NIKHEF (The Netherlands)
– RAL-LCG2 (UK)
– SURFsara (The Netherlands)
– TW-NCHC (Taiwan)
SLA agreement
Challenges & e-Solutions
§ Attract users!
§ Access to e-infrastructure
§ Job management / submission
Job management
§ Need to handle millions of job submission
§ Initially based on gLite
§ Mostly migrated to DIRAC4EGI
§ From a user perspective DIRAC is in principle
grid/cloud agnostic:
§ Can automatically launch VMs
§ Software distributed via CVMFS
But where is the CLOUD?
European Open Science Cloud
CC
Under EGI-Engage
The eInfrastructure landscape over the years
§ With activities toward:
§ Integrating the communities
§ Making best use of cloud resources
§ Bringing data to the cloud (cryo-EM)
§ Exploiting GPGPU resources
§ While maintaining the quality of our
current services!
The MoBrain CC under EGI Engage
The West-Life VRE
west-life.eu
The West-Life-VRE portal
• Virtual machines
Already used for several workshops
(INSTRUCT Utrecht Apr. 2016; NeCEN, Leiden, ISCG2017, Taipei)
SCIPION cloud server now in production
Cryo-EM in the clouds
Virtualising portals using EC3
EGI Fedcloud usage
0
20
40
60
80
100
120
140
160
2016/12016/22016/32016/42016/52016/62016/72016/82016/92016/102016/112016/122017/12017/22017/32017/42017/52017/62017/72017/82017/92017/102017/112017/12
enmr.eu VO number of VMs
Average of 21 VMs full time per month
Harvesting GPGPU resources
Exploring GPGPU resources: PowerFit
• Python package to
automatically fit high-
resolution biomolecular
structures into cryo-EM
densities
• Simple command-line
program, able to run using
single/multiple CPUs or GPU
van Zundert and Bonvin. AIMS Biophysics 2, 73-87 (2015)
www.github.com/haddocking/powerfit
Exploring GPGPU resources: DisVis
• Python package to Python
package to visualize and
quantify the accessible
interaction space of distance
restrained binary biomolecular
complexes.
• Simple command-line
program, able to run using
single/multiple CPUs or GPU
van Zundert and Bonvin. Bioinformatics. 31, 3222-3224 (2015)
www.github.com/haddocking/disvis
Accessible interaction
space as a function of
the number of
distance restraints
DISVIS example
Baremetal vs grid vs cloud
ID Type GPU #Cores CPU type Mem (GB)
B-K20 Baremetal Tesla K20 24 HT (12 real) Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 32
B-K40 Baremetal Tesla K40 48 HT (24 real) Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz 512
D-K20 Docker on K20 Tesla K20 24 Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 32
K-K40 KVM on K40 Tesla K40 24 Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz 32
Case Machine
TimeGPU
(sec)
TimeCPU 1
core CPU1/GPU
B-K40 Baremetal 674 7928 11.8
K-K40 KVM 671 7996 11.9
B-K20 Baremetal 830 11839 14.3
D- K20 Docker 837 11926 14.3
No loss of performance
CourtesyofMarioDavid
INDIGO
<= Grid
<= Cloud
GPGPU, GRID-enabled web portals
http://milou.science.uu.nl/enmr/services/DISVIS/ http://milou.science.uu.nl/enmr/services/POWERFIT/
Pre-processing
+
Input files
packaging
Architecture behind the portals
User DB
User not found
Input error
WEB CLIENT WEB SERVER MASTER NODE WORKING NODE
GPU-
calculation
Validation
Submission
to local
nodes
Submission
to grid
node
CPU-
calculation
Chimera
image
generation
Post-processing
+
Results formatting
Output files
packaging
+
submission of
image generation
OR
Software Provisioning
indigodatacloudapps/disvis
indigodatacloudapps/powerfit
Because of complex
software dependencies
we use docker containers
• Python2.7
• NumPy 1.8+
• SciPy
• FFTW3
• pyFFTW 0.10+
• OpenCL1.1+
• pyopencl
• clFFT
• gpyfft
And to avoid security
issues on the grid side,
udocker from INDIGO
Some usage statsOperational since Aug. 2016
Published Dec. 2016 Top pulls in INDIGO applications docker hub
https://hub.docker.com/r/indigodatacloudapps/
Where are we going?
Which solution?
A bit of everything…
Thematic services under EOSC-Hub
https://www.egi.eu/use-cases/scientific-applications-tools/
Thematic services under EOSC-Hub
§ Harvest both
§ DIRAC4EGI can handle both without the
additional burden of managing the cloud
VMs
§ We still have much more grid than cloud
resources
§ HADDOCK portal as use case in Helix Nebula
Science Cloud
The exascale challenge
Ø ~20’000 human proteins
Ø Hundreds of thousands of interactions
Ø Billions CPU hours and exabytes of data
Ø Need to make our software ready for it!
bioexcel.eu
Acknowledgments:
the CSB group@UU
VICI
TOP-PUNT
WeNMR
West-Life
EGI-Engage
INDIGO-
Datacloud
BioExcel CoE
EOSC-Hub
SURFSara
€€
Thank you for your attention!
http://bonvinlab.org

Contenu connexe

Similaire à Structural biology in the clouds: A success story of 10 years

Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill
 
Use r 2013 tutorial - r and cloud computing for higher education and research
Use r 2013   tutorial - r and cloud computing for higher education and researchUse r 2013   tutorial - r and cloud computing for higher education and research
Use r 2013 tutorial - r and cloud computing for higher education and researchkchine3
 
The OptIPuter and Its Applications
The OptIPuter and Its ApplicationsThe OptIPuter and Its Applications
The OptIPuter and Its ApplicationsLarry Smarr
 
OptIPuter Overview
OptIPuter OverviewOptIPuter Overview
OptIPuter OverviewLarry Smarr
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyNeil Chue Hong
 
Future Internet: Managing Innovation and Testbed
Future Internet: Managing Innovation and TestbedFuture Internet: Managing Innovation and Testbed
Future Internet: Managing Innovation and TestbedShinji Shimojo
 
How Global-Scale Personal Lighwaves are Transforming Scientific Research
How Global-Scale Personal Lighwaves are Transforming Scientific ResearchHow Global-Scale Personal Lighwaves are Transforming Scientific Research
How Global-Scale Personal Lighwaves are Transforming Scientific ResearchLarry Smarr
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017Stacy Véronneau
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataversevty
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsDataconomy Media
 
CloudStack news
CloudStack newsCloudStack news
CloudStack newsShapeBlue
 
The EGI Federated Cloud, 7 months of production
The EGI Federated Cloud, 7 months of productionThe EGI Federated Cloud, 7 months of production
The EGI Federated Cloud, 7 months of productionDavid Wallom
 
Cal-(IT)2 Projects with Sun Microsystems
Cal-(IT)2 Projects with Sun MicrosystemsCal-(IT)2 Projects with Sun Microsystems
Cal-(IT)2 Projects with Sun MicrosystemsLarry Smarr
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and CeremonyArchiver
 
What's New in Cytoscape
What's New in CytoscapeWhat's New in Cytoscape
What's New in CytoscapeKeiichiro Ono
 
Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...
Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...
Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...David Wallom
 
Academic Modular Seminar
Academic Modular SeminarAcademic Modular Seminar
Academic Modular SeminarJason Reid
 

Similaire à Structural biology in the clouds: A success story of 10 years (20)

Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for Developers
 
Use r 2013 tutorial - r and cloud computing for higher education and research
Use r 2013   tutorial - r and cloud computing for higher education and researchUse r 2013   tutorial - r and cloud computing for higher education and research
Use r 2013 tutorial - r and cloud computing for higher education and research
 
The OptIPuter and Its Applications
The OptIPuter and Its ApplicationsThe OptIPuter and Its Applications
The OptIPuter and Its Applications
 
OptIPuter Overview
OptIPuter OverviewOptIPuter Overview
OptIPuter Overview
 
Sinnott Paper
Sinnott PaperSinnott Paper
Sinnott Paper
 
Dice presents-feb2014
Dice presents-feb2014Dice presents-feb2014
Dice presents-feb2014
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
Future Internet: Managing Innovation and Testbed
Future Internet: Managing Innovation and TestbedFuture Internet: Managing Innovation and Testbed
Future Internet: Managing Innovation and Testbed
 
How Global-Scale Personal Lighwaves are Transforming Scientific Research
How Global-Scale Personal Lighwaves are Transforming Scientific ResearchHow Global-Scale Personal Lighwaves are Transforming Scientific Research
How Global-Scale Personal Lighwaves are Transforming Scientific Research
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx Systems
 
CloudStack news
CloudStack newsCloudStack news
CloudStack news
 
The EGI Federated Cloud, 7 months of production
The EGI Federated Cloud, 7 months of productionThe EGI Federated Cloud, 7 months of production
The EGI Federated Cloud, 7 months of production
 
Mundi
MundiMundi
Mundi
 
Cal-(IT)2 Projects with Sun Microsystems
Cal-(IT)2 Projects with Sun MicrosystemsCal-(IT)2 Projects with Sun Microsystems
Cal-(IT)2 Projects with Sun Microsystems
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and Ceremony
 
What's New in Cytoscape
What's New in CytoscapeWhat's New in Cytoscape
What's New in Cytoscape
 
Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...
Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...
Using a Widely Distributed Federated Cloud System to Support Multiple Dispara...
 
Academic Modular Seminar
Academic Modular SeminarAcademic Modular Seminar
Academic Modular Seminar
 

Dernier

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 

Dernier (20)

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 

Structural biology in the clouds: A success story of 10 years

  • 1. Structural biology in the clouds: A success story of 10 years Alexandre Bonvin Utrecht University The Netherlands a.m.j.j.bonvin@uu.nl
  • 2. Solution NMR: 950, 900-cryo, 750, 600-cryo, 600US, 2x500 MHz Solid-state NMR: 800WB-DNP, 400WB-DNP, 700US, 500WB MHz e-infrastructure: >1900 CPU cores + EGI grid (>110’000 CPU cores) 2019?: 1.2 GHz National and European infrastructure
  • 4. NMR and structural biology A SpronkNMR production
  • 7. Solving molecular puzzles by computational docking haddock.science.uu.nl >10000 users worldwide Used by major pharma companies
  • 8. Haddock web portal • > 10500 registered users • > 188000 served runs since June 2008 • > 35% on the GRID Visit bonvinlab.org/software De Vries et al. Nature Prot. 2010 Van Zundert et al. J.Mol.Biol. 2016
  • 10.
  • 11.
  • 12. 10 years of serving the research community Made possible via HTC resources provided by H2020 e-Infrastructure projects over the years
  • 13. European Open Science Cloud CC Under EGI-Engage The eInfrastructure landscape over the years
  • 14. European Open Science Cloud CC Under EGI-Engage The eInfrastructure landscape over the years
  • 16. # Number of dimensions 2 # INAME 1 1H # INAME 2 1H 12 2.137 2.387 1 T 0.000e+00 0.00e+00 - 0 2756 2760 0 14 2.387 4.140 1 T 0.000e+00 0.00e+00 - 0 2760 2752 0 32 1.849 4.432 1 T 0.000e+00 0.00e+00 - 0 2259 2257 0 36 1.849 3.143 1 T 0.000e+00 0.00e+00 - 0 2259 2587 0 39 1.760 4.432 1 T 0.000e+00 0.00e+00 - 0 2260 2257 0 40 1.760 1.849 1 T 0.000e+00 0.00e+00 - 0 2260 2259 0 43 1.760 3.143 1 T 0.000e+00 0.00e+00 - 0 2260 2587 0 46 1.649 4.432 1 T 1.035e+05 0.00e+00 r 0 2583 2257 0 47 1.649 1.849 1 T 0.000e+00 0.00e+00 - 0 2583 2259 0 assign ( resid 501 and name OO ) ( resid 501 and name Z ) ( resid 501 and name X ) ( resid 501 and name Y ) ( resid 2 and name CA ) -0.1400 0.15000 assign ( resid 501 and name OO ) ( resid 501 and name Z ) ( resid 501 and name X ) ( resid 501 and name Y ) ( resid 3 and name CA ) -0.0100 0.15000 Data interpretation Structure, dynamics & interactions è impact on research and health: - origin of disease - design of new experiments - drug design… Exploiting GRID resources in structural biology… Computations NMR data collection and processing SAXS data analysis
  • 17. eScience hub for NMR and structural biology Infrastructure Science Com m unity Knowledge The WeNMR VRC
  • 19. WeNMR VRC (February 2018) • enmr.eu: One of the largest (#users) VO in life sciences • >830 users have registered so far(36% outside EU) • Support from >40 sites for >200’000 CPU cores via EGI infrastructure • User-friendly access to Grid via web portals • Supported by an SLA (2016, updated in 2017) with EGI and NGIs www.wenmr.eu NMR SAXSA worldwide e-Infrastructure for NMR and structural biology
  • 20. Sustained growth of the WeNMR VRC 0 250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000 3250 3500 2011-01 2011-04 2011-07 2011-10 2012-01 2012-04 2012-07 2012-10 2013-01 2013-04 2013-07 2013-10 2014-01 2014-04 2014-07 2014-10 2015-01 2015-04 2015-07 2015-10 2016-01 2016-04 2016-07 2016-10 2017-01 2017-04 2017-07 2017-10 2018-01
  • 22. Challenges & e-Solutions § Attract users! § Offer them top of the line eScience solutions for their research ... which means top of the line software
  • 23. The WeNMR VRC Knowledge Help Center Tutorials, Wiki Consultancy Services Portals VRC Third-party aggregation Grid Exposure Marketplace Blogs, news, events.. User SSO Facebook • 39 web portals (31 NMR, 7 SAXS) • of which 29 by partners • Uniform access through the new Single Sign On functionality • RPC access available for some portals
  • 24. The WeNMR services portfolio
  • 25. Challenges & e-Solutions § Attract users! § Offer them top of the line eScience solutions for their research ... which means top of the line software § Provide them training, tutorials and support
  • 26. The WeNMR VRC Knowledge Help Center Tutorials, Wiki Consultancy Services Portals VRC Third-party aggregation Grid Exposure Marketplace Blogs, news, events.. User SSO Facebook • Help center • Consultancy remote or on location • Tutorials, wiki documents, movies • YouTube channel • Many workshops …
  • 27. Challenges & e-Solutions § Attract users! § Offer them top of the line eScience solutions for their research ... which means top of the line softwares) § Provide them training, tutorials and support § Make their life easier
  • 28. The WeNMR VRC Knowledge Help Center Tutorials, Wiki Consultancy Services Portals VRC Third-party aggregation Grid Exposure Marketplace Blogs, news, events.. User SSO Facebook
  • 29. Challenges & e-Solutions § Attract users! § Access to e-infrastructure
  • 30. Distribution of resources World-wide: Dec 2015 ~ 140’000 CPU cores from 42 sites (EGI & OSG)
  • 31. • Support for WeNMR/MoBrain/West-Life activities • 75M CPU hours, 50 TB storage from 7 sites • Renewed in 2017 – CESNET-MetaCloud (Czech Republic) – INFN-PADOVA (Italy) – IFCA-LCG2 (Spain) – NCG-INGRID-PT (Portugal) – NIKHEF (The Netherlands) – RAL-LCG2 (UK) – SURFsara (The Netherlands) – TW-NCHC (Taiwan) SLA agreement
  • 32. Challenges & e-Solutions § Attract users! § Access to e-infrastructure § Job management / submission
  • 33. Job management § Need to handle millions of job submission § Initially based on gLite § Mostly migrated to DIRAC4EGI § From a user perspective DIRAC is in principle grid/cloud agnostic: § Can automatically launch VMs § Software distributed via CVMFS
  • 34. But where is the CLOUD?
  • 35. European Open Science Cloud CC Under EGI-Engage The eInfrastructure landscape over the years
  • 36. § With activities toward: § Integrating the communities § Making best use of cloud resources § Bringing data to the cloud (cryo-EM) § Exploiting GPGPU resources § While maintaining the quality of our current services! The MoBrain CC under EGI Engage
  • 38. The West-Life-VRE portal • Virtual machines Already used for several workshops (INSTRUCT Utrecht Apr. 2016; NeCEN, Leiden, ISCG2017, Taipei)
  • 39. SCIPION cloud server now in production Cryo-EM in the clouds
  • 43. Exploring GPGPU resources: PowerFit • Python package to automatically fit high- resolution biomolecular structures into cryo-EM densities • Simple command-line program, able to run using single/multiple CPUs or GPU van Zundert and Bonvin. AIMS Biophysics 2, 73-87 (2015) www.github.com/haddocking/powerfit
  • 44. Exploring GPGPU resources: DisVis • Python package to Python package to visualize and quantify the accessible interaction space of distance restrained binary biomolecular complexes. • Simple command-line program, able to run using single/multiple CPUs or GPU van Zundert and Bonvin. Bioinformatics. 31, 3222-3224 (2015) www.github.com/haddocking/disvis
  • 45. Accessible interaction space as a function of the number of distance restraints DISVIS example
  • 46. Baremetal vs grid vs cloud ID Type GPU #Cores CPU type Mem (GB) B-K20 Baremetal Tesla K20 24 HT (12 real) Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 32 B-K40 Baremetal Tesla K40 48 HT (24 real) Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz 512 D-K20 Docker on K20 Tesla K20 24 Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 32 K-K40 KVM on K40 Tesla K40 24 Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz 32 Case Machine TimeGPU (sec) TimeCPU 1 core CPU1/GPU B-K40 Baremetal 674 7928 11.8 K-K40 KVM 671 7996 11.9 B-K20 Baremetal 830 11839 14.3 D- K20 Docker 837 11926 14.3 No loss of performance CourtesyofMarioDavid INDIGO <= Grid <= Cloud
  • 47. GPGPU, GRID-enabled web portals http://milou.science.uu.nl/enmr/services/DISVIS/ http://milou.science.uu.nl/enmr/services/POWERFIT/
  • 48. Pre-processing + Input files packaging Architecture behind the portals User DB User not found Input error WEB CLIENT WEB SERVER MASTER NODE WORKING NODE GPU- calculation Validation Submission to local nodes Submission to grid node CPU- calculation Chimera image generation Post-processing + Results formatting Output files packaging + submission of image generation OR
  • 49. Software Provisioning indigodatacloudapps/disvis indigodatacloudapps/powerfit Because of complex software dependencies we use docker containers • Python2.7 • NumPy 1.8+ • SciPy • FFTW3 • pyFFTW 0.10+ • OpenCL1.1+ • pyopencl • clFFT • gpyfft And to avoid security issues on the grid side, udocker from INDIGO
  • 50. Some usage statsOperational since Aug. 2016 Published Dec. 2016 Top pulls in INDIGO applications docker hub https://hub.docker.com/r/indigodatacloudapps/
  • 51. Where are we going?
  • 53. A bit of everything…
  • 54. Thematic services under EOSC-Hub https://www.egi.eu/use-cases/scientific-applications-tools/
  • 55. Thematic services under EOSC-Hub § Harvest both § DIRAC4EGI can handle both without the additional burden of managing the cloud VMs § We still have much more grid than cloud resources § HADDOCK portal as use case in Helix Nebula Science Cloud
  • 56. The exascale challenge Ø ~20’000 human proteins Ø Hundreds of thousands of interactions Ø Billions CPU hours and exabytes of data Ø Need to make our software ready for it! bioexcel.eu
  • 58. Thank you for your attention! http://bonvinlab.org