SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
Dynamically Creating Big Data
Centers for the LHC
Frank Würthwein
Professor of Physics
University of California San Diego
September 25th, 2013
Outline
•  The Science
•  Software & Computing Challenges
•  Present Solutions
•  Future Solutions
September 25th 2013 Frank Wurthwein - ISC Big Data 2
The Science
~67% of energy is dark energy
~29% of matter is dark matter
All of what we know makes up
Only about 4% of the universe.
We have some ideas but no
proof of what this is!
We got no clue what this is.
The Universe is a strange place!
September 25th 2013 Frank Wurthwein - ISC Big Data 4
To study Dark
Matter we need to
create it in the
laboratory
September 25th 2013 Frank Wurthwein - ISC Big Data 5
Mont Blanc
Lake Geneva
ALICE
ATLAS
LHCb
CMS
Dynamic Data Center concept
Big bang in the laboratory
•  We gain insight by colliding particles at the highest
energies possible to measure:
–  Production rates
–  Masses & lifetimes
–  Decay rates
•  From this we derive the spectroscopy as well as the
dynamics of elementary particles.
•  Progress is made by going to higher energies and
brighter beams.
September 25th 2013 Frank Wurthwein - ISC Big Data 7
Explore Nature over 15 Orders of magnitude
Perfect agreement between Theory & Experiment
[GeV/c]T
Jet p
30 40 100 200 1000 2000
GeV/c
pb
dy
T
dp
σ2
d
-5
10
-3
10
-1
10
10
3
10
5
10
7
10
9
10
11
10
13
10
= 8 TeV CMS Preliminaryspp
21
(low PU runs)-1
= 5.8 pbint
open: L
(high PU runs)-1
= 10.71 fbint
filled: L
NP⊗NNPDF 2.1 NLO
)
5
10×0.0 <|y|< 0.5 (
)4
10×0.5 <|y|< 1.0 (
)
3
10×1.0 <|y|< 1.5 (
)2
10×1.5 <|y|< 2.0 (
)1
10×2.0 <|y|< 2.5 (
)
0
10×2.5 <|y|< 3.0 (
)-1
10×3.2 <|y|< 4.7 (
)
5
10×0.0 <|y|< 0.5 (
)4
10×0.5 <|y|< 1.0 (
)
3
10×1.0 <|y|< 1.5 (
)2
10×1.5 <|y|< 2.0 (
)1
10×2.0 <|y|< 2.5 (
)
0
10×2.5 <|y|< 3.0 (
)-1
10×3.2 <|y|< 4.7 (
Dark Matter expected
somewhere below this line.
September 25th 2013 Frank Wurthwein - ISC Big Data 8
And for the Sci-Fi Buffs …
Imagine our 3D world to be
confined to a 3D surface in
a 4D universe.
Imagine this surface to be
curved such that the 4th D
distance is short for locations
light years away in 3D.
Imagine space travel by
tunneling through the 4th D.
The LHC is searching for evidence of a 4th dimension of space.
September 25th 2013 Frank Wurthwein - ISC Big Data 9
Recap so far …
•  The beams cross in the ATLAS and CMS
detectors at a rate of 20MHz
•  Each crossing contains ~10 collisions
•  We are looking for rare events that are
expected to occur in roughly
1/10000000000000 collisions, or less.
September 25th 2013 Frank Wurthwein - ISC Big Data 10
Software & Computing
Challenges
The CMS Experiment
The CMS Experiment
•  80 Million electronic channels
x 4 bytes
x 40MHz
-----------------------
~ 10 Petabytes/sec of information
x 1/1000 zero-suppression
x 1/100,000 online event filtering
------------------------
~ 100-1000 Megabytes/sec raw data to tape
1 to 10 Petabytes of raw data per year
written to tape, not counting simulations.
•  2000 Scientists (1200 Ph.D. in physics)
–  ~ 180 Institutions
–  ~ 40 countries
•  12,500 tons, 21m long, 16m diameter
September 25th 2013 Frank Wurthwein - ISC Big Data 13
Active Scientists in CMS
September 25th 2013 Frank Wurthwein - ISC Big Data 14
agreement; the Tier-2 CPU pledge is explicitly shown (orange line). It should be noted
that the utilization curve (in red) does not include the CERN contribution (used as
analysis facility since LS1). The slight deficit in pledge utilization at Tier-2 centers since
in the first half of 2013 are mainly due to the lack of simulation requests, yet the level of
Tier-2 usage for data analysis stayed high after the end of LHC Run 1.
The Tier-2 sites continue to be very successfully used for analysis and have been the
primary analysis resource. The number of individual submitters per week submitting
jobs to the Tier-2 sites with the CMS CRAB tool is shown in Figure 10. The main dips
are explained by the CERN Christmas breaks, while the main peaks appear during
preparation periods for summer and winter conferences.
Figure 10: Individual analysis submitters per week to the grid from Sep. 2009 to Aug. 2013.
The average total number of individual submitters per month since the begging of 2013
reaches 540, which means in a typical 30-day period days around 18% of the
collaboration has submitted a grid job. This is only a 10% decrease in number of active
users compared to the LHC running period, showing the user activity on the distributed
GRID facilities is largely decoupled from the actual data taking.
The average number of job slots used at Tier-2 sites since the beginning of 2013 was of
37 K, as shown on the left hand side of Figure 11. This is a 12% increase compared to
2012, however given the 23% pledge increase at the same time, the overall pledge
utilization so far in 2013 has been smaller than in 2012, as confirmed by Figure 8. The
right hand side of Figure 11 shows the completed analysis jobs at Tier-2 in the first half
of 2013, with a measured average of ~1.4 M jobs per week (200 K jobs per day).
8
5-40% of the scientific
members are actively doing
large scale data analysis in
any given week.
~1/4 of the collaboration,
scientists and engineers,
contributed to the common
source code of ~3.6M C++ SLOC.
Evolution of LHC Science Program
150Hz 1000Hz 10000Hz
Event Rate written to tape
September 25th 2013 Frank Wurthwein - ISC Big Data 15
LHC Roadmap
September 3, 2013.
Lint~75-100 fb-1
• Physics case
• Upgrade detector design
The Challenge
How do we organize the processing of 10 s to 1000 s of
Petabytes of data by a globally distributed community
of scientists, and do so with manageable change costs
for the next 20 years ?
Guiding Principles for Solutions
Chose technical solutions that allow
computing resources as distributed as human resources.
Support distributed ownership and control,
within a global single sign-on security context.
Design for heterogeneity and adaptability.
September 25th 2013 Frank Wurthwein - ISC Big Data 16
Present Solutions
September 25th 2013 Frank Wurthwein - ISC Big Data 18
Federation of National Infrastructures. In the U.S.A.: Open Science Grid
September 25th 2013 Frank Wurthwein - ISC Big Data 19
Among the top 500 supercomputers
there are only two that are bigger when
measured by power consumption.
Tier-3 Centers
•  Locally controlled resources not pledged to any of
the 4 collaborations.
–  Large clusters at major research Universities that are time
shared.
–  Small clusters inside departments and individual research
groups.
•  Requires global sign-on system to be open for
dynamically adding resources.
–  Easy to support APIs
–  Easy to work around unsupported APIs
September 25th 2013 Frank Wurthwein - ISC Big Data 20
Me -- My friends -- The grid/cloud
O(104) Users
O(102-3) Sites
O(101-2) VOs
Thin client
Thin Grid API
Thick VO
Middleware
& Support
Me
My friends
The anonymous
Grid or Cloud
Domain science specific Common to all sciences
and industry
September 25th 2013 Frank Wurthwein - ISC Big Data 21
“My Friends” Services
•  Dynamic Resource provisioning
•  Workload management
– schedule resource, establish runtime
environment, execute workload, handle
results, clean up
•  Data distribution and access
– Input, output, and relevant metadata
•  File catalogue
September 25th 2013 Frank Wurthwein - ISC Big Data 22
!"#$%&'!&()"&*+*,-.&/&-)&0/&
!"#$%&"'#()*#+)",#-.//&-0."*#.1#.23&-'*#45'(#+)",#-.+6."&"'*7#
!"#$%&$)11*$2),3 &4&!"#$%/5&!"#$%65&7&!"#$%8-&9 &#&(*:&/;;.&'*"&*+*,-&
<*-&$)11*$2),3 & &4&<*-/5&<*-65&7&<*-8=&9& & &#&(*:&/;.&'*"&*+*,-&
01*$-"),&$)11*$2),3&4&01*$-"),/5&7&9& & & &#&$)>'1*5&?(&#-&#11&
!"#$%&@5&*+A&/&-)&0/&!"#$%&'B5&*+A&/&-)&0/&
<*-&0C#D"),?$5&*+A&/&-)&0/& 01*$-"),&00E5&*+A&/&-)&0/&
7&
7&7&
$1>.-*"/&
!"#$%&'!&()"&*+*,-.&0/&-)&06& !"#$%&@5&*+A&0/&-)&06&!"#$%&'B5&*+A&0/&-)&06&
<*-&0C#D"),?$5&*+A&0/&-)&06& 01*$-"),&00E5&*+A&0/&-)&06&
7&
7&7&
$1>.-*"6&
•  0#$C&FG#.%*-H&$)I'"*..*D&.*'#"#-*1J&KL&)'2I?B*D&-)&*M$?*,-1J&"*#D3&
–  '#"2#1&*+*,-5&*ANA5&),1J&!"#$%.O&
–  '#"2#1&)G=*$-5&*ANA5&),1J&01*$-"),&00E&-)&D*$?D*&?(&-C*&*+*,-&?.&?,-*"*.2,N&#-&#11&&
•  P)"&QER5&$1>.-*"&.?B*&),&D?.%&?.&S&T&6;&EU&)"&/;&T&V;&*+*,-.&
•  !)-#1&W1*&.?B*&(")I&/;;&EU&-)&/;&XU&
FU#.%*-H&
Q1>.-*".&)(&0+*,-.&
Optimize Data Structure for
Partial Reads
September 25th 2013 Frank Wurthwein - ISC Big Data 23
Fraction of file read [%]
0 0.2 0.4 0.6 0.8 1
N
4
10
5
10
6
10
7
10
Fraction of a file that is read
September 25th 2013 Frank Wurthwein - ISC Big Data 24
#offilesread
For vast majority of files, less than 20% of the file is read.
20%
Average 20-35%
Median 3-7%
(depending on type of file)
Overflow bin
Future Solutions
From present to future
•  Initially, we operated a largely static system.
–  Data was placed quasi-static before it can be analyzed.
–  Analysis centers have contractual agreements with the collaboration.
–  All reconstruction is done at centers with custodial archives.
•  Increasingly, we have too much data to afford this.
–  Dynamic data placement
•  Data is placed at T2s based on job backlog in global queues.
–  WAN access: ”Any Data, Anytime, Anywhere”
•  Jobs are started on the same continent as the data instead of the same
cluster attached to the data.
–  Dynamic creation of data processing centers
•  Tier-1 hardware bought to satisfy steady state needs instead of peak needs.
•  Primary processing as data comes off the detector => steady state
•  Annual Reprocessing of accumulated data => peak needs
September 25th 2013 Frank Wurthwein - ISC Big Data 26
Any Data, Anytime, Anywhere
September 25th 2013 Frank Wurthwein - ISC Big Data 27
Site A Site B Site C
Global Xrootd
Redirector
Xrootd Xrootd Xrootd
Lustre Storage Hadoop Storage dCache Storage
User
Application Q: Open /store/foo
A: Check Site A
Q: Open /store/foo
A: Success!
Cmsd Cmsd Cmsd
Xrootd Cmsd
Global redirection system to unify all CMS data
into one globally accessible namespace.
Is made possible by paying careful attention to IO layer
to avoid inefficiencies due to IO related latencies.
Tape Archive!
@ FNAL!
Tier-2 Centers!
@ OSG!
Steady State!
Processing!
@ FNAL!
Peak!
Processing!
@ SDSC!
Cloud and/or OSG!
Resources!
Simulated!
Data!
Vision going forward
Implemented vision for 1st time in Spring 2013
using Gordon Supercomputer at SDSC.
September 25th 2013 Frank Wurthwein - ISC Big Data 28
September 25th 2013 Frank Wurthwein - ISC Big Data 29SAN DIEGO SUPERCOMPUTER CENTER
Gordon Overview!
•  !"#$%&'(#
•  "')*#&)+*#,"-#
•  ./0#12#34(564&4#789#
:%;4(#
•  <1#=%&40#/>#?@8:%;4#
•  /#A27#=%:5&%**4&(#
•  <.#22"(#
•  "')*#<B?CD#
•  2'E4&F+=&%#6%C%#
•  GH7#?4:1#
•  !BB#?@#7:54*#I<B#
4FAH#22"(#
•  !BB#$@#)JJ&4J)54#
•  <0B1/#12#K4%:#DL#
M2):;N#@&+;J4O#:%;4(#
•  <.#=%&4(0#./#?@8:%;4#
•  7:54*#P4Q4&(%:#G)((#
6%C%#
•  GH7#?4:!#
•  A)&J4#F46%&N#R2FG#
2'E4&:%;4(#
•  1$@#"-SF#
•  <B#$@#T*)(U#
V")5)#9)(+(W#
A'(5&4#GT2#
<BB#?@8(4=0#/#G@#
SAN DIEGO SUPERCOMPUTER CENTER
Accelerate LHC Science"
!
Rick Wagner!
San Diego Supercomputer Center!
XSEDE 13"
July 22-25, 2013"
San Diego, CA"
!
Brian Bockelman!
University of Nebraska-Lincoln!
CMS “My Friends” Stack
•  CMSSW release environment
–  NFS exported from Gordon IO nodes
–  Future: CernVM-FS via Squid caches
•  J. Blomer et al.; 2012 J. Phys.: Conf. Ser. 396 052013
•  Security Context (CA certs, CRLs) via OSG worker node client
•  CMS calibration data access via FroNTier
•  B. Blumenfeld et al; 2008 J. Phys.: Conf. Ser. 119 072007
–  Squid caches installed on Gordon IO nodes
•  glideinWMS
•  I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950
–  Implements “late binding” provisioning of CPU and job scheduling
–  Submits pilots to Gordon via BOSCO (GSI-SSH)
•  WMAgent to manage CMS workloads
•  PhEDEx data transfer management
–  Uses SRM and gridftp
September 25th 2013 Frank Wurthwein - ISC Big Data 30
Jobenvironment
DataandJob
handling
CMS “My Friends” Stack
•  CMSSW release environment
–  NFS exported from Gordon IO nodes
–  Future: CernVM-FS via Squid caches
•  J. Blomer et al.; 2012 J. Phys.: Conf. Ser. 396 052013
•  Security Context (CA certs, CRLs) via OSG worker node client
•  CMS calibration data access via FroNTier
•  B. Blumenfeld et al; 2008 J. Phys.: Conf. Ser. 119 072007
–  Squid caches installed on Gordon IO nodes
•  glideinWMS
•  I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950
–  Implements “late binding” provisioning of CPU and job scheduling
–  Submits pilots to Gordon via BOSCO (GSI-SSH)
•  WMAgent to manage CMS workloads
•  PhEDEx data transfer management
–  Uses SRM and gridftp
September 25th 2013 Frank Wurthwein - ISC Big Data 31
Jobenvironment
DataandJob
handling
This is clearly mighty complex !!!
So let’s focus only on the parts
that are specific to incorporating
Gordon as a dynamic data
processing center.
September 25th 2013 Frank Wurthwein - ISC Big Data 32
SAN DIEGO SUPERCOMPUTER CENTER
Items in red were deployed/modified to incorporate Gordon
Minor mod of
PhEDEx config file
Deploy Squid
Export CMSSW
& WN client
Gordon Results
•  Work completed in February/March 2013 as a result of
a “lunch conversation” between SDSC & US-CMS
management
–  Dynamically responding to an opportunity
•  400 Million RAW events processed
–  125 TB in and ~150 TB out
–  ~2 Million core hours of processing
•  Extremely useful for both science results as well as
proof of principle in software & computing.
September 25th 2013 Frank Wurthwein - ISC Big Data 33
Summary & Conclusions
•  Guided by the principles:
– Support distributed ownership and control in a
global single sign-on security context.
– Design for heterogeneity and adaptability
•  The LHC experiments very successfully
developed and implemented a set of new
concepts to deal with BigData.
September 25th 2013 Frank Wurthwein - ISC Big Data 34
Outlook
•  The LHC experiments had to largely invent an
island of BigData technologies with limited
interactions with industry and other domain
sciences.
•  Is it worth building bridges to other islands ?
– IO stack and HDF5 ?
– MapReduce ?
– What else ?
•  Is there a mainland emerging that is not just
another island ?
September 25th 2013 Frank Wurthwein - ISC Big Data 35

Contenu connexe

Tendances

Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science ServicesIan Foster
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilitiesIan Foster
 
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery DataThe Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery DataRobert Grossman
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceRobert Grossman
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...Ian Foster
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Robert Grossman
 
The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...Anubhav Jain
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Frederic Desprez
 
Materials Project computation and database infrastructure
Materials Project computation and database infrastructureMaterials Project computation and database infrastructure
Materials Project computation and database infrastructureAnubhav Jain
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataAnubhav Jain
 
Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Robert Grossman
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
NERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie BardNERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie BardPacificResearchPlatform
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningRafael Ferreira da Silva
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Robert Grossman
 

Tendances (20)

Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery DataThe Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of Science
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
 
The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...The Materials Project: Experiences from running a million computational scien...
The Materials Project: Experiences from running a million computational scien...
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
 
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
 
Materials Project computation and database infrastructure
Materials Project computation and database infrastructureMaterials Project computation and database infrastructure
Materials Project computation and database infrastructure
 
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
 
Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
NERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie BardNERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie Bard
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
 

En vedette

HBaseCon 2013: General Session
HBaseCon 2013: General SessionHBaseCon 2013: General Session
HBaseCon 2013: General SessionCloudera, Inc.
 
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Senthil Kumar
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBaseCloudera, Inc.
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst TrainingCloudera, Inc.
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

En vedette (9)

HBaseCon 2013: General Session
HBaseCon 2013: General SessionHBaseCon 2013: General Session
HBaseCon 2013: General Session
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBase
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst Training
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Similaire à Dynamic Data Center concept

The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research PlatformLarry Smarr
 
Toward A National Big Data Superhighway
Toward A National Big Data SuperhighwayToward A National Big Data Superhighway
Toward A National Big Data SuperhighwayLarry Smarr
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? Robert Grossman
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGGeoffrey Fox
 
Physics Research in an Era of Global Cyberinfrastructure
Physics Research in an Era of Global CyberinfrastructurePhysics Research in an Era of Global Cyberinfrastructure
Physics Research in an Era of Global CyberinfrastructureLarry Smarr
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
 
Data Capacitor II at Indiana University
Data Capacitor II at Indiana UniversityData Capacitor II at Indiana University
Data Capacitor II at Indiana Universityinside-BigData.com
 
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...Larry Smarr
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsGeoffrey Fox
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...Larry Smarr
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Otávio Carvalho
 
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...Larry Smarr
 
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Larry Smarr
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performanceJisc
 
Toward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisToward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisLarry Smarr
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsAndrew Lowe
 

Similaire à Dynamic Data Center concept (20)

Grid computing & its applications
Grid computing & its applicationsGrid computing & its applications
Grid computing & its applications
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
1. GRID COMPUTING
1. GRID COMPUTING1. GRID COMPUTING
1. GRID COMPUTING
 
Toward A National Big Data Superhighway
Toward A National Big Data SuperhighwayToward A National Big Data Superhighway
Toward A National Big Data Superhighway
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
Physics Research in an Era of Global Cyberinfrastructure
Physics Research in an Era of Global CyberinfrastructurePhysics Research in an Era of Global Cyberinfrastructure
Physics Research in an Era of Global Cyberinfrastructure
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
Data Capacitor II at Indiana University
Data Capacitor II at Indiana UniversityData Capacitor II at Indiana University
Data Capacitor II at Indiana University
 
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other things
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
 
Ogf27 Ligo
Ogf27 LigoOgf27 Ligo
Ogf27 Ligo
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
 
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting ...
 
Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...Positioning University of California Information Technology for the Future: S...
Positioning University of California Information Technology for the Future: S...
 
Challenges in end-to-end performance
Challenges in end-to-end performanceChallenges in end-to-end performance
Challenges in end-to-end performance
 
Toward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisToward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data Analysis
 
Grid computing
Grid computingGrid computing
Grid computing
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle Physics
 

Dernier

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 

Dernier (20)

UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 

Dynamic Data Center concept

  • 1. Dynamically Creating Big Data Centers for the LHC Frank Würthwein Professor of Physics University of California San Diego September 25th, 2013
  • 2. Outline •  The Science •  Software & Computing Challenges •  Present Solutions •  Future Solutions September 25th 2013 Frank Wurthwein - ISC Big Data 2
  • 4. ~67% of energy is dark energy ~29% of matter is dark matter All of what we know makes up Only about 4% of the universe. We have some ideas but no proof of what this is! We got no clue what this is. The Universe is a strange place! September 25th 2013 Frank Wurthwein - ISC Big Data 4
  • 5. To study Dark Matter we need to create it in the laboratory September 25th 2013 Frank Wurthwein - ISC Big Data 5 Mont Blanc Lake Geneva ALICE ATLAS LHCb CMS
  • 7. Big bang in the laboratory •  We gain insight by colliding particles at the highest energies possible to measure: –  Production rates –  Masses & lifetimes –  Decay rates •  From this we derive the spectroscopy as well as the dynamics of elementary particles. •  Progress is made by going to higher energies and brighter beams. September 25th 2013 Frank Wurthwein - ISC Big Data 7
  • 8. Explore Nature over 15 Orders of magnitude Perfect agreement between Theory & Experiment [GeV/c]T Jet p 30 40 100 200 1000 2000 GeV/c pb dy T dp σ2 d -5 10 -3 10 -1 10 10 3 10 5 10 7 10 9 10 11 10 13 10 = 8 TeV CMS Preliminaryspp 21 (low PU runs)-1 = 5.8 pbint open: L (high PU runs)-1 = 10.71 fbint filled: L NP⊗NNPDF 2.1 NLO ) 5 10×0.0 <|y|< 0.5 ( )4 10×0.5 <|y|< 1.0 ( ) 3 10×1.0 <|y|< 1.5 ( )2 10×1.5 <|y|< 2.0 ( )1 10×2.0 <|y|< 2.5 ( ) 0 10×2.5 <|y|< 3.0 ( )-1 10×3.2 <|y|< 4.7 ( ) 5 10×0.0 <|y|< 0.5 ( )4 10×0.5 <|y|< 1.0 ( ) 3 10×1.0 <|y|< 1.5 ( )2 10×1.5 <|y|< 2.0 ( )1 10×2.0 <|y|< 2.5 ( ) 0 10×2.5 <|y|< 3.0 ( )-1 10×3.2 <|y|< 4.7 ( Dark Matter expected somewhere below this line. September 25th 2013 Frank Wurthwein - ISC Big Data 8
  • 9. And for the Sci-Fi Buffs … Imagine our 3D world to be confined to a 3D surface in a 4D universe. Imagine this surface to be curved such that the 4th D distance is short for locations light years away in 3D. Imagine space travel by tunneling through the 4th D. The LHC is searching for evidence of a 4th dimension of space. September 25th 2013 Frank Wurthwein - ISC Big Data 9
  • 10. Recap so far … •  The beams cross in the ATLAS and CMS detectors at a rate of 20MHz •  Each crossing contains ~10 collisions •  We are looking for rare events that are expected to occur in roughly 1/10000000000000 collisions, or less. September 25th 2013 Frank Wurthwein - ISC Big Data 10
  • 13. The CMS Experiment •  80 Million electronic channels x 4 bytes x 40MHz ----------------------- ~ 10 Petabytes/sec of information x 1/1000 zero-suppression x 1/100,000 online event filtering ------------------------ ~ 100-1000 Megabytes/sec raw data to tape 1 to 10 Petabytes of raw data per year written to tape, not counting simulations. •  2000 Scientists (1200 Ph.D. in physics) –  ~ 180 Institutions –  ~ 40 countries •  12,500 tons, 21m long, 16m diameter September 25th 2013 Frank Wurthwein - ISC Big Data 13
  • 14. Active Scientists in CMS September 25th 2013 Frank Wurthwein - ISC Big Data 14 agreement; the Tier-2 CPU pledge is explicitly shown (orange line). It should be noted that the utilization curve (in red) does not include the CERN contribution (used as analysis facility since LS1). The slight deficit in pledge utilization at Tier-2 centers since in the first half of 2013 are mainly due to the lack of simulation requests, yet the level of Tier-2 usage for data analysis stayed high after the end of LHC Run 1. The Tier-2 sites continue to be very successfully used for analysis and have been the primary analysis resource. The number of individual submitters per week submitting jobs to the Tier-2 sites with the CMS CRAB tool is shown in Figure 10. The main dips are explained by the CERN Christmas breaks, while the main peaks appear during preparation periods for summer and winter conferences. Figure 10: Individual analysis submitters per week to the grid from Sep. 2009 to Aug. 2013. The average total number of individual submitters per month since the begging of 2013 reaches 540, which means in a typical 30-day period days around 18% of the collaboration has submitted a grid job. This is only a 10% decrease in number of active users compared to the LHC running period, showing the user activity on the distributed GRID facilities is largely decoupled from the actual data taking. The average number of job slots used at Tier-2 sites since the beginning of 2013 was of 37 K, as shown on the left hand side of Figure 11. This is a 12% increase compared to 2012, however given the 23% pledge increase at the same time, the overall pledge utilization so far in 2013 has been smaller than in 2012, as confirmed by Figure 8. The right hand side of Figure 11 shows the completed analysis jobs at Tier-2 in the first half of 2013, with a measured average of ~1.4 M jobs per week (200 K jobs per day). 8 5-40% of the scientific members are actively doing large scale data analysis in any given week. ~1/4 of the collaboration, scientists and engineers, contributed to the common source code of ~3.6M C++ SLOC.
  • 15. Evolution of LHC Science Program 150Hz 1000Hz 10000Hz Event Rate written to tape September 25th 2013 Frank Wurthwein - ISC Big Data 15 LHC Roadmap September 3, 2013. Lint~75-100 fb-1 • Physics case • Upgrade detector design
  • 16. The Challenge How do we organize the processing of 10 s to 1000 s of Petabytes of data by a globally distributed community of scientists, and do so with manageable change costs for the next 20 years ? Guiding Principles for Solutions Chose technical solutions that allow computing resources as distributed as human resources. Support distributed ownership and control, within a global single sign-on security context. Design for heterogeneity and adaptability. September 25th 2013 Frank Wurthwein - ISC Big Data 16
  • 18. September 25th 2013 Frank Wurthwein - ISC Big Data 18 Federation of National Infrastructures. In the U.S.A.: Open Science Grid
  • 19. September 25th 2013 Frank Wurthwein - ISC Big Data 19 Among the top 500 supercomputers there are only two that are bigger when measured by power consumption.
  • 20. Tier-3 Centers •  Locally controlled resources not pledged to any of the 4 collaborations. –  Large clusters at major research Universities that are time shared. –  Small clusters inside departments and individual research groups. •  Requires global sign-on system to be open for dynamically adding resources. –  Easy to support APIs –  Easy to work around unsupported APIs September 25th 2013 Frank Wurthwein - ISC Big Data 20
  • 21. Me -- My friends -- The grid/cloud O(104) Users O(102-3) Sites O(101-2) VOs Thin client Thin Grid API Thick VO Middleware & Support Me My friends The anonymous Grid or Cloud Domain science specific Common to all sciences and industry September 25th 2013 Frank Wurthwein - ISC Big Data 21
  • 22. “My Friends” Services •  Dynamic Resource provisioning •  Workload management – schedule resource, establish runtime environment, execute workload, handle results, clean up •  Data distribution and access – Input, output, and relevant metadata •  File catalogue September 25th 2013 Frank Wurthwein - ISC Big Data 22
  • 23. !"#$%&'!&()"&*+*,-.&/&-)&0/& !"#$%&"'#()*#+)",#-.//&-0."*#.1#.23&-'*#45'(#+)",#-.+6."&"'*7# !"#$%&$)11*$2),3 &4&!"#$%/5&!"#$%65&7&!"#$%8-&9 &#&(*:&/;;.&'*"&*+*,-& <*-&$)11*$2),3 & &4&<*-/5&<*-65&7&<*-8=&9& & &#&(*:&/;.&'*"&*+*,-& 01*$-"),&$)11*$2),3&4&01*$-"),/5&7&9& & & &#&$)>'1*5&?(&#-&#11& !"#$%&@5&*+A&/&-)&0/&!"#$%&'B5&*+A&/&-)&0/& <*-&0C#D"),?$5&*+A&/&-)&0/& 01*$-"),&00E5&*+A&/&-)&0/& 7& 7&7& $1>.-*"/& !"#$%&'!&()"&*+*,-.&0/&-)&06& !"#$%&@5&*+A&0/&-)&06&!"#$%&'B5&*+A&0/&-)&06& <*-&0C#D"),?$5&*+A&0/&-)&06& 01*$-"),&00E5&*+A&0/&-)&06& 7& 7&7& $1>.-*"6& •  0#$C&FG#.%*-H&$)I'"*..*D&.*'#"#-*1J&KL&)'2I?B*D&-)&*M$?*,-1J&"*#D3& –  '#"2#1&*+*,-5&*ANA5&),1J&!"#$%.O& –  '#"2#1&)G=*$-5&*ANA5&),1J&01*$-"),&00E&-)&D*$?D*&?(&-C*&*+*,-&?.&?,-*"*.2,N&#-&#11&& •  P)"&QER5&$1>.-*"&.?B*&),&D?.%&?.&S&T&6;&EU&)"&/;&T&V;&*+*,-.& •  !)-#1&W1*&.?B*&(")I&/;;&EU&-)&/;&XU& FU#.%*-H& Q1>.-*".&)(&0+*,-.& Optimize Data Structure for Partial Reads September 25th 2013 Frank Wurthwein - ISC Big Data 23
  • 24. Fraction of file read [%] 0 0.2 0.4 0.6 0.8 1 N 4 10 5 10 6 10 7 10 Fraction of a file that is read September 25th 2013 Frank Wurthwein - ISC Big Data 24 #offilesread For vast majority of files, less than 20% of the file is read. 20% Average 20-35% Median 3-7% (depending on type of file) Overflow bin
  • 26. From present to future •  Initially, we operated a largely static system. –  Data was placed quasi-static before it can be analyzed. –  Analysis centers have contractual agreements with the collaboration. –  All reconstruction is done at centers with custodial archives. •  Increasingly, we have too much data to afford this. –  Dynamic data placement •  Data is placed at T2s based on job backlog in global queues. –  WAN access: ”Any Data, Anytime, Anywhere” •  Jobs are started on the same continent as the data instead of the same cluster attached to the data. –  Dynamic creation of data processing centers •  Tier-1 hardware bought to satisfy steady state needs instead of peak needs. •  Primary processing as data comes off the detector => steady state •  Annual Reprocessing of accumulated data => peak needs September 25th 2013 Frank Wurthwein - ISC Big Data 26
  • 27. Any Data, Anytime, Anywhere September 25th 2013 Frank Wurthwein - ISC Big Data 27 Site A Site B Site C Global Xrootd Redirector Xrootd Xrootd Xrootd Lustre Storage Hadoop Storage dCache Storage User Application Q: Open /store/foo A: Check Site A Q: Open /store/foo A: Success! Cmsd Cmsd Cmsd Xrootd Cmsd Global redirection system to unify all CMS data into one globally accessible namespace. Is made possible by paying careful attention to IO layer to avoid inefficiencies due to IO related latencies.
  • 28. Tape Archive! @ FNAL! Tier-2 Centers! @ OSG! Steady State! Processing! @ FNAL! Peak! Processing! @ SDSC! Cloud and/or OSG! Resources! Simulated! Data! Vision going forward Implemented vision for 1st time in Spring 2013 using Gordon Supercomputer at SDSC. September 25th 2013 Frank Wurthwein - ISC Big Data 28
  • 29. September 25th 2013 Frank Wurthwein - ISC Big Data 29SAN DIEGO SUPERCOMPUTER CENTER Gordon Overview! •  !"#$%&'(# •  "')*#&)+*#,"-# •  ./0#12#34(564&4#789# :%;4(# •  <1#=%&40#/>#?@8:%;4# •  /#A27#=%:5&%**4&(# •  <.#22"(# •  "')*#<B?CD# •  2'E4&F+=&%#6%C%# •  GH7#?4:1# •  !BB#?@#7:54*#I<B# 4FAH#22"(# •  !BB#$@#)JJ&4J)54# •  <0B1/#12#K4%:#DL# M2):;N#@&+;J4O#:%;4(# •  <.#=%&4(0#./#?@8:%;4# •  7:54*#P4Q4&(%:#G)((# 6%C%# •  GH7#?4:!# •  A)&J4#F46%&N#R2FG# 2'E4&:%;4(# •  1$@#"-SF# •  <B#$@#T*)(U# V")5)#9)(+(W# A'(5&4#GT2# <BB#?@8(4=0#/#G@# SAN DIEGO SUPERCOMPUTER CENTER Accelerate LHC Science" ! Rick Wagner! San Diego Supercomputer Center! XSEDE 13" July 22-25, 2013" San Diego, CA" ! Brian Bockelman! University of Nebraska-Lincoln!
  • 30. CMS “My Friends” Stack •  CMSSW release environment –  NFS exported from Gordon IO nodes –  Future: CernVM-FS via Squid caches •  J. Blomer et al.; 2012 J. Phys.: Conf. Ser. 396 052013 •  Security Context (CA certs, CRLs) via OSG worker node client •  CMS calibration data access via FroNTier •  B. Blumenfeld et al; 2008 J. Phys.: Conf. Ser. 119 072007 –  Squid caches installed on Gordon IO nodes •  glideinWMS •  I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950 –  Implements “late binding” provisioning of CPU and job scheduling –  Submits pilots to Gordon via BOSCO (GSI-SSH) •  WMAgent to manage CMS workloads •  PhEDEx data transfer management –  Uses SRM and gridftp September 25th 2013 Frank Wurthwein - ISC Big Data 30 Jobenvironment DataandJob handling
  • 31. CMS “My Friends” Stack •  CMSSW release environment –  NFS exported from Gordon IO nodes –  Future: CernVM-FS via Squid caches •  J. Blomer et al.; 2012 J. Phys.: Conf. Ser. 396 052013 •  Security Context (CA certs, CRLs) via OSG worker node client •  CMS calibration data access via FroNTier •  B. Blumenfeld et al; 2008 J. Phys.: Conf. Ser. 119 072007 –  Squid caches installed on Gordon IO nodes •  glideinWMS •  I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950 –  Implements “late binding” provisioning of CPU and job scheduling –  Submits pilots to Gordon via BOSCO (GSI-SSH) •  WMAgent to manage CMS workloads •  PhEDEx data transfer management –  Uses SRM and gridftp September 25th 2013 Frank Wurthwein - ISC Big Data 31 Jobenvironment DataandJob handling This is clearly mighty complex !!! So let’s focus only on the parts that are specific to incorporating Gordon as a dynamic data processing center.
  • 32. September 25th 2013 Frank Wurthwein - ISC Big Data 32 SAN DIEGO SUPERCOMPUTER CENTER Items in red were deployed/modified to incorporate Gordon Minor mod of PhEDEx config file Deploy Squid Export CMSSW & WN client
  • 33. Gordon Results •  Work completed in February/March 2013 as a result of a “lunch conversation” between SDSC & US-CMS management –  Dynamically responding to an opportunity •  400 Million RAW events processed –  125 TB in and ~150 TB out –  ~2 Million core hours of processing •  Extremely useful for both science results as well as proof of principle in software & computing. September 25th 2013 Frank Wurthwein - ISC Big Data 33
  • 34. Summary & Conclusions •  Guided by the principles: – Support distributed ownership and control in a global single sign-on security context. – Design for heterogeneity and adaptability •  The LHC experiments very successfully developed and implemented a set of new concepts to deal with BigData. September 25th 2013 Frank Wurthwein - ISC Big Data 34
  • 35. Outlook •  The LHC experiments had to largely invent an island of BigData technologies with limited interactions with industry and other domain sciences. •  Is it worth building bridges to other islands ? – IO stack and HDF5 ? – MapReduce ? – What else ? •  Is there a mainland emerging that is not just another island ? September 25th 2013 Frank Wurthwein - ISC Big Data 35