SlideShare une entreprise Scribd logo
1  sur  24
Jakub Wozniak, CERN
Next CERN Accelerator
Logging Service
A road to Big Data
#SparkSummit
2
CERN?Physics Lab
since 1954
CERN
Complex
On Map
3
LHC
SPS
Future
Circular
Collider
(FCC)
4
Plans for
80-100 km tunnel
under Geneva Lake
and Alps
FCC
5
Experiments
6
1 PB / second of events
(raw physics data)
Total stored beam energy
116e09 p x 2800 bunches
50MeV
6.5TeV
Proton
Energy
LHCBeam
Accelerator Timing
• Beam time shared per experiment in cycles
– of 1.2 sec, 2.4 sec, 3.6 sec, ...
7
Exp1 Exp2 Exp1 Exp3 Exp3 Exp4 Exp1
Accelerator Hardware
8
• Magnets
– Dipoles - bending
– Quadruples - focusing
– Kickers – inject/extract
• Radio-Freq. cavities - to accelerate
• Beam diagnostics (intensity, position)
• Timing & synchronization systems
• Orbit feedback & steering systems
• Radiation monitors
• Security systems
• Vacuum, cooling & ventilation
• High voltage electrical systems
• Specialized controls electronics
RF
Bending
Quads
Kicker
Diagnostics
Hardware
Controls
9
In practice
• Thousands of different devices
• Hundreds of properties each
LEIR
10
Low Energy
Ion Ring
@CERN
Controls Data Logging
• Equipment readings (acquisitions or settings)
– Not physics data from experiments!
– Needed to operate the accelerators
• Online monitoring & ad-hoc queries
– Alarms, device/system failures
– Beam degradation or dumps
• Offline analytics & studies
– Beam quality improvements
– New beam types
– Future experiments or machines
11
CERN Accelerator Logging Service
• Old system (CALS) based on Oracle (2 DBs)
– ~20,000 devices (from ~120,000 devices)
– 1,500,000 signals
– 5,000,000 extractions per day
– 71,000,000,000 records per day
• 2 TB / day (unfiltered data, 2 DBs)
– 1 PB of total data storage (heavily filtered in 95%)
12
Current Issues With CALS
• Performance / scalability problems
– Difficult to scale horizontally
– “… to extract 24h of data takes 12h”
• CALS & tools not ready for Big Data!
– Have to extract subsets of data to do analysis!
13
Upcoming Challenges
• Future big scientific projects
– High Luminosity LHC (HL-LHC)
– Future Circular Collider (FCC)
• Important increase of data loads
– Data frequency increase (from 1Hz to 100Hz)
– Much bigger vectors
– Limited filtering
14
HL-LHC Luminosity Forecast
15
Current Controls Data Storage
16
Run 1 Run 2LS 1
900 GB/day
Big Data
For Controls?
17
CALS
Impala
Kudu
Next CALS
(NXCALS)
Query Performance Results
18
0
50
100
150
200
250
23000 57747 230000 1350000 2650000
Time(s)
Number of Records To Process
Oracle Impala Spark
Lower is better
Disclaimer: Specific join on controls data
Python on Jupyter Platform
Data Science & Scientific Computing
19
CERN On-Premise Cloud
20
250,000 cores available
NXCALS Architecture
Spark
Lo
g.
Pr
oc.
Datasources
21
Jupyter
Old API
NXCALS
API
ETL
Kafka
HBase
HDFS
Avro
Parquet
Hadoop
API
Meta-data service DB
Scientists
Programmers
Applications
Clients
Future Directions
• Visualization for Analytics
• Service for Web based Analysis (SWAN)
– With NXCALS for Accelerator Controls data…
– …and Spark as first class citizen
• Unified Software Platform
• Interactive data analysis in the cloud
22
Conclusions
• Technology has to answer scientific needs
• Big Data is present in Accelerator Controls
• CERN started a sector-wide joint effort
– NXCALS (Controls Big Data Storage)
– SWAN (Analysis & Visualization)
• To promote and support science
• Spark is our choice for data extraction & analytics!
23
Links
• NXCALS - https://gitlab.cern.ch/acc-logging-team/nxcals
• Please come to the technical talk today afternoon!
• SWAN - https://swan.web.cern.ch/
• Science, accelerators, experiments
– https://greybook.cern.ch
– https://home.cern/about/accelerators
– PS virtual visit: https://goo.gl/1zqSTA
• Lectures from accelerator schools, summer courses
• http://cas.web.cern.ch/previous-schools
• https://indico.cern.ch/category/345/
• Beams Department, Controls Group - https://be-dep-co.web.cern.ch/
• Accelerators state - https://op-webtools.web.cern.ch/vistar/vistars.php?usr=SPS1
• Take Part: https://jobs.web.cern.ch/
24

Contenu connexe

Tendances

Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark Summit
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Spark Summit
 
xPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, TachyonxPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, TachyonClaudiu Barbura
 
Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit
 
Managing Apache Spark Workload and Automatic Optimizing
Managing Apache Spark Workload and Automatic OptimizingManaging Apache Spark Workload and Automatic Optimizing
Managing Apache Spark Workload and Automatic OptimizingDatabricks
 
Processing 70Tb Of Genomics Data With ADAM And Toil
Processing 70Tb Of Genomics Data With ADAM And ToilProcessing 70Tb Of Genomics Data With ADAM And Toil
Processing 70Tb Of Genomics Data With ADAM And ToilSpark Summit
 
Monitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisMonitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisBrendan Gregg
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...NETWAYS
 
Continuous Application with FAIR Scheduler with Robert Xue
Continuous Application with FAIR Scheduler with Robert XueContinuous Application with FAIR Scheduler with Robert Xue
Continuous Application with FAIR Scheduler with Robert XueDatabricks
 
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Spark Summit
 
Suneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4JSuneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4JFlink Forward
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Spark Summit
 
Temporal Operators For Spark Streaming And Its Application For Office365 Serv...
Temporal Operators For Spark Streaming And Its Application For Office365 Serv...Temporal Operators For Spark Streaming And Its Application For Office365 Serv...
Temporal Operators For Spark Streaming And Its Application For Office365 Serv...Jen Aman
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengDatabricks
 
Apache Storm and Oracle Event Processing for Real-time Analytics
Apache Storm and Oracle Event Processing for Real-time AnalyticsApache Storm and Oracle Event Processing for Real-time Analytics
Apache Storm and Oracle Event Processing for Real-time AnalyticsPrabhu Thukkaram
 
Spark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital KediaSpark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital KediaSpark Summit
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Spark Summit
 
Scaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersScaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersJen Aman
 
Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksDatabricks
 
Low Latency Execution For Apache Spark
Low Latency Execution For Apache SparkLow Latency Execution For Apache Spark
Low Latency Execution For Apache SparkJen Aman
 

Tendances (20)

Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
 
xPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, TachyonxPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, Tachyon
 
Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca Canali
 
Managing Apache Spark Workload and Automatic Optimizing
Managing Apache Spark Workload and Automatic OptimizingManaging Apache Spark Workload and Automatic Optimizing
Managing Apache Spark Workload and Automatic Optimizing
 
Processing 70Tb Of Genomics Data With ADAM And Toil
Processing 70Tb Of Genomics Data With ADAM And ToilProcessing 70Tb Of Genomics Data With ADAM And Toil
Processing 70Tb Of Genomics Data With ADAM And Toil
 
Monitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance AnalysisMonitorama 2015 Netflix Instance Analysis
Monitorama 2015 Netflix Instance Analysis
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
 
Continuous Application with FAIR Scheduler with Robert Xue
Continuous Application with FAIR Scheduler with Robert XueContinuous Application with FAIR Scheduler with Robert Xue
Continuous Application with FAIR Scheduler with Robert Xue
 
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
 
Suneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4JSuneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4J
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
 
Temporal Operators For Spark Streaming And Its Application For Office365 Serv...
Temporal Operators For Spark Streaming And Its Application For Office365 Serv...Temporal Operators For Spark Streaming And Its Application For Office365 Serv...
Temporal Operators For Spark Streaming And Its Application For Office365 Serv...
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Apache Storm and Oracle Event Processing for Real-time Analytics
Apache Storm and Oracle Event Processing for Real-time AnalyticsApache Storm and Oracle Event Processing for Real-time Analytics
Apache Storm and Oracle Event Processing for Real-time Analytics
 
Spark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital KediaSpark Summit EU talk by Sital Kedia
Spark Summit EU talk by Sital Kedia
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 
Scaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of ParametersScaling Machine Learning To Billions Of Parameters
Scaling Machine Learning To Billions Of Parameters
 
Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
 
Low Latency Execution For Apache Spark
Low Latency Execution For Apache SparkLow Latency Execution For Apache Spark
Low Latency Execution For Apache Spark
 

Similaire à The Next CERN Accelerator Logging Service—A Road to Big Data with Jakub Wozniak (CERN)

How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicTim Bell
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
 
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...Databricks
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big DiscoveriesGovnet Events
 
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERNServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERNRené Haeberlin
 
Seismic sensor
Seismic sensorSeismic sensor
Seismic sensorajsatienza
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Databricks
 
Nuc chemsummerschool 072106-v2
Nuc chemsummerschool 072106-v2Nuc chemsummerschool 072106-v2
Nuc chemsummerschool 072106-v2Dean Quezon
 
The Netherlands China Low Frequency Explorer (NCLE) Change'-4
The Netherlands China Low Frequency Explorer (NCLE) Change'-4The Netherlands China Low Frequency Explorer (NCLE) Change'-4
The Netherlands China Low Frequency Explorer (NCLE) Change'-4ILOAHawaii
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineeringinside-BigData.com
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenNETWAYS
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsAndrew Lowe
 
The Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering UpdateThe Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering UpdateJoint ALMA Observatory
 
The CLIC project accelerator overview
The CLIC project accelerator overviewThe CLIC project accelerator overview
The CLIC project accelerator overviewasafrona
 
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...PyData
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraLarry Smarr
 
HNSciCloud represented at HUAWEI CONNECT 2017 in Shanghai
HNSciCloud represented at HUAWEI CONNECT 2017 in ShanghaiHNSciCloud represented at HUAWEI CONNECT 2017 in Shanghai
HNSciCloud represented at HUAWEI CONNECT 2017 in ShanghaiHelix Nebula The Science Cloud
 

Similaire à The Next CERN Accelerator Logging Service—A Road to Big Data with Jakub Wozniak (CERN) (20)

How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
The OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack NordicThe OpenStack Cloud at CERN - OpenStack Nordic
The OpenStack Cloud at CERN - OpenStack Nordic
 
DA-JPL-final
DA-JPL-finalDA-JPL-final
DA-JPL-final
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...
CERN’s Next Generation Data Analysis Platform with Apache Spark with Enric Te...
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big Discoveries
 
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERNServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
ServiceNow Event 15.11.2012 / ITIL for the Enterprise @CERN
 
Seismic sensor
Seismic sensorSeismic sensor
Seismic sensor
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
Nuc chemsummerschool 072106-v2
Nuc chemsummerschool 072106-v2Nuc chemsummerschool 072106-v2
Nuc chemsummerschool 072106-v2
 
The Netherlands China Low Frequency Explorer (NCLE) Change'-4
The Netherlands China Low Frequency Explorer (NCLE) Change'-4The Netherlands China Low Frequency Explorer (NCLE) Change'-4
The Netherlands China Low Frequency Explorer (NCLE) Change'-4
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineering
 
OSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe HaenOSMC 2012 | Monitoring at CERN by Christophe Haen
OSMC 2012 | Monitoring at CERN by Christophe Haen
 
Jarp big data_sydney_v7
Jarp big data_sydney_v7Jarp big data_sydney_v7
Jarp big data_sydney_v7
 
Big Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle PhysicsBig Fast Data in High-Energy Particle Physics
Big Fast Data in High-Energy Particle Physics
 
The Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering UpdateThe Square Kilometre Array: Overview and Engineering Update
The Square Kilometre Array: Overview and Engineering Update
 
The CLIC project accelerator overview
The CLIC project accelerator overviewThe CLIC project accelerator overview
The CLIC project accelerator overview
 
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
 
HNSciCloud represented at HUAWEI CONNECT 2017 in Shanghai
HNSciCloud represented at HUAWEI CONNECT 2017 in ShanghaiHNSciCloud represented at HUAWEI CONNECT 2017 in Shanghai
HNSciCloud represented at HUAWEI CONNECT 2017 in Shanghai
 

Plus de Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang WuSpark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya RaghavendraSpark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimSpark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraSpark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovSpark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkSpark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...Spark Summit
 

Plus de Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Dernier

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 

Dernier (20)

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 

The Next CERN Accelerator Logging Service—A Road to Big Data with Jakub Wozniak (CERN)

  • 1. Jakub Wozniak, CERN Next CERN Accelerator Logging Service A road to Big Data #SparkSummit
  • 4. Future Circular Collider (FCC) 4 Plans for 80-100 km tunnel under Geneva Lake and Alps FCC
  • 6. 6 1 PB / second of events (raw physics data) Total stored beam energy 116e09 p x 2800 bunches 50MeV 6.5TeV Proton Energy LHCBeam
  • 7. Accelerator Timing • Beam time shared per experiment in cycles – of 1.2 sec, 2.4 sec, 3.6 sec, ... 7 Exp1 Exp2 Exp1 Exp3 Exp3 Exp4 Exp1
  • 8. Accelerator Hardware 8 • Magnets – Dipoles - bending – Quadruples - focusing – Kickers – inject/extract • Radio-Freq. cavities - to accelerate • Beam diagnostics (intensity, position) • Timing & synchronization systems • Orbit feedback & steering systems • Radiation monitors • Security systems • Vacuum, cooling & ventilation • High voltage electrical systems • Specialized controls electronics RF Bending Quads Kicker Diagnostics
  • 9. Hardware Controls 9 In practice • Thousands of different devices • Hundreds of properties each
  • 11. Controls Data Logging • Equipment readings (acquisitions or settings) – Not physics data from experiments! – Needed to operate the accelerators • Online monitoring & ad-hoc queries – Alarms, device/system failures – Beam degradation or dumps • Offline analytics & studies – Beam quality improvements – New beam types – Future experiments or machines 11
  • 12. CERN Accelerator Logging Service • Old system (CALS) based on Oracle (2 DBs) – ~20,000 devices (from ~120,000 devices) – 1,500,000 signals – 5,000,000 extractions per day – 71,000,000,000 records per day • 2 TB / day (unfiltered data, 2 DBs) – 1 PB of total data storage (heavily filtered in 95%) 12
  • 13. Current Issues With CALS • Performance / scalability problems – Difficult to scale horizontally – “… to extract 24h of data takes 12h” • CALS & tools not ready for Big Data! – Have to extract subsets of data to do analysis! 13
  • 14. Upcoming Challenges • Future big scientific projects – High Luminosity LHC (HL-LHC) – Future Circular Collider (FCC) • Important increase of data loads – Data frequency increase (from 1Hz to 100Hz) – Much bigger vectors – Limited filtering 14
  • 16. Current Controls Data Storage 16 Run 1 Run 2LS 1 900 GB/day
  • 18. Query Performance Results 18 0 50 100 150 200 250 23000 57747 230000 1350000 2650000 Time(s) Number of Records To Process Oracle Impala Spark Lower is better Disclaimer: Specific join on controls data
  • 19. Python on Jupyter Platform Data Science & Scientific Computing 19
  • 22. Future Directions • Visualization for Analytics • Service for Web based Analysis (SWAN) – With NXCALS for Accelerator Controls data… – …and Spark as first class citizen • Unified Software Platform • Interactive data analysis in the cloud 22
  • 23. Conclusions • Technology has to answer scientific needs • Big Data is present in Accelerator Controls • CERN started a sector-wide joint effort – NXCALS (Controls Big Data Storage) – SWAN (Analysis & Visualization) • To promote and support science • Spark is our choice for data extraction & analytics! 23
  • 24. Links • NXCALS - https://gitlab.cern.ch/acc-logging-team/nxcals • Please come to the technical talk today afternoon! • SWAN - https://swan.web.cern.ch/ • Science, accelerators, experiments – https://greybook.cern.ch – https://home.cern/about/accelerators – PS virtual visit: https://goo.gl/1zqSTA • Lectures from accelerator schools, summer courses • http://cas.web.cern.ch/previous-schools • https://indico.cern.ch/category/345/ • Beams Department, Controls Group - https://be-dep-co.web.cern.ch/ • Accelerators state - https://op-webtools.web.cern.ch/vistar/vistars.php?usr=SPS1 • Take Part: https://jobs.web.cern.ch/ 24

Notes de l'éditeur

  1. Speak about CERN, accelerators, the beam delivery process and how it relates to big data in Controls.
  2. Focused on Research, Technology & Education in the domain of Fundamental Physics, Standard Model & basic building blocks of matter for which accelerators serve as giant microscopes. Located at Franco/Swiss boarder near Geneva Funded by 22 member states with 2.5 k employees and 10k users from all over the world forming unique international community and very interesting environment to work in. It is home for LHC which is the biggest and the most complex accelerator in the world where in it giant detectors we could prove the existence of Higgs boson.
  3. This is how it is laid out on map, big ring is the LHC, smaller is the SPS, others could barely be seen. It spreads from the Jura mointains and Geneva Lake on the other side. You can see also the 2 main CERN sites.
  4. LHC is not the end of it, might be just the beginning as there are plans to build even greater FCC with a tunnel of 100km long that will go under the lake and spread from Jura mountains to the Alps. This is all still in plans, let’s hope this accelerator will be build before we all retire.
  5. Coming back to what is CERN now, this is the pictogram of the whole complex with all accelerators and all experiments. The biggest LHC related ones are here but it is not all about LHC, there is number of other accelerators & experimental sites. The core business of CERN is to deliver a quality beam to those site in order for the scientists to perform their experimental physics there.
  6. It all starts with the hydrogen bottle from which the protons are extracted and accelerated towards the booster. The booster is the acc with 4 rings. This is where the protons are grouped together into so called bunches. In PS those bunches are shaped into their final form in terms of bunch spacing, bunch length or bunch number. The beam gets extracted towards the SPS, accelerated further and finally injected into the LHC. The LHC has 2 beam pipes and we accelerate the beam into the opposite directions. Later on we collide it at the experimental sites. During this process the proton energy goes from 50MeV to 6.5TeV which is very small energy comparable with flying mosquito. But when you sum up all the protons and all the bunches the total stored beam energy is comparable with Aircraft carrier cruising at 5 knots. It takes around 20 minutes to fill up the LHC with all needed bunches so the process is repeated many times. At the same time the beam also gets delivered to other experiments. The interesting fact is the number of events that get generated from collisions where we enter the LHC physics state.
  7. How do we deliver the beam to different experiments at the same time? Beam time shared between experiments in so called cycles that are time intervals multiplication of 1.2 seconds. Each cycle a new beam is created and delivered to a different experimental site.
  8. Now what are the building blocks for a typical accelerator? What equipment is needed? CERN designs & produces that equipment
  9. Those are the device readings needed to operate the machines, this is not the experiments data, it is way before we get into the physics there. This is needed to do the online monitoring & controls – respond to the events that happen every second in the accelerator chain. And to do various scheduled studies (offline analytics)
  10. We are throwing away 95% of the data.
  11. How to extract specific subset of data – say an example.
  12. When let’s take a look into the Luminosity Forecast (luminosity is a measurement – a figure of merit for accelerators). We have this chart going up to 2038 but we are only here at this red bar, end of 2017. Here we really have still at nothing but a relatively flat bottom and the increase is ahead of us.
  13. But what is actually means for the Controls Data taking process? Dramatic data growth now in Run2 compared to Run1. We take 900 GB per day compared with less than 200 in Run1. We are confronted with a situation which our systems where not really prepared for.
  14. We had to react and apply some modern big data tools to go away from Oracle and build a Next Logging Service (NXCALS).
  15. This is an example taken from the prove of concept result where Spark showed itself as one of the best tools on the market to do this kind of and extraction for Controls Data.
  16. We’ve observed the emerging adoption of Python and Jupyter notebooks that is happening at CERN and other institute for data science and scientific computing. That is setting the scene and showing directions for how we should present out controls data.
  17. We also wanted to use more the existing on-premise cloud available with a large number of cores that was not so much used up to now for beam Controls systems.
  18. This is the new system architecture where we use tools like Kafka and pump the data to Hadoop into Hbase and Parquet files. We also use Spark and Jupyter to extract and present this data to our users. Please do come to the technical presentation where I go into the details for this system.
  19. A lot more coming, the visualization is shaping its way through a new project at CERN that is called SWAN based on Jupyter notebooks & Python. We hope that together with NXCALS and Spark it will form a Unified Software Platform for interactive data analysis in the cloud. It has a lot of potential and we have high hopes for it to materialize. I do look forward to coming back and talking about how this get together. This would be a great thing to have this unification.