BDE SC6 workshop - introduction 2016

Publié le

Presentation given for the introduction of the BDE SC6 Pilot Workshop held at GESIS in Cologne on the 5th of December.

Publié dans : Données & analyses
  1. 1. THE BIG DATA EUROPE PROJECT: STATUS & NEXT STEPS SC6 Workshop, Cologne 05 December 2016
  2. 2. Supporting the Societal Domains with Big Data Technology BigDataEurope Project 16-déc.-16www.big-data-europe.eu
  3. 3. Stakeholder Engagement Cycle  Present action, showcase deployments  Raise awareness about BDE results, what they mean for stakeholders  Collect requirements to drive further development 16-déc.-16 www.big-data-europe.eu M12M6 M18 M24 M30
  4. 4. Data Value Chain Evolution 16-déc.-16 Extraction, Curation Quality, Linking, Integration Publication, Visualization, Analysis Extraction, Curation, Quality, Linking, Integration, Publication, Visualization, Analysis Health Transport Security Extraction Curation Quality Linking Integration Publication Visualization Analysis Data Repositories Linked Open Data TIME Food SocietiesClimate Energy Proprietary, ‘locked-in’ solutions OS Solutions, Big Data Stackswww.big-data-europe.eu
  5. 5. A flexible, generic platform for (Big) Data Value Chain Deployment Big Data Integrator 16-déc.-16www.big-data-europe.eu
  6. 6. Big Data Integrator  Prototype developed by BDE o Incorporates existing BD technology o Facilitates integration and deployment  Main points of the architecture o Dockerization o Support layer, including integrated UI o Semantification layer 16-déc.-16www.big-data-europe.eu
  7. 7. Generic Architecture 16-déc.-16www.big-data-europe.eu  Plug-and-play BD Platform  Cloud-deployment ready  Domain independent, Customisable  Stacks Open Source solutions BDI Prototype Releases 1. [July 2016] 2. December 2016 3. ….
  8. 8. Demonstrating the Societal Value through 7 Pilot ‘Real-world’ use-cases BigDataEurope Pilots 16-déc.-16www.big-data-europe.eu
  9. 9. 7 Pilots ◎ BDI Platform Instantiations o Allow end-users to easily deploy functionality in own system environment o Modularized Docker approach - easier to replace components o Reduces effort to keep 3rd party software updated & integrated ◎ 7 Societal Challenge Pilots o Aligned with 7 European Commision H2020 Societal Challenges o Real-world use-cases (Data, Objectives, Solutions) o Some pilots have different data & objectives but a similar16-déc.-16www.big-data-europe.eu
  10. 10. SC1: Pharmacology research 16-déc.-16 www.big-data-europe.eu Life Science s & Health • Query a large number of datasets, some large • Existing elaborate ingestion and homogenization by OpenPHACTS • Extensive toolset developed by OPF and others Objective: Large-scale heterogeneous pharma-research data linking & integration
  11. 11. SC1: Architecture & Components 16-déc.-16www.big-data-europe.eu • Replicate Open PHACTS functionality on the BDE infrastructure using OS solutions • Based on Virtuoso, proprietary distributed database • Apply to other domains (e.g. Agriculture) • Porting to BDI gives flexibility and enables new functionalities • Logging & system health monitoring
  12. 12. SC2: Viticulture resources 16-déc.-16www.big-data-europe.eu Food and Agricultur e Objective: Automate publication ingestion and thematic classification• AgInfra is a major infrastructure for agriculture researchers, serving cross- linked bibliography, data, and processing services
  13. 13. www.big-data-europe.eu SC2: Architecture & Components • BDI deployed as an external infrastructure for processing text (viticulture publications) • Storing and processing text at a larger scale than AgInfra can currently manage
  14. 14. SC3: Predictive maintenance 16-déc.-16www.big-data-europe.eu Energy • Wind turbine monitoring applies computational models to sensor data streams • Models are weekly re- parameterized using week’s data from multiple turbines Objective: Real-time turbine monitoring stream processing and analytics
  15. 15. www.big-data-europe.eu • Existing in-house non-scalable solution for model parameterization • Reliable Fortran software for data analysis • Efficient, but not scalable to data volume • Developing a BDI orchestrator • Re-uses existing software unmodified • Makes it easy to apply in parallel to many datasets and manage the outputs SC3: Architecture & Components
  16. 16. SC4: Traffic conditions estimation 16-déc.-16www.big-data-europe.eu Transpor t • Combines: • Traffic modelling from historical data • Current measurements from a taxi fleet of 1200 vehicles Objective: Estimation of real-time traffic conditions in Thessaloniki
  17. 17. 16-déc.-16www.big-data-europe.eu • New Flink implementations of map matching and traffic prediction algorithms • BDI provides access to varied data sources • PostGIS database with city map • ElasticSearch database of historical data • Kafka stream of real- time data SC4: Architecture & Components
  18. 18. SC5: Climate modelling 16-déc.-16www.big-data-europe.eu Climate • Preparing modelling experiments • Slicing, transforming, combining datasets • Submission and retrieval from modelling infrastructure • Discovering and re-using previously computed derivatives • Lineage annotation: computer derivatives from datasets and model parameters • Finding appropriate past runs avoids Objective: Supporting data-intensive climate research
  19. 19. • BDI offers: • Hive for managing data in a way that can be retrieved and manipulated, rather than file blocks • Cassandra stores structured and textual metadata for searching headers and lineage • Existing infrastructure; stable, reliable software for parallel computation of models • BDI is deployed as an external infrastructure for preparing and managing datasets SC5: Architecture & Components
  20. 20. SC6: Municipality budgets 16-déc.-16www.big-data-europe.eu Social Science s • Ingestion of budget and budget execution data • Multiple municipalities in varied formats and data models Objective: Homogenized Budgetary data made available for analysis and comparison
  21. 21. 16-déc.-16www.big-data-europe.eu • BDI deployed as ingestion and storage infrastructure for external tools • Homogenizes variety of data (JSON, CSV, XML, etc.) • Exposes data as SPARQL endpoint serving homogenized data • Existing analytics and visualization tools • Use SPARQL queries to retrieve only the relevant slices of the overall data SC6: Architecture & Components
  22. 22. SC7: Change detection & verification 16-déc.-16www.big-data-europe.eu Secure Societie s • Events are extracted from text published by news agencies and on social networking sites • Events are geo-located and relevant changes are detected by comparing current and previous satellite images Objective: Detect and Verify Events based on Satellite Imagery, News and Social Media
  23. 23. 16-déc.-16www.big-data-europe.eu Event Detection Change Detection • Re-implementation of change detection algorithms for Spark • Parallel orchestrator for text analytics • Re-uses existing software • Scales to many input streams • BDI provides: • Cassandra for text content and metadata • Strabon GIS store for detected change location • Homogeneous access to both for analysis and visualization SC7: Architecture & Components
  24. 24. Free Workshops, Hangouts & Webinars BigDataEurope Activities 16-déc.-16www.big-data-europe.eu
  25. 25. 2nd round of Societal Workshops 16-déc.-16www.big-data-europe.eu Transport 22 September 2016 Brussel s Collocated with Big Data for Transport, Tisa workshop Food&Agri 30 September 2016 Brussel s Collocated with DG AGRI WP2018-20 stakeholder consultation Energy 4 October 2016 Brussel s Collocated with EC H2020 Info Day on “Smart Grids and Storage” Climate 11 October 2016 Brussel s Collocated with Melodies Project Event – Exploiting Open Data Security 18 October 2016 Brussel s Standalone Workshop Societies 5 December 2016 Cologne Collocated with EDDI16- 8th
  26. 26. Other Activities  Fresh set (7) of Societal Workshops in 2017  Various SC-focussed and general hangouts, follow! o General (technical): 2 this year More to follow! o SC6: 2 so far, next in the next weeks o Recordings & Presentations available online! o Keep track on BDE Website (Events) 16-déc.-16www.big-data-europe.eu
  27. 27. WEB: www.big-data-europe.eu EMAIL: info@big-data- europe.eu BIG DATA INTEGRATOR www.github.com/big-data-europe PROJECT COORDINATION (Fraunhofer IAIS) Prof. Sören Auer, auer © cs.uni-bonn · de > Dr. Simon Scerri, scerri © cs.uni-bonn · de EIS Department/Group, Questions & Contacts www.big-data-europe.eu 16-déc.-16 #BigDataEurope leads the Fraunhofer Big Data Alliance