Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Big Data Europe Transport Pilot case, Luigi Selmi

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 22 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Big Data Europe Transport Pilot case, Luigi Selmi (20)

Publicité

Plus par BigData_Europe (20)

Plus récents (20)

Publicité

Big Data Europe Transport Pilot case, Luigi Selmi

  1. 1. Pilot SC4 L. Selmi - BDE - SC4 Webinar BDE SC4 02.12.2016
  2. 2. Objective of the Pilot SC4 L. Selmi - BDE - SC4 Webinar A scalable, fault-tolerant and flexible platform based on open source frameworks that can process unbounded data sets and graphs.
  3. 3. Microservice Architecture L. Selmi - BDE - SC4 Webinar
  4. 4. Message Broker L. Selmi - BDE - SC4 Webinar Apache Kafka is a high-throughput distributed durable messaging system Apache Kafka
  5. 5. Kafka Cluster L. Selmi - BDE - SC4 Webinar Apache Kafka
  6. 6. Stream and Batch Processor L. Selmi - BDE - SC4 Webinar Apache Flink is an open source platform for distributed stream and batch data processing. Apache Flink
  7. 7. Flink Cluster L. Selmi - BDE - SC4 Webinar Apache Flink
  8. 8. Storage and Indexing L. Selmi - BDE - SC4 Webinar PostGis is a spatial database that stores the road network data. Elasticsearch is a distributed open source document database built on top of Apache Lucene. It stores the result of the workflow.
  9. 9. Elasticsearch Cluster L. Selmi - BDE - SC4 Webinar
  10. 10. Pilot Architecture L. Selmi - BDE - SC4 Webinar
  11. 11. BDE Components L. Selmi - BDE - SC4 Webinar
  12. 12. The FCD Pipeline L. Selmi - BDE - SC4 Webinar
  13. 13. Visualization L. Selmi - BDE - SC4 Webinar The pilot SC4 can process real-time FCD data for map-matching and classify a road segment according to the traffic level.
  14. 14. Distributed computing: the theoretical minimum L. Selmi - BDE - SC4 Webinar Minimum requirement for fault- tolerance and scalability ● Cluster of 3 nodes (Docker swarm) ● 4 CPU cores x node ● 1 (Flink) worker x node ● 1 (Flink) slot x CPU core Max parallelism = 12
  15. 15. Parallelization: map-match subtasks L. Selmi - BDE - SC4 Webinar 1. source() 2. mapMatch() 3. keyBy()/window()/apply() 4. sink() The subtasks can be distributed in slots with different parallelism (e.g. from 1 to 12)
  16. 16. Parallelization: Flink dataflow L. Selmi - BDE - SC4 Webinar A slot can process all the subtasks in a pipeline
  17. 17. Parallelization: input and output data L. Selmi - BDE - SC4 Webinar device_id timestamp lat lon speed orientation transit The mapMatch subtask keeps the time order so that the next task keyBy(road_seg)/window(15’)/apply() will return the correct average speed and number of vehicles within the time window for each road segment. road_seg_id start_date num_vehicles avg_speed
  18. 18. Pilot Cycle 2 Targets L. Selmi - BDE - SC4 Webinar ● Extend the functionalities ● Improve the technology ● Lower the boundaries
  19. 19. Cycle 2 - Extend the functionalities L. Selmi - BDE - SC4 Webinar Short-term traffic forecasts 1. Map-match 44 Gb of historical Floating Car Data from CERTH (Thessaloniki) 2. Train a model (using ANN) 3. Make predictions using the model and the near real-time data
  20. 20. Cycle 2 - Improve the technology L. Selmi - BDE - SC4 Webinar ● Improve the map-matching algorithm ● Parallelize the processing of the historical data ● Finalizing the “dockerization” of the components
  21. 21. Cycle 2 - Lower the boundaries L. Selmi - BDE - SC4 Webinar ● Set up different visualizations for traffic monitoring and forecasting ● Visualize the traffic pattern in a road segment ● Visualize a location of a vehicle and the matched road segment (for tests)
  22. 22. Thanks L. Selmi - BDE - SC4 Webinar BDE project website: https://www.big-data-europe.eu/ Code repository: https://github.com/big-data-europe Contact: luigi.selmi@iais.fraunhofer.de

×