2. Objective of the Pilot SC4
L. Selmi - BDE - SC4 Webinar
A scalable, fault-tolerant and flexible
platform based on open source frameworks
that can process unbounded data sets and
graphs.
6. Stream and Batch
Processor
L. Selmi - BDE - SC4 Webinar
Apache Flink is an open source
platform for distributed stream and
batch data processing.
Apache Flink
8. Storage and Indexing
L. Selmi - BDE - SC4 Webinar
PostGis is a spatial database that
stores the road network data.
Elasticsearch is a distributed open
source document database built on
top of Apache Lucene. It stores the
result of the workflow.
13. Visualization
L. Selmi - BDE - SC4 Webinar
The pilot SC4 can process
real-time FCD data for
map-matching and classify
a road segment according
to the traffic level.
14. Distributed computing: the theoretical
minimum
L. Selmi - BDE - SC4 Webinar
Minimum requirement for fault-
tolerance and scalability
● Cluster of 3 nodes (Docker swarm)
● 4 CPU cores x node
● 1 (Flink) worker x node
● 1 (Flink) slot x CPU core
Max parallelism = 12
15. Parallelization: map-match subtasks
L. Selmi - BDE - SC4 Webinar
1. source()
2. mapMatch()
3. keyBy()/window()/apply()
4. sink()
The subtasks can be distributed in
slots with different parallelism (e.g.
from 1 to 12)
17. Parallelization: input and output data
L. Selmi - BDE - SC4 Webinar
device_id timestamp lat lon speed orientation transit
The mapMatch subtask keeps the time
order so that the next task
keyBy(road_seg)/window(15’)/apply() will
return the correct average speed and
number of vehicles within the time window
for each road segment.
road_seg_id start_date num_vehicles avg_speed
18. Pilot Cycle 2 Targets
L. Selmi - BDE - SC4 Webinar
● Extend the functionalities
● Improve the technology
● Lower the boundaries
19. Cycle 2 - Extend the functionalities
L. Selmi - BDE - SC4 Webinar
Short-term traffic forecasts
1. Map-match 44 Gb of historical
Floating Car Data from CERTH
(Thessaloniki)
2. Train a model (using ANN)
3. Make predictions using the model
and the near real-time data
20. Cycle 2 - Improve the technology
L. Selmi - BDE - SC4 Webinar
● Improve the map-matching
algorithm
● Parallelize the processing of the
historical data
● Finalizing the “dockerization” of the
components
21. Cycle 2 - Lower the boundaries
L. Selmi - BDE - SC4 Webinar
● Set up different visualizations for
traffic monitoring and forecasting
● Visualize the traffic pattern in a
road segment
● Visualize a location of a vehicle and
the matched road segment (for
tests)