SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics
Sergio Ballesteros, TomTom
Kia Eisinga, TomTom
Driver Location Intelligence at
Scale using Apache Spark, Delta
Lake and MLflow on Databricks
#UnifiedDataAnalytics #SparkAISummit
Ourvision
A safe, connected, autonomous world that is free of
congestion and emissions.
4
Bigdatadrivesour
business,but
dataprivacyalways
comesfirst
Data
• Anonymous location (GPS) traces
5
742.000.000kmevery day
18.000 x
6
7
Data
• Anonymous location (GPS) Traces
• Community inputs
• User events
• Journalistic data
• Car sensor data
8
Dataflow
9
~150 trillion
data points
~80 billion data
points per day
Dataflow
10
Dataflow
11
In dash systems are outperformed by smartphones
The embedded systemis expected to be up-to-date, with no user interaction. And the most visible component of it is a
map.
Usecase1:IQMapsanalytics
12
Driversdonotupdatetheirmaps
Today’s solutions provide manual updates,
oftenwith a necessity to drive to the dealer.
This is way too complex and inefficient.
13
14
OEMsrequire dataefficient
solutions
While drivers expect up-to-date system, the carmakers
are usually concerned about the data cost required for
the map management.
15
98% OF TRIPS ARE DRIVEN WITHIN150KM RADIUS99.8% OF TRIPS ARE DRIVEN WITHIN1000KM RADIUS
16
Whenradiusis0km
• User drives within 2 regions every week day
• Radius of 0 km.
• Download and install justhome regions
• Cellular data usage kept to a minimum
17
Whenradiusis150km
• User drives within 2 update regions every
week day
• Radius of 150 km.
• Home region: 6 update regions.
• Cellular data usage increased
18
IQMapsdemowithMLflow
19
20
Realresultsusing0.5Mtrips
21
“This insight has led me to the conclusion
that a default radius of 150km is
unnecessary, and a small radius of ~10km
would already satisfy mostdrivers while
keeping cellular data usage low for OEMs.”
- Rolf Dorland, PM at TomTom
Goingonholidays
• User goes for his holiday (less frequent
updated region)
• Once user starts driving, updates for all
update regions the route goes through are
downloaded and installed.
22
23
Destinationprediction
24
Opportunity
25
Past: Rule-based solution
Delta Lake pipelines
Present: Machine Learning
Data
26
Original trace data from 1 source
227K device serials
Filtering out invalid trips
143K device serials
Users with at least 50 trips
3.6K device serials
Devices feasible for modelling
2.5K device serials
Features
For each trip, we have the following information:
• Where did the trip start?
• At what speed were you driving when the trip started?
• What was the time of day (morning/afternoon/evening) when the trip started?
• Was it rush hour when the trip started?
• What day of the week was it?
• Was it a weekend day?
• What was the season?
• Which driver profile do you belong to?
Historical information:
• Which destination did you go to your last trip? And the one before that? And the one before that?
• If it is a, let's say Monday, where did you go to the last Monday you made a trip? (do this for every weekday)
To predict: To which destination are you going?
What do we use in the end?
27
Labels
• We are given the latitude and longitude of a destination
of a trip.
• In order to find out which latitude and longitudes belong
to the same destination, we apply a clustering algorithm
called DBSCAN.
• DBSCAN clusters together destinations that are within
500 meters from each other. We should have at least 5
trips to a destination in order to call it a cluster.
How do we define where you are going?
28
29
Train,validationandtestsplit
Trip ID Date Destination
Trip 1 January1 Cluster1
Trip 2 January22 Cluster2
Trip 3 February3 Cluster1
Trip 4 February15 Cluster2
Trip 5 March 2 Cluster1
Trip 6 March 14 Cluster1
Trip 7 March 27 Cluster2
Trip 8 April 4 Cluster1
Trip 9 April 16 Cluster2
Trip 10 May 8 Cluster1
Train
& validation
dataset
Test
dataset
TIME-SERIES CROSS-VALIDATION
Iterativeevaluation of the trips to
avoid overfitting
Trip ID Date Destination
Trip 1 January1 Cluster1
Trip 2 January22 Cluster2
Trip 3 February3 ?
Trip ID​ Date​ Destination
Trip 1​ January1​ Cluster1
Trip 2​ January22​ Cluster2
Trip 3​ February3​ Cluster1
Trip 4​ February15 ?
Data for 1 driver:
Trip ID​ Date​ Destination
Trip 1​ January1​ Cluster1
Trip 2​ January22​ Cluster2
Trip 3​ February3​ Cluster1
Trip 4​ February15 Cluster1
… … …
Trip 10 May 8 ?
30
Rapidexperimentation
31
32
Majoritybaseline
Distribution of precision on the test set with a majority baseline classifier
33
Results
Distribution of precision on the test set with a tuned classifier
34
AcceleratingtheFutureofMobility
By embracing Apache Spark, Databricks and the Azure cloud
3535
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

Contenu connexe

Tendances

Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason PohlBuilding a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Spark Summit
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
Databricks
 
Digital Transformation Mindset - More Than Just Technology
Digital Transformation Mindset - More Than Just TechnologyDigital Transformation Mindset - More Than Just Technology
Digital Transformation Mindset - More Than Just Technology
confluent
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Databricks
 
How to Quantify the Value of Kafka in Your Organization
How to Quantify the Value of Kafka in Your Organization How to Quantify the Value of Kafka in Your Organization
How to Quantify the Value of Kafka in Your Organization
confluent
 

Tendances (20)

Spark Summit presentation by Ken Tsai
Spark Summit presentation by Ken TsaiSpark Summit presentation by Ken Tsai
Spark Summit presentation by Ken Tsai
 
JUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingJUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streaming
 
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason PohlBuilding a Just in Time Data Warehouse by Dan Morris and Jason Pohl
Building a Just in Time Data Warehouse by Dan Morris and Jason Pohl
 
Spark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony BaerSpark and the Enterprise by Tony Baer
Spark and the Enterprise by Tony Baer
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
 
Building an ML Tool to predict Article Quality Scores using Delta & MLFlow
Building an ML Tool to predict Article Quality Scores using Delta & MLFlowBuilding an ML Tool to predict Article Quality Scores using Delta & MLFlow
Building an ML Tool to predict Article Quality Scores using Delta & MLFlow
 
Snowplow, Metail and Cascalog
Snowplow, Metail and CascalogSnowplow, Metail and Cascalog
Snowplow, Metail and Cascalog
 
Spark Summit East Keynote by Anjul Bhambhri
Spark Summit East Keynote by Anjul BhambhriSpark Summit East Keynote by Anjul Bhambhri
Spark Summit East Keynote by Anjul Bhambhri
 
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and AnalysisThe Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
 
Digital Transformation Mindset - More Than Just Technology
Digital Transformation Mindset - More Than Just TechnologyDigital Transformation Mindset - More Than Just Technology
Digital Transformation Mindset - More Than Just Technology
 
How to evolve your analytics stack with your business using Snowplow
How to evolve your analytics stack with your business using SnowplowHow to evolve your analytics stack with your business using Snowplow
How to evolve your analytics stack with your business using Snowplow
 
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile AdvertisingTapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 
A taste of Snowplow Analytics data
A taste of Snowplow Analytics dataA taste of Snowplow Analytics data
A taste of Snowplow Analytics data
 
TripleLift: Preparing for a New Programmatic Ad-Tech World
TripleLift: Preparing for a New Programmatic Ad-Tech WorldTripleLift: Preparing for a New Programmatic Ad-Tech World
TripleLift: Preparing for a New Programmatic Ad-Tech World
 
How to Quantify the Value of Kafka in Your Organization
How to Quantify the Value of Kafka in Your Organization How to Quantify the Value of Kafka in Your Organization
How to Quantify the Value of Kafka in Your Organization
 
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
 
Snowplow Analytics: from NoSQL to SQL and back again
Snowplow Analytics: from NoSQL to SQL and back againSnowplow Analytics: from NoSQL to SQL and back again
Snowplow Analytics: from NoSQL to SQL and back again
 
Apply MLOps at Scale
Apply MLOps at ScaleApply MLOps at Scale
Apply MLOps at Scale
 

Similaire à Driver Location Intelligence at Scale using Apache Spark, Delta Lake, and MLflow on Databricks

Similaire à Driver Location Intelligence at Scale using Apache Spark, Delta Lake, and MLflow on Databricks (20)

Automobile Route Matching with Dynamic Time Warping Using PySpark with Cather...
Automobile Route Matching with Dynamic Time Warping Using PySpark with Cather...Automobile Route Matching with Dynamic Time Warping Using PySpark with Cather...
Automobile Route Matching with Dynamic Time Warping Using PySpark with Cather...
 
Truck planning: how to certify the right route
Truck planning: how to certify the right routeTruck planning: how to certify the right route
Truck planning: how to certify the right route
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
 
Towards characterizing international routing detours
Towards characterizing international routing detoursTowards characterizing international routing detours
Towards characterizing international routing detours
 
Nmc ussls charter 2012
Nmc ussls charter 2012Nmc ussls charter 2012
Nmc ussls charter 2012
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
 
2014 CUTC Summer Meeting: David Plazak
2014 CUTC Summer Meeting: David Plazak2014 CUTC Summer Meeting: David Plazak
2014 CUTC Summer Meeting: David Plazak
 
Can valuable ITS data be delivered using camera technologies?
Can valuable ITS data be delivered using camera technologies?Can valuable ITS data be delivered using camera technologies?
Can valuable ITS data be delivered using camera technologies?
 
Big data Europe the transport pilot in Thessaloniki - Josep Maria Salanova
Big data Europe the transport pilot in Thessaloniki - Josep Maria SalanovaBig data Europe the transport pilot in Thessaloniki - Josep Maria Salanova
Big data Europe the transport pilot in Thessaloniki - Josep Maria Salanova
 
IoT beneath your feet - building smart roads and networks
IoT beneath your feet - building smart roads and networksIoT beneath your feet - building smart roads and networks
IoT beneath your feet - building smart roads and networks
 
The data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architecturesThe data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architectures
 
Spark Summit EU talk by Javier Aguedes
Spark Summit EU talk by Javier AguedesSpark Summit EU talk by Javier Aguedes
Spark Summit EU talk by Javier Aguedes
 
Kano vaccine direct delivery
Kano vaccine direct deliveryKano vaccine direct delivery
Kano vaccine direct delivery
 
DataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and Analytics
 
Integrating Technology into Water Trail Managemetnt Practices - Walter Opusz...
Integrating Technology into Water Trail  Managemetnt Practices - Walter Opusz...Integrating Technology into Water Trail  Managemetnt Practices - Walter Opusz...
Integrating Technology into Water Trail Managemetnt Practices - Walter Opusz...
 
FME World Tour: The difficulties of a simple trail network
FME World Tour: The difficulties of a simple trail networkFME World Tour: The difficulties of a simple trail network
FME World Tour: The difficulties of a simple trail network
 
Transport routing optimization
Transport routing optimizationTransport routing optimization
Transport routing optimization
 
Traffic Congestion using IOT
Traffic Congestion using IOTTraffic Congestion using IOT
Traffic Congestion using IOT
 
High Performance Computing on NYC Yellow Taxi Data Set
High Performance Computing on NYC Yellow Taxi Data SetHigh Performance Computing on NYC Yellow Taxi Data Set
High Performance Computing on NYC Yellow Taxi Data Set
 

Plus de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Plus de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Dernier

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 

Dernier (20)

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 

Driver Location Intelligence at Scale using Apache Spark, Delta Lake, and MLflow on Databricks

  • 1. WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics
  • 2. Sergio Ballesteros, TomTom Kia Eisinga, TomTom Driver Location Intelligence at Scale using Apache Spark, Delta Lake and MLflow on Databricks #UnifiedDataAnalytics #SparkAISummit
  • 3. Ourvision A safe, connected, autonomous world that is free of congestion and emissions.
  • 7. 7
  • 8. Data • Anonymous location (GPS) Traces • Community inputs • User events • Journalistic data • Car sensor data 8
  • 9. Dataflow 9 ~150 trillion data points ~80 billion data points per day
  • 12. In dash systems are outperformed by smartphones The embedded systemis expected to be up-to-date, with no user interaction. And the most visible component of it is a map. Usecase1:IQMapsanalytics 12
  • 13. Driversdonotupdatetheirmaps Today’s solutions provide manual updates, oftenwith a necessity to drive to the dealer. This is way too complex and inefficient. 13
  • 14. 14
  • 15. OEMsrequire dataefficient solutions While drivers expect up-to-date system, the carmakers are usually concerned about the data cost required for the map management. 15
  • 16. 98% OF TRIPS ARE DRIVEN WITHIN150KM RADIUS99.8% OF TRIPS ARE DRIVEN WITHIN1000KM RADIUS 16
  • 17. Whenradiusis0km • User drives within 2 regions every week day • Radius of 0 km. • Download and install justhome regions • Cellular data usage kept to a minimum 17
  • 18. Whenradiusis150km • User drives within 2 update regions every week day • Radius of 150 km. • Home region: 6 update regions. • Cellular data usage increased 18
  • 20. 20
  • 21. Realresultsusing0.5Mtrips 21 “This insight has led me to the conclusion that a default radius of 150km is unnecessary, and a small radius of ~10km would already satisfy mostdrivers while keeping cellular data usage low for OEMs.” - Rolf Dorland, PM at TomTom
  • 22. Goingonholidays • User goes for his holiday (less frequent updated region) • Once user starts driving, updates for all update regions the route goes through are downloaded and installed. 22
  • 24. 24
  • 25. Opportunity 25 Past: Rule-based solution Delta Lake pipelines Present: Machine Learning
  • 26. Data 26 Original trace data from 1 source 227K device serials Filtering out invalid trips 143K device serials Users with at least 50 trips 3.6K device serials Devices feasible for modelling 2.5K device serials
  • 27. Features For each trip, we have the following information: • Where did the trip start? • At what speed were you driving when the trip started? • What was the time of day (morning/afternoon/evening) when the trip started? • Was it rush hour when the trip started? • What day of the week was it? • Was it a weekend day? • What was the season? • Which driver profile do you belong to? Historical information: • Which destination did you go to your last trip? And the one before that? And the one before that? • If it is a, let's say Monday, where did you go to the last Monday you made a trip? (do this for every weekday) To predict: To which destination are you going? What do we use in the end? 27
  • 28. Labels • We are given the latitude and longitude of a destination of a trip. • In order to find out which latitude and longitudes belong to the same destination, we apply a clustering algorithm called DBSCAN. • DBSCAN clusters together destinations that are within 500 meters from each other. We should have at least 5 trips to a destination in order to call it a cluster. How do we define where you are going? 28
  • 29. 29
  • 30. Train,validationandtestsplit Trip ID Date Destination Trip 1 January1 Cluster1 Trip 2 January22 Cluster2 Trip 3 February3 Cluster1 Trip 4 February15 Cluster2 Trip 5 March 2 Cluster1 Trip 6 March 14 Cluster1 Trip 7 March 27 Cluster2 Trip 8 April 4 Cluster1 Trip 9 April 16 Cluster2 Trip 10 May 8 Cluster1 Train & validation dataset Test dataset TIME-SERIES CROSS-VALIDATION Iterativeevaluation of the trips to avoid overfitting Trip ID Date Destination Trip 1 January1 Cluster1 Trip 2 January22 Cluster2 Trip 3 February3 ? Trip ID​ Date​ Destination Trip 1​ January1​ Cluster1 Trip 2​ January22​ Cluster2 Trip 3​ February3​ Cluster1 Trip 4​ February15 ? Data for 1 driver: Trip ID​ Date​ Destination Trip 1​ January1​ Cluster1 Trip 2​ January22​ Cluster2 Trip 3​ February3​ Cluster1 Trip 4​ February15 Cluster1 … … … Trip 10 May 8 ? 30
  • 32. 32
  • 33. Majoritybaseline Distribution of precision on the test set with a majority baseline classifier 33
  • 34. Results Distribution of precision on the test set with a tuned classifier 34
  • 35. AcceleratingtheFutureofMobility By embracing Apache Spark, Databricks and the Azure cloud 3535
  • 36. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT