SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
TOYOTA Customer 360◦
on
Apache SparkTM
DATA DRIVEN
Brian Kursar, Sr Data Scientist
Toyota Motor Sales IT Research and Development
Final
TOYOTA Big Data History
2015
2014
2013
2012
2011
2010
C360 - Next Gen Insights Platform
Over 6B Records
C360 - Customer Experience Analytics
Over 700M Records
C360 - Toyota Social Media Intelligence Center
Over 500M Records
Product Quality Analytics v2
Over 120M Records
Marketing and Incentives Analytics
70M Records
Product Quality Analytics
Over 60M Records
SPARK SUMMIT 2014
TEAM TOYOTA
R&D
Data Engineering
Infrastructure
Enterprise Architecture
Genchi Genbutsu
“Go Look, Go See”
•  Compute
•  Streaming
•  Machine Learning
ACTIONABLE INSIGHTS
Customer Experience original Batch Job
160 hours (6.6 days)
Same job re-written using Apache Spark …
4hours
Data Engineering
Existing Tools
Existing Tools
Jan – Feb Feb - Mar
+1% +1%
+1% +2%
Toyota Social Opinion 2013
40% Retailers Selling Toyota Vehicles
11% Opinions on Marketing Campaigns
10% Feedback on Dealer Sales and Service Experiences
9% Opinions on Product Styling and Features
8% People In Market for a Toyota
8% Incident Reports Involving a Toyota Vehicle
7% Feedback on Product Quality
5% Customers Advocating for the Brand
2% Completely Irrelevant
Toyota Online Conversations by the Numbers
2014
Study
Toyota Online Conversations by the Numbers
40% Retailers Selling Toyota Vehicles
11% Opinions on Marketing Campaigns
10% Feedback on Dealer Sales and Service Experiences
9% Opinions on Product Styling and Features
8% People In Market for a Toyota
8% Incident Reports Involving a Toyota Vehicle
7% Feedback on Product Quality
5% Customers Advocating for the Brand
2% Completely Irrelevant
2014
Study
50%
Noise
Categorize and Prioritize incoming Social Media
interactions in Real-Time using Spark MLLib
Campaign
Opinions
Customer
Feedback
Product
Feedback Noise
First Spark MLlib Experiment
•  Seat Cover Wrinkles/Cracking
•  Brake Noise
•  Shift Quality
•  Oil Leaks
•  HVAC Odor
•  Dead Battery
•  Rodent Wire Harness Damage
•  Paint Chips
Time-box project to 12 Weeks
Classify at min 80% accuracy
Describe this
issue…
NoiseBrakes
When I’m backing up in my 2012 Prius, it sounds like
something hanging up or scraping as it rotates and only
happens in the morning..
I hear a squeak coming from the back wheel of
my Prius as I pull out from my driveway in the
morning.
Extract features from the
Verbatim Text
Identify Training Data for Brake Noise Model
Extract Category (Label)
matching Model Objective
Social ML Pipeline
Natural	
  Language	
  Processing	
  Options
Feature	
  Engineering
Options
Statistical	
  Selection
(Chi	
  Square)
All Top
TopRandomRandom
Popular
Vectorization	
  
Options
Machine	
  Learning	
  
Model
TF*IDF
TF	
  Options
Natural
Boolean
Log
Augmented
IDF	
  Options
Unary
Inverse
Inverse	
  Smooth
ProbDF
Support	
  Vector	
  Machine	
  (MLlib)
Validation	
  Set	
  
(1	
  Fold)
Cross	
  Validation:	
  Repeat	
  10	
  times
Training	
  Set
(9	
  Fold)
Multiple	
  Iteration:	
  Optimal	
  SVM	
  
Parameter
Training	
  selection	
  filters
StopwordsStemming
N-­‐grams	
  
extraction
Cleaning	
  
(#,’,	
  /,	
  @)
Positive	
  &	
  Negative	
  
training	
  Sets
Labeled	
  Data
(Surveys,	
  Hand	
  
Scored	
  Social)
Extracted	
  n-­‐grams
Extract Text Features
Train Predictive Model
Ver 1
56%
Accuracy
Ver 3
36%
Accuracy
Ver 8
35%
Accuracy
False Positives
I just had my friend at the toyota dealer rotate my tires and he
said … that the brake pads are getting thin really fast.
So what should I do when they get too thin in the future and start
to squeak?
False Positives
i cut the iac hose as shown in figure 20 in the manual but when i
start the car, it started gasping for air... choking...
sounds like it's about to die out.
i bought the power brake check valve (80190 part for
kragen)... but either i'm not installing it right or it's the wrong
size... i have no idea.
Solution
Explicit Semantic Analysis
NoiseBrakes
Noise
caliper
pads
rotor
wheel
squeak grind
groan
squeal
Brakes
drum
Noise
squeak grind
groan
squeal
Distance Similarity Between Concepts
pads
caliper
rotor
wheel
drum
Brakes
Distance is calculated between
concepts based on the
Minkowski distance formula.
Ver 9
82%
Accuracy
Kaizen
= Continuous Improvement
Kaizen
Train
TestEvaluate
Refine
Social ML Pipeline
Natural	
  Language	
  Processing	
  Options
Feature	
  Engineering
Options
Statistical	
  Selection
(Chi	
  Square)
All Top
TopRandomRandom
Popular
Vectorization	
  
Options
Machine	
  Learning	
  
Model
TF*IDF
TF	
  Options
Natural
Boolean
Log
Augmented
IDF	
  Options
Unary
Inverse
Inverse	
  Smooth
ProbDF
Support	
  Vector	
  Machine	
  (MLlib)
Validation	
  Set	
  
(1	
  Fold)
Cross	
  Validation:	
  Repeat	
  10	
  times
Training	
  Set
(9	
  Fold)
Multiple	
  Iteration:	
  Optimal	
  SVM	
  
Parameter
Training	
  selection	
  filters
StopwordsStemming
N-­‐grams	
  
extraction
Cleaning	
  
(#,’,	
  /,	
  @)
Positive	
  &	
  Negative	
  
training	
  Sets
Labeled	
  Data
(Surveys,	
  Hand	
  
Scored	
  Social)
Extracted	
  n-­‐grams
Synonyms POS
Feature	
  Manager
Negation	
  Evaluator
Distance	
  Similarity
Manhattan	
  
Distance	
  (L1)
-­‐
|x|+	
  |y|
Euclidean	
  
Distance	
  (L2)
-­‐
√(|x|^	
  2	
  +	
  |y|^	
  2)
Concept	
  
Manager
Features
Concept	
  Interpreter
TODAY
FUTURE
Manufacturing Data
Connected Vehicle Data
Consumer Data
TEAM TOYOTA Spark Tips
•  Education and Inclusion
•  Pace Yourself
•  Design and Plan a Transitional Architecture to Incrementally
Introduce Spark elements into your Applications
•  Use Joda Time for Date Comparisons
TEAM TOYOTA Lessons Learned
•  Be mindful of AKKA versions when trying to Build a new Spark release to a
packaged Hadoop Distribution
•  Use SparkSQL versus DSLs for Joins
•  Remember to configure Memory Fraction based on the size of your data.
Brian Kursar – Sr Data Scientist
Toyota Motor Sales IT Research and Development
@briankursar
Visit us at the TOYOTA
Booth here at the Spark
Summit Today.

Contenu connexe

Tendances

Tendances (6)

Correlation, causation and incrementally recommendation problems at netflix ...
Correlation, causation and incrementally  recommendation problems at netflix ...Correlation, causation and incrementally  recommendation problems at netflix ...
Correlation, causation and incrementally recommendation problems at netflix ...
 
Social media analysis project
Social media analysis projectSocial media analysis project
Social media analysis project
 
RpA introduction
RpA introductionRpA introduction
RpA introduction
 
The Playbook to Blending Product-Led Growth with Sales-Led Growth: Teams, Too...
The Playbook to Blending Product-Led Growth with Sales-Led Growth: Teams, Too...The Playbook to Blending Product-Led Growth with Sales-Led Growth: Teams, Too...
The Playbook to Blending Product-Led Growth with Sales-Led Growth: Teams, Too...
 
Is Customer Success Worth It?
Is Customer Success Worth It? Is Customer Success Worth It?
Is Customer Success Worth It?
 
Marketing Quarterly Review
Marketing Quarterly ReviewMarketing Quarterly Review
Marketing Quarterly Review
 

En vedette

Connected Banking Framework
Connected Banking FrameworkConnected Banking Framework
Connected Banking Framework
Kashif Akram
 

En vedette (20)

Apache® Spark™ MLlib: From Quick Start to Scikit-Learn
Apache® Spark™ MLlib: From Quick Start to Scikit-LearnApache® Spark™ MLlib: From Quick Start to Scikit-Learn
Apache® Spark™ MLlib: From Quick Start to Scikit-Learn
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
The Connected Consumer – Real-time Customer 360
The Connected Consumer – Real-time Customer 360The Connected Consumer – Real-time Customer 360
The Connected Consumer – Real-time Customer 360
 
FinQLOUD platform for digital banking
FinQLOUD platform for digital bankingFinQLOUD platform for digital banking
FinQLOUD platform for digital banking
 
How to build an effective omni-channel CRM & Marketing Strategy & 360 custome...
How to build an effective omni-channel CRM & Marketing Strategy & 360 custome...How to build an effective omni-channel CRM & Marketing Strategy & 360 custome...
How to build an effective omni-channel CRM & Marketing Strategy & 360 custome...
 
Connected Banking Framework
Connected Banking FrameworkConnected Banking Framework
Connected Banking Framework
 
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE) Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
 
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
 
Gartner Customer 360 Summit 2012
Gartner Customer 360 Summit 2012Gartner Customer 360 Summit 2012
Gartner Customer 360 Summit 2012
 
Extended 360 degree view of customer
Extended 360 degree view of customerExtended 360 degree view of customer
Extended 360 degree view of customer
 
Apache Kafka Scalable Message Processing and more!
Apache Kafka Scalable Message Processing and more! Apache Kafka Scalable Message Processing and more!
Apache Kafka Scalable Message Processing and more!
 
Big_data for marketing and sales
Big_data for marketing and salesBig_data for marketing and sales
Big_data for marketing and sales
 
A Customer-Centric Banking Platform Powered by MongoDB
A Customer-Centric Banking Platform Powered by MongoDB A Customer-Centric Banking Platform Powered by MongoDB
A Customer-Centric Banking Platform Powered by MongoDB
 
ANTS - 360 view of your customer - bigdata innovation summit 2016
ANTS - 360 view of your customer - bigdata innovation summit 2016ANTS - 360 view of your customer - bigdata innovation summit 2016
ANTS - 360 view of your customer - bigdata innovation summit 2016
 
Solution Blueprint - Customer 360
Solution Blueprint - Customer 360Solution Blueprint - Customer 360
Solution Blueprint - Customer 360
 
B2B CMO forum summary 2014 03 06
B2B CMO forum summary 2014 03 06B2B CMO forum summary 2014 03 06
B2B CMO forum summary 2014 03 06
 
GDPR: The Catalyst for Customer 360
GDPR: The Catalyst for Customer 360GDPR: The Catalyst for Customer 360
GDPR: The Catalyst for Customer 360
 
360° View of Your Customers
360° View of Your Customers360° View of Your Customers
360° View of Your Customers
 
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
Customer Event Hub – a modern Customer 360° view with DataStax Enterprise (DSE)
 

Similaire à Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kursar, Toyota)

Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 

Similaire à Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kursar, Toyota) (20)

Rahul Bhuman, Tech Mahindra - Truck roll prediction using Driverless AI - H2O...
Rahul Bhuman, Tech Mahindra - Truck roll prediction using Driverless AI - H2O...Rahul Bhuman, Tech Mahindra - Truck roll prediction using Driverless AI - H2O...
Rahul Bhuman, Tech Mahindra - Truck roll prediction using Driverless AI - H2O...
 
2019 12 14 Global AI Bootcamp - Auto ML with Machine Learning.Net
2019 12 14 Global AI Bootcamp   - Auto ML with Machine Learning.Net2019 12 14 Global AI Bootcamp   - Auto ML with Machine Learning.Net
2019 12 14 Global AI Bootcamp - Auto ML with Machine Learning.Net
 
2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net
 
2019 12 19 Mississauga .Net User Group - Machine Learning.Net and Auto ML
2019 12 19 Mississauga .Net User Group - Machine Learning.Net and Auto ML2019 12 19 Mississauga .Net User Group - Machine Learning.Net and Auto ML
2019 12 19 Mississauga .Net User Group - Machine Learning.Net and Auto ML
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
 
Big Data Day LA 2015 - Feature Engineering by Brian Kursar of Toyota
Big Data Day LA 2015 - Feature Engineering by Brian Kursar of ToyotaBig Data Day LA 2015 - Feature Engineering by Brian Kursar of Toyota
Big Data Day LA 2015 - Feature Engineering by Brian Kursar of Toyota
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net
 
2019 09 05 Global AI Night Toronto - Machine Learning.Net
2019 09 05 Global AI Night Toronto - Machine Learning.Net2019 09 05 Global AI Night Toronto - Machine Learning.Net
2019 09 05 Global AI Night Toronto - Machine Learning.Net
 
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
 
Data Con LA 2022 - AutoDC + AutoML = your AI development superpower
Data Con LA 2022 - AutoDC + AutoML = your AI development superpowerData Con LA 2022 - AutoDC + AutoML = your AI development superpower
Data Con LA 2022 - AutoDC + AutoML = your AI development superpower
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine Learning
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Building an ML model with zero code
Building an ML model with zero codeBuilding an ML model with zero code
Building an ML model with zero code
 

Plus de Spark Summit

Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 

Plus de Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Dernier

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 

Dernier (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 

Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kursar, Toyota)

  • 1. TOYOTA Customer 360◦ on Apache SparkTM DATA DRIVEN Brian Kursar, Sr Data Scientist Toyota Motor Sales IT Research and Development Final
  • 2. TOYOTA Big Data History 2015 2014 2013 2012 2011 2010 C360 - Next Gen Insights Platform Over 6B Records C360 - Customer Experience Analytics Over 700M Records C360 - Toyota Social Media Intelligence Center Over 500M Records Product Quality Analytics v2 Over 120M Records Marketing and Incentives Analytics 70M Records Product Quality Analytics Over 60M Records
  • 3. SPARK SUMMIT 2014 TEAM TOYOTA R&D Data Engineering Infrastructure Enterprise Architecture
  • 5. •  Compute •  Streaming •  Machine Learning ACTIONABLE INSIGHTS
  • 6. Customer Experience original Batch Job 160 hours (6.6 days) Same job re-written using Apache Spark … 4hours Data Engineering
  • 8. Existing Tools Jan – Feb Feb - Mar +1% +1% +1% +2% Toyota Social Opinion 2013
  • 9. 40% Retailers Selling Toyota Vehicles 11% Opinions on Marketing Campaigns 10% Feedback on Dealer Sales and Service Experiences 9% Opinions on Product Styling and Features 8% People In Market for a Toyota 8% Incident Reports Involving a Toyota Vehicle 7% Feedback on Product Quality 5% Customers Advocating for the Brand 2% Completely Irrelevant Toyota Online Conversations by the Numbers 2014 Study
  • 10. Toyota Online Conversations by the Numbers 40% Retailers Selling Toyota Vehicles 11% Opinions on Marketing Campaigns 10% Feedback on Dealer Sales and Service Experiences 9% Opinions on Product Styling and Features 8% People In Market for a Toyota 8% Incident Reports Involving a Toyota Vehicle 7% Feedback on Product Quality 5% Customers Advocating for the Brand 2% Completely Irrelevant 2014 Study 50% Noise
  • 11. Categorize and Prioritize incoming Social Media interactions in Real-Time using Spark MLLib Campaign Opinions Customer Feedback Product Feedback Noise
  • 12.
  • 13. First Spark MLlib Experiment •  Seat Cover Wrinkles/Cracking •  Brake Noise •  Shift Quality •  Oil Leaks •  HVAC Odor •  Dead Battery •  Rodent Wire Harness Damage •  Paint Chips Time-box project to 12 Weeks Classify at min 80% accuracy
  • 16. When I’m backing up in my 2012 Prius, it sounds like something hanging up or scraping as it rotates and only happens in the morning.. I hear a squeak coming from the back wheel of my Prius as I pull out from my driveway in the morning.
  • 17. Extract features from the Verbatim Text Identify Training Data for Brake Noise Model Extract Category (Label) matching Model Objective
  • 18. Social ML Pipeline Natural  Language  Processing  Options Feature  Engineering Options Statistical  Selection (Chi  Square) All Top TopRandomRandom Popular Vectorization   Options Machine  Learning   Model TF*IDF TF  Options Natural Boolean Log Augmented IDF  Options Unary Inverse Inverse  Smooth ProbDF Support  Vector  Machine  (MLlib) Validation  Set   (1  Fold) Cross  Validation:  Repeat  10  times Training  Set (9  Fold) Multiple  Iteration:  Optimal  SVM   Parameter Training  selection  filters StopwordsStemming N-­‐grams   extraction Cleaning   (#,’,  /,  @) Positive  &  Negative   training  Sets Labeled  Data (Surveys,  Hand   Scored  Social) Extracted  n-­‐grams
  • 21.
  • 25. False Positives I just had my friend at the toyota dealer rotate my tires and he said … that the brake pads are getting thin really fast. So what should I do when they get too thin in the future and start to squeak?
  • 26. False Positives i cut the iac hose as shown in figure 20 in the manual but when i start the car, it started gasping for air... choking... sounds like it's about to die out. i bought the power brake check valve (80190 part for kragen)... but either i'm not installing it right or it's the wrong size... i have no idea.
  • 31. Noise squeak grind groan squeal Distance Similarity Between Concepts pads caliper rotor wheel drum Brakes Distance is calculated between concepts based on the Minkowski distance formula.
  • 33.
  • 36. Social ML Pipeline Natural  Language  Processing  Options Feature  Engineering Options Statistical  Selection (Chi  Square) All Top TopRandomRandom Popular Vectorization   Options Machine  Learning   Model TF*IDF TF  Options Natural Boolean Log Augmented IDF  Options Unary Inverse Inverse  Smooth ProbDF Support  Vector  Machine  (MLlib) Validation  Set   (1  Fold) Cross  Validation:  Repeat  10  times Training  Set (9  Fold) Multiple  Iteration:  Optimal  SVM   Parameter Training  selection  filters StopwordsStemming N-­‐grams   extraction Cleaning   (#,’,  /,  @) Positive  &  Negative   training  Sets Labeled  Data (Surveys,  Hand   Scored  Social) Extracted  n-­‐grams Synonyms POS Feature  Manager Negation  Evaluator Distance  Similarity Manhattan   Distance  (L1) -­‐ |x|+  |y| Euclidean   Distance  (L2) -­‐ √(|x|^  2  +  |y|^  2) Concept   Manager Features Concept  Interpreter
  • 37. TODAY
  • 39. TEAM TOYOTA Spark Tips •  Education and Inclusion •  Pace Yourself •  Design and Plan a Transitional Architecture to Incrementally Introduce Spark elements into your Applications •  Use Joda Time for Date Comparisons
  • 40. TEAM TOYOTA Lessons Learned •  Be mindful of AKKA versions when trying to Build a new Spark release to a packaged Hadoop Distribution •  Use SparkSQL versus DSLs for Joins •  Remember to configure Memory Fraction based on the size of your data.
  • 41. Brian Kursar – Sr Data Scientist Toyota Motor Sales IT Research and Development @briankursar Visit us at the TOYOTA Booth here at the Spark Summit Today.