SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
A Microservices Framework
for
Real time Model Scoring
using
Structured Streaming
Vedant Jain
October 2018
#SAISStreaming1
2
About me
• Solutions Architect @ Databricks
• x-Hortonworks, JPMC
3
What will we talk about?
§ Machine learning
§ Model scoring
§ Microservices
§ Structured Streaming
§ Demo
4
Machine Learning
“…what we want is a machine that can learn from experience.”
Types of Machine Learning
• Unsupervised Learning
X Y Z
0.5 -3.4 9.2
-2.3 2.9 3.2
1.3 2.3 -4.5
• Supervised Learning
Types of Machine Learning
Activity
Biking
Standing
Sitting
Labels
X Y Z
0.5 -3.4 9.2
X Y Z
0.5 -3.4 9.2
-2.3 2.9 3.2
1.3 2.3 -4.5
• Supervised Learning
Types of Machine Learning
Activity
Biking
Standing
Sitting
Continuous Learning
Machine Learning Process
Define the
problem
Connect data
source
Acquire Data
Enrich, Transform
Data
Feature
Selection
Build/Train
Model
Serve/Score
Model
9
What will we talk about?
§ Machine learning
§ Model scoring
§ Microservices
§ Structured Streaming
§ Demo
Model Scoring
Propensity: Likelihood of a user to
commit a certain action
Affinity: How similar are two products or
users etc.
Lead: How closely matched lead is to
target profile
Attrition/Churn: Likelihood of a
customer to drop a service and/or start
using a competitor’s service
Credit: Ability of the user to keep
promise if granted access
Anomaly Detection: Identification of
rare or invalid transaction
11
Model Scoring on “Big Data”
Scale Streaming Dynamic
Machine Learning Pipeline
12
• Drop null values
• Calculate moving
averages on 10 minute
window
• Convert to single
parquet
• OneHotEncoder
• RF Classifier
• Cross Tabulation
• K-Means Clustering
• Hyperparameter Tuning
• Pipelines
• Deploy Pipelines to
Docker
• Serve the models using
Python Flask
• Phone/Watch
• Gyroscope/Accelerometer
• CSV
13
What will we talk about?
§ Machine learning
§ Model scoring
§ Microservices
§ Structured Streaming
§ Demo
Microservices
● Properties:
Ø Decomposition: Logic is broken down into multiple independent components
Ø Isolation: Component services are deployed and maintained independently of one another
● Benefits:
ü Reduced Regression Testing time
ü Organizational Autonomy
ü Cloud/On-premise Agnostic
ü Scalability/Reusability
Microservices
Example:
Company A tracks user’s activity through smart
devices and wants to provide tailored content to the
users based on their behavior.
Microservices
X Y Z Activity
0.5 -3.4 9.2 Biking
-2.3 2.9 3.2 Standing
1.3 2.3 -4.5 Sitting
‘Activity’
score
User Bike Stand Sit Cluster
A 0.1 0.3 0.6 0
B 0.0 0.3 0.7 1
Actionable Insights
Web services
Target
content
Affinity Score
Microservices
Microservices
Isolation
Microservices
Scalability
• Consistent
• Scalable
• Fault tolerant
• Complex Event
Processing
Microservices
21
What will we talk about?
§ Machine learning
§ Model scoring
§ Microservices
§ Structured Streaming
§ Demo
Structured Streaming
• High-level streaming API
built on Spark SQL engine
• Runs the same computation
as batch queries in
Datasets/DataFrames
• Event time, windowing,
sessions, sources & sinks
• End-to-end exactly once
semantics
• Late Data Handling
23
ML Limitations in Streaming
• Many models/transformers/estimators are not supported
• Limited to only models built in Spark MLLib
• Not ideal for Continuous Learning
23
24
What will we talk about?
§ Machine Learning
§ Model scoring
§ Microservices
§ Structured Streaming
§ Demo
https://github.com/vedantja/eu_summit_demo

Contenu connexe

Tendances

[CB19] Recent APT attack on crypto exchange employees by Heungsoo Kang
[CB19] Recent APT attack on crypto exchange employees by Heungsoo Kang[CB19] Recent APT attack on crypto exchange employees by Heungsoo Kang
[CB19] Recent APT attack on crypto exchange employees by Heungsoo KangCODE BLUE
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsEMC
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeSean Davis
 
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2Daniel Katz
 
My PhD thesis defense presentation
My PhD thesis defense presentationMy PhD thesis defense presentation
My PhD thesis defense presentationSuman Srinivasan
 

Tendances (7)

[CB19] Recent APT attack on crypto exchange employees by Heungsoo Kang
[CB19] Recent APT attack on crypto exchange employees by Heungsoo Kang[CB19] Recent APT attack on crypto exchange employees by Heungsoo Kang
[CB19] Recent APT attack on crypto exchange employees by Heungsoo Kang
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the Transcriptome
 
Data science for everyone
Data science for everyoneData science for everyone
Data science for everyone
 
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
 
My PhD thesis defense presentation
My PhD thesis defense presentationMy PhD thesis defense presentation
My PhD thesis defense presentation
 

Similaire à A Microservices Framework for Real-Time Model Scoring Using Structured Streaming with Vedant Jain

Marwa_Ezzatt_Ahmed_CV
Marwa_Ezzatt_Ahmed_CVMarwa_Ezzatt_Ahmed_CV
Marwa_Ezzatt_Ahmed_CVMarwa Ezzat
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Sri Ambati
 
Data Management is a Team Sport - IBM
Data Management is a Team Sport - IBMData Management is a Team Sport - IBM
Data Management is a Team Sport - IBMMongoDB
 
1,2,3 … Testing : Is this thing on(line)? with Mike Martin
1,2,3 … Testing : Is this thing on(line)? with Mike Martin1,2,3 … Testing : Is this thing on(line)? with Mike Martin
1,2,3 … Testing : Is this thing on(line)? with Mike MartinNETUserGroupBern
 
Cross Platform Angular 2 and TypeScript Development
Cross Platform Angular 2 and TypeScript DevelopmentCross Platform Angular 2 and TypeScript Development
Cross Platform Angular 2 and TypeScript DevelopmentJeremy Likness
 
Building an ML model with zero code
Building an ML model with zero codeBuilding an ML model with zero code
Building an ML model with zero codeNick Trogh
 
Canada DevOps Summit 2020 Presentation Nov_03_2020
Canada DevOps Summit 2020 Presentation Nov_03_2020Canada DevOps Summit 2020 Presentation Nov_03_2020
Canada DevOps Summit 2020 Presentation Nov_03_2020Varun Manik
 
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Bhakthi Liyanage
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSSri Ambati
 
5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday Season
5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday Season5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday Season
5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday SeasonG3 Communications
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create PyData
 
motorized bike j2ee ppt explanation of project
motorized bike j2ee ppt explanation of projectmotorized bike j2ee ppt explanation of project
motorized bike j2ee ppt explanation of projectprabhat kumar
 
Kong Summit 2018 - Microservices: decomposing applications for testability an...
Kong Summit 2018 - Microservices: decomposing applications for testability an...Kong Summit 2018 - Microservices: decomposing applications for testability an...
Kong Summit 2018 - Microservices: decomposing applications for testability an...Chris Richardson
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Turi, Inc.
 
Migrating to Microservices Patterns and Technologies (edition 2023)
 Migrating to Microservices Patterns and Technologies (edition 2023) Migrating to Microservices Patterns and Technologies (edition 2023)
Migrating to Microservices Patterns and Technologies (edition 2023)Ahmed Misbah
 
From Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtFrom Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtTechWell
 
Role of artificial intellligence in construction engg & management
Role of artificial intellligence in construction engg & managementRole of artificial intellligence in construction engg & management
Role of artificial intellligence in construction engg & managementKundan Sanap
 

Similaire à A Microservices Framework for Real-Time Model Scoring Using Structured Streaming with Vedant Jain (20)

A Mashup with Backbone
A Mashup with BackboneA Mashup with Backbone
A Mashup with Backbone
 
Marwa_Ezzatt_Ahmed_CV
Marwa_Ezzatt_Ahmed_CVMarwa_Ezzatt_Ahmed_CV
Marwa_Ezzatt_Ahmed_CV
 
Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
 
Data Management is a Team Sport - IBM
Data Management is a Team Sport - IBMData Management is a Team Sport - IBM
Data Management is a Team Sport - IBM
 
1,2,3 … Testing : Is this thing on(line)? with Mike Martin
1,2,3 … Testing : Is this thing on(line)? with Mike Martin1,2,3 … Testing : Is this thing on(line)? with Mike Martin
1,2,3 … Testing : Is this thing on(line)? with Mike Martin
 
Cross Platform Angular 2 and TypeScript Development
Cross Platform Angular 2 and TypeScript DevelopmentCross Platform Angular 2 and TypeScript Development
Cross Platform Angular 2 and TypeScript Development
 
Building an ML model with zero code
Building an ML model with zero codeBuilding an ML model with zero code
Building an ML model with zero code
 
Canada DevOps Summit 2020 Presentation Nov_03_2020
Canada DevOps Summit 2020 Presentation Nov_03_2020Canada DevOps Summit 2020 Presentation Nov_03_2020
Canada DevOps Summit 2020 Presentation Nov_03_2020
 
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
Integrating Azure Machine Learning and Predictive Analytics with SharePoint O...
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWS
 
5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday Season
5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday Season5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday Season
5 Steps To Deliver The Fastest Mobile Shopping Experience This Holiday Season
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
 
motorized bike j2ee ppt explanation of project
motorized bike j2ee ppt explanation of projectmotorized bike j2ee ppt explanation of project
motorized bike j2ee ppt explanation of project
 
Kong Summit 2018 - Microservices: decomposing applications for testability an...
Kong Summit 2018 - Microservices: decomposing applications for testability an...Kong Summit 2018 - Microservices: decomposing applications for testability an...
Kong Summit 2018 - Microservices: decomposing applications for testability an...
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015
 
Migrating to Microservices Patterns and Technologies (edition 2023)
 Migrating to Microservices Patterns and Technologies (edition 2023) Migrating to Microservices Patterns and Technologies (edition 2023)
Migrating to Microservices Patterns and Technologies (edition 2023)
 
Ahmed El Mawaziny CV
Ahmed El Mawaziny CVAhmed El Mawaziny CV
Ahmed El Mawaziny CV
 
Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019
Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019
Conclusion Connect state of IoT 2019 Review io t solutions world congress 2019
 
From Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtFrom Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical Debt
 
Role of artificial intellligence in construction engg & management
Role of artificial intellligence in construction engg & managementRole of artificial intellligence in construction engg & management
Role of artificial intellligence in construction engg & management
 

Plus de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Plus de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Dernier

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 

Dernier (20)

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 

A Microservices Framework for Real-Time Model Scoring Using Structured Streaming with Vedant Jain

  • 1. A Microservices Framework for Real time Model Scoring using Structured Streaming Vedant Jain October 2018 #SAISStreaming1
  • 2. 2 About me • Solutions Architect @ Databricks • x-Hortonworks, JPMC
  • 3. 3 What will we talk about? § Machine learning § Model scoring § Microservices § Structured Streaming § Demo
  • 4. 4 Machine Learning “…what we want is a machine that can learn from experience.”
  • 5. Types of Machine Learning • Unsupervised Learning
  • 6. X Y Z 0.5 -3.4 9.2 -2.3 2.9 3.2 1.3 2.3 -4.5 • Supervised Learning Types of Machine Learning Activity Biking Standing Sitting Labels X Y Z 0.5 -3.4 9.2
  • 7. X Y Z 0.5 -3.4 9.2 -2.3 2.9 3.2 1.3 2.3 -4.5 • Supervised Learning Types of Machine Learning Activity Biking Standing Sitting
  • 8. Continuous Learning Machine Learning Process Define the problem Connect data source Acquire Data Enrich, Transform Data Feature Selection Build/Train Model Serve/Score Model
  • 9. 9 What will we talk about? § Machine learning § Model scoring § Microservices § Structured Streaming § Demo
  • 10. Model Scoring Propensity: Likelihood of a user to commit a certain action Affinity: How similar are two products or users etc. Lead: How closely matched lead is to target profile Attrition/Churn: Likelihood of a customer to drop a service and/or start using a competitor’s service Credit: Ability of the user to keep promise if granted access Anomaly Detection: Identification of rare or invalid transaction
  • 11. 11 Model Scoring on “Big Data” Scale Streaming Dynamic
  • 12. Machine Learning Pipeline 12 • Drop null values • Calculate moving averages on 10 minute window • Convert to single parquet • OneHotEncoder • RF Classifier • Cross Tabulation • K-Means Clustering • Hyperparameter Tuning • Pipelines • Deploy Pipelines to Docker • Serve the models using Python Flask • Phone/Watch • Gyroscope/Accelerometer • CSV
  • 13. 13 What will we talk about? § Machine learning § Model scoring § Microservices § Structured Streaming § Demo
  • 14. Microservices ● Properties: Ø Decomposition: Logic is broken down into multiple independent components Ø Isolation: Component services are deployed and maintained independently of one another ● Benefits: ü Reduced Regression Testing time ü Organizational Autonomy ü Cloud/On-premise Agnostic ü Scalability/Reusability
  • 15. Microservices Example: Company A tracks user’s activity through smart devices and wants to provide tailored content to the users based on their behavior.
  • 16. Microservices X Y Z Activity 0.5 -3.4 9.2 Biking -2.3 2.9 3.2 Standing 1.3 2.3 -4.5 Sitting ‘Activity’ score User Bike Stand Sit Cluster A 0.1 0.3 0.6 0 B 0.0 0.3 0.7 1 Actionable Insights Web services Target content Affinity Score
  • 19. Microservices Scalability • Consistent • Scalable • Fault tolerant • Complex Event Processing
  • 21. 21 What will we talk about? § Machine learning § Model scoring § Microservices § Structured Streaming § Demo
  • 22. Structured Streaming • High-level streaming API built on Spark SQL engine • Runs the same computation as batch queries in Datasets/DataFrames • Event time, windowing, sessions, sources & sinks • End-to-end exactly once semantics • Late Data Handling
  • 23. 23 ML Limitations in Streaming • Many models/transformers/estimators are not supported • Limited to only models built in Spark MLLib • Not ideal for Continuous Learning 23
  • 24. 24 What will we talk about? § Machine Learning § Model scoring § Microservices § Structured Streaming § Demo https://github.com/vedantja/eu_summit_demo