SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Extracting information from images using
deep learning and transfer learning.
Application to personalized recommendations.
Introduction and Context
Iterative building of a
recommender system
Labeling Images
Pragmatic deep learning for
dummies
Post Processing
AKA: Image for BI on steroids
Outline
Results
More images !
Dataiku
•  Founded in 2013
•  90 + employees, 100 + clients
•  Paris, New-York, London, San Francisco,
Singapore
Data Science Software Editor of Dataiku DSS
DESIGN
Load and prepare
your data
PREPARE
Build your
models
MODEL
Visualize and share
your work
ANALYSE
Re-execute your
workflow at
ease
AUTOMATE
Follow your production
environment
MONITOR
Get predictions
in real time
SCORE
PRODUCTIO
N
Client Key Figures
E-business vacation retailer
Negotiate the best prize for their clients
Discount luxury
Sale Image is paramount
Purchase is impulsive
18 Millions of clients.
Hundreds of sales opened everyday
Specificities
Highly temporary sales
-> Classical recommender system
fail
-> Time event linked (Christmas, ski,
summer)
Expensive Product
-> Few recurrent buyers
-> Appearance counts a lot
Iterative Building of a Recommender System
Basic Recommendation Engines
Other Factors
One Meta Model to Rule Them All
Recommenders	
  as	
  
features	
  
Machine	
  learning	
  to	
  
op5mize	
  purchasing	
  
probability	
  
Combine	
  
Recommend	
  
Describe	
  
Recommender system for Home Page Ordering
Cleaning, combining
and enrichment of
data
Recommendation
Engines
Optimization of
home display
the application
automatically runs and
compiles heterogeneous
data
Generation of
recommendations based
on user behaviour
Every customer is
shown the 10
sales he is the
most likely to buy
Customer visits
Purchases
Sales Images
Metal model combine
recommendations to
directly optimize
purchasing probability
Meta Model
+7% revenue
Sales information
(A/B testing)
Batch Scoring every night
Why use Image ?
We want do distinguish
« Sun
and
Beach »
« Ski »
A picture is worth a thousand words
Integrating Image Information
Sales Images Labelling
Model
Pool + Palm Trees
Hotel
+ Mountains
Pool + Forest + Hotel
+ Sea
Sea + Beach +Forest +
Hotel
Sales
descriptions
vector
CONTENT	
  BASED	
  
Recommender
System
Image Labelling For Recommendation Engine
Pragma&c	
  Deep	
  learning	
  for	
  “Dummies”	
  
Using Deep Learning models
Common Issues
“I don’t have GPUs server” “I don’t have a deep leaning expert”
“I don’t have labelled data”
(or too few)
“I don’t have the time to wait for model training ”
I don’t want to pay to pay for private apis” / “I’m afraid their labelling will change over time”
“I don’t have (or few) labelled data”
-> Is there similar data ?
Solution 1 : Pre trained models
PLACES	
  DATABASE	
  US	
   SUN	
  DATABASE	
  
205	
  categories	
  
2.5	
  M	
  images	
  
307	
  categories	
  
110	
  K	
  images	
  
tower: 0.53
skyscraper: 0.26
swimming_pool/outdoor: 0.65
inn/outdoor: 0.06
Solution 1 : Pre trained models
If there is open data, there is an open pre trained model !
•  Kudos to the community
•  Check the licensing
Example	
  with	
  Places	
  (Caffe	
  Model	
  Zoo)	
  :	
  
	
  
Solution 2 : Transfer Learning
Credit	
  :	
  	
  Fei-­‐Fei	
  Li	
  &	
  Andrej	
  Karpathy	
  &	
  Jus5n	
  Johnson	
  hYp://cs231n.stanford.edu/slides/winter1516_lecture11.pdf	
  
Solution 2 : Transfer Learning
Use the network as a feature extractor
Fine tune the model Keras makes it super easy !
(Possible to freeze layer to train just the top) hYps://github.com/PGu5/Deep_Learning_tuto	
  
PLACES	
  DATABASE	
   US	
  SUN	
  DATABASE	
  
Training	
  
(op5onal)	
  
Pre-­‐trained	
  model	
  
VGG16	
  
tower: 0.53
skyscraper: 0.26
Re-­‐Training	
  
Transferred	
  Data	
  :	
  
Last	
  convolu5onal	
  
layer	
  features	
  
Re-­‐trained	
  model	
  
TensorFlow	
  
2	
  fully	
  connected	
  layers	
  
Caffe	
  
Model	
  Zoo	
  
	
  
GPU	
  
CPU	
  
GPU	
  
Leverage existing knowledge !
Solution 2 : Transfer Learning
Accuracy:	
  72%,	
  Top-­‐5	
  Acc:	
  90	
  %	
  >	
  state	
  of	
  the	
  art	
  on	
  dataset	
  alone	
  
Transfer	
  
Post Treatment & Results
(Or how we transfer the labelling information)
Using	
  Images	
  informa&on	
  for	
  BI	
  on	
  steroids	
  	
  
Labels post-processing
Complementary information Redondant information
Issue with our approach:
Solution : Matrix Factorization
Topic extraction with Non-Negative Matrix Factorization
•  Non Negative Matrix factorization (NMF)
X = WH
•  X : image x tags, non negative
•  W : image x theme
•  H : theme x tag
(scikit learn implementation)
•  Most represented Themes
•  Swimming-pool_Apartment_Putting-green
•  Ocean_Coast_SandBar
•  Coast_SeaCliff_RockArch
•  Beach_Coast_BoardWalk
•  Bridge_Viaduc_River
•  Palace_BuildingFacade-Mansion
•  Castle_Mansion_Monastery
•  HotelRoom_Bedroom_DormRoom
•  Dimension Reduction
•  200x200 pixels -> 600 tags => 30
themes
•  Faster content based filtering
•  Image often sparse combination of
themes
Faster content based filtering
•  Each theme has the same explication
power
Balanced vector for content based
•  Explicability
Each theme corresponds to a few labels
Image content detection
Topic scores determine the importance of topics in an image
TOPIC	
   TOPIC	
  SCORE	
  
(%)	
  
Golf	
  course	
  –	
  Fairway	
  –	
  PuHng	
  green	
   31	
  
Hotel	
  –	
  Inn	
  –	
  Apartment	
  building	
  
outdoor	
  
30	
  
Swimming	
  pool	
  –	
  Lido	
  Deck	
  –	
  Hot	
  tub	
  
outdoor	
  
22	
  
Beach	
  –	
  Coast	
  -­‐	
  Harbor	
   17	
  
TOPIC	
   TOPIC	
  SCORE	
  (%)	
  
Tower	
  –	
  Skyscraper	
  –	
  Office	
  building	
   62	
  
Bridge	
  –	
  River	
  –	
  Viaduct	
   38	
  
Note on model performance
•  Images labels are used for similarity
Calling herb field “putting green”:
•  Is not important if all herbs field are called this way.
•  Would be if we had lot’s of golf trips sales.
•  Improving the NN performance ?
•  Labels are used in NMF and reduced to themes
•  Themes are used to calculate similarities for CB
recommenders
•  CB Recommenders are used as a feature in meta model
•  Meta model give probabilities of purchase = order
•  Users only check 10 sales…
-> what is the change of online performance for 1% accuracy ?
Results
Results ?
1) Visits :
•  France and Morocco
•  Pool displayed
2) First Recommendation
•  Mostly France &
Mediterranean
•  Fails to display pools
3) Only Images recommendation
•  Pool all around the world
•  Does not respect budget
4) Third column = Right Mix
Results ?
1) Visits :
•  Spain
•  Sun & Beach
•  Pool displayed
2) First Recommendation
•  Displays nature…
3) Only Images ?
•  Pool all around the world
4) Third = Right Mix
•  Get the bungalow
feature !
Learned along the way
What’s next ?
AYrac5veness	
  =	
  %	
  visits	
  with	
  tag	
  /	
  %	
  sales	
  with	
  tag	
  -­‐1	
  	
  
For	
  ski	
  sales,	
  indoor	
  pictures	
  performs	
  beYer!	
  
	
  
What’s Next ?
Kenya
Prague
Berlin
Cambodia
Conclusion
Do iterative data science !
Start simple and grow
Evaluate at each steps
Image labelling = BI on steroids
Transfer Learning
Kick-start your project
Gain time and money
Any Data Scientist can do it
Deep Learning
Don’t start from scratch !
Is there existing data ?
Is there a pre-trained model ?
Thank you for your attention !
Basic Recommendation Engines
•  Implementation
•  Everything in SQL
-  Vertica
-  Then Impala
-  Then Spark
•  Collaborative filtering
•  Score(user, future sale j) = Sum_{i:visited sale} sim(i,j)
•  Content based filtering
•  Sale profile: (sale_id, feature, value)
•  User profile: (user_id, feature, value)
•  Sparse coding + join does the trick
Join	
  on	
  
user_id	
  
Join	
  on	
  user_id	
  +	
  sales	
  id	
  
One Meta Model to Rule Them All
•  Negative sampling
•  Take all purchases tuples : (user, product, timestamp)-> 1
•  Select 5 sales open at the same date the user did not buy -> 0
•  The model directly optimize purchasing probability
•  Machine learning model
•  Features : recommender systems.
•  Logistic Regression
Regularizing effect : we don’t want to overfit leaks.
•  Reranking approach.
Similar to Google or Yandex (Kaggle challenge)
Classification and not Detection problem…
•  Only have probabilities of each class
•  Selecting based on probability threshold fails
•  Keeping all information is not sparse
Labels post-processing
Deep/Transfer Learning
5-10 tags per images2s/image with CPU
x20 speed up with GPU
Our images
Cafe
-> we keep 5 labels
With probabilities per image
Solution 3 : What about APIs ?
Solution 3 : What about APIs ?
•  Price
•  Their cost
often rather cheap. Ex: 100 K request for less than 300$
•  VS the one of redeveloping (probably not as well)
•  Full Database scoring
•  APIs are often limited query per month.
•  Make sure to be able to avoid cold start problem
•  Stability
•  Use model versioning
•  Avoid covariate shift, distribution drift
What about APIs ? Use for generating labels !
How to steal model:
•  1) Score part of the database for training
•  2) Train a model
•  3) Score your entire database !
(Or don’t, it’s illegal)
But I have only 5000 requests ?
-> Use Transfer Learning !
Tramèr, Florian, et al. "Stealing Machine Learning Models via Prediction APIs." arXiv
preprint arXiv:1609.02943 (2016).
What about APIs ? Use for generating labels !
Experiment:
•  5000 requests on API. (4500/500 split)
•  Transfer learning with MIT Places Pre-trained Model
•  Scikit learn Multilabel model
(Or don’t, it’s illegal)
(demo, not used in any real project)
What about APIs ? Results
Precision	
   75	
  
Label	
   Probability	
   Label	
   Probability	
  
landscape 1,0000 sunset 0,9998
sky 1,0000 no person 0,9996
outdoors 1,0000 water 0,9990
nature 1,0000 park 0,9849
rock 1,0000 river 0,9678
travel 1,0000 scenic 0,8031
Label	
   Probability	
   Label	
   Probability	
  
beach 1,0000 ocean 1,0000
summer 1,0000 relaxation 1,0000
sand 1,0000 island 1,0000
tropical 1,0000 idyllic 1,0000
travel 1,0000 seashore 0,9998
seascape 1,0000 water 0,9997
(demo, not used in any real project)
Recall	
   80	
  Accuracy	
   95	
  

Contenu connexe

Tendances

Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelinesjeykottalam
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Sri Ambati
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paperstrange_loop
 
Porting R Models into Scala Spark
Porting R Models into Scala SparkPorting R Models into Scala Spark
Porting R Models into Scala Sparkcarl_pulley
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)Julien SIMON
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkFaisal Siddiqi
 
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...Databricks
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Costanoa Ventures
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at UberKarthik Murugesan
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...Spark Summit
 
Apache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline
Apache Liminal (Incubating)—Orchestrate the Machine Learning PipelineApache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline
Apache Liminal (Incubating)—Orchestrate the Machine Learning PipelineDatabricks
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningDatabricks
 
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...Databricks
 
Netflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelNetflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelFaisal Siddiqi
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsAndrzej Michałowski
 
AI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksAI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksDatabricks
 
Zipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering FrameworkZipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering FrameworkDatabricks
 
Bootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B TestsBootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B TestsDatabricks
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingDatabricks
 

Tendances (20)

Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paper
 
Porting R Models into Scala Spark
Porting R Models into Scala SparkPorting R Models into Scala Spark
Porting R Models into Scala Spark
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talk
 
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline
Apache Liminal (Incubating)—Orchestrate the Machine Learning PipelineApache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline
Apache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine Learning
 
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestratio...
 
Netflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelNetflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time Travel
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
AI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksAI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with Databricks
 
Zipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering FrameworkZipline - A Declarative Feature Engineering Framework
Zipline - A Declarative Feature Engineering Framework
 
Bootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B TestsBootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B Tests
 
NLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated TrainingNLP Text Recommendation System Journey to Automated Training
NLP Text Recommendation System Journey to Automated Training
 

Similaire à Extracting information from images using deep learning and transfer learning — Pierre Gutierrez (dataiku labs) @PAPIs Connect — São Paulo 2017

Pragmatic deep learning for image labelling
Pragmatic deep learning for image labellingPragmatic deep learning for image labelling
Pragmatic deep learning for image labellingPierre Gutierrez
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemPierre Gutierrez
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneIvo Andreev
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsYalçın Yenigün
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupSri Ambati
 
The Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and BeyondThe Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and BeyondNUS-ISS
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#J On The Beach
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Agile Experiments in Machine Learning
Agile Experiments in Machine LearningAgile Experiments in Machine Learning
Agile Experiments in Machine Learningmathias-brandewinder
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Lucidworks
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Srinath Perera
 
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...Srinath Perera
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Turi, Inc.
 
CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share志明 陳
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure applicationCodecamp Romania
 
The paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHThe paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHDataiku
 
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...DevClub_lv
 
Web based interactive big data visualization
Web based interactive big data visualizationWeb based interactive big data visualization
Web based interactive big data visualizationWenli Zhang
 
Large scale computing
Large scale computing Large scale computing
Large scale computing Bhupesh Bansal
 

Similaire à Extracting information from images using deep learning and transfer learning — Pierre Gutierrez (dataiku labs) @PAPIs Connect — São Paulo 2017 (20)

Pragmatic deep learning for image labelling
Pragmatic deep learning for image labellingPragmatic deep learning for image labelling
Pragmatic deep learning for image labelling
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
The Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and BeyondThe Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and Beyond
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Agile Experiments in Machine Learning
Agile Experiments in Machine LearningAgile Experiments in Machine Learning
Agile Experiments in Machine Learning
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
 
Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
 
The paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHThe paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECH
 
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
 
Web based interactive big data visualization
Web based interactive big data visualizationWeb based interactive big data visualization
Web based interactive big data visualization
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
 

Plus de PAPIs.io

Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...PAPIs.io
 
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017PAPIs.io
 
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...PAPIs.io
 
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...PAPIs.io
 
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...PAPIs.io
 
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...PAPIs.io
 
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs ConnectReal-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs ConnectPAPIs.io
 
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...PAPIs.io
 
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...PAPIs.io
 
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs ConnectDemystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs ConnectPAPIs.io
 
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs ConnectPredictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs ConnectPAPIs.io
 
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs ConnectMicrodecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs ConnectPAPIs.io
 
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...PAPIs.io
 
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...PAPIs.io
 
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs ConnectHow to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs ConnectPAPIs.io
 
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...PAPIs.io
 
Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...
Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...
Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...PAPIs.io
 
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectMachine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectPAPIs.io
 
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...PAPIs.io
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectPAPIs.io
 

Plus de PAPIs.io (20)

Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
 
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
 
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
 
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
 
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
 
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
 
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs ConnectReal-world applications of AI - Daniel Hulme @ PAPIs Connect
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
 
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
 
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
 
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs ConnectDemystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
 
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs ConnectPredictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
 
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs ConnectMicrodecision making in financial services - Greg Lamp @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
 
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
 
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
 
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs ConnectHow to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
 
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
 
Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...
Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...
Automating Machine Learning Workflows: A Report from the Trenches - Jose A. O...
 
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectMachine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
 
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
 

Dernier

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Dernier (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Extracting information from images using deep learning and transfer learning — Pierre Gutierrez (dataiku labs) @PAPIs Connect — São Paulo 2017

  • 1. Extracting information from images using deep learning and transfer learning. Application to personalized recommendations.
  • 2. Introduction and Context Iterative building of a recommender system Labeling Images Pragmatic deep learning for dummies Post Processing AKA: Image for BI on steroids Outline Results More images !
  • 3. Dataiku •  Founded in 2013 •  90 + employees, 100 + clients •  Paris, New-York, London, San Francisco, Singapore Data Science Software Editor of Dataiku DSS DESIGN Load and prepare your data PREPARE Build your models MODEL Visualize and share your work ANALYSE Re-execute your workflow at ease AUTOMATE Follow your production environment MONITOR Get predictions in real time SCORE PRODUCTIO N
  • 4. Client Key Figures E-business vacation retailer Negotiate the best prize for their clients Discount luxury Sale Image is paramount Purchase is impulsive 18 Millions of clients. Hundreds of sales opened everyday
  • 5. Specificities Highly temporary sales -> Classical recommender system fail -> Time event linked (Christmas, ski, summer) Expensive Product -> Few recurrent buyers -> Appearance counts a lot
  • 6. Iterative Building of a Recommender System
  • 9. One Meta Model to Rule Them All Recommenders  as   features   Machine  learning  to   op5mize  purchasing   probability   Combine   Recommend   Describe  
  • 10. Recommender system for Home Page Ordering Cleaning, combining and enrichment of data Recommendation Engines Optimization of home display the application automatically runs and compiles heterogeneous data Generation of recommendations based on user behaviour Every customer is shown the 10 sales he is the most likely to buy Customer visits Purchases Sales Images Metal model combine recommendations to directly optimize purchasing probability Meta Model +7% revenue Sales information (A/B testing) Batch Scoring every night
  • 11. Why use Image ? We want do distinguish « Sun and Beach » « Ski » A picture is worth a thousand words
  • 12. Integrating Image Information Sales Images Labelling Model Pool + Palm Trees Hotel + Mountains Pool + Forest + Hotel + Sea Sea + Beach +Forest + Hotel Sales descriptions vector CONTENT  BASED   Recommender System
  • 13. Image Labelling For Recommendation Engine Pragma&c  Deep  learning  for  “Dummies”  
  • 14. Using Deep Learning models Common Issues “I don’t have GPUs server” “I don’t have a deep leaning expert” “I don’t have labelled data” (or too few) “I don’t have the time to wait for model training ” I don’t want to pay to pay for private apis” / “I’m afraid their labelling will change over time”
  • 15. “I don’t have (or few) labelled data” -> Is there similar data ? Solution 1 : Pre trained models PLACES  DATABASE  US   SUN  DATABASE   205  categories   2.5  M  images   307  categories   110  K  images  
  • 16. tower: 0.53 skyscraper: 0.26 swimming_pool/outdoor: 0.65 inn/outdoor: 0.06 Solution 1 : Pre trained models If there is open data, there is an open pre trained model ! •  Kudos to the community •  Check the licensing Example  with  Places  (Caffe  Model  Zoo)  :    
  • 17. Solution 2 : Transfer Learning Credit  :    Fei-­‐Fei  Li  &  Andrej  Karpathy  &  Jus5n  Johnson  hYp://cs231n.stanford.edu/slides/winter1516_lecture11.pdf  
  • 18. Solution 2 : Transfer Learning Use the network as a feature extractor Fine tune the model Keras makes it super easy ! (Possible to freeze layer to train just the top) hYps://github.com/PGu5/Deep_Learning_tuto  
  • 19. PLACES  DATABASE   US  SUN  DATABASE   Training   (op5onal)   Pre-­‐trained  model   VGG16   tower: 0.53 skyscraper: 0.26 Re-­‐Training   Transferred  Data  :   Last  convolu5onal   layer  features   Re-­‐trained  model   TensorFlow   2  fully  connected  layers   Caffe   Model  Zoo     GPU   CPU   GPU   Leverage existing knowledge ! Solution 2 : Transfer Learning Accuracy:  72%,  Top-­‐5  Acc:  90  %  >  state  of  the  art  on  dataset  alone   Transfer  
  • 20. Post Treatment & Results (Or how we transfer the labelling information) Using  Images  informa&on  for  BI  on  steroids    
  • 21. Labels post-processing Complementary information Redondant information Issue with our approach: Solution : Matrix Factorization
  • 22. Topic extraction with Non-Negative Matrix Factorization •  Non Negative Matrix factorization (NMF) X = WH •  X : image x tags, non negative •  W : image x theme •  H : theme x tag (scikit learn implementation) •  Most represented Themes •  Swimming-pool_Apartment_Putting-green •  Ocean_Coast_SandBar •  Coast_SeaCliff_RockArch •  Beach_Coast_BoardWalk •  Bridge_Viaduc_River •  Palace_BuildingFacade-Mansion •  Castle_Mansion_Monastery •  HotelRoom_Bedroom_DormRoom •  Dimension Reduction •  200x200 pixels -> 600 tags => 30 themes •  Faster content based filtering •  Image often sparse combination of themes Faster content based filtering •  Each theme has the same explication power Balanced vector for content based •  Explicability Each theme corresponds to a few labels
  • 23. Image content detection Topic scores determine the importance of topics in an image TOPIC   TOPIC  SCORE   (%)   Golf  course  –  Fairway  –  PuHng  green   31   Hotel  –  Inn  –  Apartment  building   outdoor   30   Swimming  pool  –  Lido  Deck  –  Hot  tub   outdoor   22   Beach  –  Coast  -­‐  Harbor   17   TOPIC   TOPIC  SCORE  (%)   Tower  –  Skyscraper  –  Office  building   62   Bridge  –  River  –  Viaduct   38  
  • 24. Note on model performance •  Images labels are used for similarity Calling herb field “putting green”: •  Is not important if all herbs field are called this way. •  Would be if we had lot’s of golf trips sales. •  Improving the NN performance ? •  Labels are used in NMF and reduced to themes •  Themes are used to calculate similarities for CB recommenders •  CB Recommenders are used as a feature in meta model •  Meta model give probabilities of purchase = order •  Users only check 10 sales… -> what is the change of online performance for 1% accuracy ?
  • 26. Results ? 1) Visits : •  France and Morocco •  Pool displayed 2) First Recommendation •  Mostly France & Mediterranean •  Fails to display pools 3) Only Images recommendation •  Pool all around the world •  Does not respect budget 4) Third column = Right Mix
  • 27. Results ? 1) Visits : •  Spain •  Sun & Beach •  Pool displayed 2) First Recommendation •  Displays nature… 3) Only Images ? •  Pool all around the world 4) Third = Right Mix •  Get the bungalow feature !
  • 28. Learned along the way What’s next ? AYrac5veness  =  %  visits  with  tag  /  %  sales  with  tag  -­‐1     For  ski  sales,  indoor  pictures  performs  beYer!    
  • 30. Conclusion Do iterative data science ! Start simple and grow Evaluate at each steps Image labelling = BI on steroids Transfer Learning Kick-start your project Gain time and money Any Data Scientist can do it Deep Learning Don’t start from scratch ! Is there existing data ? Is there a pre-trained model ?
  • 31. Thank you for your attention !
  • 32. Basic Recommendation Engines •  Implementation •  Everything in SQL -  Vertica -  Then Impala -  Then Spark •  Collaborative filtering •  Score(user, future sale j) = Sum_{i:visited sale} sim(i,j) •  Content based filtering •  Sale profile: (sale_id, feature, value) •  User profile: (user_id, feature, value) •  Sparse coding + join does the trick Join  on   user_id   Join  on  user_id  +  sales  id  
  • 33. One Meta Model to Rule Them All •  Negative sampling •  Take all purchases tuples : (user, product, timestamp)-> 1 •  Select 5 sales open at the same date the user did not buy -> 0 •  The model directly optimize purchasing probability •  Machine learning model •  Features : recommender systems. •  Logistic Regression Regularizing effect : we don’t want to overfit leaks. •  Reranking approach. Similar to Google or Yandex (Kaggle challenge)
  • 34. Classification and not Detection problem… •  Only have probabilities of each class •  Selecting based on probability threshold fails •  Keeping all information is not sparse Labels post-processing Deep/Transfer Learning 5-10 tags per images2s/image with CPU x20 speed up with GPU Our images Cafe -> we keep 5 labels With probabilities per image
  • 35. Solution 3 : What about APIs ?
  • 36. Solution 3 : What about APIs ? •  Price •  Their cost often rather cheap. Ex: 100 K request for less than 300$ •  VS the one of redeveloping (probably not as well) •  Full Database scoring •  APIs are often limited query per month. •  Make sure to be able to avoid cold start problem •  Stability •  Use model versioning •  Avoid covariate shift, distribution drift
  • 37. What about APIs ? Use for generating labels ! How to steal model: •  1) Score part of the database for training •  2) Train a model •  3) Score your entire database ! (Or don’t, it’s illegal) But I have only 5000 requests ? -> Use Transfer Learning ! Tramèr, Florian, et al. "Stealing Machine Learning Models via Prediction APIs." arXiv preprint arXiv:1609.02943 (2016).
  • 38. What about APIs ? Use for generating labels ! Experiment: •  5000 requests on API. (4500/500 split) •  Transfer learning with MIT Places Pre-trained Model •  Scikit learn Multilabel model (Or don’t, it’s illegal) (demo, not used in any real project)
  • 39. What about APIs ? Results Precision   75   Label   Probability   Label   Probability   landscape 1,0000 sunset 0,9998 sky 1,0000 no person 0,9996 outdoors 1,0000 water 0,9990 nature 1,0000 park 0,9849 rock 1,0000 river 0,9678 travel 1,0000 scenic 0,8031 Label   Probability   Label   Probability   beach 1,0000 ocean 1,0000 summer 1,0000 relaxation 1,0000 sand 1,0000 island 1,0000 tropical 1,0000 idyllic 1,0000 travel 1,0000 seashore 0,9998 seascape 1,0000 water 0,9997 (demo, not used in any real project) Recall   80  Accuracy   95