SlideShare une entreprise Scribd logo
1  sur  23
Building Personalized
Data Products with Dato
Trey Causey
trey@dato.com
Questions?
• Now: We are monitoring chat window
• Later: Email me at trey@dato.com
• dato.com
What are data products?
• Products that produce and consume data.
• Products that improve as they produce and
consume data.
• Products that use data to provide a personalized
experience.
• Personalized experiences increase engagement
and retention.
What data?
• You probably already have this data
• Usage logs, transaction data, etc.
• Need a way to turn this existing data into
an intelligent application
Recommender systems
• Personalized experiences through
recommendations
• Recommend products, social network
connections, events, songs, and more
• Implicitly and explicitly drive many of
experiences you’re familiar with
Recommender uses
• Netflix, Spotify, LinkedIn, Facebook with the most
visible examples
• “You May Also Like”
“People You May Know”
“People to Follow”
• Also silently power many other experiences
• Product listings, up-sell options, add-ons,
• Netflix —> $1MM for 10% better
What data do you need?
• Required for implicit data
• User identifier
• Product identifier
• That’s it!
• Further customization
• Ratings (explicit data), counts
• Side data
Implicit data
• User x product
interactions
• Consumed / used /
clicked / etc.
How do recommenders work?
• Most basic: item similarity
Matrix factorization
• Treat users and products as a giant matrix
with (very) many missing values
• Users have latent factors that describe
how much they like various genres
• Items have latent factors that describe
how much like each genre they are
Matrix factorization
• Turn this into a fill-in-the-missing-value
exercise by learning the latent factors
• Implicit or explicit data
• Part of the winning formula for the Netflix
Prize
• Predict ratings or rankings
Matrix factorization
Fill in the blanks
• Learn the latent factors that minimize
prediction error on the observed values
• Fill in the missing values
• Sort the list by predicted rating &
recommend the unseen items
Rankings?
• Often less concerned with predicting
precise scores
• Just want to get the first few items right
• Screen real estate is precious
• Ranking factorization recommender
Side features
• Include information about users
• Geographic, demographic, time of day,
etc.
• Include information about products
• Product subtypes, geographic
availability, etc.
• Help with the cold start problem
How to choose which model?
• Select the appropriate model for your data
(implicit/explicit), if you want side features
or not, select hyperparameters, tune
them…
• … or let GraphLab Create do it for you and
automatically tune hyperparameters
Evaluation
• Train on a portion of your data
• Test on a held-out portion
• Ratings: RMSE
• Ranking: Precision, recall
• Business metrics
• Evaluate against popularity
Live demo
• Building and deploying a recommender
system with GraphLab Create and Dato
Predictive Services
Thank you!
• dato.com
• @datoinc
• trey@dato.com

Contenu connexe

Tendances

Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
PyData
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
Dataiku
 

Tendances (20)

Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and Practices
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanSquare's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong Yan
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Machine Learning system architecture – Microsoft Translator, a Case Study :  ...Machine Learning system architecture – Microsoft Translator, a Case Study :  ...
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
 
A quick overview of Eaagle
A quick overview of EaagleA quick overview of Eaagle
A quick overview of Eaagle
 
Building Better Models Faster Using Active Learning
Building Better Models Faster Using Active LearningBuilding Better Models Faster Using Active Learning
Building Better Models Faster Using Active Learning
 
Deploying ml
Deploying mlDeploying ml
Deploying ml
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Introduction to Azure machine learning
Introduction to Azure machine learningIntroduction to Azure machine learning
Introduction to Azure machine learning
 
Emerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper reviewEmerging trends in Artificial intelligence - A deeper review
Emerging trends in Artificial intelligence - A deeper review
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
 
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning PlatformsWebinar - Comparative Analysis of Cloud based Machine Learning Platforms
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
 

En vedette (9)

Leveraging data science to keep commerce safe
Leveraging data science to keep commerce safeLeveraging data science to keep commerce safe
Leveraging data science to keep commerce safe
 
RESUME
RESUMERESUME
RESUME
 
Politi-K
Politi-KPoliti-K
Politi-K
 
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
Los Planes de Movilidad Urbana Sostenible: herramientas y alternativas para e...
 
Dieta de líquidos completos
Dieta de líquidos completosDieta de líquidos completos
Dieta de líquidos completos
 
Neck pain
Neck painNeck pain
Neck pain
 
World health organization - Nikhil - HRM, Welingkar
World health organization - Nikhil - HRM, WelingkarWorld health organization - Nikhil - HRM, Welingkar
World health organization - Nikhil - HRM, Welingkar
 
Lobbysheets 2017
Lobbysheets 2017Lobbysheets 2017
Lobbysheets 2017
 
Hammerdesk Profile
Hammerdesk  ProfileHammerdesk  Profile
Hammerdesk Profile
 

Similaire à Building Personalized Data Products with Dato

Data Detectives - Presentation
Data Detectives - PresentationData Detectives - Presentation
Data Detectives - Presentation
Clint Campbell
 
ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012
TRG Arts
 

Similaire à Building Personalized Data Products with Dato (20)

Career in Data Using Tableau
Career in Data Using TableauCareer in Data Using Tableau
Career in Data Using Tableau
 
Nonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & TricksNonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & Tricks
 
Design Recommender systems from scratch
Design Recommender systems from scratchDesign Recommender systems from scratch
Design Recommender systems from scratch
 
Data Detectives - Presentation
Data Detectives - PresentationData Detectives - Presentation
Data Detectives - Presentation
 
Creating a marketing calendar that works for you
Creating a marketing calendar that works for youCreating a marketing calendar that works for you
Creating a marketing calendar that works for you
 
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
Getting Started with Product Analytics - A 101 Implementation Guide for Begin...
 
7 Step Data Cleanse: Salesforce Hygiene
7 Step Data Cleanse: Salesforce Hygiene7 Step Data Cleanse: Salesforce Hygiene
7 Step Data Cleanse: Salesforce Hygiene
 
Interactive Marketing week 7 Ethan Chazin
Interactive Marketing week 7 Ethan ChazinInteractive Marketing week 7 Ethan Chazin
Interactive Marketing week 7 Ethan Chazin
 
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.com
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.comDiscover the Benefits of Cloud Computing with Google Apps and Salesforce.com
Discover the Benefits of Cloud Computing with Google Apps and Salesforce.com
 
Webinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchWebinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better Search
 
ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012
 
Fueling Your Growth With Smart Data Management
Fueling Your Growth With Smart Data ManagementFueling Your Growth With Smart Data Management
Fueling Your Growth With Smart Data Management
 
Raab innovation begins with data
Raab innovation begins with dataRaab innovation begins with data
Raab innovation begins with data
 
Think tank - Data Culture for a Better Business
Think tank - Data Culture for a Better BusinessThink tank - Data Culture for a Better Business
Think tank - Data Culture for a Better Business
 
SEO 101 deck for 3dCart webinar
SEO 101 deck for 3dCart webinarSEO 101 deck for 3dCart webinar
SEO 101 deck for 3dCart webinar
 
Stc preso2012 b
Stc preso2012 bStc preso2012 b
Stc preso2012 b
 
Eventbrite sxsw
Eventbrite sxswEventbrite sxsw
Eventbrite sxsw
 
116 Machine learning for Product Managers
116   Machine learning for Product Managers116   Machine learning for Product Managers
116 Machine learning for Product Managers
 
Machine learning for product managers. Presented at Boston ProductCamp (June...
Machine learning for product  managers. Presented at Boston ProductCamp (June...Machine learning for product  managers. Presented at Boston ProductCamp (June...
Machine learning for product managers. Presented at Boston ProductCamp (June...
 
Digital Marketing Analytics Certification - Session One
Digital Marketing Analytics Certification - Session OneDigital Marketing Analytics Certification - Session One
Digital Marketing Analytics Certification - Session One
 

Plus de Turi, Inc.

Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Turi, Inc.
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
Turi, Inc.
 

Plus de Turi, Inc. (20)

Webinar - Analyzing Video
Webinar - Analyzing VideoWebinar - Analyzing Video
Webinar - Analyzing Video
 
Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log Data
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
SFrame
SFrameSFrame
SFrame
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data Experience
 
Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark Better {ML} Together: GraphLab Create + Spark
Better {ML} Together: GraphLab Create + Spark
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Building Personalized Data Products with Dato

  • 1. Building Personalized Data Products with Dato Trey Causey trey@dato.com
  • 2. Questions? • Now: We are monitoring chat window • Later: Email me at trey@dato.com • dato.com
  • 3. What are data products? • Products that produce and consume data. • Products that improve as they produce and consume data. • Products that use data to provide a personalized experience. • Personalized experiences increase engagement and retention.
  • 4. What data? • You probably already have this data • Usage logs, transaction data, etc. • Need a way to turn this existing data into an intelligent application
  • 5. Recommender systems • Personalized experiences through recommendations • Recommend products, social network connections, events, songs, and more • Implicitly and explicitly drive many of experiences you’re familiar with
  • 6. Recommender uses • Netflix, Spotify, LinkedIn, Facebook with the most visible examples • “You May Also Like” “People You May Know” “People to Follow” • Also silently power many other experiences • Product listings, up-sell options, add-ons, • Netflix —> $1MM for 10% better
  • 7. What data do you need? • Required for implicit data • User identifier • Product identifier • That’s it! • Further customization • Ratings (explicit data), counts • Side data
  • 8. Implicit data • User x product interactions • Consumed / used / clicked / etc.
  • 9. How do recommenders work? • Most basic: item similarity
  • 10. Matrix factorization • Treat users and products as a giant matrix with (very) many missing values • Users have latent factors that describe how much they like various genres • Items have latent factors that describe how much like each genre they are
  • 11. Matrix factorization • Turn this into a fill-in-the-missing-value exercise by learning the latent factors • Implicit or explicit data • Part of the winning formula for the Netflix Prize • Predict ratings or rankings
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Fill in the blanks • Learn the latent factors that minimize prediction error on the observed values • Fill in the missing values • Sort the list by predicted rating & recommend the unseen items
  • 18. Rankings? • Often less concerned with predicting precise scores • Just want to get the first few items right • Screen real estate is precious • Ranking factorization recommender
  • 19. Side features • Include information about users • Geographic, demographic, time of day, etc. • Include information about products • Product subtypes, geographic availability, etc. • Help with the cold start problem
  • 20. How to choose which model? • Select the appropriate model for your data (implicit/explicit), if you want side features or not, select hyperparameters, tune them… • … or let GraphLab Create do it for you and automatically tune hyperparameters
  • 21. Evaluation • Train on a portion of your data • Test on a held-out portion • Ratings: RMSE • Ranking: Precision, recall • Business metrics • Evaluate against popularity
  • 22. Live demo • Building and deploying a recommender system with GraphLab Create and Dato Predictive Services
  • 23. Thank you! • dato.com • @datoinc • trey@dato.com