SlideShare une entreprise Scribd logo
1  sur  17
True Fit Skin Care 
Chang Liu 
Fellow at Insight Data Science 2014
So many products… 
What makes it so hard? Overwhelming information
What makes it so hard? Overwhelming information 
So many products… So many reviews…
What makes it so hard? Overwhelming information 
So many products… So many reviews… 
Reviews can be so long…
What makes it so hard? Overwhelming information 
So many products… So many reviews… 
Reviews can be so long… 
So many ingredients…
So many products… So many reviews… 
Time 
spent 
Money 
wasted 
Happiness 
What makes it so hard? Overwhelming information 
Reviews can be so long… 
So many ingredients…
32k Reviewers 
• w/ 2+ reviews 
~1200 Products 
• ~80 brands 
• 8 categories 
184k Reviews 
• Rating [1-5] 
• Review text 
• Quick take 
Collaborative Filter using User Reviews from Sephora.com 
Product 
X Y … 
Reviewers 
1 … 
2 … 
3 … 
… … 
… … 
N … 
Algorithm: 
• Item-centric collaborative filter 
• Pearson’s correlation coefficients 
to measure pairwise similarity
32k Reviewers 
• w/ 2+ reviews 
~1200 Products 
• ~80 brands 
• 8 categories 
184k Reviews 
• Rating [1-5] 
• Review text 
• Quick take 
Collaborative Filter using User Reviews from Sephora.com 
Product 
X Y … 
Reviewers 
1 … 
2 … 
3 … 
… … 
… … 
N … 
Algorithm: 
• Item-centric collaborative filter 
• Pearson’s correlation coefficients 
to measure pairwise similarity 
Similarity = cXY = 
(Xi - X)(Yi -Y ) 
N 
å 
i=1 
N 
å (X- X)2 
(Y-Y )2 
i i i=1 
N 
å 
i=1 
M 
å / cij 
recommendation scoreui = rujcij 
j
32k Reviewers 
• w/ 2+ reviews 
~1200 Products 
• ~80 brands 
• 8 categories 
184k Reviews 
• Rating [1-5] 
• Review text 
• Quick take 
Collaborative Filter using User Reviews from Sephora.com 
Product 
X Y … 
Reviewers 
1 … 
2 … 
3 … 
… … 
… … 
N … 
Algorithm: 
• Item-centric collaborative filter 
• Pearson’s correlation coefficients 
to measure pairwise similarity 
Similarity = cXY = 
(Xi - X)(Yi -Y ) 
N 
å 
i=1 
N 
å (X- X)2 
(Y-Y )2 
i i i=1 
N 
å 
M 
å / cij 
Cross Validation 
• 5-fold for reviewer 
• Leave-one-out for product 
• Accuracy = 86.3% ± 1% 
i=1 
recommendation scoreui = rujcij 
j
Visualize the similarity matrix 
White = high similarity 
Black = low similarity 
Sorted by brands 
alphabetically
White in a square 
= 
Users reviews are similar 
for all products in a brand 
= 
Strong customer loyalty 
There are structures!
Expensive! 
“Organic 
& Natural” 
There are structures! For example… 
Cost effective
There are structures! For example… 
Expensive! 
“Organic 
& Natural” 
Cost effective 
Actionable Insights 
For Sephora.com: 
Send marketing emails to 
new customers of brands 
with stronger customer 
loyalty!
Chang Liu 
PhD. in Civil Engineering @CMU 
J8D8L5@gmail.com 
linkedin.com/in/changliucmu 
github.com/R4trtry
Is the rating a good measure of reviewers’ perspective? 
• Trained a NaïveBaysian classifier for 
sentiment analysis 
• W/ 250 thousand reviews from 
Birchbox.com 
• A website that sends out free 
samples from smaller brands and 
gathers massive user reviews 
Most common words Most informative feature 
Word Count Negative Positive 
skin 91349 re-wash Penny 
product 82481 garbage hook 
use 64044 mediocre gorgeous 
love 55691 ketchup perk 
feel 47879 trash stock 
face 42615 unimpressive glowing 
like 41427 survey splurge 
great 34155 ineffective effortless 
really 31672 gag Christmas 
smell 27621 worthless happily 
text quick take 
Precision 95.3% 85.4% 
Recall 89.8% 93.1% 
Worth 
every 
penny! 
Another Validation
Another Validation 
Is the rating a good measure of reviewers’ perspective?
Product X 
Algorithm: Item-centric collaborative filter 
similarity 
87.4% 
Product Y 
Product X 
Product Y 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
Reviewers 
Product 
X Y … 
1 … 
2 … 
3 … 
… … 
… … 
N … 
M products reviewed by N reviewers 
Pairwise similarities are measured 
by Pearson's correlation coefficients: 
cXY = 
(Xi - X)(Yi -Y ) 
N 
å 
i=1 
N 
å (X- X)2 
(Y-Y )2 
i i i=1 
N 
å 
i=1 
Then weight the ratings 
based on the correlation coefficients: 
Scorei = 
cijr uj 
M 
å 
j 
| cij | 
ruj : User u's preference on item j

Contenu connexe

Similaire à Chang liu insight 2014

Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabadprathyusha1234
 
best online data science courses
best online data science coursesbest online data science courses
best online data science coursesprathyusha1234
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clusteringawesomesos
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
alaTest Dmexco 22 Sept 2011 - Increase Conversion with Reviews
alaTest Dmexco 22 Sept 2011 - Increase Conversion with ReviewsalaTest Dmexco 22 Sept 2011 - Increase Conversion with Reviews
alaTest Dmexco 22 Sept 2011 - Increase Conversion with Reviewsalatest_jakob
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviewsmaranlar
 
Actual cases of applying AI related technologiesin Rakuten
Actual cases of applying AI related technologiesin RakutenActual cases of applying AI related technologiesin Rakuten
Actual cases of applying AI related technologiesin RakutenRakuten Group, Inc.
 
Automated solutions for product and pricing research
Automated solutions for product and pricing researchAutomated solutions for product and pricing research
Automated solutions for product and pricing researchRay Poynter
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
The Case For Reviews Webinar
The Case For Reviews WebinarThe Case For Reviews Webinar
The Case For Reviews WebinarTrustpilot
 
Control Your Online Reputation - MSP Social Media Breakfast
Control Your Online Reputation - MSP Social Media BreakfastControl Your Online Reputation - MSP Social Media Breakfast
Control Your Online Reputation - MSP Social Media BreakfastAaron Weiche
 
Building a Recommendation System for EverQuest Landmark's Marketplace
Building a Recommendation System for EverQuest Landmark's MarketplaceBuilding a Recommendation System for EverQuest Landmark's Marketplace
Building a Recommendation System for EverQuest Landmark's MarketplaceBen Weber
 
Building a Recommendation System for EverQuest Landmark’s Marketplace
Building a Recommendation System for EverQuest Landmark’s MarketplaceBuilding a Recommendation System for EverQuest Landmark’s Marketplace
Building a Recommendation System for EverQuest Landmark’s MarketplaceBen Weber
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systemsAndrea Gigli
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFYusuke Yamamoto
 
Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...
Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...
Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...Trustpilot
 
Daniel Lemin - Why Online Reviews Are The Future of Local & Search Marketing
Daniel Lemin - Why Online Reviews Are The Future of Local & Search MarketingDaniel Lemin - Why Online Reviews Are The Future of Local & Search Marketing
Daniel Lemin - Why Online Reviews Are The Future of Local & Search MarketingJulia Grosman
 

Similaire à Chang liu insight 2014 (20)

Top data science institutes in hyderabad
Top data science institutes in hyderabadTop data science institutes in hyderabad
Top data science institutes in hyderabad
 
best online data science courses
best online data science coursesbest online data science courses
best online data science courses
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clustering
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
alaTest Dmexco 22 Sept 2011 - Increase Conversion with Reviews
alaTest Dmexco 22 Sept 2011 - Increase Conversion with ReviewsalaTest Dmexco 22 Sept 2011 - Increase Conversion with Reviews
alaTest Dmexco 22 Sept 2011 - Increase Conversion with Reviews
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviews
 
Gio
GioGio
Gio
 
Actual cases of applying AI related technologiesin Rakuten
Actual cases of applying AI related technologiesin RakutenActual cases of applying AI related technologiesin Rakuten
Actual cases of applying AI related technologiesin Rakuten
 
Automated solutions for product and pricing research
Automated solutions for product and pricing researchAutomated solutions for product and pricing research
Automated solutions for product and pricing research
 
Fashiondatasc
FashiondatascFashiondatasc
Fashiondatasc
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Voice of the Market, Tom Anderson
Voice of the Market, Tom AndersonVoice of the Market, Tom Anderson
Voice of the Market, Tom Anderson
 
The Case For Reviews Webinar
The Case For Reviews WebinarThe Case For Reviews Webinar
The Case For Reviews Webinar
 
Control Your Online Reputation - MSP Social Media Breakfast
Control Your Online Reputation - MSP Social Media BreakfastControl Your Online Reputation - MSP Social Media Breakfast
Control Your Online Reputation - MSP Social Media Breakfast
 
Building a Recommendation System for EverQuest Landmark's Marketplace
Building a Recommendation System for EverQuest Landmark's MarketplaceBuilding a Recommendation System for EverQuest Landmark's Marketplace
Building a Recommendation System for EverQuest Landmark's Marketplace
 
Building a Recommendation System for EverQuest Landmark’s Marketplace
Building a Recommendation System for EverQuest Landmark’s MarketplaceBuilding a Recommendation System for EverQuest Landmark’s Marketplace
Building a Recommendation System for EverQuest Landmark’s Marketplace
 
Introduction to recommender systems
Introduction to recommender systemsIntroduction to recommender systems
Introduction to recommender systems
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CF
 
Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...
Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...
Email Marketing + Social Proof = The Perfect Formula to Convert More Browsers...
 
Daniel Lemin - Why Online Reviews Are The Future of Local & Search Marketing
Daniel Lemin - Why Online Reviews Are The Future of Local & Search MarketingDaniel Lemin - Why Online Reviews Are The Future of Local & Search Marketing
Daniel Lemin - Why Online Reviews Are The Future of Local & Search Marketing
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Chang liu insight 2014

  • 1. True Fit Skin Care Chang Liu Fellow at Insight Data Science 2014
  • 2. So many products… What makes it so hard? Overwhelming information
  • 3. What makes it so hard? Overwhelming information So many products… So many reviews…
  • 4. What makes it so hard? Overwhelming information So many products… So many reviews… Reviews can be so long…
  • 5. What makes it so hard? Overwhelming information So many products… So many reviews… Reviews can be so long… So many ingredients…
  • 6. So many products… So many reviews… Time spent Money wasted Happiness What makes it so hard? Overwhelming information Reviews can be so long… So many ingredients…
  • 7. 32k Reviewers • w/ 2+ reviews ~1200 Products • ~80 brands • 8 categories 184k Reviews • Rating [1-5] • Review text • Quick take Collaborative Filter using User Reviews from Sephora.com Product X Y … Reviewers 1 … 2 … 3 … … … … … N … Algorithm: • Item-centric collaborative filter • Pearson’s correlation coefficients to measure pairwise similarity
  • 8. 32k Reviewers • w/ 2+ reviews ~1200 Products • ~80 brands • 8 categories 184k Reviews • Rating [1-5] • Review text • Quick take Collaborative Filter using User Reviews from Sephora.com Product X Y … Reviewers 1 … 2 … 3 … … … … … N … Algorithm: • Item-centric collaborative filter • Pearson’s correlation coefficients to measure pairwise similarity Similarity = cXY = (Xi - X)(Yi -Y ) N å i=1 N å (X- X)2 (Y-Y )2 i i i=1 N å i=1 M å / cij recommendation scoreui = rujcij j
  • 9. 32k Reviewers • w/ 2+ reviews ~1200 Products • ~80 brands • 8 categories 184k Reviews • Rating [1-5] • Review text • Quick take Collaborative Filter using User Reviews from Sephora.com Product X Y … Reviewers 1 … 2 … 3 … … … … … N … Algorithm: • Item-centric collaborative filter • Pearson’s correlation coefficients to measure pairwise similarity Similarity = cXY = (Xi - X)(Yi -Y ) N å i=1 N å (X- X)2 (Y-Y )2 i i i=1 N å M å / cij Cross Validation • 5-fold for reviewer • Leave-one-out for product • Accuracy = 86.3% ± 1% i=1 recommendation scoreui = rujcij j
  • 10. Visualize the similarity matrix White = high similarity Black = low similarity Sorted by brands alphabetically
  • 11. White in a square = Users reviews are similar for all products in a brand = Strong customer loyalty There are structures!
  • 12. Expensive! “Organic & Natural” There are structures! For example… Cost effective
  • 13. There are structures! For example… Expensive! “Organic & Natural” Cost effective Actionable Insights For Sephora.com: Send marketing emails to new customers of brands with stronger customer loyalty!
  • 14. Chang Liu PhD. in Civil Engineering @CMU J8D8L5@gmail.com linkedin.com/in/changliucmu github.com/R4trtry
  • 15. Is the rating a good measure of reviewers’ perspective? • Trained a NaïveBaysian classifier for sentiment analysis • W/ 250 thousand reviews from Birchbox.com • A website that sends out free samples from smaller brands and gathers massive user reviews Most common words Most informative feature Word Count Negative Positive skin 91349 re-wash Penny product 82481 garbage hook use 64044 mediocre gorgeous love 55691 ketchup perk feel 47879 trash stock face 42615 unimpressive glowing like 41427 survey splurge great 34155 ineffective effortless really 31672 gag Christmas smell 27621 worthless happily text quick take Precision 95.3% 85.4% Recall 89.8% 93.1% Worth every penny! Another Validation
  • 16. Another Validation Is the rating a good measure of reviewers’ perspective?
  • 17. Product X Algorithm: Item-centric collaborative filter similarity 87.4% Product Y Product X Product Y 1 1 1 1 1 1 1 1 1 1 1 1 Reviewers Product X Y … 1 … 2 … 3 … … … … … N … M products reviewed by N reviewers Pairwise similarities are measured by Pearson's correlation coefficients: cXY = (Xi - X)(Yi -Y ) N å i=1 N å (X- X)2 (Y-Y )2 i i i=1 N å i=1 Then weight the ratings based on the correlation coefficients: Scorei = cijr uj M å j | cij | ruj : User u's preference on item j

Notes de l'éditeur

  1. Hi My name is Chang. I created true fit skin care, a web-app that recommend skin care products for you. I’m not an expert As you can see, the background is crowded with skin care products in boxes, bottles and jars. This is what it looks like in out bathroom. My wife
  2. Estee lauder, is not doing so well, it’s a bit expensive, so there are actually very small number of reviews per product. FAB, instead, is very cost effective, therefore has pretty good customer loyalty. Origins, on the other hand, makes products with organic and natural ingredients. Therefore, customer who likes their product are paying for this natural concept.
  3. Estee lauder, is not doing so well, it’s a bit expensive, so there are actually very small number of reviews per product. FAB, instead, is very cost effective, therefore has pretty good customer loyalty. Origins, on the other hand, makes products with organic and natural ingredients. Therefore, customer who likes their product are paying for this natural concept.
  4. And this is me, just finiwww.linkedin.com/in/changliucmu/shed phd in civil engineering at carnegie Mellon University. I studied pipe monitoring using data driven approach. The image here shows the transmission pipe lines across the US.