Conventional recommender systems provide recommendations by aggregating preferences from similar users or matching item characteristics to user profiles. However, these approaches have limitations for eBay due to its long-tailed inventory, sparse data, and heterogeneous items. Unconventional recommendation techniques explored by eBay include clustering items to address sparsity, mining relationships between item clusters from co-purchase data, and emphasizing image quality and explanation in the user experience.
3. eBay Inc. confidential3
Problem Definition:
What do conventional Recommender Systems do ?
• ‘people provide recommendations as inputs, which the system then
aggregates and directs to appropriate recipients.’ P. Resnick, H. Varian,
1997
• ‘any system that produces individualized recommendations as output or has
the effect of guiding the user in a personalized way to interesting or useful
objects in a large space of possible options.’ R. Burke, 2002
• where C is the set of all users and S is the set of all
possible items. u is a utility function. G. Adomavicius, A. Tuzhilin, 2005
• ‘A recommender selects the product that if acquired by the buyer maximizes
value of both buyer and seller at a give point in time’ F. Martin, Strands,
2009
4. eBay Inc. confidential
Taxonomy of Recommendation Techniques
Recommender System
Types
Descriptions Examples
Collaborative Filtering
(CF)
Only requires user item
ratings. Estimates a
user’s preference for an
item from other similar
users’ preferences.
‘People Who Bought this item
also Bought’
Content-based Filtering
(CBF)
Uses user profiles and
item characteristics.
Computes how well they
match.
UserProfile = { keywords of interest }
DocProfile = { associated keywords }
d=argmax J(UserProfile, DocProfile)
Hybrid Combines CF and CBF
methods in a myriad of
ways
Regularize the latent-factor CF with
constraints derived from user and
item profiles (Zhengdong Lu, 2009)
4
Based on categorization from (G. Adomavicius, A. Tuzhilin, 2005)
5. eBay Inc. confidential
Collaborative Filtering Algorithms
• Two distinct approaches to Collaborative Filtering – neighborhood methods
and latent factor methods (Y. Koren 2009)
• Neighborhood method – e.g., find other users with similar tastes and find
unseen items that they liked. Can also be done in an item-oriented way.
5
Recommend Baseball
to User 3
User 1
User 2
User 3
6. eBay Inc. confidential
Collaborative Filtering – Latent Factor Model
• Received a lot of attention during the Netflix competition.
• Related to Latent Semantic Indexing method published in 1988 by Bell labs.
• Postulates that an item rating is an inner product of a user factor vector and
the item factor vector (both are latent) : (Y. Koren 2009)
• Requires factorization of the user-item ratings matrix:
= X
• Missing ratings are estimated during factorization.
• Latent Factor Models are generally superior to neighborhood models when
compared on the Netflix dataset. As a side-note, user-based neighborhood models were
found to be vastly inferior to item-based neighborhood models. (R. Bell, Y. Koren, 2007)
6
U x I Ratings Matrix
U : # of users
I : # of items
U x F Matrix
U : # of users
F : # of factors
F x I Matrix
F : # of factors
I : # of items
7. eBay Inc. confidential
Long-Tailed Recommendations
• User and item distributions are long-tailed in retailer sites like eBay. Naïve
CF systems do not perform well with long-tailed users and items.
• The above problem is also related to the cold start problem.
• Y. Park et al divided the user-item ratings data into heads and tails. Then,
using clustering, reduced sparsity in the tails data before applying CF
techniques. They found that the errors in the tails items are significantly
lower. (Y. Park et al, 2008)
7
8. eBay Inc. confidential
What is Merchandising in e-Commerce
• Cross sell and Up sell inventory
– Accessories, related, similar
– Better deals/options.
• Navigation Aid
– Help users explore other options/choices.
– Browse/Top Trends
• Inspirational
– Pre-purchase decisions
– Post-purchase decisions
• Personalized
– Target recommendations to personalize their online shopping experience at
eBay.
Click thru, Customer interests, Past activity
$$$
9. eBay Inc. confidential
How different is merch ?
different from search?
Get signal
(Query)
User
Results
Boost signal
(Query rewriting)
Search index
Select & Rank
Get signal*
(Query/Item)
User
Results
Boost signal*
(Neighborhood
search)
Search index
Select & Rank*
Assume users always
have queries in mind at
each page they visit
Assume eBay can read
their mind!
Question #1: Merch =?
search results with these
queries?
Question #2: Merch
runtime flow =? search
runtime flow
Search (Pull) vs. Merch (Push)
10. eBay Inc. confidential
Get signal
(Query)
User
Results
Boost signal
(Query rewriting)
Search index
Select & Rank
Get signal*
(Query/Item)
User
Results
Boost signal*
(Neighborhood
search)
Search index
Select & Rank*
Search (Pull) vs. Merch (Push)
Claim: Merch and Search can share the same flow and
many components (with necessary customization)!
15. eBay Inc. confidential
Example 4 – a breath analyzer to go with your
purchase ?
Items to go with your purchase
Feedback on our suggestions
NEW Alcohawk Pro
Breathalyzer Alcohol
Tester/Test
$99.99
Buy It Now
See suggestions
NEW Alcohawk
Precision
Breathalyzer Alcohol
Tester/Test
$69.99
Buy It Now
See suggestions
AlcoHawk Alcohol
Breath Test
Breathalyzer
PRECISION
$76.91
Buy It Now
See suggestions
Alcohawk Precision
Alcohol Breathalyzer
Breath Test NEW
$47.00
Buy It Now
See suggestions
AlcoHAWK® PT500
Digital Breathalyzer
$165.99
Buy It Now
Free shipping
See suggestions
AlcoHawk Digital
Alcohol Tester
Breathalyzer ABI
$89.00
Buy It Now
See suggestions
AlcoHAWK PT500
Handheld
Breathalyzer Alcohol
Test Q3i
$159.99
Buy It Now
See suggestions
AlcoHAWK Digital
Alcohol Tester
Breathalyzer PT500-
GB
$149.99
Buy It Now
See suggestions
ItemBought
Recommendations
16. eBay Inc. confidential
Collaborative Filtering: Unconventional
recommendations – Example 5
• You Bought a Purple Princess Canopy, would like to buy more ?.
19. eBay Inc. confidential
• Limited catalog coverage due to long tail
of inventory.
• Limited structured data & incomplete
profile information
• Inconsistent image quality from sellers
• Heavy similar item buyers skew overall
behavior
• Sellers feel they “own” the listing
presentation
• Sellers get similar recommendation to
items they are selling
19
Hard Problem to Tackle: Known Challenges
Items in “other” categories is very
unstructured and recommendation
within that category may look off
20. eBay Inc. confidential
Challenges
• Non-productized inventory, long tail.
– Product coverage is there only for few categories
– Majority of items are ad hoc listings not covered by catalog
taxonomy
– Maintaining catalogs is a daunting task for the long tail.
– One-of-a-kind inventory, Items are short-lived
• Unstructured data
– Attribute coverage is minimal
• Sparsity in the transactional data
– Very few purchases for certain kinds of items
21. eBay Inc. confidential
Challenges
– Item-item pairs are supported by even fewer users.
• -we may not see users buying both a product and accessories on eBay.
• Large Data
– Much bigger data set in both users and inventory than other
ecommerce sites.
• Scale
– More than 300M listings.
– More than 10M new items every day
• Varying Levels of Item Description
• APPLE IPOD 16GB BLACK NANO 5TH GENERATION FM VIDEO 16 GB
• 16gb 16 gb Black Apple iPod 5th Gen Generation Nano MP3
22. eBay Inc. confidential
Merchandising Challenges
• Historical transaction data at the user level is very sparse
– In Netflix sparsity is 1:100, eBay sparsity is even higher (1:10K)
• User-item ratings are not available
• Unstructured and heterogeneous data
• eBay's uniqueness: formats, catalogs/items etc.
• Requires processing more implicit information:
– Item views, saved searches, tracked items, bids, purchases, clicks
eBay has several unique characteristics in the recommendation space!
22
23. eBay Inc. confidential
Our Guiding principles
• Learn from data as much as possible and use editorial overrides wherever
applicable relationships are not learnt. Collaborative filtering can only take
you so far.
• Develop extensible data mining pipeline and models
• Feed forward and feedback
• Establish trust in recommendations
• Context sensitive and personalized
• Develop many approaches, use ensemble methods
• Clarify the interface between item clustering for categorization vs item
clustering for recommendation
• UX Matters !!
23
24. eBay Inc. confidential
Goals for Merchandising
• Buyer’s perspective
– Relevance, diversity, recent, seasonality, interestingness, inspiration
• Seller’s perspective
– Visibility
• eBay’s perspective
– User satisfaction
– Completeness and efficiency for buyers
– Cross-selling, up-selling, surfacing of long tail
• Guiding Principle
– Big Data, extensive data mining to learn relationships ?
27. eBay Inc. confidential
View Item - Popular Watches
What are popular items in a category , by one or more content features.
Where else is merchandising shown on eBay ?
• Home Page
• My eBay
• Search Results Page
• BID/BIN Confirm
• View Item (Active, Closed)
• Checkout Success
• Sign Out Confirm
• Buyer Site Emails etc.
31. eBay Inc. confidential
Similar Item Recommendations
• User bid on item but was outbid and item has ended
– Show similar items as replacement items.
• User was watching an item that has ended
– Show similar items as replacement items
• User viewed an item but did not make a purchase
– Show similar items to showcase more choices.
– Inject diversity in the recommendation.
• What is item similarity for ecommerce
– Is it an exact match replacement
– Is it a similar product but diversity in one or more content features.
• For e.g. ipod nano blue 8gb vs ipod nano red 8gb
• Objective Function
– Definition : Find replacement items such that the recommended items have
enough diversity on one or more content features.
32. eBay Inc. confidential
Similar Items: Clustering Architecture
Off-line
Cluster
Generation
Cluster
Dictionary
New/Revise Flow
Cluster
Assignment
eBay Site
Search Engine
Item
Cluster
Index
Applications:
• Merchandising
• Navigation
• etc.
item
New & Revised
Items
Slow,
Periodic
Fast
37. eBay Inc. confidential
Motivation & Design Principles
• Once a user has purchased an Item, what else can we recommend to the
user to go with his purchase?
• One of the primary goals of Merchandising is to drive incremental
purchases
• On check-out, we want to recommend other items that “go-together” with
the item being purchased.
– E.g. for a cell-phone we may recommend a charger, case, screen protector.
– For a dress shirt, we may recommend a tie, a dress shoe or a jacket.
• Design Principles
– Horizontally scalable solution.
• All sites, all categories.
– Let the data speak for itself.
– Use Map-Reduce for scalability.
– Build upon the existing components.
38. eBay Inc. confidential
Building Blocks
• Non-productized item inventory with short lifetime makes any CF based
approach difficult.
• Map the items to a higher level abstraction (pseudo-product) to handle data
sparsity.
• Category structure is not granular enough to represent items.
• Reuse the item clusters generated for Similar Item Recommendation.
• Use users intent as a logical grouping of items.
• Use item title, item attributes and category hierarchy as finer grain features.
• Apply an unsupervised clustering approach to partition the logical group
into finer clusters.
• Treat products as special type of clusters.
• Merge clusters when necessary, We have few million leaf level clusters.
39. eBay Inc. confidential
Building Blocks (Item Cluster)
• Use user query as a logical grouping of items.
• Use item title, item attributes and category hierarchy as finer grain features.
• Apply an unsupervised clustering approach to partition the logical group into
finer clusters.
• Treat products as special type of clusters.
• Merge clusters when necessary.
• We have about 6 million leaf level clusters.
40. eBay Inc. confidential
Algorithmic Approach
• Hybrid Approach
– Content-based + Collaborative Filtering (CF)
• Use structured data when available
• Exploit the hierarchical nature of the data
• Utilize multiple types of features
Cluster Relationship Mining
• Create a directed graph of cluster – cluster using the co-purchase data.
• The latent relationships between clusters exist at different granularities.
• Model the problem of recommending related items as that of “ranking the
outgoing edges of a node”.
41. eBay Inc. confidential
Cluster-Cluster Relations
• From item-item pairs, we generate cluster-cluster relations.
Ex:
• U1 – { (Itm-1, Clid1), (Itm2, Clid2),….}
• U2 – { (Itm-3, Clid1), (Itm4 Clid2),….}
• (Clid-1, Clid2) -{U1, U2}
• Generate multi-level relations between clusters.
i1 i2
C1 C2
C1p
C2p
42. eBay Inc. confidential
Feature Extraction on Cluster Graph
• We create a directed graph where “from” node indicates
a purchase in “from cluster” and “to” node indicates a
purchase in “to cluster” at a later time.
• Total number of edges = billions of edges.
• Filtered edges using a MIN_SUPPORT value
• Computed the following types features for each of the
Cluster-1 – Cluster-2 (co-purchase) relations:
– Features from the graph
– Content (semantic) features
45. eBay Inc. confidential
Explaining Recommendation Results
• “Recent research noticed that the acceptance of CF recommender systems
(like Amazon.com, Netflix.com, MovieLens etc.) increases, when users
receive justified recommendations” (P. Symenoidis et al, 2009)
• P. Symenoidis et al found that more detailed justifications are preferred by
the users. (P. Symenoidis et al, 2009)
45
49. eBay Inc. confidential
References
• Paul Resnick and Hal R. Varian. Recommender Systems. Communications of the
ACM. March 1997
• Robin Burke. Hybrid Recommender Systems: Survey and Experiments. User
Modeling and User-Adapted Interaction. 2002
• Gediminas Adomavicius and Alexander Tuzhilin. Toward the Next Generation of
Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions.
IEEE Transactions on Knowledge and Data Engineering Vol. 17, No. 6, June 2005
• Francisco J. Martin. Top 10 Lessons Learned Developing, Deploying, and Operating
Real-World Recommender Systems. Invited Talk. RecSys09
• Yehuda Koren, Robert Bell, Chris Volinsky. Matrix Factorization Techniques for
Recommender Systems. IEEE Computer, 2009, 8
• David M. Pennock, Eric Horvitz, Steve Lawrence, C. Lee Giles. Collaborative Filtering
by Personality Diagnosis: A Hybrid Memory- and Model-Based Approach. UAI-2000
• Robert M. Bell and Yehunda Koren. Scalable Collaborative Filtering with Jointly
Derived Neighborhood Interpolation Weights. ICDM 2007
49
50. eBay Inc. confidential
References (Contd.)
• Simon Funk. Netflix Update http://sifter.org/~simon/journal/20061211.html
• Yehunda Koren. Collaborative Filtering with Temporal Dynamics. KDD 2009
• Zhengdong Lu, Deepak Agarwal, Inderjit S. Dhillon. A Spatio-Temporal Approach to
Collaborative Filtering. RecSys09
• Zeinab Abbassi, Sihem Amer-Yahia, Laks Lakshmanan, Sergei Vassilvitskii, Cong
Yu. Getting Recommender Systems to Think Outside the Box. RecSys09
• Yoon-Joo Park and Alexander Tuzhilin. The Long Tail of Recommender Systems and
How to Leverage It. RecSys08
• Panagiotis Symeonidis, Alexandros Nanopoulos, Yannis Manolopoulos. MoviExplain:
A Recommender System with Explanations. RecSys09
• Guy Shani and Asela Gunawardana. Evaluating Recommendation Systems.
Microsoft Techreport
• Raghunandan H. Keshavan, Andrea Montanari, Sewoong Oh. Matrix Completion
from Noisy Entries. NIPS 2009
50
51. eBay Inc. confidential
References (Contd.)
• Mohsen Jamali and Martin Ester. Using a Trust Network to Improve Top-N
Recommendation. RecSys09
• Hao Ma, Michael R. Lyu, Irwin King. Learning to Recommend with Trust and Distrust
Relationships. RecSys09
• Jaime Teevan, Susan T. Dumais, Eric Horvitz. Potential for Personalization. ACM
Transaction March 2010
51