SlideShare une entreprise Scribd logo
1  sur  39
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Building Content Recommendation
Systems Using Apache MXNet and Gluon
C y r u s M V a h i d
c y r u s m v @ a m a z o n . c o m
P r i n c i p a l D e e p L e a r n i n g S o l u t i o n A r c h i t e c t @ A m a z o n A I
D e c e m b e r 1 , 2 0 1 7
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
My Profile – amazon.co.uk My Profile – amazon.de
Motivation
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Recommender systems are an important
mechanism to personalize and enhance customer
experience.
• Recommender systems can be applied to achieve
different goals:
• Increased time spent on a platform
• Bring to the attention of a customer
that an item might need complementary
pieces.
• In the heart of it all, the intention is to provide
the person interacting with the system the most
relevant results and content to that person.
Motivation
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recommender System
Collaborative Filtering Using Matrix Factorization
Step by Step Coding
Challenges
Deep Structure Semantic Models (DSSM)
Production Notes
C o m m o n A p p r o a c h e s t o B u i l d a
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• User based recommenders try to identify a short
list of other users who are "similar" to you who
have rated this new item. They then take a
weighted average of the rating for the item from
these users and predict that number as your likely
rating for that item.
• Item based recommenders try to identify a short
list of other items that are "similar" to the item
under question and take a weighted average of
the ratings for those top few items which you
provided and predict that number as your likely
rating for that item.
Collaborative Filtering
https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Collaborative Filtering
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• User-Item matrices are very sparse. For instance
amazon.com provides a catalogue of millions of
items, but each user has rated only a handful of
items.
• Intention of a recommendation engine is to
complete the sparse matrix based on the known
data.
• Matrix Factorization factorizes a matrix to
separate matrices, that when multiplied
approximate to the completed matrix.
• For the sake of efficiency we would like to
factorize the matrix to a long and a wide matrices.
Matrix Factorization
https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
def plain_net(k):
# input
user = mx.symbol.Variable('user')
item = mx.symbol.Variable('item')
score = mx.symbol.Variable('score')
# user feature lookup
user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k)
# item feature lookup
item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k)
# predict by the inner product, which is elementwise product and then sum
pred = user * item
pred = mx.symbol.sum(data = pred, axis = 1)
pred = mx.symbol.Flatten(data = pred)
# loss layer
pred = mx.symbol.LinearRegressionOutput(data = pred, label = score)
return pred
net1 = plain_net(64)
mx.viz.plot_network(net1)
Linear Model
User
Embedding
Item
Embedding
∘
⨀⨁ score
output
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
def get_one_layer_mlp(hidden, k):
# input
user = mx.symbol.Variable('user')
item = mx.symbol.Variable('item')
score = mx.symbol.Variable('score')
# user latent features
user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k)
user = mx.symbol.Activation(data = user, act_type='relu')
user = mx.symbol.FullyConnected(data = user, num_hidden = hidden)
# item latent features
item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k)
item = mx.symbol.Activation(data = item, act_type='relu')
item = mx.symbol.FullyConnected(data = item, num_hidden = hidden)
# predict by the inner product
pred = user * item
pred = mx.symbol.sum(data = pred, axis = 1)
pred = mx.symbol.Flatten(data = pred)
# loss layer
pred = mx.symbol.LinearRegressionOutput(data = pred, label = score)
return pred
Adding Non-Linearity
User
Embedding
Item
Embedding
∘
⨀⨁ score
output
dense dense
dense dense
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Embedding Layer is where a networks extracts
importance of features from data.
• Embedding is frequently used in NLP. For instance
in sentiment analysis embedding distills sentiment
information form words.
Embedding Layer
https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recommender System
Collaborative Filtering Using Matrix Factorization
Step by Step Coding
Challenges
Deep Structure Semantic Models (DSSM)
Production Notes
C o m m o n A p p r o a c h e s t o B u i l d a
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Coding Steps in gluon
• Loading Data into an Array Iterator
• Defining the Network
• Initializing the Network
• Choosing the Loss Function
• Choosing Optimizer
• Training the Model
• Measuring Performance
• Adding Non-linearity
• Measuring Performance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
train_data_iter = gluon.data.DataLoader(SparseMatrixDataset(train_data, train_label),
shuffle=True, batch_size=batch_size)
test_data_iter = gluon.data.DataLoader(SparseMatrixDataset(test_data, test_label),
shuffle=True, batch_size=batch_size)
class SparseMatrixDataset(gluon.data.Dataset):
def __init__(self, data, label):
assert data.shape[0] == len(label)
self.data = data
self.label = label
if isinstance(label, ndarray.NDArray) and len(label.shape) == 1:
self._label = label.asnumpy()
else:
self._label = label
def __getitem__(self, idx):
return self.data[idx, 0], self.data[idx, 1], self.label[idx]
def __len__(self):
return self.data.shape[0]
Loading Data into an Array Iterator
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
class MFBlock(gluon.Block):
def __init__(self, max_users, max_items, num_emb, dropout_p=0.5):
super(MFBlock, self).__init__()
self.max_users = max_users
self.max_items = max_items
self.dropout_p = dropout_p
self.num_emb = num_emb
with self.name_scope():
self.user_embeddings = gluon.nn.Embedding(max_users, num_emb)
self.item_embeddings = gluon.nn.Embedding(max_items, num_emb)
def forward(self, users, items):
a = self.user_embeddings(users)
b = self.item_embeddings(items)
predictions = a * b
predictions = nd.sum(predictions, axis=1)
return predictions
Defining the Network
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx, force_reinit=True)
Initializing the Network
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
loss_function = gluon.loss.L2Loss()
Choosing the Loss Function
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': lr, 'wd': wd, 'momentum':
0.9})
Choosing Optimizer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
pochs = 10
def train(data_iter, net):
for e in range(epochs):
print("epoch: {}".format(e))
for i, (user, item, label) in enumerate(train_data_iter):
user = user.as_in_context(ctx).reshape((batch_size,))
item = item.as_in_context(ctx).reshape((batch_size,))
label = label.as_in_context(ctx).reshape((batch_size,))
with mx.autograd.record():
output = net(user, item)
loss = loss_function(output, label)
loss.backward()
net.collect_params().values()
trainer.step(batch_size)
print("EPOCH {}: RMSE ON TRAINING and TEST: {}. {}".format(e,
eval_net(train_data_iter, net)),
eval_net(test_data_iter, net)))
return output
Training the Model
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EPOCH 0: RMSE ON TRAINING and TEST: 3.5231035657055676. 3.53353935778141
EPOCH 1: RMSE ON TRAINING and TEST: 3.3170668120235205. 3.3881871127337218
EPOCH 2: RMSE ON TRAINING and TEST: 1.7826270855292679. 2.011566595607996
EPOCH 3: RMSE ON TRAINING and TEST: 1.1699977076664567. 1.3101321067839862
EPOCH 4: RMSE ON TRAINING and TEST: 0.9599943867214024. 1.0654756610274314
EPOCH 5: RMSE ON TRAINING and TEST: 0.8661853047579527. 0.951712748876214
EPOCH 6: RMSE ON TRAINING and TEST: 0.8170913911752403. 0.8906555083304644
EPOCH 7: RMSE ON TRAINING and TEST: 0.7873748381830752. 0.8524201203614473
EPOCH 8: RMSE ON TRAINING and TEST: 0.7683705289691687. 0.8290898406863213
EPOCH 9: RMSE ON TRAINING and TEST: 0.7545523364305496. 0.8126902643978595
Measuring Performance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
class MFBlock(gluon.Block):
def __init__(self, max_users, max_items, num_emb, dropout_p=0.5):
super(MFBlock, self).__init__()
self.max_users = max_users
self.max_items = max_items
self.dropout_p = dropout_p
self.num_emb = num_emb
with self.name_scope():
self.user_embeddings = gluon.nn.Embedding(max_users, num_emb)
self.item_embeddings = gluon.nn.Embedding(max_items, num_emb)
self.dense = gluon.nn.Dense(num_emb, activation='relu')
def forward(self, users, items):
a = self.user_embeddings(users)
a = self.dense(a)
b = self.item_embeddings(items)
b = self.dense(b)
predictions = a * b
predictions = nd.sum(predictions, axis=1)
return predictions
Adding Non_Linearity
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EPOCH 0: RMSE ON TRAINING and TEST:
0.7397429539039732. 0.7727593539834022
EPOCH 1: RMSE ON TRAINING and TEST:
0.7340302160739899. 0.7680136477231979
EPOCH 2: RMSE ON TRAINING and TEST:
0.7290934163123369. 0.7636049427747726
EPOCH 3: RMSE ON TRAINING and TEST:
0.7489639871120453. 0.7812525823950768
EPOCH 4: RMSE ON TRAINING and TEST:
0.7219638488218189. 0.7587113852620124
EPOCH 5: RMSE ON TRAINING and TEST:
0.7221305305734277. 0.7613962271451951
EPOCH 6: RMSE ON TRAINING and TEST:
0.6964019835107028. 0.7500802919447422
EPOCH 7: RMSE ON TRAINING and TEST:
0.7040959010727703. 0.7654881411790848
EPOCH 8: RMSE ON TRAINING and TEST:
0.662088219345361. 0.7430168891996145
EPOCH 9: RMSE ON TRAINING and TEST:
0.641668664816767. 0.7430033217519522
Measuring Performance
EPOCH 0: RMSE ON TRAINING and TEST:
3.5231035657055676. 3.53353935778141
EPOCH 1: RMSE ON TRAINING and TEST:
3.3170668120235205. 3.3881871127337218
EPOCH 2: RMSE ON TRAINING and TEST:
1.7826270855292679. 2.011566595607996
EPOCH 3: RMSE ON TRAINING and TEST:
1.1699977076664567. 1.3101321067839862
EPOCH 4: RMSE ON TRAINING and TEST:
0.9599943867214024. 1.0654756610274314
EPOCH 5: RMSE ON TRAINING and TEST:
0.8661853047579527. 0.951712748876214
EPOCH 6: RMSE ON TRAINING and TEST:
0.8170913911752403. 0.8906555083304644
EPOCH 7: RMSE ON TRAINING and TEST:
0.7873748381830752. 0.8524201203614473
EPOCH 8: RMSE ON TRAINING and TEST:
0.7683705289691687. 0.8290898406863213
EPOCH 9: RMSE ON TRAINING and TEST:
0.7545523364305496. 0.8126902643978595
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recommender System
Collaborative Filtering Using Matrix Factorization
Step by Step Coding
Challenges
Deep Structure Semantic Models (DSSM)
Production Notes
C o m m o n A p p r o a c h e s t o B u i l d a
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Matrix Factorization is ideal for small catalogues
and can perform based on small amount of data.
• As the catalogues get larger memory becomes a
challenge.
• For instance MovieLens 20M has 27278 items and
138493 users. User x Item matrix will have a
dimension of 138,493 X 27,278 = 3,777,812,054.
From all possible ratings the dataset includes only
20000263. This means only 0.05 percent of the
dataset contains data and 99.95 percent is just
sparsity.
Limitations
ref: Leo Dirac – re:Invent 2016
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Factorization Machines take advantage of
combining Support Vector Machines and MF in
order to scale and deal with sparsity.
• The problem FM is that they run on a single
machine and have huge memory requirements
Scaling Is the Issue…
• DiFacto or Distributed Factorization Machines are
capable of distributing computation of sub-
gradients on minibatches asynchronously and thus
distribution load to several machines.
(Alex Smola et al. - 2016)
parameter server asynch SGD
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• New users have no previous raring history.
• New items have never been rated before.
• Implementing recommendation without having
any prior information for platforms that no user-
login has been required.
Cold-Start Problem
itm0 itm1 itm2 itm3 itm4 itm5 itm6 itm7 itm8 itm9 itm10
u0 2 ? ? ? ? ? 4 ? ? ? ?
u1 ? 2 3 ? 4 ? ? 3 ? 5 ?
u2 2 ? ? ? 4 ? 4 ? ? ? ?
u3 1 ? ? 1 ? ? ? ? 1 ? ?
u4 5 ? ? 2 ? ? ? ? ? 2 ?
u5 ? ? 3 ? ? ? ? ? ? 5 ?
u6 ? 1 ? ? ? ? 4 ? 1 ? ?
u7 ? ? ? ? ? ? 1 ? 2 3 ?
u8 ? ? ? 5 ? 3 ? ? ? ? ?
u9 ? 2 3 ? 3 ? ? 3 ? 5 ?
u10 ? ? ? ? ? ? ? ? ? ? ?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• We might not have user data, but often there is a
wealth of product information available. We can
use this data in order to recommend similar
products to a user.
Content (Feature)-Based
code category sub	category weight price colour dimentions
itm1 1 1 20 123.1 blue 23x12x19
itm2 2 1 20 900 white 23x12x20
itm3 1 1 22 123.1 green 20x10x18
itm4 3 1 20 600 blue 23x12x22
itm5 1 2 19 200 yellow 23x12x23
itm6 1 1 1 12 red 2x1x3
itm7 1 1 900 2000 blue 100x80x99
itm8 9 8 20 6000 grey 99x99x99
itm9 7 5 1000 123.1 blue 123x5x8
itm10 9 8 20 5000 brown 99x99x99
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Content Based
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Hybrid Models combine item/user-based models
with content based.
• After searching for travel guides for Peru,
Argentina, and Brasil and while being in Germany,
I received the recommendation in the opposite
pane.
• The recommender has combined my location,
offered me a South American travel guide, and
has given more weight to a specific publisher that
I deliberately clicked-on more frequently.
Hybrid Models
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recommender System
Collaborative Filtering Using Matrix Factorization
Step by Step Coding
Challenges
Deep Structure Semantic Models (DSSM)
Production Notes
C o m m o n A p p r o a c h e s t o B u i l d a
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• There is still a wealth of information we have not
tapped into.
• Movies have images.
• Images can be captioned.
• Products have names, while we have so
far reduced them to item numbers.
• TV series have episode names.
• Products have verbose reviews.
• We can harvest all of this information and enrich
our recommendation system.
Most of the Data Is Still Untapped
There is a strong similarity in the ambience and
composition of these images:
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• A Matrix Factorization solution in its core is
multiplication of two matrices.
• Neural Networks are good at picking up semantic
intent at phrase/sentence level.
• Neural Networks perform well at image
captioning.
• Output of network is a tensor.
• So we can use output of several networks as our
embedding layer for an enriched recommendation
system.
DSSM
User
Embedding
Item
Embedding
∘
⨀⨁ score
output
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DSSM
User
Embedding
Item
Embedding
⨀
⨀⨁ score
output
user Search
BOW
title words
BOW
resnet: imgs
dropout
dense densedensedensedense
concat concat
densedense
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recommender System
Collaborative Filtering Using Matrix Factorization
Step by Step Coding
Challenges
Deep Structure Semantic Models (DSSM)
Production Notes
C o m m o n A p p r o a c h e s t o B u i l d a
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Development and Deployment
• One of the most important areas to pay attention to is the loss function.
• Multi-label cross-entropy loss has worked well in the past.
• This is relatively simple to apply to a wide variety of model types. Ranking loss often works better, but is
more complex to apply correctly.
• Scalability is always a big challenge. Offline batch computation and saving the results can help.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Logging and Measurement
• Deploying a recommender system requires some care since a model only succeeds if good behavioral data
can be logged.
• Moreover, without good logging it is impossible to assess the quality of the deployment. Tools such as
Amazon Kinesis are ideally suited for this purpose.
• Display bias is very strong – this means that customers are more likely to click on a mediocre
recommendation that they see than an excellent recommendation they don’t see.
• An improved model might not outperform the baseline when evaluated on historical data (and a model
that performs well on historical data might not work well on future data).
• Online comparisons are critical. This suggests an iterative strategy to building out a recommender system,
starting with a relatively simple model that is easy to deploy, and iteratively testing improvements.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Metrics
• Some common and effective metrics to measure are
• Recall: How many of the relevant items were selected.
• Precision: How many items were selected – Optimize for precision and for top x.
• Try to increase the number of customers that viewed a product with some frequency.
• Weblab: take half of the customers and check the end-result.
• Time spent on the platform
• Distinct days on platform
• Forecast customer score
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How to Choose a Model
Iterative
Process
à à à à
Data Available Limited user
data
Binary user-item
interaction
Limited user
data
Additional user-
item interaction
User data
Extensive item
data
Extensive user
data
Extensive item
data
Relevant
Algorithms
MF
Binary
MF
FM
DiFacto
DSSM Customized and
more advanced
DSSM
Relative
Complexity
2 4 5 5
Deployment
Considerations
• Historical data size – 30d/60d/1y…
• Fine-tuning techniques (daily, weekly..)
• Inference – compressed model? Trade-off between model complexity
and inference latency
• Validation system setup
• Iterate fast and simple
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
References
• https://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf
• http://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf
• https://www.cs.cmu.edu/~muli/file/difacto.pdf
• http://papers.www2017.com.au.s3-website-ap-southeast-2.amazonaws.com/proceedings/p173.pdf
• http://mccormickml.com/2017/01/11/word2vec-tutorial-part-2-negative-sampling/
• https://arxiv.org/pdf/1310.4546.pdf
• http://faculty.washington.edu/xiaohe/UW_DeepNLP_slides.pdf
• https://www.microsoft.com/en-us/research/publication/learning-deep-structured-semantic-models-for-
web-search-using-clickthrough-data/
• http://alex.smola.org/teaching/berkeley2012/recommender.html
• http://papers.www2017.com.au.s3-website-ap-southeast-
2.amazonaws.com/proceedings/p173.pdfhttp://www.simafore.com/blog/bid/207476/The-intuition-
behind-recommender-systems-collaborative-filtering
• https://arxiv.org/pdf/1312.4894.pdf
• http://www.simafore.com/blog/bid/207476/The-intuition-behind-recommender-systems-collaborative-
filtering
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
c y r u s m v @ a m a z o n . c o m

Contenu connexe

Tendances

Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018
Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018 Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018
Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018
Amazon Web Services Korea
 
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Databricks
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Lior Rokach
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
inovex GmbH
 

Tendances (20)

Personalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep LearningPersonalizing "The Netflix Experience" with Deep Learning
Personalizing "The Netflix Experience" with Deep Learning
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Personalizing Session-based Recommendations with Hierarchical Recurrent Neura...
Personalizing Session-based Recommendations with Hierarchical Recurrent Neura...Personalizing Session-based Recommendations with Hierarchical Recurrent Neura...
Personalizing Session-based Recommendations with Hierarchical Recurrent Neura...
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
BERT (v3).pptx
BERT (v3).pptxBERT (v3).pptx
BERT (v3).pptx
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
 
Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018
Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018 Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018
Amazon Neptune- 신규 그래프 데이터베이스 서비스 활용::김상필, 강정희::AWS Summit Seoul 2018
 
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
Large Scale Fuzzy Name Matching with a Custom ML Pipeline in Batch and Stream...
 
Intermediate Cypher.pdf
Intermediate Cypher.pdfIntermediate Cypher.pdf
Intermediate Cypher.pdf
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Gpt models
Gpt modelsGpt models
Gpt models
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature Engineering
 
Neo4j Training Series - Spring Data Neo4j
Neo4j Training Series - Spring Data Neo4jNeo4j Training Series - Spring Data Neo4j
Neo4j Training Series - Spring Data Neo4j
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 

Similaire à Building Content Recommendation Systems Using Apache MXNet and Gluon - MCL402 - re:Invent 2017

Similaire à Building Content Recommendation Systems Using Apache MXNet and Gluon - MCL402 - re:Invent 2017 (20)

Building Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet GluonBuilding Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet Gluon
 
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and GluonMCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
 
Bring Your Own Apache MXNet and TensorFlow Scripts to Amazon SageMaker (AIM35...
Bring Your Own Apache MXNet and TensorFlow Scripts to Amazon SageMaker (AIM35...Bring Your Own Apache MXNet and TensorFlow Scripts to Amazon SageMaker (AIM35...
Bring Your Own Apache MXNet and TensorFlow Scripts to Amazon SageMaker (AIM35...
 
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
 
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
[AWS Dev Day] 실습워크샵 | 모두를 위한 컴퓨터 비전 딥러닝 툴킷, GluonCV 따라하기
 
Amazon SageMaker Algorithms: Machine Learning Week San Francisco
Amazon SageMaker Algorithms: Machine Learning Week San FranciscoAmazon SageMaker Algorithms: Machine Learning Week San Francisco
Amazon SageMaker Algorithms: Machine Learning Week San Francisco
 
SageMaker Algorithms Infinitely Scalable Machine Learning
SageMaker Algorithms Infinitely Scalable Machine LearningSageMaker Algorithms Infinitely Scalable Machine Learning
SageMaker Algorithms Infinitely Scalable Machine Learning
 
MCL309_Deep Learning on a Raspberry Pi
MCL309_Deep Learning on a Raspberry PiMCL309_Deep Learning on a Raspberry Pi
MCL309_Deep Learning on a Raspberry Pi
 
Integrating Deep Learning In the Enterprise
Integrating Deep Learning In the EnterpriseIntegrating Deep Learning In the Enterprise
Integrating Deep Learning In the Enterprise
 
Integrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your EnterpriseIntegrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your Enterprise
 
Metering the Hybrid Cloud - ARC404 - re:Invent 2017
Metering the Hybrid Cloud - ARC404 - re:Invent 2017Metering the Hybrid Cloud - ARC404 - re:Invent 2017
Metering the Hybrid Cloud - ARC404 - re:Invent 2017
 
Time series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 LausanneTime series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 Lausanne
 
Build, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMakerBuild, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMaker
 
[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...
[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...
[AWS Innovate 온라인 컨퍼런스] 간단한 Python 코드만으로 높은 성능의 기계 학습 모델 만들기 - 김무현, AWS Sr.데이...
 
Integrating Deep Learning into your Enterprise
Integrating Deep Learning into your EnterpriseIntegrating Deep Learning into your Enterprise
Integrating Deep Learning into your Enterprise
 
Supercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMakerSupercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMaker
 
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
 
Getting Started with ML in AdTech - AWS Online Tech Talks
Getting Started with ML in AdTech - AWS Online Tech TalksGetting Started with ML in AdTech - AWS Online Tech Talks
Getting Started with ML in AdTech - AWS Online Tech Talks
 

Plus de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Building Content Recommendation Systems Using Apache MXNet and Gluon - MCL402 - re:Invent 2017

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Building Content Recommendation Systems Using Apache MXNet and Gluon C y r u s M V a h i d c y r u s m v @ a m a z o n . c o m P r i n c i p a l D e e p L e a r n i n g S o l u t i o n A r c h i t e c t @ A m a z o n A I D e c e m b e r 1 , 2 0 1 7
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. My Profile – amazon.co.uk My Profile – amazon.de Motivation
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Recommender systems are an important mechanism to personalize and enhance customer experience. • Recommender systems can be applied to achieve different goals: • Increased time spent on a platform • Bring to the attention of a customer that an item might need complementary pieces. • In the heart of it all, the intention is to provide the person interacting with the system the most relevant results and content to that person. Motivation
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommender System Collaborative Filtering Using Matrix Factorization Step by Step Coding Challenges Deep Structure Semantic Models (DSSM) Production Notes C o m m o n A p p r o a c h e s t o B u i l d a
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • User based recommenders try to identify a short list of other users who are "similar" to you who have rated this new item. They then take a weighted average of the rating for the item from these users and predict that number as your likely rating for that item. • Item based recommenders try to identify a short list of other items that are "similar" to the item under question and take a weighted average of the ratings for those top few items which you provided and predict that number as your likely rating for that item. Collaborative Filtering https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Collaborative Filtering
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • User-Item matrices are very sparse. For instance amazon.com provides a catalogue of millions of items, but each user has rated only a handful of items. • Intention of a recommendation engine is to complete the sparse matrix based on the known data. • Matrix Factorization factorizes a matrix to separate matrices, that when multiplied approximate to the completed matrix. • For the sake of efficiency we would like to factorize the matrix to a long and a wide matrices. Matrix Factorization https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. def plain_net(k): # input user = mx.symbol.Variable('user') item = mx.symbol.Variable('item') score = mx.symbol.Variable('score') # user feature lookup user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k) # item feature lookup item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k) # predict by the inner product, which is elementwise product and then sum pred = user * item pred = mx.symbol.sum(data = pred, axis = 1) pred = mx.symbol.Flatten(data = pred) # loss layer pred = mx.symbol.LinearRegressionOutput(data = pred, label = score) return pred net1 = plain_net(64) mx.viz.plot_network(net1) Linear Model User Embedding Item Embedding ∘ ⨀⨁ score output
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. def get_one_layer_mlp(hidden, k): # input user = mx.symbol.Variable('user') item = mx.symbol.Variable('item') score = mx.symbol.Variable('score') # user latent features user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k) user = mx.symbol.Activation(data = user, act_type='relu') user = mx.symbol.FullyConnected(data = user, num_hidden = hidden) # item latent features item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k) item = mx.symbol.Activation(data = item, act_type='relu') item = mx.symbol.FullyConnected(data = item, num_hidden = hidden) # predict by the inner product pred = user * item pred = mx.symbol.sum(data = pred, axis = 1) pred = mx.symbol.Flatten(data = pred) # loss layer pred = mx.symbol.LinearRegressionOutput(data = pred, label = score) return pred Adding Non-Linearity User Embedding Item Embedding ∘ ⨀⨁ score output dense dense dense dense
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Embedding Layer is where a networks extracts importance of features from data. • Embedding is frequently used in NLP. For instance in sentiment analysis embedding distills sentiment information form words. Embedding Layer https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommender System Collaborative Filtering Using Matrix Factorization Step by Step Coding Challenges Deep Structure Semantic Models (DSSM) Production Notes C o m m o n A p p r o a c h e s t o B u i l d a
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Coding Steps in gluon • Loading Data into an Array Iterator • Defining the Network • Initializing the Network • Choosing the Loss Function • Choosing Optimizer • Training the Model • Measuring Performance • Adding Non-linearity • Measuring Performance
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. train_data_iter = gluon.data.DataLoader(SparseMatrixDataset(train_data, train_label), shuffle=True, batch_size=batch_size) test_data_iter = gluon.data.DataLoader(SparseMatrixDataset(test_data, test_label), shuffle=True, batch_size=batch_size) class SparseMatrixDataset(gluon.data.Dataset): def __init__(self, data, label): assert data.shape[0] == len(label) self.data = data self.label = label if isinstance(label, ndarray.NDArray) and len(label.shape) == 1: self._label = label.asnumpy() else: self._label = label def __getitem__(self, idx): return self.data[idx, 0], self.data[idx, 1], self.label[idx] def __len__(self): return self.data.shape[0] Loading Data into an Array Iterator
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. class MFBlock(gluon.Block): def __init__(self, max_users, max_items, num_emb, dropout_p=0.5): super(MFBlock, self).__init__() self.max_users = max_users self.max_items = max_items self.dropout_p = dropout_p self.num_emb = num_emb with self.name_scope(): self.user_embeddings = gluon.nn.Embedding(max_users, num_emb) self.item_embeddings = gluon.nn.Embedding(max_items, num_emb) def forward(self, users, items): a = self.user_embeddings(users) b = self.item_embeddings(items) predictions = a * b predictions = nd.sum(predictions, axis=1) return predictions Defining the Network
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx, force_reinit=True) Initializing the Network
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. loss_function = gluon.loss.L2Loss() Choosing the Loss Function
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': lr, 'wd': wd, 'momentum': 0.9}) Choosing Optimizer
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. pochs = 10 def train(data_iter, net): for e in range(epochs): print("epoch: {}".format(e)) for i, (user, item, label) in enumerate(train_data_iter): user = user.as_in_context(ctx).reshape((batch_size,)) item = item.as_in_context(ctx).reshape((batch_size,)) label = label.as_in_context(ctx).reshape((batch_size,)) with mx.autograd.record(): output = net(user, item) loss = loss_function(output, label) loss.backward() net.collect_params().values() trainer.step(batch_size) print("EPOCH {}: RMSE ON TRAINING and TEST: {}. {}".format(e, eval_net(train_data_iter, net)), eval_net(test_data_iter, net))) return output Training the Model
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EPOCH 0: RMSE ON TRAINING and TEST: 3.5231035657055676. 3.53353935778141 EPOCH 1: RMSE ON TRAINING and TEST: 3.3170668120235205. 3.3881871127337218 EPOCH 2: RMSE ON TRAINING and TEST: 1.7826270855292679. 2.011566595607996 EPOCH 3: RMSE ON TRAINING and TEST: 1.1699977076664567. 1.3101321067839862 EPOCH 4: RMSE ON TRAINING and TEST: 0.9599943867214024. 1.0654756610274314 EPOCH 5: RMSE ON TRAINING and TEST: 0.8661853047579527. 0.951712748876214 EPOCH 6: RMSE ON TRAINING and TEST: 0.8170913911752403. 0.8906555083304644 EPOCH 7: RMSE ON TRAINING and TEST: 0.7873748381830752. 0.8524201203614473 EPOCH 8: RMSE ON TRAINING and TEST: 0.7683705289691687. 0.8290898406863213 EPOCH 9: RMSE ON TRAINING and TEST: 0.7545523364305496. 0.8126902643978595 Measuring Performance
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. class MFBlock(gluon.Block): def __init__(self, max_users, max_items, num_emb, dropout_p=0.5): super(MFBlock, self).__init__() self.max_users = max_users self.max_items = max_items self.dropout_p = dropout_p self.num_emb = num_emb with self.name_scope(): self.user_embeddings = gluon.nn.Embedding(max_users, num_emb) self.item_embeddings = gluon.nn.Embedding(max_items, num_emb) self.dense = gluon.nn.Dense(num_emb, activation='relu') def forward(self, users, items): a = self.user_embeddings(users) a = self.dense(a) b = self.item_embeddings(items) b = self.dense(b) predictions = a * b predictions = nd.sum(predictions, axis=1) return predictions Adding Non_Linearity
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EPOCH 0: RMSE ON TRAINING and TEST: 0.7397429539039732. 0.7727593539834022 EPOCH 1: RMSE ON TRAINING and TEST: 0.7340302160739899. 0.7680136477231979 EPOCH 2: RMSE ON TRAINING and TEST: 0.7290934163123369. 0.7636049427747726 EPOCH 3: RMSE ON TRAINING and TEST: 0.7489639871120453. 0.7812525823950768 EPOCH 4: RMSE ON TRAINING and TEST: 0.7219638488218189. 0.7587113852620124 EPOCH 5: RMSE ON TRAINING and TEST: 0.7221305305734277. 0.7613962271451951 EPOCH 6: RMSE ON TRAINING and TEST: 0.6964019835107028. 0.7500802919447422 EPOCH 7: RMSE ON TRAINING and TEST: 0.7040959010727703. 0.7654881411790848 EPOCH 8: RMSE ON TRAINING and TEST: 0.662088219345361. 0.7430168891996145 EPOCH 9: RMSE ON TRAINING and TEST: 0.641668664816767. 0.7430033217519522 Measuring Performance EPOCH 0: RMSE ON TRAINING and TEST: 3.5231035657055676. 3.53353935778141 EPOCH 1: RMSE ON TRAINING and TEST: 3.3170668120235205. 3.3881871127337218 EPOCH 2: RMSE ON TRAINING and TEST: 1.7826270855292679. 2.011566595607996 EPOCH 3: RMSE ON TRAINING and TEST: 1.1699977076664567. 1.3101321067839862 EPOCH 4: RMSE ON TRAINING and TEST: 0.9599943867214024. 1.0654756610274314 EPOCH 5: RMSE ON TRAINING and TEST: 0.8661853047579527. 0.951712748876214 EPOCH 6: RMSE ON TRAINING and TEST: 0.8170913911752403. 0.8906555083304644 EPOCH 7: RMSE ON TRAINING and TEST: 0.7873748381830752. 0.8524201203614473 EPOCH 8: RMSE ON TRAINING and TEST: 0.7683705289691687. 0.8290898406863213 EPOCH 9: RMSE ON TRAINING and TEST: 0.7545523364305496. 0.8126902643978595
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommender System Collaborative Filtering Using Matrix Factorization Step by Step Coding Challenges Deep Structure Semantic Models (DSSM) Production Notes C o m m o n A p p r o a c h e s t o B u i l d a
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Matrix Factorization is ideal for small catalogues and can perform based on small amount of data. • As the catalogues get larger memory becomes a challenge. • For instance MovieLens 20M has 27278 items and 138493 users. User x Item matrix will have a dimension of 138,493 X 27,278 = 3,777,812,054. From all possible ratings the dataset includes only 20000263. This means only 0.05 percent of the dataset contains data and 99.95 percent is just sparsity. Limitations ref: Leo Dirac – re:Invent 2016
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Factorization Machines take advantage of combining Support Vector Machines and MF in order to scale and deal with sparsity. • The problem FM is that they run on a single machine and have huge memory requirements Scaling Is the Issue… • DiFacto or Distributed Factorization Machines are capable of distributing computation of sub- gradients on minibatches asynchronously and thus distribution load to several machines. (Alex Smola et al. - 2016) parameter server asynch SGD
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • New users have no previous raring history. • New items have never been rated before. • Implementing recommendation without having any prior information for platforms that no user- login has been required. Cold-Start Problem itm0 itm1 itm2 itm3 itm4 itm5 itm6 itm7 itm8 itm9 itm10 u0 2 ? ? ? ? ? 4 ? ? ? ? u1 ? 2 3 ? 4 ? ? 3 ? 5 ? u2 2 ? ? ? 4 ? 4 ? ? ? ? u3 1 ? ? 1 ? ? ? ? 1 ? ? u4 5 ? ? 2 ? ? ? ? ? 2 ? u5 ? ? 3 ? ? ? ? ? ? 5 ? u6 ? 1 ? ? ? ? 4 ? 1 ? ? u7 ? ? ? ? ? ? 1 ? 2 3 ? u8 ? ? ? 5 ? 3 ? ? ? ? ? u9 ? 2 3 ? 3 ? ? 3 ? 5 ? u10 ? ? ? ? ? ? ? ? ? ? ?
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • We might not have user data, but often there is a wealth of product information available. We can use this data in order to recommend similar products to a user. Content (Feature)-Based code category sub category weight price colour dimentions itm1 1 1 20 123.1 blue 23x12x19 itm2 2 1 20 900 white 23x12x20 itm3 1 1 22 123.1 green 20x10x18 itm4 3 1 20 600 blue 23x12x22 itm5 1 2 19 200 yellow 23x12x23 itm6 1 1 1 12 red 2x1x3 itm7 1 1 900 2000 blue 100x80x99 itm8 9 8 20 6000 grey 99x99x99 itm9 7 5 1000 123.1 blue 123x5x8 itm10 9 8 20 5000 brown 99x99x99
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Content Based
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Hybrid Models combine item/user-based models with content based. • After searching for travel guides for Peru, Argentina, and Brasil and while being in Germany, I received the recommendation in the opposite pane. • The recommender has combined my location, offered me a South American travel guide, and has given more weight to a specific publisher that I deliberately clicked-on more frequently. Hybrid Models
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommender System Collaborative Filtering Using Matrix Factorization Step by Step Coding Challenges Deep Structure Semantic Models (DSSM) Production Notes C o m m o n A p p r o a c h e s t o B u i l d a
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • There is still a wealth of information we have not tapped into. • Movies have images. • Images can be captioned. • Products have names, while we have so far reduced them to item numbers. • TV series have episode names. • Products have verbose reviews. • We can harvest all of this information and enrich our recommendation system. Most of the Data Is Still Untapped There is a strong similarity in the ambience and composition of these images:
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • A Matrix Factorization solution in its core is multiplication of two matrices. • Neural Networks are good at picking up semantic intent at phrase/sentence level. • Neural Networks perform well at image captioning. • Output of network is a tensor. • So we can use output of several networks as our embedding layer for an enriched recommendation system. DSSM User Embedding Item Embedding ∘ ⨀⨁ score output
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. DSSM User Embedding Item Embedding ⨀ ⨀⨁ score output user Search BOW title words BOW resnet: imgs dropout dense densedensedensedense concat concat densedense
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recommender System Collaborative Filtering Using Matrix Factorization Step by Step Coding Challenges Deep Structure Semantic Models (DSSM) Production Notes C o m m o n A p p r o a c h e s t o B u i l d a
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Development and Deployment • One of the most important areas to pay attention to is the loss function. • Multi-label cross-entropy loss has worked well in the past. • This is relatively simple to apply to a wide variety of model types. Ranking loss often works better, but is more complex to apply correctly. • Scalability is always a big challenge. Offline batch computation and saving the results can help.
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Logging and Measurement • Deploying a recommender system requires some care since a model only succeeds if good behavioral data can be logged. • Moreover, without good logging it is impossible to assess the quality of the deployment. Tools such as Amazon Kinesis are ideally suited for this purpose. • Display bias is very strong – this means that customers are more likely to click on a mediocre recommendation that they see than an excellent recommendation they don’t see. • An improved model might not outperform the baseline when evaluated on historical data (and a model that performs well on historical data might not work well on future data). • Online comparisons are critical. This suggests an iterative strategy to building out a recommender system, starting with a relatively simple model that is easy to deploy, and iteratively testing improvements.
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Metrics • Some common and effective metrics to measure are • Recall: How many of the relevant items were selected. • Precision: How many items were selected – Optimize for precision and for top x. • Try to increase the number of customers that viewed a product with some frequency. • Weblab: take half of the customers and check the end-result. • Time spent on the platform • Distinct days on platform • Forecast customer score
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How to Choose a Model Iterative Process à à à à Data Available Limited user data Binary user-item interaction Limited user data Additional user- item interaction User data Extensive item data Extensive user data Extensive item data Relevant Algorithms MF Binary MF FM DiFacto DSSM Customized and more advanced DSSM Relative Complexity 2 4 5 5 Deployment Considerations • Historical data size – 30d/60d/1y… • Fine-tuning techniques (daily, weekly..) • Inference – compressed model? Trade-off between model complexity and inference latency • Validation system setup • Iterate fast and simple
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. References • https://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf • http://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf • https://www.cs.cmu.edu/~muli/file/difacto.pdf • http://papers.www2017.com.au.s3-website-ap-southeast-2.amazonaws.com/proceedings/p173.pdf • http://mccormickml.com/2017/01/11/word2vec-tutorial-part-2-negative-sampling/ • https://arxiv.org/pdf/1310.4546.pdf • http://faculty.washington.edu/xiaohe/UW_DeepNLP_slides.pdf • https://www.microsoft.com/en-us/research/publication/learning-deep-structured-semantic-models-for- web-search-using-clickthrough-data/ • http://alex.smola.org/teaching/berkeley2012/recommender.html • http://papers.www2017.com.au.s3-website-ap-southeast- 2.amazonaws.com/proceedings/p173.pdfhttp://www.simafore.com/blog/bid/207476/The-intuition- behind-recommender-systems-collaborative-filtering • https://arxiv.org/pdf/1312.4894.pdf • http://www.simafore.com/blog/bid/207476/The-intuition-behind-recommender-systems-collaborative- filtering
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you! c y r u s m v @ a m a z o n . c o m