Explainability for Learning to Rank

Explainability for Learning
to Rank models 
 
 
Anna Ruggero, R&D Software Engineer
Ilaria Petreti, IR/ML Engineer
30th
March 2021

‣ R&D Search Software Engineer
‣ Master Degree in Computer Science
Engineering
‣ Big Data, Information Retrieval
‣ Organist, Music lover
Who We Are
Anna Ruggero

‣ Information Retrieval/Machine Learning
Engineer
‣ Master in Data Science
‣ Data Mining and Machine Learning
technologies passionate
‣ Sports Lover
Who We Are
Ilaria Petreti

Overview
What is Explainability
Explainability Methods
SHAP Library
Case Study - Tree SHAP
Warnings

“XAI (eXplainable Artificial Intelligence) refers to methods and techniques in the application
of artificial intelligence technology such that the results of the solution can be understood by
humans”
XAI - What is Explainability?

Explainability - Why is important?
Neural Networks
Forest of Regression Trees
Predictive Performance
Explainability
Explainability - Accuracy Trade Off

• Model debugging
• Increase detection of bias
• Verifying accuracy
Explainability - Why is important?
Why and Who needs Explanations?
• Building trust in the model’s output
• Building social acceptance
• Increase transparency
• Satisfying regulatory requirements
• Verifying model safety
MODEL
BUILDERS
END
USERS
PUBLIC
STAKEHOLDERS

Explainability - Why is important in Information Retrieval?
Understand your Learning To Rank model
‣ Why a search result is at a certain position
‣ How does the model calculate the score?
‣ How is a feature affecting the ranking?
‣ How is the feature values affecting the ranking?
‣ Has the model learned any weird behaviour?

Taxonomy of Explainability Methods

GLOBAL
Explain the overall model
Post-Hoc Explainability
Explainability - Methods
‣ Anchors
‣ CEM: Contrastive Explanations Method
‣ LIME
‣ SHAP
‣ DeepLIFT
‣ ….
‣ PDP: Partial Dependence Plots
‣ ALE: Accumulated Local Effect
‣ ICE: Individual Conditional Expectation
‣ Feature Importance and Permutation
Importance (through ELI5)
‣ ….
LOCAL
Explain a single prediction

References of some popular Python libraries for Explainability:
Explainability - Libraries
Python Library Type Links
ELI5 Model Agnostic https://eli5.readthedocs.io/en/latest/overview.html
LIME Model Agnostic https://github.com/marcotcr/lime
SHAP Model Agnostic + Specific Explainers https://github.com/slundberg/shap
Anchors Model Agnostic https://github.com/marcotcr/anchor
DeepLIFT Neural Network https://github.com/kundajelab/deeplift

Library Methods Links
SHAPASH (Python) LIME & SHAP https://github.com/MAIF/shapash
AI Explainability 360 (Python) CEM, LIME & SHAP https://github.com/Trusted-AI/AIX360
InterpretML (Python) PDP, LIME & SHAP https://github.com/interpretml/interpret
ALIBI (Python) ALE, SHAP, Anchors, etc. https://github.com/SeldonIO/alibi
SKATER (Python) LIME (for Local Expl.) https://github.com/oracle/Skater
TrustyAI (Java) PDP & personalised LIME
https://blog.kie.org/2020/10/an-introduction-to-trustyai-ex
plainability-capabilities.html
What-If (TensorFlow Interface) https://pair-code.github.io/what-if-tool/
Explainability - Libraries
Other Open Source Libraries/Tools:

Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
Python Library Python Library
SHAP
(Shapley Additive
Explanations)
pip install lime pip install shap

LIME
Model-Agnostic
Explanations)
Local Explainability Local and Global Explainability
SHAP
(Shapley Additive
Explanations)

Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
LIME
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
SHAP EXPLAINER

Model Agnostic
explainer
Importance Scores Importance Scores
LIME
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
Type of Explanation Family
The importance scores are meant to communicate the relative contribution made
by each input feature to a given prediction

Model Agnostic
explainer
Input Perturbation
Particular case of Input Perturbation —> Shapley
Values
LIME
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
For a given observation, it generates
local perturbations of the features and
captures their impact using a linear
model => the weights of the linear model
are used as feature importance scores
Shapley value is computed by examining
all possible perturbations of inputting
other features

Model Agnostic
explainer
Input Perturbation
Values
Fast
Computationally Expensive
(especially for large dataset)
LIME
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)

Model Agnostic
explainer
Input Perturbation
Values
Fast
Not optimised for all model types (i.e. XGBoost) Not optimised for all model types (i.e. k-NN)
LIME
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)

Model Agnostic
explainer
Input Perturbation
Values
Fast
Not optimised for all model types (i.e. XGBoost) Not optimised for all model types (i.e. k-NN)
Not designed to work with one hot encoded
data
Handle categorical data
LIME
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)

Model Agnostic
explainer
Input Perturbation
Values
Fast
Not optimised for all model types (XGBoost) Not optimised for all model types (k-NN)
Not designed to work with one hot encoded
data
Handle categorical data
Interfaces for diﬀerent type of data (Tabular,
Text and Image)
Handle all type of data
LIME
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)

SHAP
https://github.com/slundberg/shap
❑ Game Theory Approach
❑ Explain the Output of any Machine Learning Models
❑ Inspects Model Prediction
SHapley Additive exPlanations
https://towardsdatascience.com/shap-explained-the-way-i-wish-someone-explained-it-to-me-ab81cc69ef30

SHAP - Theory
❑ In the Game Theory we have two components:
❑ Game
❑ Players
❑ The application of the game theory approach to explain a machine learning model
makes these two elements becoming:
❑ Game → Output of the model for one observation
❑ Players → Features

SHAP - Theory
What Shapley does is quantifying the
contribution that each player brings to the
game.
What SHAP does is quantifying the
contribution that each feature brings to the
prediction made by the model.

SHAP - Theory
What is the impact of each feature in the predictions?
❑ Suppose to predict the income of a person.
❑ Suppose to have 3 features in our model: age, gender and job.
❑ We have to consider all the possible features orderings of f features
(f going from 0 to 3).

SHAP - Theory
❑ The combinations of features can
be seen as a tree
❑ Node = 1 specific combination of
features
❑ Edge = marginal contribution that
each feature gives to the model.

SHAP - Theory
❑ Possible combinations = 2^n (for a power set)
❑ In our example, possible combinations = 2^3 = 8
❑ SHAP trains a distinct model on each of these combinations, therefore it
considers 2^n models.
❑ Too expensive, SHAP actually implements a variation of this naif approach.

SHAP - Theory
❑ Suppose to have trained all the 8 models
❑ Suppose to consider one observation called x0
❑ Let’s see what each model
predict for the same
observation x0

SHAP - Theory
❑ To get the overall contribution
of Age in the model,
we have to consider its
marginal contribution of Age
in all the models where Age is
present.

SHAP - Theory
We have to consider all the edges
connecting two nodes such that:
❑ the upper one does not
contain Age, and
❑ the bottom one contains Age.

SHAP - Theory
❑ The main assumptions are:
❑ The sum of the weights on each row of the tree should be equal.
w1 = w2 + w3 = w4
❑ Each weight inside one row of the tree should be equal.
w2 = w3
❑ In our example:
❑ w1 = w4 = 1/3
❑ w2 = w3 = 1/6

SHAP - Theory
❑ On our example, the formula yields:
❑ SHAP_Age(x₀) = -11.33k $
❑ SHAP_Gender(x₀) = -2.33k $
❑ SHAP_Job(x₀) = +46.66k $
❑ Summing them up gives: +33k $
❑ It is exactly the difference between the output of the full model (83k $) and
the output of the dummy model with no features (50k $).

SHAP - Theory
Generalising:
❑ The weight of an edge is the reciprocal of the total number of edges in the same
“row”.
Or, equivalently, the weight of a marginal contribution to a f-feature-model is the
reciprocal of the number of possible marginal contributions to all the
f-feature-models.
❑ Each f-feature-model has f marginal contributions (one per feature), so it is
enough to count the number of possible f-feature-models and to multiply it by f.

SHAP - Theory
❑ We have built the formula for calculating the SHAP value of Age in a
3-feature-model.
❑ Generalising to any feature and any F, we obtain the formula reported in the
article by Slundberg and Lee:

Case Study - TREE SHAP
❑ Let’s imagine to have a book e-commerce
❑ Every book is characterised by several features:
❑ Sales done the last week and total sales
❑ Number of reviews and average of these reviews
❑ Genre
❑ Price
❑ Author
❑ …
❑ Train a LTR model using LambdaMART algorithm

SHAP Plots
❑ Y-axis: most important features
for the model
❑ X-axis: average SHAP value
(impact on model score)
Summary Plot

SHAP Plots
Summary Plot
❑ Y-axis: most important features
for the model
❑ X-axis: SHAP value (impact on
model score)
❑ Feature value in color
❑ Each point is a prediction result

SHAP Plots
❑ Single model prediction lowest relevance
❑ Model output: -7.01
❑ Impact of each feature
Force Plot

SHAP Plots
❑ Single model prediction highest relevance
❑ Model output: -3.25
❑ Impact of each feature
Force Plot

SHAP Plots
❑ Show interaction between 2
features
❑ Each point is a prediction
❑ X-axis: first feature value
❑ Y-axis: impact on the score
❑ Color is a second feature
Dependence Plot

SHAP Plots
Dependence Plot
❑ Show interaction between 2
features
❑ Each point is a prediction
❑ X-axis: first feature value
❑ Y-axis: impact on the score
❑ Color is a second feature
https://slundberg.github.io/shap/notebooks/plots/dependence_plot.html

SHAP Plots
Decision Plot
❑ How the prediction changes during
the decision process
❑ Y-axis: features names
❑ X-axis: output of the model
❑ Each row show the impact of each
feature
❑ Each vertical line is a prediction

SHAP Plots
Decision Plot
❑ How the prediction changes
during the decision process
❑ Y-axis: features names
❑ X-axis: output of the model
❑ Feature value between brackets ()

SHAP - Python Code
Tree Explainer explainer = shap.TreeExplainer(xgb_model)
SHAP values shap_values = explainer.shap_values(training_data_set)
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html
import shap
import matplotlib.pyplot as plt
To save Plots
plt.savefig(images_path + '/summary_plot.png', bbox_inches=‘tight')
plt.close()

Summary Plot shap.summary_plot(shap_values, training_data_set)
Summary Plot with Bars shap.summary_plot(shap_values, training_data_set, plot_type="bar")
Decision Plot
(total AND single observation)
shap.decision_plot(explainer.expected_value, shap_values, training_data_set,
feature_names=training_data_set.columns.tolist(), ignore_warnings=True)
AND
shap.decision_plot(explainer.expected_value, shap_values[0], training_data_set.iloc[0],
feature_names=training_data_set.columns.tolist(), ignore_warnings=True)
Force Plot
(total AND single observation)
html_img = shap.force_plot(explainer.expected_value, shap_values, training_data_set)
AND
shap.force_plot(explainer.expected_value, shap_values[0], training_data_set.iloc[0], matplotlib=True)
Dependence Plot
if 'is_genre_fantasy' in training_data_set.columns:
shap.dependence_plot(feature_to_analyze, shap_values, training_data_set, interaction_index='is_genre_fantasy')
SHAP - Python Code
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html

SHAP - Warnings
❑ Attention on Data preprocessing
❑ SHAP doesn’t consider queries
Extrapolate all the interactions with the same query and then execute the plot
❑ The output (score) of the model is NOT the Relevance Label
Relative relevance between products

Conclusion
❑ Explainability is useful to understand the model behaviour
❑ There are several method for explainability
❑ SHAP is a very powerful library that provides several tools
❑ SHAP’s plots allow us to give local and global explainability to the model
❑ Keep attention on data preprocessing, queries and relevance during plots
interpretation

‣ Integration of SHAP Library in Apache Solr
‣ Exploration of other Explainability libraries/methods, that are
more speciﬁc for ranking algorithm
Future Works

Our Blog Posts about Explainability:
‣ https://sease.io/2020/07/explaining-learnin
g-to-rank-models-with-tree-shap.html
‣ https://sease.io/2021/02/a-learning-to-ran
k-project-on-a-daily-song-ranking-problem
-part-2.html
Thank You!
Keep an eye on our Blog page, as more is coming!

Explainability for Learning to Rank

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Explainability for Learning to Rank

Similar to Explainability for Learning to Rank (20)

More from Sease

More from Sease (20)

Recently uploaded

Recently uploaded (20)

Explainability for Learning to Rank