This document provides an overview of explainability methods for learning to rank models. It introduces the SHAP (Shapley Additive Explanations) library as a popular method for explaining machine learning models using game theory. A case study demonstrates how to use Tree SHAP to generate summary, force, dependence, and decision plots to explain the features influencing a LambdaMART ranking model. Key advantages and limitations of SHAP are discussed, such as its ability to handle different data types but computational expense. Code examples are provided for generating SHAP values and plots in Python.
1. Explainability for Learning
to Rank models
Anna Ruggero, R&D Software Engineer
Ilaria Petreti, IR/ML Engineer
30th
March 2021
2. ‣ R&D Search Software Engineer
‣ Master Degree in Computer Science
Engineering
‣ Big Data, Information Retrieval
‣ Organist, Music lover
Who We Are
Anna Ruggero
3. ‣ Information Retrieval/Machine Learning
Engineer
‣ Master in Data Science
‣ Data Mining and Machine Learning
technologies passionate
‣ Sports Lover
Who We Are
Ilaria Petreti
5. “XAI (eXplainable Artificial Intelligence) refers to methods and techniques in the application
of artificial intelligence technology such that the results of the solution can be understood by
humans”
XAI - What is Explainability?
6. Explainability - Why is important?
Neural Networks
Forest of Regression Trees
Predictive Performance
Explainability
Explainability - Accuracy Trade Off
7. • Model debugging
• Increase detection of bias
• Verifying accuracy
Explainability - Why is important?
Why and Who needs Explanations?
• Building trust in the model’s output
• Building social acceptance
• Increase transparency
• Satisfying regulatory requirements
• Verifying model safety
MODEL
BUILDERS
END
USERS
PUBLIC
STAKEHOLDERS
8. Explainability - Why is important in Information Retrieval?
Understand your Learning To Rank model
‣ Why a search result is at a certain position
‣ How does the model calculate the score?
‣ How is a feature affecting the ranking?
‣ How is the feature values affecting the ranking?
‣ Has the model learned any weird behaviour?
10. GLOBAL
Explain the overall model
Post-Hoc Explainability
Explainability - Methods
‣ Anchors
‣ CEM: Contrastive Explanations Method
‣ LIME
‣ SHAP
‣ DeepLIFT
‣ ….
‣ PDP: Partial Dependence Plots
‣ ALE: Accumulated Local Effect
‣ ICE: Individual Conditional Expectation
‣ Feature Importance and Permutation
Importance (through ELI5)
‣ ….
LOCAL
Explain a single prediction
11. References of some popular Python libraries for Explainability:
Explainability - Libraries
Python Library Type Links
ELI5 Model Agnostic https://eli5.readthedocs.io/en/latest/overview.html
LIME Model Agnostic https://github.com/marcotcr/lime
SHAP Model Agnostic + Specific Explainers https://github.com/slundberg/shap
Anchors Model Agnostic https://github.com/marcotcr/anchor
DeepLIFT Neural Network https://github.com/kundajelab/deeplift
14. Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
Python Library Python Library
Local Explainability Local and Global Explainability
SHAP
(Shapley Additive
Explanations)
15. Python Library Python Library
Local Explainability Local and Global Explainability
Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
SHAP EXPLAINER
16. Python Library Python Library
Local Explainability Local and Global Explainability
Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
Importance Scores Importance Scores
Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
Type of Explanation Family
The importance scores are meant to communicate the relative contribution made
by each input feature to a given prediction
17. Python Library Python Library
Local Explainability Local and Global Explainability
Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
Importance Scores Importance Scores
Input Perturbation
Particular case of Input Perturbation —> Shapley
Values
Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
For a given observation, it generates
local perturbations of the features and
captures their impact using a linear
model => the weights of the linear model
are used as feature importance scores
Shapley value is computed by examining
all possible perturbations of inputting
other features
18. Python Library Python Library
Local Explainability Local and Global Explainability
Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
Importance Scores Importance Scores
Input Perturbation
Particular case of Input Perturbation —> Shapley
Values
Fast
Computationally Expensive
(especially for large dataset)
Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
19. Python Library Python Library
Local Explainability Local and Global Explainability
Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
Importance Scores Importance Scores
Input Perturbation
Particular case of Input Perturbation —> Shapley
Values
Fast
Computationally Expensive
(especially for large dataset)
Not optimised for all model types (i.e. XGBoost) Not optimised for all model types (i.e. k-NN)
Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
20. Python Library Python Library
Local Explainability Local and Global Explainability
Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
Importance Scores Importance Scores
Input Perturbation
Particular case of Input Perturbation —> Shapley
Values
Fast
Computationally Expensive
(especially for large dataset)
Not optimised for all model types (i.e. XGBoost) Not optimised for all model types (i.e. k-NN)
Not designed to work with one hot encoded
data
Handle categorical data
Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
21. Python Library Python Library
Local Explainability Local and Global Explainability
Model Agnostic
KernelExplainer (model agnostic) + optimised
explainer
Importance Scores Importance Scores
Input Perturbation
Particular case of Input Perturbation —> Shapley
Values
Fast
Computationally Expensive
(especially for large dataset)
Not optimised for all model types (XGBoost) Not optimised for all model types (k-NN)
Not designed to work with one hot encoded
data
Handle categorical data
Interfaces for different type of data (Tabular,
Text and Image)
Handle all type of data
Explainability - LIME vs SHAP
LIME
(Local Interpretable
Model-Agnostic
Explanations)
SHAP
(Shapley Additive
Explanations)
22. SHAP
https://github.com/slundberg/shap
❑ Game Theory Approach
❑ Explain the Output of any Machine Learning Models
❑ Inspects Model Prediction
SHapley Additive exPlanations
https://towardsdatascience.com/shap-explained-the-way-i-wish-someone-explained-it-to-me-ab81cc69ef30
23. SHAP - Theory
❑ In the Game Theory we have two components:
❑ Game
❑ Players
❑ The application of the game theory approach to explain a machine learning model
makes these two elements becoming:
❑ Game → Output of the model for one observation
❑ Players → Features
24. SHAP - Theory
What Shapley does is quantifying the
contribution that each player brings to the
game.
What SHAP does is quantifying the
contribution that each feature brings to the
prediction made by the model.
25. SHAP - Theory
What is the impact of each feature in the predictions?
❑ Suppose to predict the income of a person.
❑ Suppose to have 3 features in our model: age, gender and job.
❑ We have to consider all the possible features orderings of f features
(f going from 0 to 3).
26. SHAP - Theory
❑ The combinations of features can
be seen as a tree
❑ Node = 1 specific combination of
features
❑ Edge = marginal contribution that
each feature gives to the model.
27. SHAP - Theory
❑ Possible combinations = 2^n (for a power set)
❑ In our example, possible combinations = 2^3 = 8
❑ SHAP trains a distinct model on each of these combinations, therefore it
considers 2^n models.
❑ Too expensive, SHAP actually implements a variation of this naif approach.
28. SHAP - Theory
❑ Suppose to have trained all the 8 models
❑ Suppose to consider one observation called x0
❑ Let’s see what each model
predict for the same
observation x0
29. SHAP - Theory
❑ To get the overall contribution
of Age in the model,
we have to consider its
marginal contribution of Age
in all the models where Age is
present.
30. SHAP - Theory
We have to consider all the edges
connecting two nodes such that:
❑ the upper one does not
contain Age, and
❑ the bottom one contains Age.
32. SHAP - Theory
❑ The main assumptions are:
❑ The sum of the weights on each row of the tree should be equal.
w1 = w2 + w3 = w4
❑ Each weight inside one row of the tree should be equal.
w2 = w3
❑ In our example:
❑ w1 = w4 = 1/3
❑ w2 = w3 = 1/6
33. SHAP - Theory
❑ The main assumptions are:
❑ The sum of the weights on each row of the tree should be equal.
w1 = w2 + w3 = w4
❑ Each weight inside one row of the tree should be equal.
w2 = w3
❑ In our example:
❑ w1 = w4 = 1/3
❑ w2 = w3 = 1/6
34. SHAP - Theory
❑ On our example, the formula yields:
❑ SHAP_Age(x₀) = -11.33k $
❑ SHAP_Gender(x₀) = -2.33k $
❑ SHAP_Job(x₀) = +46.66k $
❑ Summing them up gives: +33k $
❑ It is exactly the difference between the output of the full model (83k $) and
the output of the dummy model with no features (50k $).
35. SHAP - Theory
Generalising:
❑ The weight of an edge is the reciprocal of the total number of edges in the same
“row”.
Or, equivalently, the weight of a marginal contribution to a f-feature-model is the
reciprocal of the number of possible marginal contributions to all the
f-feature-models.
❑ Each f-feature-model has f marginal contributions (one per feature), so it is
enough to count the number of possible f-feature-models and to multiply it by f.
39. SHAP - Theory
❑ We have built the formula for calculating the SHAP value of Age in a
3-feature-model.
❑ Generalising to any feature and any F, we obtain the formula reported in the
article by Slundberg and Lee:
40. Case Study - TREE SHAP
❑ Let’s imagine to have a book e-commerce
❑ Every book is characterised by several features:
❑ Sales done the last week and total sales
❑ Number of reviews and average of these reviews
❑ Genre
❑ Price
❑ Author
❑ …
❑ Train a LTR model using LambdaMART algorithm
41. SHAP Plots
❑ Y-axis: most important features
for the model
❑ X-axis: average SHAP value
(impact on model score)
Summary Plot
42. SHAP Plots
Summary Plot
❑ Y-axis: most important features
for the model
❑ X-axis: SHAP value (impact on
model score)
❑ Feature value in color
❑ Each point is a prediction result
43. SHAP Plots
❑ Single model prediction lowest relevance
❑ Model output: -7.01
❑ Impact of each feature
Force Plot
44. SHAP Plots
❑ Single model prediction highest relevance
❑ Model output: -3.25
❑ Impact of each feature
Force Plot
46. SHAP Plots
❑ Show interaction between 2
features
❑ Each point is a prediction
❑ X-axis: first feature value
❑ Y-axis: impact on the score
❑ Color is a second feature
Dependence Plot
47. SHAP Plots
Dependence Plot
❑ Show interaction between 2
features
❑ Each point is a prediction
❑ X-axis: first feature value
❑ Y-axis: impact on the score
❑ Color is a second feature
https://slundberg.github.io/shap/notebooks/plots/dependence_plot.html
48. SHAP Plots
Decision Plot
❑ How the prediction changes during
the decision process
❑ Y-axis: features names
❑ X-axis: output of the model
❑ Each row show the impact of each
feature
❑ Each vertical line is a prediction
49. SHAP Plots
Decision Plot
❑ How the prediction changes
during the decision process
❑ Y-axis: features names
❑ X-axis: output of the model
❑ Feature value between brackets ()
50. SHAP - Python Code
Tree Explainer explainer = shap.TreeExplainer(xgb_model)
SHAP values shap_values = explainer.shap_values(training_data_set)
https://github.com/slundberg/shap
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html
import shap
import matplotlib.pyplot as plt
To save Plots
plt.savefig(images_path + '/summary_plot.png', bbox_inches=‘tight')
plt.close()
51. Summary Plot shap.summary_plot(shap_values, training_data_set)
Summary Plot with Bars shap.summary_plot(shap_values, training_data_set, plot_type="bar")
Decision Plot
(total AND single observation)
shap.decision_plot(explainer.expected_value, shap_values, training_data_set,
feature_names=training_data_set.columns.tolist(), ignore_warnings=True)
AND
shap.decision_plot(explainer.expected_value, shap_values[0], training_data_set.iloc[0],
feature_names=training_data_set.columns.tolist(), ignore_warnings=True)
Force Plot
(total AND single observation)
html_img = shap.force_plot(explainer.expected_value, shap_values, training_data_set)
AND
shap.force_plot(explainer.expected_value, shap_values[0], training_data_set.iloc[0], matplotlib=True)
Dependence Plot
if 'is_genre_fantasy' in training_data_set.columns:
shap.dependence_plot(feature_to_analyze, shap_values, training_data_set, interaction_index='is_genre_fantasy')
SHAP - Python Code
https://github.com/slundberg/shap
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html
52. SHAP - Warnings
❑ Attention on Data preprocessing
❑ SHAP doesn’t consider queries
Extrapolate all the interactions with the same query and then execute the plot
❑ The output (score) of the model is NOT the Relevance Label
Relative relevance between products
53. Conclusion
❑ Explainability is useful to understand the model behaviour
❑ There are several method for explainability
❑ SHAP is a very powerful library that provides several tools
❑ SHAP’s plots allow us to give local and global explainability to the model
❑ Keep attention on data preprocessing, queries and relevance during plots
interpretation
54. ‣ Integration of SHAP Library in Apache Solr
‣ Exploration of other Explainability libraries/methods, that are
more specific for ranking algorithm
Future Works
55. Our Blog Posts about Explainability:
‣ https://sease.io/2020/07/explaining-learnin
g-to-rank-models-with-tree-shap.html
‣ https://sease.io/2021/02/a-learning-to-ran
k-project-on-a-daily-song-ranking-problem
-part-2.html
Thank You!
Keep an eye on our Blog page, as more is coming!