From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Transparency in ML and AI (humble views from a concerned academic)
1. Dr. Paolo Missier
School of Computing
Newcastle University
Innovation Opportunity of the GDPR for AI and ML
Digital Catapult London,
March 2nd, 2018
Transparency in ML and AI
(humble views from a concerned academic)
2. 2
My current favourite book
<eventname>
How much of Big Data is My Data?
Is Data the problem?
Or the algorithms?
Or how much we trust them?
Is there a problem at all?
3. 3
What matters?
<eventname>
Decisions made by processes based on algorithmically-generated
knowledge: Knowledge-Generating Systems (KGS)
• automatically filtering job applicants
• approving loans or other credit
• approving access to benefits schemes
• predicting insurance risk levels
• user profiling for policing purposes and to predict risk of criminal
recidivism
• identifying health risk factors
• …
4. 4
GDPR and algorithmic decision making
<eventname>
Profiling is “any form of automated processing of personal data consisting of the use
of personal data to evaluate certain personal aspects relating to a natural person”
Thus profiling should be construed as a subset of processing, under two conditions:
the processing is automated, and the processing is for the purposes of evaluation.
Article 22: Automated individual decision-making, including profiling, paragraph
1 (see figure 1) prohibits any“decision based solely on automated processing,
including profiling” which “significantly affects” a data subject.
it stands to reason that an algorithm can only be explained if the trained model can be
articulated and understood by a human. It is reasonable to suppose that any adequate
explanation would provide an account of how input features relate to predictions:
- Is the model more or less likely to recommend a loan if the applicant is a minority?
- Which features play the largest role in prediction?
B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a ‘right to explanation,’”
Proc. 2016 ICML Work. Hum. Interpret. Mach. Learn. (WHI 2016), Jun. 2016.
5. 5
Heads up on the key questions:
• [to what extent, at what level] should lay people be educated about
algorithmic decision making?
• What mechanisms would you propose to engender trust in
algorithmic decision making?
• With regards to trust and transparency, what should Computer
Science researchers focus on?
• What kind of inter-disciplinary research do you see?
<eventname>
6. 6
Recidivism Prediction Instruments (RPI)
<eventname>
• Increasingly popular within the criminal justice system
• Used or considered for use in pre-trial decision-making (USA)
Social debate and scholarly arguments…
Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias: There’s software used
across the country to predict future criminals. and it’s biased against blacks. 2016.
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
black defendants who did not recidivate over a two-year period were
nearly twice as likely to be misclassified as higher risk compared to
their white counterparts (45 percent vs. 23 percent).
white defendants who re-offended within the next two years were
mistakenly labeled low risk almost twice as often as black re-offenders
(48 percent vs. 28 percent)
A. Chouldechova, “Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction
Instruments,” Big Data, vol. 5, no. 2, pp. 153–163, Jun. 2017.
In this paper we show that the differences in false positive and false negative rates
cited as evidence of racial bias in the ProPublica article are a direct consequence of
applying an instrument that is free from predictive bias to a population in which
recidivism prevalence differs across groups.
7. 7
Opacity
<eventname>
J. Burrell, “How the machine ‘thinks’: Understanding opacity in machine learning algorithms,” Big Data
Soc., vol. 3, no. 1, p. 2053951715622512, 2016.
Three forms of opacity:
1- intentional corporate or state secrecy, institutional self-protection
2- opacity as technical illiteracy, writing (and reading) code is a specialist skill
• One proposed response is to make code available for scrutiny, through regulatory
means if necessary
3- mismatch between mathematical optimization in high-dimensionality characteristic of
machine learning and the demands of human-scale reasoning and styles of semantic
interpretation.
“Ultimately partnerships between legal scholars, social scientists, domain experts,
along with computer scientists may chip away at these challenging questions of
fairness in classification in light of the barrier of opacity”
8. 8
<eventname>
But, is research focusing on the right problems?
Research and innovation:
React to threats,
Spot opportunities…
10. 10
Interpretability (of machine learning models)
<eventname>
Z. C. Lipton, “The Mythos of Model Interpretability,” Proc. 2016 ICML Work. Hum. Interpret. Mach.
Learn. (WHI 2016), Jun. 2016.
- Transparency
- Are features understandable?
- Which features are more important?
- Post hoc interpretability
- Natural language explanations
- Visualisations of models
- Explanations by example
- “this tumor is classified as malignant
because to the model it looks a lot like
these other tumors”
11. 11
“Why Should I Trust You?”
<eventname>
M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ : Explaining the Predictions of Any Classifier,” in Proceedings of
the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 2016, pp. 1135–1144.
Interpretability of model predictions has become a hot research topic in Machine Learning
“if the users do not trust a model or a prediction,
they will not use it”
By “explaining a prediction”, we mean presenting textual or visual artifacts that provide qualitative
understanding of the relationship between the instance’s components and the model’s prediction.
12. 12
Explaining image classification
<eventname>
M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ : Explaining the Predictions of Any Classifier,” in Proceedings of
the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 2016, pp. 1135–1144.
14. 14
Features
Volume: how many features contribute to the prediction?
Meaning : how suitable are the features for human interpretation?
• Raw: (low-level, non-semantic) signals such as images pixels
• Deep learning
• Visualisation ---- occlusion test
• Cases: Object recognition, and medical diagnosis
• Many features: (thousands is too many)
• Few, high-level features. -- is this the only chance?
15. 15
Occlusion test for CNNs
Kemany, et al., Identifying Medical Diagnoses and treatable diseases by image based deep learning
Cell 2018
Zeiler, et al., Visualizing and Understanding Convolutional Networks, ECCV 2014
16. 16
Attribute Learning
Layer for
Semantic Attributes
Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, Shree K. Nayar,, "Describable Visual Attributes for Face
Verification and Image Search,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI),
vol. 33, no. 10, pp. 1962--1977, October 2011.
17. 17
Can we control inferences made about us?
<eventname>
Facebook’s (and many other marketing companies) problem:
Personal characteristics are often hard to observe because of lack of data or
privacy restrictions
Solution: firms and governments increasingly depend on statistical inferences
drawn from available information.
Goal of the research:
- How to to give online users transparency into why certain inferences are
made about them by statistical models
- How to inhibit those inferences by hiding (“cloaking”) certain personal
information from inference
D. Chen, S. P. Fraiberger, R. Moakler, and F. Provost, “Enhancing Transparency and Control when Drawing Data-Driven
Inferences about Individuals,” in 2016 ICML Workshop on Human Interpretability in Machine network Learning (WHI
2016), 2016, pp. 21–25.
privacy invasions via statistical inferences are at least as
troublesome as privacy invasions based on revealing personal data
18. 18
“Cloaking”
<eventname>
Which “evidence” in the input feature vectors is critical to make an accurate prediction?
evidence counterfactual: “what would the model have done if this evidence hadn’t been
present”?
Not an easy problem!
User 1 greatly affected
User 2 unaffected
20. 20
AI Guardians
<eventname>
A. Etzioni and O. Etzioni, “Designing AI Systems That Obey Our Laws and Values,” Commun. ACM, vol.
59, no. 9, pp. 29–31, Aug. 2016.
Operational AI systems (for example, self-driving cars) need to obey
both the law of the land and our values.
Why do we need oversight systems?
- AI systems learn continuously they change over time
- AI systems are becoming opaque
- “black boxes” to human beings
- AI-guided systems have increasing autonomy
- they make choices “on their own.”
a major mission for AI is to develop in the near
future such AI oversight systems Auditors
Monitors
EnforcersEthics bots!
21. 21
AI accountability – your next Pal?
<eventname>
Asked where AI systems are weak today, Veloso (*) says they should be more
transparent. "They need to explain themselves: why did they do this, why did
they do that, why did they detect this, why did they recommend that?
Accountability is absolutely necessary."
(*) Manuela Veloso, head of the Machine Learning Department at Carnegie-Mellon University
Gary Anthes. 2017. Artificial intelligence poised to ride a new wave. Commun. ACM 60, 7 (June 2017), 19-21.
DOI: https://doi.org/10.1145/3088342
IBM's Witbrock echoes the call for humanism in AI: …"It's an embodiment of a
human dream of having a patient, helpful, collaborative kind of companion."
22. 22
A personal view
<eventname>
Hypothesis:
it is technically practical to provide a limited and IP-preserving degree of
transparency by surrounding and augmenting a black-box KGS with
metadata that describes the nature of its input, training and test data, and
can therefore be used to automatically generate explanations that can be
understood by lay persons.
Knowledge-Generating Systems (KGS)
…It’s the meta-data, stupid (*)
(*) https://en.wikipedia.org/wiki/It%27s_the_economy,_stupid
23. 23
Something new to try, perhaps?
<eventname>
Contextualised
Classifications
Explanation
Service
KGS 2
limited
profile
KGS 1
limited
profile
Secure ledger
(Blockchain)
infomediary
users
User data
contributions
Shared
Vocabulary
And metadata
model
Informed
co-decision
process
KGS 2. (e.g. health)
Background
(Big) data
KGS 1 (e.g. pensions)
Background
(Big) data
Contextualised
Classifications
Secure ledger
(Blockchain)
- descriptive summary of background data
- high-level characterisation of algorithm
KGS profiles
Users instances
and classifications
Disclosure
policy
Disclosure
policy
Fig. 1
24. 24
References (to take home)
<eventname>
• Gary Anthes. 2017. Artificial intelligence poised to ride a new wave. Commun. ACM 60, 7 (June 2017), 19-21. DOI:
https://doi.org/10.1145/3088342
• J. Burrell, “How the machine ‘thinks’: Understanding opacity in machine learning algorithms,” Big Data Soc., vol. 3, no. 1, p.
2053951715622512, 2016
• Caruana, Rich, Lou, Yin, Gehrke, Johannes, Koch, Paul, Sturm, Marc, and Elhadad, Noemie. Intelligible models for healthcare:
Predicting pneumonia risk and hospital 30-day readmission. In KDD, 2015
• D. Chen, S. P. Fraiberger, R. Moakler, and F. Provost, “Enhancing Transparency and Control when Drawing Data-Driven
Inferences about Individuals,” in 2016 ICML Workshop on Human Interpretability in Machine network Learning (WHI 2016), 2016,
pp. 21–25
• A. Chouldechova, “Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments,” Big Data, vol. 5,
no. 2, pp. 153–163, Jun. 2017.
• A. Etzioni and O. Etzioni, “Designing AI Systems That Obey Our Laws and Values,” Commun. ACM, vol. 59, no. 9, pp. 29–31, Aug.
2016.
• B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a ‘right to explanation,’” Proc.
2016 ICML Work. Hum. Interpret. Mach. Learn. (WHI 2016), Jun. 2016.
• Kumar, et al. Describable visual attributes for face verification and image search PAMI, 2011. (Pattern Analysis and Machine
Intelligence)
• Z. C. Lipton, “The Mythos of Model Interpretability,” Proc. 2016 ICML Work. Hum. Interpret. Mach. Learn. (WHI 2016), Jun. 2016.
• M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ : Explaining the Predictions of Any Classifier,” in Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, 2016, pp. 1135–1144.
• Zeiler, et al., Visualizing and Understanding Convolutional Networks, ECCV 2014
25. 25
Questions to you:
• [to what extent, at what level] should lay people be educated about
algorithmic decision making?
• What mechanisms would you propose to engender trust in
algorithmic decision making?
• With regards to trust and transparency, what should Computer
Science researchers focus on?
• What kind of inter-disciplinary research do you see?
<eventname>
26. 26
Scenarios
<eventname>
What kind of explanations would you request / expect / accept?
• My application for benefits has been denied but I am not sure why
• My insurance premium is higher than my partner’s, and it’s not clear
why
• My work performance has been deemed unsatisfactory, but I don’t
see why
• [can you suggest other scenarios close to your experience?]
Notes de l'éditeur
Individuals as well as businesses, which we will initially refer to as subjects (and later upgrade to active participants), increasingly find themselves at the receiving end of impactful decisions made by organisations on their behalf, based on processes that use algorithmically-generated knowledge.
3. we cannot look at the code directly for many important algorithms of classification that are in wide- spread use. This opacity (at one level) exists because of proprietary concerns. They are closed in order to main- tain competitive advantage and/or to keep a few steps ahead of adversaries. Adversaries could be other com- panies in the market or malicious attackers (relevant in many network security applications). However, it is possible to investigate the general computational designs that we know these algorithms use by drawing from educational materials.
Machine learning models that prove useful (specifically, in terms of the ‘accuracy’ of classification) possess a degree of unavoidable complexity
In a ‘Big Data’ era, billions or trillions of data examples and thousands or tens of thousands of prop- erties of the data (termed ‘features’ in machine learning) may be analyzed. The internal decision logic of the algorithm is altered as it ‘learns’ on training data. Handling a huge number especially of heterogeneous properties of data (i.e. not just words in spam email, but also email header info) adds complexity to the code. Machine
Brings about the issue of trust in the models.
Should I use the prediction?
“Determining trust in individual predictions is an importantproblem when the model is used for decision making. When using machine learning for medical diagnosis [6] or terrorism detection, for example, predictions cannot be acted upon on blind faith, as the consequences may be catastrophic”
Notes for Paolo: by checking significant performance decrease for masks in different locations
information disclosed on social network sites (such as Facebook) can be used to predict personal characteristics with surprisingly high accuracy
We introduce the idea of a “cloaking device” as a vehicle to offer users control over inferences,
Kim, Been. Interactive and interpretable machine learning models for human machine collaboration. PhD thesis, Massachusetts Institute of Technology, 2015.