This talk explains why machine learning algorithms are prone to bias, shows concrete examples, and examines regulatory, conceptual, and technical means to address these issues.
8. The Raw Ingredients
Deep understanding of a business problem
Data, data, data
Algorithmic capability
9. Data Product Lifecycle
Design
What problem
are we solving?
Data Exploration
What does the data
look like?
Data Engineering
Production
Can we harden and
scale the model?
Maintenance
How to update
as data changes?
Data Processing
Is our data ready
for use?
Model Prototyping
Will this work?
Should we pivot?
13. 70% of machine learning products use
supervised learning
https://www.sas.com/en_ca/insights/analytics/machine-learning.html
14. Supervised Learning
Find a function that defines a correlation between P and C
Use this function to make guesses about C
Find a proxy (P) for something hard to know (C)
24. Classical Statistics & ML
Higher unconscious bias
in feature selection
Higher explainability
in the model
Deep Learning
Lower unconscious bias
in feature selection
Lower explainability
in the model
30. • System use algorithms to identify negative sentiment
• Performs better with strident, unambiguous expressions of emotions
• Men more likely to use those expressions
• Men attract disproportionate attention from brands
https://blog.dominodatalab.com/video-how-machine-learning-amplifies-societal-privilege/
31. Bluntness and bias
• Precision <> Recall
• Marketing wants high precision
• Implies low recall
• We’re better at identifying extremes
• They’re likely in a particular group
34. Bolukbasi, Chang, Zou, Saligrama, Kalai, 2016
Man : King :: Woman : Queen
Man : Computer Programmer :: Woman : Homemaker
Black Male : Assaulted :: White Male: Entitled To
Inherent Bias in Word Embeddings
57. Fair representations: treat similar individuals similarly
http://proceedings.mlr.press/v28/zemel13.html
“We formulate fairness as an optimization
problem of finding an intermediate representation
of the data that best encodes the data (i.e., preserving
as much information about the individual’s attributes
as possible), while simultaneously obfuscates aspects of
it, removing any information about membership with
respect to the protected subgroup.”
60. LOVE PEOPLE
Find opportunities to maximize mutual lifetime value
Respect the principles of contextual integrity
Protect individual and corporate data using differential privacy
Consider the goals of the people affected by the systems we build