2. Jennifer Prendki, PhD
Founder and CEO, Alectio
More about me:
• Currently Expert Network @ IIA
• Previously VP of Machine Learning @ Figure Eight,
Chief Data Scientist @ Atlassian
• Managed Applied Data Science Research in the Search team
@ Walmart Labs
• Have built & scaled ML functions in companies of all sizes
3. ALECTIO’S MISSION:
Sustainable Machine Learning
Helping Machine Learning teams build Machine
Learning models with less resources (starting with
less data)
4. AGENDA
• Data: The New Oil?
• Fatally Unprepared?
• Data At All Costs?
• Insane(ly Good) Machine Learning
• Responsible Data Science
ETHICS IN DATA SCIENCE AND MACHINE LEARNING
5. Data: The New Oil?
WHY WE DATA SCIENTISTS LOVE OUR DATA…
21. Progress… or Global Societal Abuse?
Disappearance of Privacy
Abuses of the Data
Economy
$
22. Progress… or Global Societal Abuse?
Disappearance of Privacy Automation of Unfairness
Abuses of the Data
Economy
$
23. Progress… or Global Societal Abuse?
Disappearance of Privacy Automation of Unfairness
Abuses of the Data
Economy
Malevolent Applications
$
24. Data At All Costs?
THE IMPACT OF THE BIG DATA ECONOMY ON SOCIETY
25. Datafication:
a modern technological trend turning
many aspects of our lives into data which
is subsequently transferred into
information realized as a new form of
value.
27. A Brief History of Data Privacy
Google
Street View
Behavior
targeting
is targeted
Facebook Apps
harvesting data
w/out consent
Voicemail
Hacking
Facebook &
Cambridge Analytica
GDPR
EU Treaty went
into effect
Creation of the
European Data
Protection Directive
Privacy in the
News
Proposal of
GDPR Released
Adoption by the
EU Parliament
GDPR valid
29. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
30. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
o A challenge for the workers
A tougher job than it might seem…
31. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
o A challenge for the workers
A tougher job than it might seem…
32. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
o A challenge for the workers
A tougher job than it might seem…
33. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
o A challenge for the workers
A tougher job than it might seem…
34. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
o A challenge for the workers
A tougher job than it might seem…
35. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
o A challenge for the workers
A tougher job than it might seem…
Slow human <> job matching
Overall
36. Data Labeling and the Gig Economy
The human side of A.I.
o A dependency on human labor
Good side: communities
o A challenge for the workers
A tougher job than it might seem…
Slow human <> job matching
Overall
o Inconsistent qualify of work
Error-prone tasks
Subjective tasks
40. Biases All Over the Place…
DATA BIAS
o Labeling Bias
Subjective Labeling Tasks
o Subgroup Validity
Simpson’s Paradox
o Representation
Inappropriate Sampling Strategy
41. Biases All Over the Place…
DATA BIAS ALGORITHMIC
BIAS
o Labeling Bias
Subjective Labeling Tasks
o Subgroup Validity
Simpson’s Paradox
o Representation
Inappropriate Sampling Strategy
o Involuntary
Statistical Stereotyping
o Voluntary
Agenda-Based
50. A Fairer AI Economy
o General Patterns > Granular Insights
51. A Fairer AI Economy
o General Patterns > Granular Insights
o Social Impact > Feasibility
52. A Fairer AI Economy
o General Patterns > Granular Insights
o Social Impact > Feasibility
o Human + Machine Collaboration > Competition
53. A Fairer AI Economy
o General Patterns > Granular Insights
o Social Impact > Feasibility
o Human + Machine Collaboration > Competition
o Ethics by Design > Legislation
54. Fairness vs. Biases
• With ML, biases are of the essence… and that’s a good thing!
• (Yes, you read that right!)
55. Fairness vs. Biases
• With ML, biases are of the essence… and that’s a good thing!
• (Yes, you read that right!)
• Fairness is not ingrained in Machine Learning
• Machines learn what we humans teach them
• (Yes, even in the case of Reinforcement Learning)
56. Fairness vs. Biases
• With ML, biases are of the essence… and that’s a good thing!
• (Yes, you read that right!)
• Fairness is not ingrained in Machine Learning
• Machines learn what we humans teach them
• (Yes, even in the case of Reinforcement Learning)
Unfairness ≠ Bias
ML is born of biases, but its societal purpose dies with unfairness
57. Responsible A.I.
o Ethical
o Inclusive (not exclusive to a privileged group)
o No harm to society (no weaponization)
o Centered on the well-being of Society
58. Be the Change you
want to see in the World
o Machine Learning will not become fair on its
own
ML algorithms are by-products of human-generated
data
o Society and politicians are not ready
Uneducated users
No appropriate legislation in place
o The one true prevention of unethical use of
data is the Data Community
CongresshearingofMarkZuckerberginApril
2018