Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribution

S C A L A B L E I N T E R A C T I V E T O O L S
I N T E R P R E T A T I O N & A T T R I B U T I O N
HUMAN-CENTERED AI
Polo Chau  
Georgia Tech
Associate Professor, Computational Science & Engineering 
ML Area Leader, College of Computing 
Associate Director, MS Analytics 
for

AI HI+
HUMAN  
INTELLIGENCE
ARTIFICIAL  
INTELLIGENCE
Scalable interactive tools to make sense of  
complex large-scale datasets and models
Polo Club of Data Science

Human-Centered AI Cyber Security
Social Good & HealthLarge Graph Mining & Visualization
Polo Club of Data Science poloclub.github.io
Adversarial ML

Why should we make AI more
human-centered?
Accessible & Interpretable
for people who build and use machine learning

AI models often used as black-box

8
Our ShapeShifter Attack: Stop Sign ! Person 
Spotlighted in new DARPA GARD program [PKDD’18; with Intel]
Real Stop Sign
Printed Adversarial
Stop Sign

8
Our ShapeShifter Attack: Stop Sign ! Person 
Spotlighted in new DARPA GARD program [PKDD’18; with Intel]

SHIELD: Fast Practical Defense for  
Deep Learning via JPEG Compression 
[KDD’18 Audience Appreciation Award runner up; with intel]
9

We do NOT know  
why AI attacks and defenses work

Actually nobody does 
(e.g., which neurons are under attack?)

Image used in
xkcd.com Kim, CVPR’18 tutorial
Image credit:

Practitioners very interested in  
why and how AI works
Figure from Washington Post

RESEARCH PROBLEM
How can we design scalable, interactive,
usable interfaces for people to understand
complex, large-scale ML systems?

OUR KEY IDEA
Scalable Interactive visualization
as a medium for connecting users
with ML models
15

Why interactive visualization?
Machine learning aims to find patterns from data.
Visualization amplifies human cognition to find patterns.

Why interactive visualization?
By interacting with visualization, users can
incrementally make sense of AI models.

How to visualize complex models?
18
Challenge 1
GoogLeNet Neural Machine Translation

How to scale up to large datasets?
Challenge 2

How to scale up to large datasets?
Challenge 2
?

How to make tools easy-to-use for
various users?
Challenge 3
EXPERTS NOVICES
[Hohman, Kahng, Pienta, Chau, TVCG, 2018]
PRACTITIONERS

Human-Centered AI by Visual Analytics
Our Research
GAN Lab
ActiVis
Visualization for Industry-Scale Models
Interactive Learning of Complex Models
ML Cube - Model comparison
- Activation analysis by subsets
- Experimentation with GANs

ActiVis
Scaling Visualization to  
Industry-scale Models and Data
Deployed by[Kahng, et al. IEEE VIS’17]

why practitioners want visualization?
Facebook data scientists need visualization
tools to interpret complex models

Those down totags: #mycat, #cute
date: 10/1/2017
location: 33.7, 88.4
Practical Design Challenges
24
DIVERSE FEATURESDEEP/WIDE MODELS LARGE DATASETS
image, text, numerical,
categorical, …
1,000+  
operations/layers
1 billion+
instances
Enjoying nice weather with kiki❤

UNDERSTANDING USERS’ NEEDS
Participatory design sessions with 15+
researchers, engineers & data scientists
at Facebook over 11 months
25

Visualizing activation of industry-scale
deep neural nets, deployed by Facebook
26
ActiVis

ActiVis Research Challenges
1. Many model parameters to visualize
2. Many data instances to analyze
3. Intensive computation for deployment
27
Visualization for industry-scale deep models

Location
How to visualize many model parameters?
28
Challenge #1
INPUT OUTPUTMODEL
Person
Location 81%
8%
Where is  
Mercedes-Benz
Stadium
located?
Number 11%
many layers

Location
How to visualize many model parameters?
28
Challenge #1
INPUT OUTPUTMODEL
Person
Location 81%
8%
Where is  
Mercedes-Benz
Stadium
located?
Number 11%
particularly
useful
many layers
Observation: No need to show everything

Model Overview to Activation Details
29
ActiVis Key Ideas #1

How to analyze many data instances?
30
Challenge #2
SUBSET-LEVELINSTANCE-LEVEL
Complementary
Useful for debugging Useful for large datasets
Observation: Two Analytics Patterns
How model behaves at
higher-level categorization
(e.g., by topic)?
How model responds to  
individual instances?

Unified Analysis for Instances & Subsets
ActiVis Key Ideas (2)

Scaling Up ActiVis for Facebook
1. User-guided Instance Sampling
2. Selective Pre-computation of Layers
3. Matrix Computation for Billion-Scale Instances
32
ActiVis Key Ideas (3)

Deployed on FBLearner
33
used by >25% of engineering team
Facebook’s ML platform

ML Cube
Interactive Model Comparison
Deployed by[Kahng, et al. HILDA’16]

Challenge: Model Selection
Which model to use?
Baseline Model New Model
89.5% 90.1%overall accuracy
Age 20-39
Age 13-19 92.0% 97.0%
87.0% 69.0%

Comparison by Subsets with Data Cube
37
country
age
gender
age gender country
Model A
accuracy
Model B
accuracy
* * * 89.5% 90.1%
13-19 * * 92.0% 97.0%
20-39 * * 87.0% 69.0%
* F * 89.6% 89.9%
13-19 F * 91.0% 93.0%
13-19 F USA 91.1% 94.0%
* M USA 75.5% 74.0%
20-39 M Canada 87.2% 73.7%
How to scale to very large number of possible subsets?

38
Visual Model Comparison
via Interactive Data Cube Exploration
MLCube

Impact
Deployed on Facebook’s ML
Platform
Influenced Google’s new
open-source system,
TensorFlow Model Analysis
40

Research Contributions from ActiVis & MLCube
How can we scale visualization to
industry-scale models and data?
1.Exploration from overview to details
2.Drilling down into specific parts of data
3.Combination of scalable & interactive methods
(Under review — discover interesting subsets automatically:  
FairVis: Visual Analytics for Discovering Intersectional Bias in Machine Learning)

GAN Lab
Interactive understanding of
complex deep learning models
PAIR | People + AI Research Initiative
[Kahng, et al. IEEE VIS’18]

Visualization for ML Education

Modern deep models are complex
45

Generative Adversarial Networks (GANs)
46
“the most interesting idea in the last 10 years in ML”
- Yann LeCun
Face images generated by BEGAN [Berthelot et al., 2017]

Generative Adversarial Networks (GANs)
Hard to understand and train even for experts
47
Discriminator
Generator

Why GANs are hard?
A GAN uses two competing neural networks
48
Discriminator 
spots fake
Police 
spots fake bills
Generator 
synthesizes outputs
Counterfeiter 
makes fake bills

GAN Lab Research Challenges
1. Conceptual understanding of GANs
2. Interactive model training
3. Easily accessible for students
Can we design an interactive tool for GANs?

50
Discriminator 
(Police)
Generator 
(Counterfeiter)

What type of data to visualize?
50
Discriminator 
(Police)
Generator 
(Counterfeiter)

2D distribution, instead of high-dimensional images
50
Discriminator 
(Police)
Generator 
(Counterfeiter)

51
Discriminator 
(Police)
Generator 
(Counterfeiter)

51
Discriminator 
(Police)
Generator 
(Counterfeiter)
1. To focus on GAN’s main concepts
2. To easily visualize data distribution
Why 2D data points?

VER. 0.1
52
Real  
(green)
Generated 
(purple)

Discriminator 
(Police)
How to visually explain the generator?
53
Generator 
(Counterfeiter)

54
Generator 
(Counterfeiter)

random
54
Generator 
(Counterfeiter)

map an input point
into a new position
random
54
Generator 
(Counterfeiter)

map an input point
into a new position
random
?
54
Generator 
(Counterfeiter)

map an input point
into a new position
random
Manifold
?
54
Generator 
(Counterfeiter)

How to visualize the discriminator?
55
Generator 
(Counterfeiter)
Discriminator 
(Police)

56

2D heatmap, to represent binary classification
56
Data points in this region
are likely real.
Data points are likely fake.

VER. 0.5
57
realdata fakedata generator discriminator
+ + +

VER. 0.5
58
Hard to develop mental models for GANs

1. Building mental
models for GANs
2. Locating
hyperparameters
3. Tracking data flow
How GAN Lab helps?

GAN Lab broadens education access
62
Conventional Deep Learning Visualization
in JavaScript
in Python with GPU
Model Training
Visualization
$$$

Everything done in browser, powered by TensorFlow.js
GAN Lab broadens education access
63
Accelerated by WebGL
in JavaScript
Visualization
also in JavaScript
Model Training

GAN Lab is Live!
64
30K visitors, 135 countries 1.9K Likes 800+ Retweets
Try at bit.ly/gan-lab

Research Contributions from GAN Lab
Can we design tools for non-experts to
understand complex deep models?
1.Visualization of overall structure & components
2.Interactive experimentation of training process
3.Accessible approach using browsers

Visual Analytics in Deep Learning:
An Interrogative Survey for the Next Frontiers
Fred Hohman, Minsuk Kahng, Robert Pienta, Polo Chau 
TVCG 2018
!66

!67
Some Takeaways
1. Most tools aimed at expert users

2. Instance-based analysis

3. Inherently interdisciplinary

4. Lacks actionability

5. Evaluation is hard

6. State-of-the-art models not robust
bit.ly/va-dl-survey

How can we make AI more
human-centered?

How can we make AI more
human-centered?
Through the design of visualization tools
that are scalable, interactive & usable,  
we can help users learn and interpret  
large-scale complex ML systems.

Thanks!
Polo Chau Georgia Tech
HUMAN-CENTERED AI
S C A L A B L E I N T E R A C T I V E T O O L S
I N T E R P R E T A T I O N & A T T R I B U T I O N
for

Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribution

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribution

Similaire à Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribution (20)

Dernier

Dernier (20)

Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribution