business model, business model canvas, mission model, mission model canvas, customer development, hacking for defense, H4D, lean launchpad, lean startup, stanford, startup, steve blank, pete newell, bmnt, AI, Machine Learning, ML
1. Modern deep learning models
require extensive labeled
training data. This quickly
becomes a hard and expensive
problem in the Intelligence
Community because crowdsourcing
data labeling is not possible
for classified data. As a
result, intelligence analysts
are pulled off mission to
support data labeling efforts.
Large amounts of intel
currently are under-analyzed,
and analysts are burdened with
tedious manual processes to
analyze data. AI methods would
help rectify this, but efforts
to adopt them have yet to
succeeded.
GUTENBERG: LESSONS LEARNED
Remmelt
Ammerlaan
Alexander
Krey
Lawrence
Moore
Nini
Moorhead
Kelly
Shen
THEN NOW
114
Interviews
MS Comp. Math MBA / MPA MS Comp. Math MBA MS Comp. Science
2. HOW WE GOT STARTED
NORMAL H4D PROCESS
Sponsor
Problem
Statement
H4D Teams
3. HOW WE GOT STARTED
GUTENBERG H4D PROCESS: CREATE OUR OWN!
H4D
Data
Labeling
in IC
4. WHERE WE BEGAN: MMC
Key goal:
automate
labeling to
improve
analyst
lives
7. INITIAL SCOPE
Crowdsourcing
Would Help
Machine Learning
is Crucial
+ BUT
That Isn’t Possible in
the IC
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
10. TURNED TO SOLUTIONS TOO DEEP, TOO FAST
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“Just getting access to data
is a huge problem”
- IC Program Manager
11. TURNED TO SOLUTIONS TOO DEEP, TOO FAST
“Just getting access to data
is a huge problem”
- IC Program Manager
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“We don’t know even know what to
Google to start learning ML”
- IC Program Manager
12. TURNED TO SOLUTIONS TOO DEEP, TOO FAST
“Just getting access to data
is a huge problem”
- IC Program Manager
“We don’t know even know what to
Google to start learning ML”
- IC Program Manager
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
13. KEY LEARNINGS AFTER WEEK 0
❏ DON’T ALWAYS LISTEN TO NINI
❏ CUSTOMER DISCOVERY REQUIRES BREADTH
❏ SOLVE FOR THE END USER
15. STAGE 1: DEFINE ANALYST WORKFLOW + PROBLEMS
MVP 1:
Map of IC text
analytics pipeline
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Data
Labeling
Train
Model
Entity
Extraction
16. STAGE 1: DEFINE ANALYST WORKFLOW + PROBLEMS
“You may also be interested in…”
MVP 2:
“Apple News”
for daily
intelligence
MVP 1:
Map of IC text
analytics pipeline
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Data
Labeling
Train
Model
Entity
Extraction
17. STAGE 1: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Welcome to the graveyard of
failed analytic tool pilots…
“We’ve tried
integrating
automated tools
before, and they
mostly have failed”
- DARPA Program
Manager
18. STAGE 1: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Welcome to the graveyard of
failed analytic tool pilots…
“The higher-ups
want one thing, and
the analysts want
something very
different”
- Former IC Analyst
19. KEY LEARNINGS AFTER STAGE 1
❏ LOW ADOPTION OF ML TOOLS IN IC
❏ DISCONNECT BETWEEN WHAT ANALYSTS AND
LEADERS WANT FROM AUTOMATION
❏ ANY TECH SOLUTION MUST BE INTEGRATED
INTO ANALYST WORKFLOW
21. STAGE 2: FOSTERING COMMON UNDERSTANDING
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
MVP 3:
“AI playbook” to identify
capabilities and necessary steps
Boolean
search
Central
data repo
Automatic
doc
curation
Semantic
understanding
Automatic
insight
from text
22. STAGE 2: FOSTERING COMMON UNDERSTANDING
MVP 4:
First chapter in
“playbook” on full-motion
video analysis
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
MVP 3:
“AI playbook” to identify
capabilities and necessary steps
Boolean
search
Central
data repo
Automatic
doc
curation
Semantic
understanding
Automatic
insight
from text
23. STAGE 2: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO BEALE AFB
(PROJECT MAVEN TEST SITE)
+ Chick-Fil-A!
24. STAGE 2: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO BEALE AFB
(PROJECT MAVEN TEST SITE)
“The single biggest issue
is the iteration time of
algorithms”
- ISR Analyst at Beale
25. STAGE 2: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO BEALE AFB
(PROJECT MAVEN TEST SITE)
“We need a standardized
process that specifies how
to label and train models”
- ISR Analyst at Beale
26. KEY LEARNINGS AFTER STAGE 2
❏ CURRENT BEST-IN-CLASS SOLUTION DOESN’T
MEAN BEST IT COULD BE
❏ THE DOD / IC NEED TO STANDARDIZE TOP-
DOWN BEST PRACTICES OF AI
28. STAGE 3: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“Data centralization is
world hunger for us”
- Colonel, USAF
TRIP TO DIU
(FORWARD DEPLOYED SOCOM TEAM)
29. STAGE 3: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“The best structure for
data currently means
grouped into folders on a
shared drive”
- Forward Deployed Data
Scientist, SOCOM
TRIP TO DIU
(FORWARD DEPLOYED SOCOM TEAM)
30. STAGE 3: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“Moving data between
security enclaves is
our number one blocker”
- Deployed data lead,
SOCOM
TRIP TO DIU
(FORWARD DEPLOYED SOCOM TEAM)
31. KEY LEARNINGS AFTER STAGE 3
❏ FORMAT STANDARDIZATION IS ESSENTIAL TO
USING FULL POTENTIAL OF DATA
❏ DATA SHARING IS HINDERED BY NETWORK
SEPARATION AND SECURITY ISSUES
32. STAGE 4: THE ROAD TO DC
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Here are the issues we
know you are facing
Here are the amazing
things you can do
Here’s the
research and
open source
software to
get there
33. STAGE 4: THE ROAD TO DC
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We curated a list of senior AI
players in the IC to brief
34. FINAL MVP: DC BRIEFING
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
To adapt to the IC’s needs, a machine learning
pipeline must be distributed and non-labor intensive
❏ Three recent trends in academia could make this
a reality:
● Semi-supervised learning
● Active / online learning
● Federated learning
35. STAGE 4: WHAT WE HEARD
“Data access is on a
need to know basis”
- CIA Data Scientist
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO DC
(BRIEF DOD AND IC SENIORS)
36. STAGE 4: WHAT WE HEARD
“The infrastructure in place
was never designed with
automated methods in mind”
- CIA Data Scientist
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO DC
(BRIEF DOD AND IC SENIORS)
37. STAGE 4: WHAT WE HEARD
“What would make you
want to join the IC?”
- Deputy Director of
National Intelligence
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO DC
(BRIEF DOD AND IC SENIORS)
38. DC TAKEAWAYS
❏ THE IC NEEDS TO CREATE AN ELITE
CULTURE AROUND DATA SCIENCE
❏ THE IC MUST RETHINK ITS RELATIONSHIP
WITH DATA
❏ THE IC AND DOD NEED TO BE ON THE SAME
PAGE FOR AI
39. SUMMARIZING OUR JOURNEY
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We came in ready
to build...
40. SUMMARIZING OUR JOURNEY
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We came in ready
to build...
...but that isn’t
what the IC needed
in the end...
41. SUMMARIZING OUR JOURNEY
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We came in ready
to build...
...LISTEN TO THE
CUSTOMER!!
...but that isn’t
what the IC needed
in the end...
42. NEXT STEPS
Partners:
Follow-Ups
Sponsor:
Final Deck
Gutenberg:
New Roles
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
❏ ODNI: Thoughts
on IC vs
industry
differences
❏ NSCAI: Thoughts
on ML use in
legacy systems
❏ ODNI
❏ IARPA
❏ USAF, 70th ISR
❏ Nini: Vannevar
Labs
❏ Kelly: Apple
❏ Lawrence: Waymo
❏ Remmelt: Microsoft
❏ Krey: AWS
43. THANK YOU
Teaching Team
Col Pete Newell, Ret
Steve Blank
Steve Weinstein
Tom Bedecarre
Jeff Decker
Diego Solorzano Cervantes
Nate Simon
Mentors
LtCol Kevin Childs
Lisa Wallace
Sponsor
John Beiler
Dean Souleles
Key Partners
ODNI
IARPA
SOCOM
USAF
IQT
And all others who helped
us along the way!