Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Modern deep learning models
require extensive labeled
training data. This quickly
becomes a hard and expensive
problem in ...
HOW WE GOT STARTED
NORMAL H4D PROCESS
Sponsor
Problem
Statement
H4D Teams
HOW WE GOT STARTED
GUTENBERG H4D PROCESS: CREATE OUR OWN!
H4D
Data
Labeling
in IC
WHERE WE BEGAN: MMC
Key goal:
automate
labeling to
improve
analyst
lives
INITIAL SCOPE
Machine Learning
Crucial
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
INITIAL SCOPE
Crowdsourcing
Would Help
Machine Learning
is Crucial
+
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week...
INITIAL SCOPE
Crowdsourcing
Would Help
Machine Learning
is Crucial
+ BUT
That Isn’t Possible in
the IC
Week 0 Week 1 Week ...
INITIAL “SOLUTION”
“Let’s use semi-supervised
learning !” - Gutenberg
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Wee...
INITIAL “SOLUTION”
“Let’s use semi-supervised
learning !” --Gutenberg
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Wee...
TURNED TO SOLUTIONS TOO DEEP, TOO FAST
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Futur...
TURNED TO SOLUTIONS TOO DEEP, TOO FAST
“Just getting access to data
is a huge problem”
- IC Program Manager
Week 0 Week 1 ...
TURNED TO SOLUTIONS TOO DEEP, TOO FAST
“Just getting access to data
is a huge problem”
- IC Program Manager
“We don’t know...
KEY LEARNINGS AFTER WEEK 0
❏ DON’T ALWAYS LISTEN TO NINI
❏ CUSTOMER DISCOVERY REQUIRES BREADTH
❏ SOLVE FOR THE END USER
PIVOT
“Data labeling is just the tip
of the iceberg”
- MAVEN Data Administrator
STAGE 1: DEFINE ANALYST WORKFLOW + PROBLEMS
MVP 1:
Map of IC text
analytics pipeline
Week 0 Week 1 Week 2 Week 3 Week 4 We...
STAGE 1: DEFINE ANALYST WORKFLOW + PROBLEMS
“You may also be interested in…”
MVP 2:
“Apple News”
for daily
intelligence
MV...
STAGE 1: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Welcome to the...
STAGE 1: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Welcome to the...
KEY LEARNINGS AFTER STAGE 1
❏ LOW ADOPTION OF ML TOOLS IN IC
❏ DISCONNECT BETWEEN WHAT ANALYSTS AND
LEADERS WANT FROM AUTO...
PIVOT
Our reaction:
“The uncertainty around what AI
can do is the single biggest
deterrent for adoption”
- Data Scientist ...
STAGE 2: FOSTERING COMMON UNDERSTANDING
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Futu...
STAGE 2: FOSTERING COMMON UNDERSTANDING
MVP 4:
First chapter in
“playbook” on full-motion
video analysis
Week 0 Week 1 Wee...
STAGE 2: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO BEALE ...
STAGE 2: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO BEALE ...
STAGE 2: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
TRIP TO BEALE ...
KEY LEARNINGS AFTER STAGE 2
❏ CURRENT BEST-IN-CLASS SOLUTION DOESN’T
MEAN BEST IT COULD BE
❏ THE DOD / IC NEED TO STANDARD...
STAGE 3: DEVELOPING A ML PIPELINE
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
MVP...
STAGE 3: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“Data centrali...
STAGE 3: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“The best stru...
STAGE 3: WHAT WE HEARD
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
“Moving data b...
KEY LEARNINGS AFTER STAGE 3
❏ FORMAT STANDARDIZATION IS ESSENTIAL TO
USING FULL POTENTIAL OF DATA
❏ DATA SHARING IS HINDER...
STAGE 4: THE ROAD TO DC
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
Here are the ...
STAGE 4: THE ROAD TO DC
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We curated a ...
FINAL MVP: DC BRIEFING
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
To adapt to th...
STAGE 4: WHAT WE HEARD
“Data access is on a
need to know basis”
- CIA Data Scientist
Week 0 Week 1 Week 2 Week 3 Week 4 We...
STAGE 4: WHAT WE HEARD
“The infrastructure in place
was never designed with
automated methods in mind”
- CIA Data Scientis...
STAGE 4: WHAT WE HEARD
“What would make you
want to join the IC?”
- Deputy Director of
National Intelligence
Week 0 Week 1...
DC TAKEAWAYS
❏ THE IC NEEDS TO CREATE AN ELITE
CULTURE AROUND DATA SCIENCE
❏ THE IC MUST RETHINK ITS RELATIONSHIP
WITH DAT...
SUMMARIZING OUR JOURNEY
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We came in re...
SUMMARIZING OUR JOURNEY
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We came in re...
SUMMARIZING OUR JOURNEY
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
We came in re...
NEXT STEPS
Partners:
Follow-Ups
Sponsor:
Final Deck
Gutenberg:
New Roles
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 ...
THANK YOU
Teaching Team
Col Pete Newell, Ret
Steve Blank
Steve Weinstein
Tom Bedecarre
Jeff Decker
Diego Solorzano Cervant...
Prochain SlideShare
Chargement dans…5
×

Gutenberg H4D Stanford 2019

37 687 vues

Publié le

business model, business model canvas, mission model, mission model canvas, customer development, hacking for defense, H4D, lean launchpad, lean startup, stanford, startup, steve blank, pete newell, bmnt, AI, Machine Learning, ML

Publié dans : Formation
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Gutenberg H4D Stanford 2019

  1. Modern deep learning models require extensive labeled training data. This quickly becomes a hard and expensive problem in the Intelligence Community because crowdsourcing data labeling is not possible for classified data. As a result, intelligence analysts are pulled off mission to support data labeling efforts. Large amounts of intel currently are under-analyzed, and analysts are burdened with tedious manual processes to analyze data. AI methods would help rectify this, but efforts to adopt them have yet to succeeded. GUTENBERG: LESSONS LEARNED Remmelt Ammerlaan Alexander Krey Lawrence Moore Nini Moorhead Kelly Shen THEN NOW 114 Interviews MS Comp. Math MBA / MPA MS Comp. Math MBA MS Comp. Science
  2. HOW WE GOT STARTED NORMAL H4D PROCESS Sponsor Problem Statement H4D Teams
  3. HOW WE GOT STARTED GUTENBERG H4D PROCESS: CREATE OUR OWN! H4D Data Labeling in IC
  4. WHERE WE BEGAN: MMC Key goal: automate labeling to improve analyst lives
  5. INITIAL SCOPE Machine Learning Crucial Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
  6. INITIAL SCOPE Crowdsourcing Would Help Machine Learning is Crucial + Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
  7. INITIAL SCOPE Crowdsourcing Would Help Machine Learning is Crucial + BUT That Isn’t Possible in the IC Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
  8. INITIAL “SOLUTION” “Let’s use semi-supervised learning !” - Gutenberg Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
  9. INITIAL “SOLUTION” “Let’s use semi-supervised learning !” --Gutenberg Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
  10. TURNED TO SOLUTIONS TOO DEEP, TOO FAST Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future “Just getting access to data is a huge problem” - IC Program Manager
  11. TURNED TO SOLUTIONS TOO DEEP, TOO FAST “Just getting access to data is a huge problem” - IC Program Manager Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future “We don’t know even know what to Google to start learning ML” - IC Program Manager
  12. TURNED TO SOLUTIONS TOO DEEP, TOO FAST “Just getting access to data is a huge problem” - IC Program Manager “We don’t know even know what to Google to start learning ML” - IC Program Manager Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future
  13. KEY LEARNINGS AFTER WEEK 0 ❏ DON’T ALWAYS LISTEN TO NINI ❏ CUSTOMER DISCOVERY REQUIRES BREADTH ❏ SOLVE FOR THE END USER
  14. PIVOT “Data labeling is just the tip of the iceberg” - MAVEN Data Administrator
  15. STAGE 1: DEFINE ANALYST WORKFLOW + PROBLEMS MVP 1: Map of IC text analytics pipeline Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future Data Labeling Train Model Entity Extraction
  16. STAGE 1: DEFINE ANALYST WORKFLOW + PROBLEMS “You may also be interested in…” MVP 2: “Apple News” for daily intelligence MVP 1: Map of IC text analytics pipeline Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future Data Labeling Train Model Entity Extraction
  17. STAGE 1: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future Welcome to the graveyard of failed analytic tool pilots… “We’ve tried integrating automated tools before, and they mostly have failed” - DARPA Program Manager
  18. STAGE 1: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future Welcome to the graveyard of failed analytic tool pilots… “The higher-ups want one thing, and the analysts want something very different” - Former IC Analyst
  19. KEY LEARNINGS AFTER STAGE 1 ❏ LOW ADOPTION OF ML TOOLS IN IC ❏ DISCONNECT BETWEEN WHAT ANALYSTS AND LEADERS WANT FROM AUTOMATION ❏ ANY TECH SOLUTION MUST BE INTEGRATED INTO ANALYST WORKFLOW
  20. PIVOT Our reaction: “The uncertainty around what AI can do is the single biggest deterrent for adoption” - Data Scientist in IC
  21. STAGE 2: FOSTERING COMMON UNDERSTANDING Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future MVP 3: “AI playbook” to identify capabilities and necessary steps Boolean search Central data repo Automatic doc curation Semantic understanding Automatic insight from text
  22. STAGE 2: FOSTERING COMMON UNDERSTANDING MVP 4: First chapter in “playbook” on full-motion video analysis Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future MVP 3: “AI playbook” to identify capabilities and necessary steps Boolean search Central data repo Automatic doc curation Semantic understanding Automatic insight from text
  23. STAGE 2: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future TRIP TO BEALE AFB (PROJECT MAVEN TEST SITE) + Chick-Fil-A!
  24. STAGE 2: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future TRIP TO BEALE AFB (PROJECT MAVEN TEST SITE) “The single biggest issue is the iteration time of algorithms” - ISR Analyst at Beale
  25. STAGE 2: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future TRIP TO BEALE AFB (PROJECT MAVEN TEST SITE) “We need a standardized process that specifies how to label and train models” - ISR Analyst at Beale
  26. KEY LEARNINGS AFTER STAGE 2 ❏ CURRENT BEST-IN-CLASS SOLUTION DOESN’T MEAN BEST IT COULD BE ❏ THE DOD / IC NEED TO STANDARDIZE TOP- DOWN BEST PRACTICES OF AI
  27. STAGE 3: DEVELOPING A ML PIPELINE Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future MVP 6+7: schema administration train testdeploy declassify organize labelling
  28. STAGE 3: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future “Data centralization is world hunger for us” - Colonel, USAF TRIP TO DIU (FORWARD DEPLOYED SOCOM TEAM)
  29. STAGE 3: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future “The best structure for data currently means grouped into folders on a shared drive” - Forward Deployed Data Scientist, SOCOM TRIP TO DIU (FORWARD DEPLOYED SOCOM TEAM)
  30. STAGE 3: WHAT WE HEARD Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future “Moving data between security enclaves is our number one blocker” - Deployed data lead, SOCOM TRIP TO DIU (FORWARD DEPLOYED SOCOM TEAM)
  31. KEY LEARNINGS AFTER STAGE 3 ❏ FORMAT STANDARDIZATION IS ESSENTIAL TO USING FULL POTENTIAL OF DATA ❏ DATA SHARING IS HINDERED BY NETWORK SEPARATION AND SECURITY ISSUES
  32. STAGE 4: THE ROAD TO DC Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future Here are the issues we know you are facing Here are the amazing things you can do Here’s the research and open source software to get there
  33. STAGE 4: THE ROAD TO DC Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future We curated a list of senior AI players in the IC to brief
  34. FINAL MVP: DC BRIEFING Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future To adapt to the IC’s needs, a machine learning pipeline must be distributed and non-labor intensive ❏ Three recent trends in academia could make this a reality: ● Semi-supervised learning ● Active / online learning ● Federated learning
  35. STAGE 4: WHAT WE HEARD “Data access is on a need to know basis” - CIA Data Scientist Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future TRIP TO DC (BRIEF DOD AND IC SENIORS)
  36. STAGE 4: WHAT WE HEARD “The infrastructure in place was never designed with automated methods in mind” - CIA Data Scientist Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future TRIP TO DC (BRIEF DOD AND IC SENIORS)
  37. STAGE 4: WHAT WE HEARD “What would make you want to join the IC?” - Deputy Director of National Intelligence Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future TRIP TO DC (BRIEF DOD AND IC SENIORS)
  38. DC TAKEAWAYS ❏ THE IC NEEDS TO CREATE AN ELITE CULTURE AROUND DATA SCIENCE ❏ THE IC MUST RETHINK ITS RELATIONSHIP WITH DATA ❏ THE IC AND DOD NEED TO BE ON THE SAME PAGE FOR AI
  39. SUMMARIZING OUR JOURNEY Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future We came in ready to build...
  40. SUMMARIZING OUR JOURNEY Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future We came in ready to build... ...but that isn’t what the IC needed in the end...
  41. SUMMARIZING OUR JOURNEY Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future We came in ready to build... ...LISTEN TO THE CUSTOMER!! ...but that isn’t what the IC needed in the end...
  42. NEXT STEPS Partners: Follow-Ups Sponsor: Final Deck Gutenberg: New Roles Week 0 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Future ❏ ODNI: Thoughts on IC vs industry differences ❏ NSCAI: Thoughts on ML use in legacy systems ❏ ODNI ❏ IARPA ❏ USAF, 70th ISR ❏ Nini: Vannevar Labs ❏ Kelly: Apple ❏ Lawrence: Waymo ❏ Remmelt: Microsoft ❏ Krey: AWS
  43. THANK YOU Teaching Team Col Pete Newell, Ret Steve Blank Steve Weinstein Tom Bedecarre Jeff Decker Diego Solorzano Cervantes Nate Simon Mentors LtCol Kevin Childs Lisa Wallace Sponsor John Beiler Dean Souleles Key Partners ODNI IARPA SOCOM USAF IQT And all others who helped us along the way!

×