SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
Shift Conference
Transfer Learning
BETTER MACHINE LEARNING
WITH LESS DATA
May 31st, 2019
Split, Croatia
2 | Copyright © 2019 Indico2 | Copyright © 2019 Indico
• CTO of indico
• B2B Intelligent Process Automation
company based in Boston
• Working on deep learning based transfer
learning since 2013
• Guy that plays with embeddings all day
• Vegan baker
3 | Copyright © 2019 Indico
Transfer
Learning
3 | Copyright © 2019 Indico
1. What is deep learning?
2. What makes it so effective?
3. What’s the catch?
4. Opening the “black box”
5. The unreasonable effectiveness of
embeddings
6. What makes a good embedding?
4 | Copyright © 2019 Indico
“DeepMind’s Go-playing AI doesn’t need human
help to beat us anymore”
- The Verge
“New AI Development So Advanced It's Too
Dangerous To Release, Says Scientists”
- Forbes
“AI defeated a top-tier 'Dota 2' esports
team. OpenAI is also inviting everyone
everyone to play.”
- Engadget
“New AI Style Transfer Algorithm
Allows Users to Create Millions of
Artistic Combinations”
- Nvidia
Network Models?
Hebbian Learning
Maybe this is
actually the
opposite of how
things work?
Spike timing
dependent plasticity
Oh, I guess this
doesn't really work
in machine learning
Backprop
All-or-nothing
neurons all wired
together
Connectivity in the
brain is complex,
all-or-nothing isn't
an absolute rule
???
Non-linearities are
critical, step
functions don't work
that well
ReLUs,
convolution,
recurrence
1940 Today1980
“Neuroscientists have long
criticised [sic] deep learning
algorithms as incompatible with
current knowledge of
neurobiology.”
- Yoshua Bengio et al
Towards Biologically Plausible Deep
Learning (2015)
What’s the
big deal?
Source
AlexNet:
The shot
heard
round the
world
Source
Human
Accuracy
But Why?
Let’s go on an adventure…
“Traditional” Machine Learning
What you have What you need
???
Count Vectorizer
# of times word
0 shows up
# of times word
1 shows up …[ ],,
TF-IDF (Term Frequency, Inverse
Document Frequency)
𝑓",$ = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑡𝑒𝑟𝑚 𝑡 𝑎𝑝𝑝𝑒𝑎𝑟𝑠 𝑖𝑛 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝑑
𝐷 = 𝐴𝑙𝑙 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑠
𝑣$ =
𝑓",$
∑$:;
< 𝑓",$
|𝐷|
| 𝑡 ∈ 𝑇
𝑣$ = 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝑑@ 𝑠 𝑡𝑓𝑖𝑑𝑓 𝑣𝑒𝑐𝑡𝑜𝑟
T = 𝐹𝑢𝑙𝑙 𝑉𝑜𝑐𝑎𝑏𝑢𝑙𝑎𝑟𝑦
The Problem With Text
John Malkovitch plays tennis in Winchester. He
has been reporting soreness in his elbow. His
60th birthday is in two weeks. After he returns
from his birthday trip to Casablanca we will
recommend a steroid shot to reduce
inflammation.
Feature(s)
• Name
The Problem With Text
John Malkovitch plays tennis in Winchester. He
has been reporting soreness in his elbow. His
60th birthday is in two weeks. After he returns
from his birthday trip to Casablanca we will
recommend a steroid shot to reduce
inflammation.
Feature(s)
• Gender
• Location
• Age
Feature(s)
• Name
The Problem With Text
John Malkovitch plays tennis in Winchester. He
has been reporting soreness in his elbow. His
60th birthday is in two weeks. After he returns
from his birthday trip to Casablanca we will
recommend a steroid shot to reduce
inflammation.
Feature(s)
• Activity
• Prior Affliction/Treatment
• Travel
Feature(s)
• Name
Feature(s)
• Gender
• Location
• Age
The Problem With Text
Problem Traditional Solution Traditional Problem
Linguistic Context • Stemming
• Synonym sets
• Lexicons
• Brittle
• Labor-intensive
• Messy real-world data
Local Context • Parse trees
• N-grams
• Phrase lexicon
• Inaccurate parsing
• Limited Context
• Messy real-world data
Out of Vocabulary Issues • Lemmatization
• Expanded vocabulary
• Ignore
• Computationally expensive
• Diminishing returns
• Messy real-world data
Manual
Feature
Engineering
Select
Features
Train
Model
Evaluate
Errors
and View
Test Error
The Philosophy of Traditional Learning
• Text
• Image
• Audio
Raw
Data
• tf-idf
• SIFT
Features
Final
Model
outputs
Outcome
The Philosophy of Deep Learning
• Text
• Image
• Audio
Raw
Data
Statistical
features
derived
from data
Features
Final
Model
outputs
Outcome
What’s going on inside
of a network model
Credit: Zeiler and Fergus (2014)
Enter Embeddings Transfer Learning
What are text embeddings?
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
0.5
What is an Embedding?
Text Space
(e.g. English)
Manifold
(e.g. R300)
Embedding Method
(e.g. Word2Vec)
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
Words
Manifold
What is an Embedding?
Text Space
(e.g. English)
Embedding Space
(e.g. R300)
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
Embedding Method
(e.g. Word2Vec)
Linguistic Context
(e.g. Wikipedia)
Pitfalls
• Sufficient, Diverse Linguistic Context
• Clean Test/Train Splits
• The Curse of Dimensionality
• Effective Benchmarking
King
Queen
- man
+ woman
(Royalty)
How do Embeddings Work?
• Meaning is “encoded” into the
embedding space
• Individual dimensions are not
human interpretable
• Embedding method learns by
examining large corpora of
generic language
• Goal is accurate language
representation as a proxy for
downstream performance
“Word” Embeddings
Examples
• Word2vec
• GloVe
• fastText
“Word” Embeddings
Token Value
“great” [0.1, 0.3, …]
… …
Examples In Practice
• Word2vec
• GloVe
• fastText
“Word” Embeddings
Token Value
“great” [0.1, 0.3, …]
… …
Examples In Practice
Training
The quick brown fox _____ over the lazy dog
___ ___ ____ ___ jumps ___ __ ___ ___
CBOW
Skip Gram
• Word2vec
• GloVe
• fastText
Do They Really Preserve Algorithmic Value?
• Embeddings generally
outperform raw text at low data
volumes
• Leveraging large, generic text
corpora improves
generalizability
• This is 4 year old tech.
Embeddings have improved
drastically. Text has not.
Reported numbers are the average of 5 runs of randomly sampled test/train splits
each reporting the average of a 5-fold cv, within which Logistic Regression
hyperparameters are optimized. Generated using Enso
0,5
0,55
0,6
0,65
0,7
0,75
0,8
0,85
0,9
50
75
100
125
150
175
200
225
250
275
300
325
350
375
400
425
450
475
500
Accuracy
Number of Data Points
Glove Benchmark (Movie Review Sentiment
Analysis)
tf-idf
Glove
Problems with
Small Data
Add Linguistic Context (Semantics)
Add Local Context
Prevent Out of Vocabulary Issues
Text Embeddings
Examples
• Doc2vec
• Elmo
• ULMFiT
Text Embeddings
Examples
In Practice
Often built on top of pre-trained word embeddings
• Doc2vec
• Elmo
• ULMFiT
Text Embeddings
Examples In Practice
Training
The quick brown fox jumps over the lazy
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
Language
Supervised
dog
True
Often built on top of pre-trained word embeddings
• Doc2vec
• Elmo
• ULMFiT
Text Embeddings
CNN-Style
The quick brown fox jumps over the lazy
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
Prediction
https://arxiv.org/pdf/1408.5882.pdf
Example
Text Embeddings
RNN-Style
The quick brown fox jumps over the lazy
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
Output
Memory
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.1
0.2
0.8
0.1
0.3
0.6
0.8
0.3
…
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
…
σ σ σ σ σ σ σ σ
Prediction
https://arxiv.org/pdf/1802.05365.pdf
Example
Add Linguistic Context (Semantics)
Add Local Context
Prevent Out of Vocabulary Issues
Problems with
Small Data
The Power of Context
We used a bytepair encoding (BPE) vocabulary…
significantly improving upon the state of the art in 9 out of
the 12 tasks studied
- Improving Language Understanding by Generative Pre-Training*
* https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-
unsupervised/language_understanding_paper.pdf
Problems with
Small Data
Add Linguistic Context (Semantics)
Add Local Context
Prevent Out of Vocabulary Issues
Do They Really Preserve Algorithmic Value?
• Newer transfer learning
techniques have made deep
learning at low data volumes
tractable
• Even when operating on top of
byte-pair encodings sufficient
context is retained to achieve
sota performance
• 4x error reduction over tf-idf
Reported numbers are the average of 5 runs of randomly sampled test/train splits
each reporting the average of a 5-fold cv, within which Logistic Regression
hyperparameters are optimized. Generated using Enso
0,5
0,55
0,6
0,65
0,7
0,75
0,8
0,85
0,9
50
75
100
125
150
175
200
225
250
275
300
325
350
375
400
425
450
475
500
Accuracy
Number of Data Points
Finetune Benchmark (Movie Review Sentiment
Analysis)
tf-idf
Glove
Finetune
Treat it like any other feature vector
Thank You
SLATER VICTOROFF
slater@indico.io

Contenu connexe

Similaire à Better Machine Learning with Less Data - Slater Victoroff (Indico Data)

Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Hacking school computers for fun profit and better grades short
Hacking school computers for fun profit and better grades shortHacking school computers for fun profit and better grades short
Hacking school computers for fun profit and better grades shortVincent Ohprecio
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchrohitcse52
 
Genetic Malware
Genetic MalwareGenetic Malware
Genetic MalwareOkta
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013Ken Mwai
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonJanu Jahnavi
 
Rental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean DownesRental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean DownesDatabricks
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonJanu Jahnavi
 
No specimen (software) left behind
No specimen (software) left behindNo specimen (software) left behind
No specimen (software) left behindVince Smith
 
SP14 CS188 Lecture 1 -- Introduction.pptx
SP14 CS188 Lecture 1 -- Introduction.pptxSP14 CS188 Lecture 1 -- Introduction.pptx
SP14 CS188 Lecture 1 -- Introduction.pptxssuser851498
 
Cloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptxCloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptxSahithiGurlinka
 
Atlassian - Software For Every Team
Atlassian - Software For Every TeamAtlassian - Software For Every Team
Atlassian - Software For Every TeamSven Peters
 
From Data to Visualization, what happens in between?
From Data to Visualization, what happens in between?From Data to Visualization, what happens in between?
From Data to Visualization, what happens in between?Krist Wongsuphasawat
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsJason Anderson
 
The information supernova
The information supernovaThe information supernova
The information supernovaAlaa Al-Agamawi
 
Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...
Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...
Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...Databricks
 
Tokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTrent McConaghy
 
Lecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris WeltyLecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris WeltyLora Aroyo
 
Short URLs, Big Fun
Short URLs, Big FunShort URLs, Big Fun
Short URLs, Big FunHilary Mason
 

Similaire à Better Machine Learning with Less Data - Slater Victoroff (Indico Data) (20)

Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Hacking school computers for fun profit and better grades short
Hacking school computers for fun profit and better grades shortHacking school computers for fun profit and better grades short
Hacking school computers for fun profit and better grades short
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
 
Genetic Malware
Genetic MalwareGenetic Malware
Genetic Malware
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk python
 
Rental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean DownesRental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean Downes
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk python
 
No specimen (software) left behind
No specimen (software) left behindNo specimen (software) left behind
No specimen (software) left behind
 
SP14 CS188 Lecture 1 -- Introduction.pptx
SP14 CS188 Lecture 1 -- Introduction.pptxSP14 CS188 Lecture 1 -- Introduction.pptx
SP14 CS188 Lecture 1 -- Introduction.pptx
 
Cloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptxCloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptx
 
Atlassian - Software For Every Team
Atlassian - Software For Every TeamAtlassian - Software For Every Team
Atlassian - Software For Every Team
 
From Data to Visualization, what happens in between?
From Data to Visualization, what happens in between?From Data to Visualization, what happens in between?
From Data to Visualization, what happens in between?
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
 
The information supernova
The information supernovaThe information supernova
The information supernova
 
Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...
Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...
Great Models with Great Privacy: Optimizing ML and AI Under GDPR with Sim Sim...
 
Tokens, Complex Systems, and Nature
Tokens, Complex Systems, and NatureTokens, Complex Systems, and Nature
Tokens, Complex Systems, and Nature
 
IoT and DataStream
IoT and DataStreamIoT and DataStream
IoT and DataStream
 
Lecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris WeltyLecture 6: Watson and the Social Web (2014), Chris Welty
Lecture 6: Watson and the Social Web (2014), Chris Welty
 
Short URLs, Big Fun
Short URLs, Big FunShort URLs, Big Fun
Short URLs, Big Fun
 

Plus de Shift Conference

Shift Remote: AI: How Does Face Recognition Work (ars futura)
Shift Remote: AI: How Does Face Recognition Work  (ars futura)Shift Remote: AI: How Does Face Recognition Work  (ars futura)
Shift Remote: AI: How Does Face Recognition Work (ars futura)Shift Conference
 
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...Shift Conference
 
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...Shift Conference
 
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...Shift Conference
 
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...
Shift Remote: DevOps: Autodesks research into digital twins for AEC -  Kean W...Shift Remote: DevOps: Autodesks research into digital twins for AEC -  Kean W...
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...Shift Conference
 
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...Shift Conference
 
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...
Shift Remote: DevOps: Modern incident management with opsgenie -  Kristijan L...Shift Remote: DevOps: Modern incident management with opsgenie -  Kristijan L...
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...Shift Conference
 
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)Shift Conference
 
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...Shift Conference
 
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)Shift Conference
 
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)Shift Conference
 
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...Shift Conference
 
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...Shift Conference
 
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...Shift Conference
 
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...Shift Conference
 
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...Shift Conference
 
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...Shift Conference
 
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...Shift Conference
 
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...Shift Conference
 
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)Shift Conference
 

Plus de Shift Conference (20)

Shift Remote: AI: How Does Face Recognition Work (ars futura)
Shift Remote: AI: How Does Face Recognition Work  (ars futura)Shift Remote: AI: How Does Face Recognition Work  (ars futura)
Shift Remote: AI: How Does Face Recognition Work (ars futura)
 
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
 
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Ti...
 
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
 
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...
Shift Remote: DevOps: Autodesks research into digital twins for AEC -  Kean W...Shift Remote: DevOps: Autodesks research into digital twins for AEC -  Kean W...
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...
 
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
 
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...
Shift Remote: DevOps: Modern incident management with opsgenie -  Kristijan L...Shift Remote: DevOps: Modern incident management with opsgenie -  Kristijan L...
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...
 
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
 
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
 
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
 
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
 
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
 
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
 
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
 
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
 
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
 
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
 
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
 
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
 
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
 

Dernier

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Dernier (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Better Machine Learning with Less Data - Slater Victoroff (Indico Data)

  • 1. Shift Conference Transfer Learning BETTER MACHINE LEARNING WITH LESS DATA May 31st, 2019 Split, Croatia
  • 2. 2 | Copyright © 2019 Indico2 | Copyright © 2019 Indico • CTO of indico • B2B Intelligent Process Automation company based in Boston • Working on deep learning based transfer learning since 2013 • Guy that plays with embeddings all day • Vegan baker
  • 3. 3 | Copyright © 2019 Indico Transfer Learning 3 | Copyright © 2019 Indico 1. What is deep learning? 2. What makes it so effective? 3. What’s the catch? 4. Opening the “black box” 5. The unreasonable effectiveness of embeddings 6. What makes a good embedding?
  • 4. 4 | Copyright © 2019 Indico “DeepMind’s Go-playing AI doesn’t need human help to beat us anymore” - The Verge “New AI Development So Advanced It's Too Dangerous To Release, Says Scientists” - Forbes “AI defeated a top-tier 'Dota 2' esports team. OpenAI is also inviting everyone everyone to play.” - Engadget “New AI Style Transfer Algorithm Allows Users to Create Millions of Artistic Combinations” - Nvidia
  • 5. Network Models? Hebbian Learning Maybe this is actually the opposite of how things work? Spike timing dependent plasticity Oh, I guess this doesn't really work in machine learning Backprop All-or-nothing neurons all wired together Connectivity in the brain is complex, all-or-nothing isn't an absolute rule ??? Non-linearities are critical, step functions don't work that well ReLUs, convolution, recurrence 1940 Today1980
  • 6. “Neuroscientists have long criticised [sic] deep learning algorithms as incompatible with current knowledge of neurobiology.” - Yoshua Bengio et al Towards Biologically Plausible Deep Learning (2015)
  • 10. Let’s go on an adventure…
  • 11. “Traditional” Machine Learning What you have What you need ???
  • 12. Count Vectorizer # of times word 0 shows up # of times word 1 shows up …[ ],,
  • 13. TF-IDF (Term Frequency, Inverse Document Frequency) 𝑓",$ = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑡𝑒𝑟𝑚 𝑡 𝑎𝑝𝑝𝑒𝑎𝑟𝑠 𝑖𝑛 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝑑 𝐷 = 𝐴𝑙𝑙 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑠 𝑣$ = 𝑓",$ ∑$:; < 𝑓",$ |𝐷| | 𝑡 ∈ 𝑇 𝑣$ = 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝑑@ 𝑠 𝑡𝑓𝑖𝑑𝑓 𝑣𝑒𝑐𝑡𝑜𝑟 T = 𝐹𝑢𝑙𝑙 𝑉𝑜𝑐𝑎𝑏𝑢𝑙𝑎𝑟𝑦
  • 14. The Problem With Text John Malkovitch plays tennis in Winchester. He has been reporting soreness in his elbow. His 60th birthday is in two weeks. After he returns from his birthday trip to Casablanca we will recommend a steroid shot to reduce inflammation. Feature(s) • Name
  • 15. The Problem With Text John Malkovitch plays tennis in Winchester. He has been reporting soreness in his elbow. His 60th birthday is in two weeks. After he returns from his birthday trip to Casablanca we will recommend a steroid shot to reduce inflammation. Feature(s) • Gender • Location • Age Feature(s) • Name
  • 16. The Problem With Text John Malkovitch plays tennis in Winchester. He has been reporting soreness in his elbow. His 60th birthday is in two weeks. After he returns from his birthday trip to Casablanca we will recommend a steroid shot to reduce inflammation. Feature(s) • Activity • Prior Affliction/Treatment • Travel Feature(s) • Name Feature(s) • Gender • Location • Age
  • 17. The Problem With Text Problem Traditional Solution Traditional Problem Linguistic Context • Stemming • Synonym sets • Lexicons • Brittle • Labor-intensive • Messy real-world data Local Context • Parse trees • N-grams • Phrase lexicon • Inaccurate parsing • Limited Context • Messy real-world data Out of Vocabulary Issues • Lemmatization • Expanded vocabulary • Ignore • Computationally expensive • Diminishing returns • Messy real-world data
  • 19. The Philosophy of Traditional Learning • Text • Image • Audio Raw Data • tf-idf • SIFT Features Final Model outputs Outcome
  • 20. The Philosophy of Deep Learning • Text • Image • Audio Raw Data Statistical features derived from data Features Final Model outputs Outcome
  • 21. What’s going on inside of a network model Credit: Zeiler and Fergus (2014)
  • 23. What are text embeddings? 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 0.5
  • 24. What is an Embedding? Text Space (e.g. English) Manifold (e.g. R300) Embedding Method (e.g. Word2Vec) 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 …
  • 26. What is an Embedding? Text Space (e.g. English) Embedding Space (e.g. R300) 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … Embedding Method (e.g. Word2Vec) Linguistic Context (e.g. Wikipedia)
  • 27. Pitfalls • Sufficient, Diverse Linguistic Context • Clean Test/Train Splits • The Curse of Dimensionality • Effective Benchmarking
  • 28. King Queen - man + woman (Royalty) How do Embeddings Work? • Meaning is “encoded” into the embedding space • Individual dimensions are not human interpretable • Embedding method learns by examining large corpora of generic language • Goal is accurate language representation as a proxy for downstream performance
  • 30. “Word” Embeddings Token Value “great” [0.1, 0.3, …] … … Examples In Practice • Word2vec • GloVe • fastText
  • 31. “Word” Embeddings Token Value “great” [0.1, 0.3, …] … … Examples In Practice Training The quick brown fox _____ over the lazy dog ___ ___ ____ ___ jumps ___ __ ___ ___ CBOW Skip Gram • Word2vec • GloVe • fastText
  • 32. Do They Really Preserve Algorithmic Value? • Embeddings generally outperform raw text at low data volumes • Leveraging large, generic text corpora improves generalizability • This is 4 year old tech. Embeddings have improved drastically. Text has not. Reported numbers are the average of 5 runs of randomly sampled test/train splits each reporting the average of a 5-fold cv, within which Logistic Regression hyperparameters are optimized. Generated using Enso 0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 50 75 100 125 150 175 200 225 250 275 300 325 350 375 400 425 450 475 500 Accuracy Number of Data Points Glove Benchmark (Movie Review Sentiment Analysis) tf-idf Glove
  • 33. Problems with Small Data Add Linguistic Context (Semantics) Add Local Context Prevent Out of Vocabulary Issues
  • 35. Text Embeddings Examples In Practice Often built on top of pre-trained word embeddings • Doc2vec • Elmo • ULMFiT
  • 36. Text Embeddings Examples In Practice Training The quick brown fox jumps over the lazy 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … Language Supervised dog True Often built on top of pre-trained word embeddings • Doc2vec • Elmo • ULMFiT
  • 37. Text Embeddings CNN-Style The quick brown fox jumps over the lazy 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … Prediction https://arxiv.org/pdf/1408.5882.pdf Example
  • 38. Text Embeddings RNN-Style The quick brown fox jumps over the lazy 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … Output Memory 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.1 0.2 0.8 0.1 0.3 0.6 0.8 0.3 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … σ σ σ σ σ σ σ σ Prediction https://arxiv.org/pdf/1802.05365.pdf Example
  • 39. Add Linguistic Context (Semantics) Add Local Context Prevent Out of Vocabulary Issues Problems with Small Data
  • 40. The Power of Context We used a bytepair encoding (BPE) vocabulary… significantly improving upon the state of the art in 9 out of the 12 tasks studied - Improving Language Understanding by Generative Pre-Training* * https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language- unsupervised/language_understanding_paper.pdf
  • 41. Problems with Small Data Add Linguistic Context (Semantics) Add Local Context Prevent Out of Vocabulary Issues
  • 42. Do They Really Preserve Algorithmic Value? • Newer transfer learning techniques have made deep learning at low data volumes tractable • Even when operating on top of byte-pair encodings sufficient context is retained to achieve sota performance • 4x error reduction over tf-idf Reported numbers are the average of 5 runs of randomly sampled test/train splits each reporting the average of a 5-fold cv, within which Logistic Regression hyperparameters are optimized. Generated using Enso 0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 50 75 100 125 150 175 200 225 250 275 300 325 350 375 400 425 450 475 500 Accuracy Number of Data Points Finetune Benchmark (Movie Review Sentiment Analysis) tf-idf Glove Finetune
  • 43. Treat it like any other feature vector