SlideShare une entreprise Scribd logo
1  sur  35
Successes and Challenges
Emily Pitler, Google AI
Representations
from Natural
Language Data
State-of-the-art in Natural Language Understanding in 2017
→ → Custom Recurrent Architectures
P 2
Oct. 2018: One Model with Task-specific Tuning in Minutes
P 3
BERT: Bidirectional Encoder Representations from Transformers
P 4
https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
Transformers: Attention is All You Need
P 5Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin, NIPS 2017
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 6
animal...street...it
0000…10001…01000…0
one-hot
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 7
animal...street...it
0000…10001…01000…0
one-hot
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
query, key, value
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 8
animal...street...it
0000…10001…01000…0
one-hot
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
query, key, value
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 9
animal...street...it
0000…10001…01000…0
one-hot
0.1
0.2
0.7
(self-)attention
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
query, key, value
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 10
animal...street...it
0000…10001…01000…0
one-hot
0.1
0.2
0.7
(self-)attention
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
OpenAI: Generative Pretraining
The animal tired Acceptable
<s> the … too <s> the … tired
P 11
Transformer Transformer Transformer
Transformer TransformerTransformer
Transformer Transformer Transformer
Transformer TransformerTransformer
Understanding Can Need “Future” Information
How far is Jacksonville from Miami?
Jacksonville is in the First Coast region of northeast Florida and is centered on the
banks of the St. Johns River, about 25 miles (40 km) south of the Georgia state line
and about 340 miles (550 km) north of Miami.
VERB NOUN
Mark which area you want to distress. Mark, which area do you want to distress?
P 12
Naive Bidirectionality: Words Can “See Themselves”
The animal tired The animal tired
<s> the … too <s> the … too
P 13
Transformer Transformer Transformer
Transformer TransformerTransformer
Transformer Transformer Transformer
Transformer TransformerTransformer
Training BERT
Masked Language Model (Fill-in-the-blank)
Deep learning (also [MASK] [MASK] deep structured learning or [MASK]
learning) is part of a broader family of machine learning methods
[MASK] on [MASK] data representations, as opposed to task-specific
algorithms.
[MASK] is allergic to peaches. Is
P 14https://en.wikipedia.org/wiki/Deep_learning
https://en.wikipedia.org/wiki/Daniel_Tiger%27s_Neighborhood
BooksCorpus: Zhu, Kiros, Zemel, Salakhutdinov, Urtasun, Torralba, Fidler, CVPR 2015
Basic BERT Recipe
P 15
Basic BERT Recipe
P 16
Basic BERT Recipe
P 17
SWAG: Zellers, Bisk, Schwartz, Choi, EMNLP 2018 SQuAD: Rajpurkar, Zhang, Lopyrev, Liang, EMNLP 2016
Results: Commonsense Reasoning and Question Answering
P 18
MRPC: Dolan and Brockett, IWP 2005
Pretraining Tasks Matter...and Bigger = Better *
P 19
Do I Need Full BERT Models for All My Tasks?
P 20Houlsby, Giurgiu, Jastrzebski, Morrone, de Laroussilhe, Gesmundo, Attariyan, Gelly, arxiv Feb 2019
Try It Out, Get Faster Training with TPUs
P 21
Mismatches between Training and
Realistic Inputs
Two Case Studies: Mixed Language Text and Identifying Commands
P 22
P 23
Multiple Languages: Frequent “In-the-Wild”, Rare in Training
P 24
Multiple Languages: Frequent “In-the-Wild”, Rare in Training
“A Fast, Compact, Accurate Model for Language Identification
of Codemixed Text”
Zhang, Riesa , Gillick , Bakalov, Baldridge, Weiss, EMNLP 2018 P 25
Accuracy and Speed of Token-level Language Id
P 26
Accuracy and Speed of Token-level Language Id
P 27
Useful Preprocessing Step Across Tasks
P 28
Mismatches between Training and
Realistic Inputs
Two Case Studies: Mixed Language Text and Identifying Commands
P 29
Noun-Verb Ambiguity
“lives” / Noun → /laIvz/
“lives” / Verb → /lIvz/
flies
NOUN
Mark VERB
P 30Elkahky, Webster, Andor, Pitler, EMNLP 2018
Certain insects can damage plumerias, such as mites, flies, or aphids. NOUN
Mark which area you want to distress. VERB
P 31
“A Challenge Set and Methods for Noun-Verb Ambiguity”,
EMNLP 2018
Accuracy on Noun-Verb Disambiguation
P 32
Pronunciation of Homographs Accuracy
P 33
Mark VERB
Webster, Recasens, Axelrod, Baldridge, TACL 2019 Kwiatkowski, Palomaki, Redfield, Collins, Parikh, Alberti, Epstein, Polosukhin, Kelcey, Devlin, Lee, Toutanova, Jones, Chang, Dai, Uszkoreit, Le, Petrov, TACL 2019
Released Datasets with “In-the-Wild” Natural Challenges
P 34
Summary
P 35

Contenu connexe

Similaire à Emily Pitler - Representations from Natural Language Data: Successes and Challenges

ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015RIILP
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsForward Gradient
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimEdgar Marca
 
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical InformaticsMEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical Informaticsbutest
 
State of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendState of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendEgor Pushkin
 
Biemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimBiemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimdiannepatricia
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyDafydd Gibbon
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowValeria de Paiva
 
Babak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesBabak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesZoltan Varju
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingSeth Grimes
 
More on Indexing Text Operations (1).pptx
More on Indexing  Text Operations (1).pptxMore on Indexing  Text Operations (1).pptx
More on Indexing Text Operations (1).pptxMahsadelavari
 
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...Apache OpenNLP
 
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...fgodin
 
Contemporary Models of Natural Language Processing
Contemporary Models of Natural Language ProcessingContemporary Models of Natural Language Processing
Contemporary Models of Natural Language ProcessingKaterina Vylomova
 
Assessment of english language learners final
Assessment of english language learners finalAssessment of english language learners final
Assessment of english language learners finalcswstyle
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part IDelip Rao
 

Similaire à Emily Pitler - Representations from Natural Language Data: Successes and Challenges (20)

ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensim
 
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical InformaticsMEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
 
State of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendState of NLP and Amazon Comprehend
State of NLP and Amazon Comprehend
 
Biemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimBiemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanim
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technology
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and How
 
Babak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesBabak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entities
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
More on Indexing Text Operations (1).pptx
More on Indexing  Text Operations (1).pptxMore on Indexing  Text Operations (1).pptx
More on Indexing Text Operations (1).pptx
 
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
 
Parameter setting
Parameter settingParameter setting
Parameter setting
 
Lausanne 2019 #3
Lausanne 2019 #3Lausanne 2019 #3
Lausanne 2019 #3
 
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
 
Contemporary Models of Natural Language Processing
Contemporary Models of Natural Language ProcessingContemporary Models of Natural Language Processing
Contemporary Models of Natural Language Processing
 
Assessment of english language learners final
Assessment of english language learners finalAssessment of english language learners final
Assessment of english language learners final
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part I
 
Lidia Pivovarova
Lidia PivovarovaLidia Pivovarova
Lidia Pivovarova
 

Plus de MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Plus de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Emily Pitler - Representations from Natural Language Data: Successes and Challenges

  • 1. Successes and Challenges Emily Pitler, Google AI Representations from Natural Language Data
  • 2. State-of-the-art in Natural Language Understanding in 2017 → → Custom Recurrent Architectures P 2
  • 3. Oct. 2018: One Model with Task-specific Tuning in Minutes P 3
  • 4. BERT: Bidirectional Encoder Representations from Transformers P 4
  • 5. https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html Transformers: Attention is All You Need P 5Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin, NIPS 2017
  • 6. From One-Hot Vectors to Word Embeddings & Self-Attention P 6 animal...street...it 0000…10001…01000…0 one-hot 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 7. From One-Hot Vectors to Word Embeddings & Self-Attention P 7 animal...street...it 0000…10001…01000…0 one-hot 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 8. query, key, value From One-Hot Vectors to Word Embeddings & Self-Attention P 8 animal...street...it 0000…10001…01000…0 one-hot 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 9. query, key, value From One-Hot Vectors to Word Embeddings & Self-Attention P 9 animal...street...it 0000…10001…01000…0 one-hot 0.1 0.2 0.7 (self-)attention 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 10. query, key, value From One-Hot Vectors to Word Embeddings & Self-Attention P 10 animal...street...it 0000…10001…01000…0 one-hot 0.1 0.2 0.7 (self-)attention 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 11. OpenAI: Generative Pretraining The animal tired Acceptable <s> the … too <s> the … tired P 11 Transformer Transformer Transformer Transformer TransformerTransformer Transformer Transformer Transformer Transformer TransformerTransformer
  • 12. Understanding Can Need “Future” Information How far is Jacksonville from Miami? Jacksonville is in the First Coast region of northeast Florida and is centered on the banks of the St. Johns River, about 25 miles (40 km) south of the Georgia state line and about 340 miles (550 km) north of Miami. VERB NOUN Mark which area you want to distress. Mark, which area do you want to distress? P 12
  • 13. Naive Bidirectionality: Words Can “See Themselves” The animal tired The animal tired <s> the … too <s> the … too P 13 Transformer Transformer Transformer Transformer TransformerTransformer Transformer Transformer Transformer Transformer TransformerTransformer
  • 14. Training BERT Masked Language Model (Fill-in-the-blank) Deep learning (also [MASK] [MASK] deep structured learning or [MASK] learning) is part of a broader family of machine learning methods [MASK] on [MASK] data representations, as opposed to task-specific algorithms. [MASK] is allergic to peaches. Is P 14https://en.wikipedia.org/wiki/Deep_learning https://en.wikipedia.org/wiki/Daniel_Tiger%27s_Neighborhood BooksCorpus: Zhu, Kiros, Zemel, Salakhutdinov, Urtasun, Torralba, Fidler, CVPR 2015
  • 18. SWAG: Zellers, Bisk, Schwartz, Choi, EMNLP 2018 SQuAD: Rajpurkar, Zhang, Lopyrev, Liang, EMNLP 2016 Results: Commonsense Reasoning and Question Answering P 18
  • 19. MRPC: Dolan and Brockett, IWP 2005 Pretraining Tasks Matter...and Bigger = Better * P 19
  • 20. Do I Need Full BERT Models for All My Tasks? P 20Houlsby, Giurgiu, Jastrzebski, Morrone, de Laroussilhe, Gesmundo, Attariyan, Gelly, arxiv Feb 2019
  • 21. Try It Out, Get Faster Training with TPUs P 21
  • 22. Mismatches between Training and Realistic Inputs Two Case Studies: Mixed Language Text and Identifying Commands P 22
  • 23. P 23 Multiple Languages: Frequent “In-the-Wild”, Rare in Training
  • 24. P 24 Multiple Languages: Frequent “In-the-Wild”, Rare in Training
  • 25. “A Fast, Compact, Accurate Model for Language Identification of Codemixed Text” Zhang, Riesa , Gillick , Bakalov, Baldridge, Weiss, EMNLP 2018 P 25
  • 26. Accuracy and Speed of Token-level Language Id P 26
  • 27. Accuracy and Speed of Token-level Language Id P 27
  • 28. Useful Preprocessing Step Across Tasks P 28
  • 29. Mismatches between Training and Realistic Inputs Two Case Studies: Mixed Language Text and Identifying Commands P 29
  • 30. Noun-Verb Ambiguity “lives” / Noun → /laIvz/ “lives” / Verb → /lIvz/ flies NOUN Mark VERB P 30Elkahky, Webster, Andor, Pitler, EMNLP 2018
  • 31. Certain insects can damage plumerias, such as mites, flies, or aphids. NOUN Mark which area you want to distress. VERB P 31 “A Challenge Set and Methods for Noun-Verb Ambiguity”, EMNLP 2018
  • 32. Accuracy on Noun-Verb Disambiguation P 32
  • 34. Mark VERB Webster, Recasens, Axelrod, Baldridge, TACL 2019 Kwiatkowski, Palomaki, Redfield, Collins, Parikh, Alberti, Epstein, Polosukhin, Kelcey, Devlin, Lee, Toutanova, Jones, Chang, Dai, Uszkoreit, Le, Petrov, TACL 2019 Released Datasets with “In-the-Wild” Natural Challenges P 34