SlideShare a Scribd company logo
1 of 37
Download to read offline
UCU ML Summer Workshops 2017
Natural Language Processing: Part 2
Yuriy Guts
ML Engineer
Remaining Agenda
Day 4: Short Text Similarity & Paraphrase Detection
● Paraphrase identification problem
● Traditional and DL approaches to semantic similarity
● Architectural principles for state-of-the-art neural NLP
● DL feature extraction for NLP
Day 5: Sequence-to-Sequence Modeling
● Attention mechanisms in DL for NLP
● Sequence-to-sequence neural models
Where can I get very professional and reliable envelope printing service in Sydney?
Where can I get very affordable branded envelope printing service in Sydney?
Paraphrase Identification
Why are doctors always late?
Why doctors always make you wait for 15-20 minutes before they see you?
Source: Quora Question Pairs Dataset
https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs
Recap: BoW Representations
Image copyright © S. Mukherjee
Recap: Cosine Similarity
Image copyright © C. Perone
Lexical Databases and Ontologies
house, home
dwelling
hermitage cottage
backyard
veranda
study
“A place that serves as the living
quarters of one or more families”
. . .
penthouse
Meronym
Hyponym
HyponymHypernym
Hyponym
Def
WordNet
“The complete meaning of a word is always contextual, and no study of
meaning apart from context can be taken seriously”
Distributed Hypothesis of Language
John R. Firth. The technique of semantics.
Transactions of the Philological Society, 1935.
“Words that occur in the same contexts tend to have similar meanings”
Distributional Hypothesis of Language [Harris, 1954]
Image copyright © D. Kartsaklis
The minimum amount of “work” needed to transform document 1 to document 2.
Inspired by Earth Mover’s Distance, a well-studied transportation problem.
Word Mover’s Distance (WMD)
M. Kusner et al. “From Word Embeddings to Document Distances”, 2015.
http://proceedings.mlr.press/v37/kusnerb15.pdf
WMD: Linear Optimization Problem
M. Kusner et al. “From Word Embeddings to Document Distances”, 2015.
http://proceedings.mlr.press/v37/kusnerb15.pdf
nBOW frequency of the
i-th word in the document
“How much” of word i
travels to word j
Architectural Principles
for State-of-the-Art Neural NLP
Embed
Encode
Attend
Predict
State-of-the-Art NLP Pipeline
Image copyright © Explosion AI
Embed
Image copyright © The Morning Paper
Encode: Convolutions over Word Embeddings
Image copyright © WildML
Y. Kim. “Convolutional Neural Networks for Sentence Classification” [2014]
Encode: RNNs over Word Embeddings
Image copyright © Edwin Chen
Encode: RNNs over Word Embeddings
Image copyright © Edwin Chen
Encode: Gated Recurrent Architectures
Image copyright © Chris Olah
Encode: Gated Recurrent Architectures
Image copyright © Edwin Chen
Two Input Documents? Siamese Network
Run the same encoding step for every input.
Share encoder weights, don’t learn WE1
and WE2
separately.
Distance Learning, Contrastive Loss
Penalize similar pairs by a monotonically increasing function of their learned distance.
Penalize different pairs by a monotonically decreasing function of their learned distance.
Hadsell, Chopra, LeCun “Dimensionality Reduction by Learning an Invariant Mapping”, 2006.
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
Sequence-to-Sequence Modeling
and Attention Mechanisms in Neural NLP
State-of-the-Art NLP Pipeline
Image copyright © Explosion AI
Relative Complexity of NLP Problems
Easy Medium Hard
• Chunking
• Part-of-Speech Tagging
• Named Entity Recognition
• Spam Detection
• Spellchecking
• Syntactic Parsing
• Paraphrase Identification
• Sentiment Analysis
• Topic Modeling
• Information Retrieval
• Machine Translation
• Text Generation
• Automatic Summarization
• Question Answering
• Conversational Interfaces
Relative Complexity of NLP Problems
Easy Medium Hard
• Chunking
• Part-of-Speech Tagging
• Named Entity Recognition
• Spam Detection
• Spellchecking
• Syntactic Parsing
• Paraphrase Identification
• Sentiment Analysis
• Topic Modeling
• Information Retrieval
• Machine Translation
• Text Generation
• Automatic Summarization
• Question Answering
• Conversational Interfaces
Seq2Seq can be applied here
Seq2Seq Modeling at a Glance
Image copyright © Suriyadeepan Ram
● Two different models: an encoder and a decoder (typically RNNs).
● Encoder represents an input document as a fixed size “thought vector”.
● Decoder unwinds the thought vector into an output sequence.
● Hard for encoder to compress the source documents into a fixed size vector.
● Performance deteriorates rapidly as the size of the document increases
(e.g. good local grammar, but lots of word repetition).
Vanilla Seq2Seq: Challenges
Content credit: Stanford CS224n
Traditional MT: “Word Alignment”
Content credit: Stanford CS224n
It would be a lot more beneficial to let the decoder peek at the original input at
each step of the decoding phase.
Jointly Learning to Align and Translate
D. Bahdanau, K. Cho et al. “Neural Machine Translation by Jointly Learning to
Align and Translate”, 2014.
https://arxiv.org/pdf/1409.0473.pdf
Neural Attention: Scoring
Content credit: Stanford CS224n
Neural Attention: Scoring
Content credit: Stanford CS224n
Attention Mechanism: Context
Content credit: Stanford CS224n
Interpretability of Attention Scores
Image copyright © Bahdanau et al.
Interpretability of Attention Scores
Image copyright © Distill.Pub
MT Performance
Content credit: Stanford CS224n
Recent paper: SotA NMT using only
stacked attention units and positional
embeddings (no CNN, RNN etc.).
Very computationally efficient.
Attention-Only NMT
Vaswani, Shazeer et al. (Google Brain)
“Attention is All You Need”, Jun 2017.
https://arxiv.org/pdf/1706.03762.pdf

More Related Content

What's hot

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Mariana Soffer
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative Communication
Divya Sugumar
 

What's hot (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
The Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information RetrievalThe Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information Retrieval
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative Communication
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.net
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Natural Language Processing Crash Course
Natural Language Processing Crash CourseNatural Language Processing Crash Course
Natural Language Processing Crash Course
 
Natural Language Processing and Machine Learning
Natural Language Processing and Machine LearningNatural Language Processing and Machine Learning
Natural Language Processing and Machine Learning
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 

Similar to UCU NLP Summer Workshops 2017 - Part 2

BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
Karthik Murugesan
 

Similar to UCU NLP Summer Workshops 2017 - Part 2 (20)

The Magic of Image processing using Neural Networks
The Magic of Image processing using Neural Networks The Magic of Image processing using Neural Networks
The Magic of Image processing using Neural Networks
 
Dcnn for text
Dcnn for textDcnn for text
Dcnn for text
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
Conversational AI with Rasa - PyData Workshop
Conversational AI with Rasa - PyData WorkshopConversational AI with Rasa - PyData Workshop
Conversational AI with Rasa - PyData Workshop
 
Cork AI Meetup Number 3
Cork AI Meetup Number 3Cork AI Meetup Number 3
Cork AI Meetup Number 3
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Image captioning
Image captioningImage captioning
Image captioning
 
Understanding deep learning
Understanding deep learningUnderstanding deep learning
Understanding deep learning
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Powering NLU Engine with Apache Spark to Communicate with World with Rahul Kumar
Powering NLU Engine with Apache Spark to Communicate with World with Rahul KumarPowering NLU Engine with Apache Spark to Communicate with World with Rahul Kumar
Powering NLU Engine with Apache Spark to Communicate with World with Rahul Kumar
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Commonsense knowledge for Machine Intelligence - part 1
Commonsense knowledge for Machine Intelligence - part 1Commonsense knowledge for Machine Intelligence - part 1
Commonsense knowledge for Machine Intelligence - part 1
 
Natural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyNatural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A Survey
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
An Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLPAn Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLP
 

More from Yuriy Guts

A Developer Overview of Redis
A Developer Overview of RedisA Developer Overview of Redis
A Developer Overview of Redis
Yuriy Guts
 
Non-Functional Requirements
Non-Functional RequirementsNon-Functional Requirements
Non-Functional Requirements
Yuriy Guts
 
Introduction to Software Architecture
Introduction to Software ArchitectureIntroduction to Software Architecture
Introduction to Software Architecture
Yuriy Guts
 
UML for Business Analysts
UML for Business AnalystsUML for Business Analysts
UML for Business Analysts
Yuriy Guts
 
Intro to Software Engineering for non-IT Audience
Intro to Software Engineering for non-IT AudienceIntro to Software Engineering for non-IT Audience
Intro to Software Engineering for non-IT Audience
Yuriy Guts
 

More from Yuriy Guts (18)

Target Leakage in Machine Learning (ODSC East 2020)
Target Leakage in Machine Learning (ODSC East 2020)Target Leakage in Machine Learning (ODSC East 2020)
Target Leakage in Machine Learning (ODSC East 2020)
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Target Leakage in Machine Learning
Target Leakage in Machine LearningTarget Leakage in Machine Learning
Target Leakage in Machine Learning
 
Paraphrase Detection in NLP
Paraphrase Detection in NLPParaphrase Detection in NLP
Paraphrase Detection in NLP
 
NoSQL (ELEKS DevTalks #1 - Jan 2015)
NoSQL (ELEKS DevTalks #1 - Jan 2015)NoSQL (ELEKS DevTalks #1 - Jan 2015)
NoSQL (ELEKS DevTalks #1 - Jan 2015)
 
Experiments with Machine Learning - GDG Lviv
Experiments with Machine Learning - GDG LvivExperiments with Machine Learning - GDG Lviv
Experiments with Machine Learning - GDG Lviv
 
A Developer Overview of Redis
A Developer Overview of RedisA Developer Overview of Redis
A Developer Overview of Redis
 
[JEEConf 2015] Lessons from Building a Modern B2C System in Scala
[JEEConf 2015] Lessons from Building a Modern B2C System in Scala[JEEConf 2015] Lessons from Building a Modern B2C System in Scala
[JEEConf 2015] Lessons from Building a Modern B2C System in Scala
 
Redis for .NET Developers
Redis for .NET DevelopersRedis for .NET Developers
Redis for .NET Developers
 
Aspect-Oriented Programming (AOP) in .NET
Aspect-Oriented Programming (AOP) in .NETAspect-Oriented Programming (AOP) in .NET
Aspect-Oriented Programming (AOP) in .NET
 
Non-Functional Requirements
Non-Functional RequirementsNon-Functional Requirements
Non-Functional Requirements
 
Introduction to Software Architecture
Introduction to Software ArchitectureIntroduction to Software Architecture
Introduction to Software Architecture
 
UML for Business Analysts
UML for Business AnalystsUML for Business Analysts
UML for Business Analysts
 
Intro to Software Engineering for non-IT Audience
Intro to Software Engineering for non-IT AudienceIntro to Software Engineering for non-IT Audience
Intro to Software Engineering for non-IT Audience
 
ELEKS DevTalks #4: Amazon Web Services Crash Course
ELEKS DevTalks #4: Amazon Web Services Crash CourseELEKS DevTalks #4: Amazon Web Services Crash Course
ELEKS DevTalks #4: Amazon Web Services Crash Course
 
ELEKS Summer School 2012: .NET 09 - Databases
ELEKS Summer School 2012: .NET 09 - DatabasesELEKS Summer School 2012: .NET 09 - Databases
ELEKS Summer School 2012: .NET 09 - Databases
 
ELEKS Summer School 2012: .NET 06 - Multithreading
ELEKS Summer School 2012: .NET 06 - MultithreadingELEKS Summer School 2012: .NET 06 - Multithreading
ELEKS Summer School 2012: .NET 06 - Multithreading
 
ELEKS Summer School 2012: .NET 04 - Resources and Memory
ELEKS Summer School 2012: .NET 04 - Resources and MemoryELEKS Summer School 2012: .NET 04 - Resources and Memory
ELEKS Summer School 2012: .NET 04 - Resources and Memory
 

Recently uploaded

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 

Recently uploaded (20)

GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 

UCU NLP Summer Workshops 2017 - Part 2

  • 1. UCU ML Summer Workshops 2017 Natural Language Processing: Part 2 Yuriy Guts ML Engineer
  • 2. Remaining Agenda Day 4: Short Text Similarity & Paraphrase Detection ● Paraphrase identification problem ● Traditional and DL approaches to semantic similarity ● Architectural principles for state-of-the-art neural NLP ● DL feature extraction for NLP Day 5: Sequence-to-Sequence Modeling ● Attention mechanisms in DL for NLP ● Sequence-to-sequence neural models
  • 3. Where can I get very professional and reliable envelope printing service in Sydney? Where can I get very affordable branded envelope printing service in Sydney? Paraphrase Identification Why are doctors always late? Why doctors always make you wait for 15-20 minutes before they see you? Source: Quora Question Pairs Dataset https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs
  • 4. Recap: BoW Representations Image copyright © S. Mukherjee
  • 5. Recap: Cosine Similarity Image copyright © C. Perone
  • 6. Lexical Databases and Ontologies house, home dwelling hermitage cottage backyard veranda study “A place that serves as the living quarters of one or more families” . . . penthouse Meronym Hyponym HyponymHypernym Hyponym Def
  • 8. “The complete meaning of a word is always contextual, and no study of meaning apart from context can be taken seriously” Distributed Hypothesis of Language John R. Firth. The technique of semantics. Transactions of the Philological Society, 1935.
  • 9. “Words that occur in the same contexts tend to have similar meanings” Distributional Hypothesis of Language [Harris, 1954] Image copyright © D. Kartsaklis
  • 10. The minimum amount of “work” needed to transform document 1 to document 2. Inspired by Earth Mover’s Distance, a well-studied transportation problem. Word Mover’s Distance (WMD) M. Kusner et al. “From Word Embeddings to Document Distances”, 2015. http://proceedings.mlr.press/v37/kusnerb15.pdf
  • 11. WMD: Linear Optimization Problem M. Kusner et al. “From Word Embeddings to Document Distances”, 2015. http://proceedings.mlr.press/v37/kusnerb15.pdf nBOW frequency of the i-th word in the document “How much” of word i travels to word j
  • 14. State-of-the-Art NLP Pipeline Image copyright © Explosion AI
  • 15. Embed Image copyright © The Morning Paper
  • 16. Encode: Convolutions over Word Embeddings Image copyright © WildML Y. Kim. “Convolutional Neural Networks for Sentence Classification” [2014]
  • 17. Encode: RNNs over Word Embeddings Image copyright © Edwin Chen
  • 18. Encode: RNNs over Word Embeddings Image copyright © Edwin Chen
  • 19. Encode: Gated Recurrent Architectures Image copyright © Chris Olah
  • 20. Encode: Gated Recurrent Architectures Image copyright © Edwin Chen
  • 21. Two Input Documents? Siamese Network Run the same encoding step for every input. Share encoder weights, don’t learn WE1 and WE2 separately.
  • 22. Distance Learning, Contrastive Loss Penalize similar pairs by a monotonically increasing function of their learned distance. Penalize different pairs by a monotonically decreasing function of their learned distance. Hadsell, Chopra, LeCun “Dimensionality Reduction by Learning an Invariant Mapping”, 2006. http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
  • 24. State-of-the-Art NLP Pipeline Image copyright © Explosion AI
  • 25. Relative Complexity of NLP Problems Easy Medium Hard • Chunking • Part-of-Speech Tagging • Named Entity Recognition • Spam Detection • Spellchecking • Syntactic Parsing • Paraphrase Identification • Sentiment Analysis • Topic Modeling • Information Retrieval • Machine Translation • Text Generation • Automatic Summarization • Question Answering • Conversational Interfaces
  • 26. Relative Complexity of NLP Problems Easy Medium Hard • Chunking • Part-of-Speech Tagging • Named Entity Recognition • Spam Detection • Spellchecking • Syntactic Parsing • Paraphrase Identification • Sentiment Analysis • Topic Modeling • Information Retrieval • Machine Translation • Text Generation • Automatic Summarization • Question Answering • Conversational Interfaces Seq2Seq can be applied here
  • 27. Seq2Seq Modeling at a Glance Image copyright © Suriyadeepan Ram ● Two different models: an encoder and a decoder (typically RNNs). ● Encoder represents an input document as a fixed size “thought vector”. ● Decoder unwinds the thought vector into an output sequence.
  • 28. ● Hard for encoder to compress the source documents into a fixed size vector. ● Performance deteriorates rapidly as the size of the document increases (e.g. good local grammar, but lots of word repetition). Vanilla Seq2Seq: Challenges Content credit: Stanford CS224n
  • 29. Traditional MT: “Word Alignment” Content credit: Stanford CS224n
  • 30. It would be a lot more beneficial to let the decoder peek at the original input at each step of the decoding phase. Jointly Learning to Align and Translate D. Bahdanau, K. Cho et al. “Neural Machine Translation by Jointly Learning to Align and Translate”, 2014. https://arxiv.org/pdf/1409.0473.pdf
  • 31. Neural Attention: Scoring Content credit: Stanford CS224n
  • 32. Neural Attention: Scoring Content credit: Stanford CS224n
  • 33. Attention Mechanism: Context Content credit: Stanford CS224n
  • 34. Interpretability of Attention Scores Image copyright © Bahdanau et al.
  • 35. Interpretability of Attention Scores Image copyright © Distill.Pub
  • 37. Recent paper: SotA NMT using only stacked attention units and positional embeddings (no CNN, RNN etc.). Very computationally efficient. Attention-Only NMT Vaswani, Shazeer et al. (Google Brain) “Attention is All You Need”, Jun 2017. https://arxiv.org/pdf/1706.03762.pdf