Machine Learning in NLP

Meandering through the machine learning maze
NLP & Machine
Learning
Vijay Ganti
** I want to acknowledge full credit for the content of this deck to various sources including Stanford NLP & Deep Learning content, Machine Learning
Lectures by Andrew Ng, Natural Language Processing with Python and many more. This was purely an educational presentation with no monetary gains
for any parties

NLP powered by ML will change the way business gets done
❖ Conversational agents are becoming an important form of
human-computer communication (Customer support
interactions using chat-bots)
❖ Much of human-human communication is now mediated
by computers (Email, Social Media, Messaging)
❖ An enormous amount of knowledge is now available in
machine readable form as natural language text (web,
proprietary enterprise content)

What do I hope to achieve with this talk?
My 3 cents
Make it accessible & real
Get you excited
Give you a sense of what it takes

Let’s start with a typical ML workflow
Training Data
Feature
Extractor
ML Algo
Prediction Data
Feature
Extractor
Classiﬁer Model
Features
Features
Label
Input + Labels
Input Only

Let’s walk through the workflow
with an example

names_data = [(n,'male') for n in names.words('male.txt')]+
[(n,'female') for n in names.words('female.txt')]
Training Data
Input + Labels
Feature
Extractor
def gender_feature1(name):
return {'Last Letter': name[-1]}
def gender_feature2(name):
return {'last_letter': name[-1],'last_two_letters': name[-2:]}
Features
feature_set = [(gender_feature1(n), g) for n,g in names_data]
feature_set = [(gender_feature2(n), g) for n,g in names_data]
train, devtest, test = feature_set[1500:], feature_set[500:1500],
feature_set[:500]
Partition Data

classifier = nltk.NaiveBayesClassifier.train(train)
ML Algo
Classifier Model
Prediction_Accuracy = nltk.classify.accuracy(classifier, devtest)
Prediction = classifier.classify(gender_feature2(‘Kathryn’))
Prediction = (classifier.classify(gender_feature2(‘Vijay'))
Prediction = (classifier.classify(gender_feature2('Jenna'))
Classifier Model
Prediction

Why is NLP hard?
Violinist Linked to JAL Crash Blossoms
Teacher Strikes Idle Kids
Red Tape Holds Up New Bridges
Hospitals Are Sued by 7 Foot Doctors
Juvenile Court to Try Shooting Defendant
Local High School Dropouts Cut in Half
Ambiguity

Tricky NEsInformal English
Why is NLP hard..er?
Segmentation issues
the New York-New Haven Railroad
the New York-New Haven Railroad
Idioms
Kodak moment
As cool as cucumber
Hold your horses
Kick the bucket
Neologisms
Gamify
Double-click
Bromance
Unfriend
Great job @justinbieber! Were SOO
PROUD of what youve accomplished! U
taught us 2 #neversaynever & you
yourself should never give up either♥
Where is A Bug’s Life playing …
Let It Be was recorded …
… a mutation on the for gene …

State of the language technology
Let’s go to Agra
Buy V1AGRA …
Spam Detection
Adj Noun Adv Noun Verb
big dog quickly China trump
Part of Speech Tagging (POS)
Person Company Place
Satya met with LinkedIn in Palo Alto
Named Entity Recognition (NER)
Sentiment Analysis
Vijay told Arnaud that he was wrong
Coreference Resolution
I need new batteries for my mouse
Word sense disambiguation
Information Extraction
MOSTLY SOLVED GOOD PROGRESS STILL VERY HARD
Questions and Answers
What kind of cable do I need to project from
my Mac ?
Paraphrasing
Dell acquired EMC last year
EMC was taken over by Dell 8 months back.
Summarization
Brexit vote shocked everyone
Financial markets are in turmoil
Young people are not happy
Brexit has been
disruptive
Chat/Dialog
What movie is
playing this
evening?
Casablanca. Do
you want to buy
tickets ?

Why is this a good time to solve hard problems in NLP?
❖ What has changed since 2006?
❖ New methods for unsupervised pre-training have been developed (Restricted
Boltzmann Machines, auto-encoders, contrastive estimation, etc.)
❖ More efﬁcient parameter estimation methods
❖ Better understanding of model regularization
❖ Changes in computing technology favor deep learning
❖ In NLP, speed has traditionally come from exploiting sparsity
❖ But with modern machines, branches and widely spaced memory accesses are costly
❖ Uniform parallel operations on dense vectors are faster These trends are even
stronger with multi-core CPUs and GPUs
❖ Wide availability of ML implementations and computing infrastructure to run them
Come back to this slide

What you need to learn to make machines learn ?
❖ Machine Learning Process & Concepts
❖ Feature engineering
❖ Bias / Variance (overfitting, underfitting)
❖ Performance testing (accuracy, precision, recall, f-score)
❖ regularization
❖ Parameter selection
❖ Algo selection ( Wait for next slide for a quick summary of algos used at Netflix)
❖ Probability
❖ Linear Algebra (vector & matrices)
❖ Some calculus
❖ Python
❖ Octave/Matlab

Why is this a good time to solve hard problems in NLP?
❖ What has changed since 2006?
❖ New methods for unsupervised pre-training have been developed (ex. Restricted
Boltzmann Machines, auto-encoders)
❖ More efﬁcient parameter estimation methods
❖ Better understanding of model regularization
❖ Changes in computing technology favor deep learning
❖ In NLP, speed has traditionally come from exploiting sparsity
❖ But with modern machines, branches and widely spaced memory accesses are costly
❖ Uniform parallel operations on dense vectors are faster These trends are even
stronger with multi-core CPUs and GPUs
❖ Wide availability of ML implementations and computing infrastructure to run them
Now this old slide will make sense

ML Sequence model approach to NER
❖ Training
1. Collect a set of representative training documents
2. Label each token for its entity class or other (O)
3. Design feature extractors appropriate to the text and classes
4. Train a sequence classiﬁer to predict the labels from the data
❖ Testing
1. Receive a set of testing documents
2. Run sequence model inference to label each token
3. Appropriately output the recognized entities

Old skills are still very relevant - Role of Reg Expression
the
Misses capitalized examples
[tT]he Incorrectly returns other or theology
[â-zA-Z][tT]he[â-zA-Z]
Find me all instances of the word “the” in a text.
❖ Regular expressions play a surprisingly large role
❖ Sophisticated sequences of regular expressions are often the first model for any
text processing
❖ For many hard tasks, we use machine learning classifiers
❖ But regular expressions are used as features in the classifiers
❖ Can be very useful in capturing generalizations
/^(?:(?:(?(?:00|+)([1-4]dd|[1-9]d?))?)?[-. /]?)?((?:(?d{1,})?[-. /]?){0,})(?:[-. /]?(?:#|ext.?|extension|x)[-. /]?(d+))?$/i

Algo Selection example - Logistic Regression vs SVM
❖ If n is large (relative to m), then use logistic regression,
or SVM without a kernel (the "linear kernel")
❖ If n is small and m is intermediate, then use SVM with a
Gaussian Kernel
❖ If n is small and m is large, then manually create/add
more features, then use logistic regression or SVM
without a kernel.

Machine Learning in NLP

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Machine Learning in NLP

Similaire à Machine Learning in NLP (20)

Dernier

Dernier (20)

Machine Learning in NLP