SlideShare a Scribd company logo
1 of 20
by

 Mohd. Yaseen Ansari
   From TE CSE

under the guidance of

Prof. Mrs. A.R.Kulkarni
Introduction
Principle
Parts of Speech Classes
What is POS Tagging good for ?
Tag Set
Tag Set Example
Why is POS Tagging Hard ?
Methods for POS Tagging ?
Stochastic POS Tagging
Definition of Hidden Markov Model
HMM for Tagging
Viterbi Tagging
Viterbi Algorithm
An Example
Definition
            Parts of Speech Tagging is defined as the task
of labeling each word in a sentence with its appropriate
parts of speech

Example
          The mother kissed the baby on the cheek.

       The[AT] mother[NN] kissed[VBD] the[AT]
baby[NN] on[PRP] the[AT] cheek[NN].
The
mother
kissed      Noun
  the       Verb
 baby      Article
  on     Preposition
  the
cheek
Parts of speech tagging is harder than just having a list of
words and their parts of speech, because some words can
represent more than one part of speech at different
times, and because some parts of speech are complex or
unspoken. A large percentage of word-forms are
ambiguous. For example,

The sailor dogs the barmaid.
Even "dogs", which is usually thought of as just a plural
noun, can also be a verb.
There are two classes for parts of speech:-

1) Open Classes:- nouns , verbs , adjectives ,adverbs , etc.

2) Closed Classes:-

a) Conjunctions:- and , or , but , etc.
b) Pronouns:- I , she , him , etc.
c)Preposition:- with , on , under , etc.
d)Determiners:- the ,a ,an , etc.
e) Auxiliary verbs:- can , could , may , etc.

and there are many others.
1) Useful in -
       a) Information Retrieval
       b) Text to Speech
       c) Word Sense Disambiguation

2) Useful as a preprocessing step of parsing –
unique tag to each word reduces the number of parses.
For POS Tagging , there is need of tag sets so that one may
not have any problem for assigning one tag for each parts
of speech. There are four tag sets used worldwide.

1) Brown Corpus – 87 tag sets
2) Penn Tree Bank – 45 tag sets
3) British National Corpus – 61 tag sets
4) C7 – 164 tag sets

There are tag sets available which have tags for phrases
also.
PRP
PRP$
POS Tagging, most of the times is ambiguous that’s why one
can’t easily find the right tag for each word. For example, we
want to translate the ambiguous sentence.Example,

Time flies like an arrow.

Possibilities:-
1) Time/NN flies/NN like/VB an/AT arrow/NN.

2) Time/VB flies/NN like/IN an/AT arrow/NN.

3) Time/NN flies/VBZ like/IN an/AT arrow/NN.

Here the 3) is correct but see how many possibilities are there
and we don’t know exactly which one to choose. So one who has
a good hand in grammar and vocabulary can only make the
difference.
1) Rule-Based POS tagging
* e.g., ENGTWOL Tagger
* large collection (> 1000) of constraints on what
sequences of tags are allowable

2) Stochastic (Probabilistic) tagging
* e.g., HMM Tagger
* I’ll discuss this in a bit more detail

3) Transformation-based tagging
* e.g., Brill’s tagger
* Combination of Rule-Based and Stochastic
methodologies.
Input:- a string of words, tagset (ex. Book that flight, Penn
Treebank tagset)

Output:- a single best tag for each word (ex. Book/VB
that/DT flight/NN ./.)

Problem:- resolve ambiguity → disambiguation
Example-> book (Hand me that book, Book that flight)
Set of states – all possible tags
Output alphabet – all words in the language
State/tag transition probabilities
Initial state probabilities: the probability of beginning a
 sentence with a tag t (t0t)
Output probabilities – producing word w at state t
Output sequence – observed word sequence
State sequence – underlying tag sequence
First-order (bigram) Markov assumptions:

  1) Limited Horizon: Tag depends only on previous tag
       P(ti+1 = tk | t1=tj1,…,ti=tji) = P(ti+1 = tk | ti = tj)

  2) Time invariance: No change over time
       P(ti+1 = tk | ti = tj) = P(t2 = tk | t1 = tj) = P(tj  tk)

Output probabilities:

  1) Probability of getting word wk for tag tj: P(wk | tj)

  2) Assumption:


  Not dependent on other tags or words!
Probability of a tag sequence:

P(t1t2…tn) = P(t1)P(t1t2)P(t2t3)…P(tn-1tn)

Assume t0 – starting tag:
                = P(t0t1)P(t1t2)P(t2t3)…P(tn-1tn)

Probabilty of word sequence and tag sequence:

P(W,T) = i P(ti-1ti) P(wi | ti)
Labeled training = each word has a POS tag


Thus:
         PMLE(tj) = C(tj) / N
         PMLE(tjtk) = C(tj, tk) / C(tj)
         PMLE(wk | tj) = C(tj:wk) / C(tj)
1) D(0, START) = 0
2)   for each tag t != START do: D(1, t) = -
3)   for i  1 to N do:
        for each tag tj do:


D(i, tj)  maxk D(i-1,tk) + lm(tk tj) + lm(wi|tj)
Record best(i,j)=k which yielded the max

1) log P(W,T) = maxj D(N, tj)
2) Reconstruct path from maxj backwards


Where: lm(.) = log m(.) and D(i, tj) – max joint probability
of state and word sequences till position i, ending at tj.
Complexity: O(Nt2 N)
Most probable tag sequence given text:


      T*     = arg maxT Pm(T | W)
             = arg maxT Pm(W | T) Pm(T) / Pm(W)
                    (Bayes’ Theorem)
             = arg maxT Pm(W | T) Pm(T)
                    (W is constant for all T)
             = arg maxT i[m(ti-1ti) m(wi | ti) ]
             = arg maxT i log[m(ti-1ti) m(wi | ti) ]

Exponential number of possible tag sequences – use
dynamic programming for efficient computation
Secretariat/NNP is/VBZ expected/VBN to/TO race/VB
tomorrow/NN

People/NNS continue/VBP to/TO inquire/VB the DT
reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN

to/TO race/???

the/DT race/???
ti = argmaxj P(tj|ti-1)P(wi|tj)

max[P(VB|TO)P(race|VB) , P(NN|TO)P(race|NN)]

Brown:-
P(NN|TO) = .021            ×      P(race|NN) = .00041    = .000007
P(VB|TO) = .34 ×           P(race|VB) = .00003   = .00001
Parts of Speect Tagging

More Related Content

What's hot

Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
vini89
 
Language Model (N-Gram).pptx
Language Model (N-Gram).pptxLanguage Model (N-Gram).pptx
Language Model (N-Gram).pptx
HeneWijaya
 

What's hot (20)

Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
NLP
NLPNLP
NLP
 
NLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingNLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological Parsing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language-processing
Natural language-processingNatural language-processing
Natural language-processing
 
Machine Tanslation
Machine TanslationMachine Tanslation
Machine Tanslation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP
NLPNLP
NLP
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for English
 
Language Model (N-Gram).pptx
Language Model (N-Gram).pptxLanguage Model (N-Gram).pptx
Language Model (N-Gram).pptx
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Machine translation
Machine translationMachine translation
Machine translation
 
6 shallow parsing introduction
6 shallow parsing introduction6 shallow parsing introduction
6 shallow parsing introduction
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense Disambiguation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)
 

Similar to Parts of Speect Tagging

Coms30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorComs30123 Synthesis 3 Projector
Coms30123 Synthesis 3 Projector
Dr. Cupid Lucid
 
Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10
Jason Yang
 

Similar to Parts of Speect Tagging (20)

Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
 
Lecture-18(11-02-22)Stochastics POS Tagging.pdf
Lecture-18(11-02-22)Stochastics POS Tagging.pdfLecture-18(11-02-22)Stochastics POS Tagging.pdf
Lecture-18(11-02-22)Stochastics POS Tagging.pdf
 
Coms30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorComs30123 Synthesis 3 Projector
Coms30123 Synthesis 3 Projector
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
 
INFORMATIZED CAPTION ENHANCEMENT BASED ON IBM WATSON API AND SPEAKER PRONUNCI...
INFORMATIZED CAPTION ENHANCEMENT BASED ON IBM WATSON API AND SPEAKER PRONUNCI...INFORMATIZED CAPTION ENHANCEMENT BASED ON IBM WATSON API AND SPEAKER PRONUNCI...
INFORMATIZED CAPTION ENHANCEMENT BASED ON IBM WATSON API AND SPEAKER PRONUNCI...
 
Post-editese: an Exacerbated Translationese (presentation at MT Summit 2019)
Post-editese: an Exacerbated Translationese (presentation at MT Summit 2019)Post-editese: an Exacerbated Translationese (presentation at MT Summit 2019)
Post-editese: an Exacerbated Translationese (presentation at MT Summit 2019)
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learning
 
Presentation 2
Presentation 2Presentation 2
Presentation 2
 
Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function ...
Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function ...Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function ...
Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function ...
 
Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10
 
2015ht13439 final presentation
2015ht13439 final presentation2015ht13439 final presentation
2015ht13439 final presentation
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NLP new words
NLP new wordsNLP new words
NLP new words
 
sadf
sadfsadf
sadf
 
NLP and Deep Learning
NLP and Deep LearningNLP and Deep Learning
NLP and Deep Learning
 
NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Parts of Speect Tagging

  • 1. by Mohd. Yaseen Ansari From TE CSE under the guidance of Prof. Mrs. A.R.Kulkarni
  • 2. Introduction Principle Parts of Speech Classes What is POS Tagging good for ? Tag Set Tag Set Example Why is POS Tagging Hard ? Methods for POS Tagging ? Stochastic POS Tagging Definition of Hidden Markov Model HMM for Tagging Viterbi Tagging Viterbi Algorithm An Example
  • 3. Definition Parts of Speech Tagging is defined as the task of labeling each word in a sentence with its appropriate parts of speech Example The mother kissed the baby on the cheek. The[AT] mother[NN] kissed[VBD] the[AT] baby[NN] on[PRP] the[AT] cheek[NN].
  • 4. The mother kissed Noun the Verb baby Article on Preposition the cheek
  • 5. Parts of speech tagging is harder than just having a list of words and their parts of speech, because some words can represent more than one part of speech at different times, and because some parts of speech are complex or unspoken. A large percentage of word-forms are ambiguous. For example, The sailor dogs the barmaid. Even "dogs", which is usually thought of as just a plural noun, can also be a verb.
  • 6. There are two classes for parts of speech:- 1) Open Classes:- nouns , verbs , adjectives ,adverbs , etc. 2) Closed Classes:- a) Conjunctions:- and , or , but , etc. b) Pronouns:- I , she , him , etc. c)Preposition:- with , on , under , etc. d)Determiners:- the ,a ,an , etc. e) Auxiliary verbs:- can , could , may , etc. and there are many others.
  • 7. 1) Useful in - a) Information Retrieval b) Text to Speech c) Word Sense Disambiguation 2) Useful as a preprocessing step of parsing – unique tag to each word reduces the number of parses.
  • 8. For POS Tagging , there is need of tag sets so that one may not have any problem for assigning one tag for each parts of speech. There are four tag sets used worldwide. 1) Brown Corpus – 87 tag sets 2) Penn Tree Bank – 45 tag sets 3) British National Corpus – 61 tag sets 4) C7 – 164 tag sets There are tag sets available which have tags for phrases also.
  • 10. POS Tagging, most of the times is ambiguous that’s why one can’t easily find the right tag for each word. For example, we want to translate the ambiguous sentence.Example, Time flies like an arrow. Possibilities:- 1) Time/NN flies/NN like/VB an/AT arrow/NN. 2) Time/VB flies/NN like/IN an/AT arrow/NN. 3) Time/NN flies/VBZ like/IN an/AT arrow/NN. Here the 3) is correct but see how many possibilities are there and we don’t know exactly which one to choose. So one who has a good hand in grammar and vocabulary can only make the difference.
  • 11. 1) Rule-Based POS tagging * e.g., ENGTWOL Tagger * large collection (> 1000) of constraints on what sequences of tags are allowable 2) Stochastic (Probabilistic) tagging * e.g., HMM Tagger * I’ll discuss this in a bit more detail 3) Transformation-based tagging * e.g., Brill’s tagger * Combination of Rule-Based and Stochastic methodologies.
  • 12. Input:- a string of words, tagset (ex. Book that flight, Penn Treebank tagset) Output:- a single best tag for each word (ex. Book/VB that/DT flight/NN ./.) Problem:- resolve ambiguity → disambiguation Example-> book (Hand me that book, Book that flight)
  • 13. Set of states – all possible tags Output alphabet – all words in the language State/tag transition probabilities Initial state probabilities: the probability of beginning a sentence with a tag t (t0t) Output probabilities – producing word w at state t Output sequence – observed word sequence State sequence – underlying tag sequence
  • 14. First-order (bigram) Markov assumptions: 1) Limited Horizon: Tag depends only on previous tag P(ti+1 = tk | t1=tj1,…,ti=tji) = P(ti+1 = tk | ti = tj) 2) Time invariance: No change over time P(ti+1 = tk | ti = tj) = P(t2 = tk | t1 = tj) = P(tj  tk) Output probabilities: 1) Probability of getting word wk for tag tj: P(wk | tj) 2) Assumption: Not dependent on other tags or words!
  • 15. Probability of a tag sequence: P(t1t2…tn) = P(t1)P(t1t2)P(t2t3)…P(tn-1tn) Assume t0 – starting tag: = P(t0t1)P(t1t2)P(t2t3)…P(tn-1tn) Probabilty of word sequence and tag sequence: P(W,T) = i P(ti-1ti) P(wi | ti)
  • 16. Labeled training = each word has a POS tag Thus: PMLE(tj) = C(tj) / N PMLE(tjtk) = C(tj, tk) / C(tj) PMLE(wk | tj) = C(tj:wk) / C(tj)
  • 17. 1) D(0, START) = 0 2) for each tag t != START do: D(1, t) = - 3) for i  1 to N do: for each tag tj do: D(i, tj)  maxk D(i-1,tk) + lm(tk tj) + lm(wi|tj) Record best(i,j)=k which yielded the max 1) log P(W,T) = maxj D(N, tj) 2) Reconstruct path from maxj backwards Where: lm(.) = log m(.) and D(i, tj) – max joint probability of state and word sequences till position i, ending at tj. Complexity: O(Nt2 N)
  • 18. Most probable tag sequence given text: T* = arg maxT Pm(T | W) = arg maxT Pm(W | T) Pm(T) / Pm(W) (Bayes’ Theorem) = arg maxT Pm(W | T) Pm(T) (W is constant for all T) = arg maxT i[m(ti-1ti) m(wi | ti) ] = arg maxT i log[m(ti-1ti) m(wi | ti) ] Exponential number of possible tag sequences – use dynamic programming for efficient computation
  • 19. Secretariat/NNP is/VBZ expected/VBN to/TO race/VB tomorrow/NN People/NNS continue/VBP to/TO inquire/VB the DT reason/NN for/IN the/DT race/NN for/IN outer/JJ space/NN to/TO race/??? the/DT race/??? ti = argmaxj P(tj|ti-1)P(wi|tj) max[P(VB|TO)P(race|VB) , P(NN|TO)P(race|NN)] Brown:- P(NN|TO) = .021 × P(race|NN) = .00041 = .000007 P(VB|TO) = .34 × P(race|VB) = .00003 = .00001