4. POS TAGGING
Part-of-speech (POS) tagging is a popular Natural
Language Processing process which refers to
categorizing words in a text (corpus) in
correspondence with a particular part of speech,
depending on the definition of the word and its
context.
20XX PRESENTATION TITLE 4
Why part of speech is important in NLP?
In NLP, there is a huge use of POST or part of speech tagging. By
sequencing words, if we had provided the tags to the words, it
becomes more useful for algorithms to understand the exact
representation of the similar word in different situations.
6. WHAT ARE TAGS?
It is a process of converting a sentence to forms –
list of words, list of tuples (where each tuple is
having a form (word, tag)). The tag in case of is a
part-of-speech tag, and signifies whether the word
is a noun, adjective, verb, and so on. Default tagging
is a basic step for the part-of-speech tagging
20XX PRESENTATION TITLE 6
POS tagger is used to assign
grammatical information of each
word of the sentence.
The process of classifying words
into their parts of speech and
labeling them accordingly is known
as part-of-speech tagging or POS-
tagging, or simply tagging.
7. TYPES OF POS TAGGING
20XX PRESENTATION TITLE 7
Types of POS taggers
POS-tagging algorithms fall into three distinctive groups:
Rule-Based POS Taggers
Stochastic POS Taggers
Transformation POS Taggers
8. RULE-BASED TAGGING
AUTOMATIC PART OF SPEECH TAGGING IS AN AREA OF NATURAL
LANGUAGE PROCESSING WHERE STATISTICAL TECHNIQUES HAVE
BEEN MORE SUCCESSFUL THAN RULE-BASED METHODS.
TYPICAL RULE-BASED APPROACHES USE CONTEXTUAL
INFORMATION TO ASSIGN TAGS TO UNKNOWN OR AMBIGUOUS
WORDS. DISAMBIGUATION IS DONE BY ANALYZING THE
LINGUISTIC FEATURES OF THE WORD, ITS PRECEDING WORD, ITS
FOLLOWING WORD, AND OTHER ASPECTS.
FOR EXAMPLE, IF THE PRECEDING WORD IS AN ARTICLE, THEN
THE WORD IN QUESTION MUST BE A NOUN. THIS INFORMATION
IS CODED IN THE FORM OF RULES.
20XX PRESENTATION TITLE 8
9. STOCHASTIC POS TAGGERS
IN THIS APPROACH, THE STOCHASTIC TAGGERS DISAMBIGUATE
THE WORDS BASED ON THE PROBABILITY THAT A WORD OCCURS
WITH A PARTICULAR TAG. WE CAN ALSO SAY THAT THE TAG
ENCOUNTERED MOST FREQUENTLY WITH THE WORD IN THE
TRAINING SET IS THE ONE ASSIGNED TO AN AMBIGUOUS
INSTANCE OF THAT WORD.
20XX PRESENTATION TITLE 9
10. TRANSFORMATION POS TAGGERS
THE TAG TRANSFORM ALLOWS YOU TO CATEGORIZE DATA BASED
ON CRITERIA YOU SELECT FROM YOUR DATA. IT WILL CREATE A
NEW COLUMN WITH THE TAG NAMES YOU'VE CREATED. FOR
THOSE FAMILIAR WITH SQL CASE STATEMENTS, THIS TRANSFORM
IS ESSENTIALLY A WAY TO ACCOMPLISH THE BASIC
FUNCTIONALITY WITHOUT ACTUALLY WRITING SQ
20XX PRESENTATION TITLE 10