The document provides contact information for Rushdi Shams including an email address and website. It also lists Rushdi Shams as a lecturer in the computer science department at KUET Bangladesh. The document discusses various components of human language including the lexicon, categorization, and grammar rules.
39. Why use computers in translation?
• Too much translation for humans
• Technical materials too boring for humans
• Greater consistency required
• Need results more quickly
• Not everything needs to be top quality
• Reduce costs
• any one of these may justify machine translation or computer aids
40. Components of a LanguageComponents of a Language
• There are three components of a language‐There are three components of a language
1. Lexicon
C i i2. Categorization
3. Grammar Rules
41. LexiconLexicon
stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
| | | | | f | | | || | | | | f | | | |here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
the | a | an |the | a | an | ……
to | in | on | near |to | in | on | near | ……
and | or | but |and | or | but | ……
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
42. CategorizationCategorization
NounNoun >> stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
VerbVerb >> is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
AdjectiveAdjective >> right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
| | | | | f | | | || | | | | f | | | |AdverbAdverb >> here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
PronounPronoun >> me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
NameName >> John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……NameName John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
ArticleArticle >> the | a | an |the | a | an | ……
PrepositionPreposition >> to | in | on | near |to | in | on | near | ……
ConjunctionConjunction >> and | or | but |and | or | but | ……
DigitDigit >> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
43. Grammar StructureGrammar Structure
• In this lecture and the one following it In this lecture and the one following it,
attending it carefully does not mean you know
all of English languageg g g
• Because, that will take you to read NLP as one
subject for 4 years! ☺subject for 4 years! ☺
• We will learn how to define the basic grammar
structure for NLP systemsstructure for NLP systems
• We will also learn what things you need to
keep in your head while devising such systemskeep in your head while devising such systems
44. Syntactic TreeSyntactic Tree
• Human recognizes the organization of words g g
according to their POS in a sentence with trees.
• Are you denying?
• Well you can. Because, you didn’t learn it this way in
your childhood.
• No one did!• No one did!
• But it has been proved that our brain draws a tree
like structure when we first develop our skills on p
language
• That research is beyond this lecture
45. Syntactic TreeSyntactic Tree
• So if you really do that unintentionally So, if you really do that unintentionally,
then why not learn it on pen and paper so
that you can understand how you will teach that you can understand how you will teach
machines to learn languages?
• The tree structure human contemplates is • The tree structure human contemplates is
called syntactic tree
46. Parsing a Syntactic TreeParsing a Syntactic Tree
• Parsing is the process of using grammar Parsing is the process of using grammar
rules to determine whether a sentence is
legal and to obtain its syntactical structurelegal, and to obtain its syntactical structure
• ‘The large cat eats the small rat’
49. ParsingParsing
Article adjective noun noun phraseVerbArticle adjective noun noun phrase
Article adjective noun
Verb
The large cat eats the small rat
50. ParsingParsing
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Article adjective noun
The large cat eats the small rat
51. Parsing
t
Parsing
Noun phrase verb phrase
sentence
Article adjective noun Verb noun phrase
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Article adjective noun
The large cat eats the small rat
52. Syntactic Tree
• The point where lines begin or end is called node
• Each node has labels like S PP or chasedEach node has labels like S, PP or chased
• If 2 nodes are connected by a line, the upper node is immediate
dominator of the lower node. D is the immediate dominator of the
• Upper nodes in a branch are called dominators. NP is the dominator Upper nodes in a branch are called dominators. NP is the dominator
of D, N, the, dog
53. Syntactic TreeSyntactic Tree
• Two nodes are sisters if they are immediately dominated by
same node D and N are sisterssame node. D and N are sisters.
• The immediate dominator of them is called their mother. NP is
the mother of D and N. Similarly, D and N are daughters of NP
• The immediate dominators of them are called their parents
54. Syntactic TreeSyntactic Tree
• Constituents are the terminal nodes that
ll d i d b i l i l are all dominated by a single non‐terminal
node. Chased a cat into the garden are
constituents as they are dominated by VPconstituents as they are dominated by VP
57. R b h i h Remember, you may have to practise the
reverse‐ constructing a syntactic tree
f l b l b k i ☺from label bracketing ☺
58. Constituents and CategoriesConstituents and Categories
• Tree structure provides two information‐Tree structure provides two information
1. It divides the sentence into constituents
(in English these are called phrases)(in English, these are called phrases)
2. It puts them into categories (NP, VP, etc)
59. Constituents and CategoriesConstituents and Categories
• How do we know what would be the right way to group g y g p
words into right category?
• How do we know into the garden is a category, but a cat
i t i t?into is not?
• Any words that can be moved as group are probably
constituents‐ the meaning of the dog chased a cat into g g
the garden and into the garden, the dog chased a cat.
• Which one did you move? Into the garden‐ right?
• And the meaning did not change
• That’s probably our constituent
60. Constituents and CategoriesConstituents and Categories
• Any string of words that can be deleted is Any string of words that can be deleted is
probably a constituent
• If you omit into the garden from the sentence, y g ,
nothing is changed grammatically.
• Usually, meaning of unit of words makes sense. y g
Into the garden is much more meaningful than a
cat into
61. Constituents and CategoriesConstituents and Categories
• However, we are only talking about syntactic However, we are only talking about syntactic
structure, not the semantic one.
• The dog, the cat and the garden‐ their grammar g, g g
structure is saying they are all noun phrases.
• It means, they can be used interchangeably‐ no y g y
linguist can deny that
• Then what about‐ “The garden chased the cat
into the dog”? ☺☺
• We will not focus on semantics, said you before!
62. AmbiguityAmbiguity
• There are 2 types of ambiguity‐yp g y
1. Lexical Ambiguity: Sentence contains an
idiom/word/term that has more than one meaning.
Glasses means both drinking glasses and spectacles
2. Structural Ambiguity: Sentence has more than
one syntactic treeone syntactic tree
I saw the boy with the telescope‐
Did you see the boy with a telescope? OrDid you see the boy with a telescope? Or
Did you see the boy who was having a telescope?
64. Difficulties with Natural Language:
Anaphora
•• Using pronouns to refer back to entities Using pronouns to refer back to entities Us g p o ou s to e e bac to e t t esUs g p o ou s to e e bac to e t t es
already introduced in the textalready introduced in the text
After Mary proposed to John, After Mary proposed to John, theythey found a found a
preacher and got married.preacher and got married.
For the honeymoon, For the honeymoon, theythey went to Hawaiiwent to Hawaii
Mary saw a ring through the window and asked Mary saw a ring through the window and asked
John for John for itit
Mary threw a rock at the window and broke Mary threw a rock at the window and broke itit
68. Semantics in NLSemantics in NL
• I can't untie that knot with one hand.ca t u t e t at ot t o e a d.
– The sentence is about the abilities of whoever spoke
or wrote it. (Call this person the speaker.)
– It's also about a knot, maybe one that the speaker is
pointing at
– The sentence denies that the speaker has a certain– The sentence denies that the speaker has a certain
ability. (This is the contribution of the word `can't'.)
– Untying is a way of making something not tied.
– The sentence doesn't mean that the knot has one
hand; it has to do with how many hands are used to
do the untyingdo the untying.
69. Problems in Semantics in NLProblems in Semantics in NL
• If you do not understand certain you do ot u de sta d ce ta
characteristics of linguistics, you will not be
able to understand the semantics.
• If you do understand them, you need to feel
them
• If you do feel them, you need to see the context
• If you see the context, you are dealt with both
ti d ti i NLsemantics and pragmatics in NL
• ☺
70. SynonymySynonymy
• Synonyms are different words (or sometimesSynonyms are different words (or sometimes
phrases) with identical or very similar
meaningsmeanings.
• Words that are synonyms are said to
be synonymous and the state of being abe synonymous, and the state of being a
synonym is called synonymy
71. SynonymySynonymy
• student and pupil (noun)student and pupil (noun)
• buy and purchase (verb)
i k d ill ( dj i )• sick and ill (adjective)
• quickly and speedily (adverb)
• on and upon (preposition)
72. SynonymySynonymy
• Note that synonyms are defined with respectNote that synonyms are defined with respect
to certain senses of words
• pupil as the "aperture in the iris of the eye" is• pupil as the aperture in the iris of the eye is
not synonymous with student.
Si il l h i d h h• Similarly,he expired means the same as he
died, yet my passport has expired cannot be
l d b h di dreplaced by my passport has died.
73. AntonymyAntonymy
• Antonyms are words with opposite or nearlyAntonyms are words with opposite or nearly
opposite meanings. For example:
• short and tall• short and tall
• dead and alive
• increase and decrease
74. HomonymyHomonymy
• a homonym is one of a group of words thata homonym is one of a group of words that
share the same spelling and the same
pronunciation but have different meanings,
usually as a result of the two words having
different origins.
• The state of being a homonym is
called homonymy.
• bark (the sound of a dog) and bark (the skin of
a tree).
75. HeteronymyHeteronymy
• heteronyms (also known asheterophones) areheteronyms (also known asheterophones) are
words with identical spellings (or characters)
but different pronunciations and meaningsbut different pronunciations and meanings.
76. Monolingual ambiguity
• morphological ambiguity:
– German -en: noun plural, dative plural, weak noun non-nominative, adjective
masculine non-nominative, etc.
• compound nouns:
– coincide -> coin+cide, cooperate -> cooper+ate
• category ambiguity:
– round: the first round (noun), to round up cattle (verb), the round table (adjective), go on a
voyage round the Mediterranean (preposition), it measure three feet round (adverb), etc.
• homographs and polysemes:
– branch: ‘of a tree’, ‘of a bank’; crane (a bird or lifting machine)
– ball: The ball rolled down the hill, The ball lasted until midnight
77. Bilingual lexical ambiguity
• English wall: German Mauer (outside) or Wand (inside)
• English river: French fleuve (major) or rivière (general term)
• English leg: French jambe (human), patte (animal, insect), pied (table), étape (journey)
• English blue: Russian goluboi (pale blue) or sinii (dark blue)
• French louer: English hire or rent
• German leihen: English borrow or lend
• English wear: Japanese haoru (coat/jacket), haku (shoes/trousers), kaburu (hat), hameru
(ring/gloves), shimeru (belt/tie/scarf), tsukeru (brooch/clip), kakeru (glasses/necklace)
• resolvable by:
– rules (indicating allowable or usual categories or types of subjects, objects, verbs, etc.)
– collocations (specifying particular adjacent words)
– frequencies (most probable adjacent or dependent words)
78. Structural ambiguity
• Flying planes can be dangerous
• The man saw the girl with a telescope
• John mentioned the book I sent to Mary
• I told everyone concerned about the strike
– everyone concerned/involved/relevant, or: everyone disturbed/worried
• He noticed her shaking hands
– either which were shaking from cold, or which were shaking other hands
• They complained to the guide that they could not hear
– that as relative pronoun (‘whom they could not hear’) or as complementizer (‘that they could
not hear him’)
• The mathematics students sat their examinations
• The mathematics students study today is very complex
– difficulty of identifying noun compound vs. relative clause
• Gas pump prices rose last time oil stocks fell
– each word potentially noun or verb