SlideShare une entreprise Scribd logo
1  sur  81
Télécharger pour lire hors ligne
Drop me a mail: Drop me a mail: rushdecoder@yahoo.comrushdecoder@yahoo.com
Visit me at: Visit me at: http://rushdishams.googlepages.comhttp://rushdishams.googlepages.com
1Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh
O OO O
V
• Peter mentioned the book I sent to Marry
• We will give medicines to pregnant women
and children
• I saw the boy with the telescope
• The painter put on another coat
• We like flying planes
• The judge threw the book at him
• Visiting relatives can be tiresome
• Da Vinci liked to paint his models nude.
• He wrote the note yesterday
• You mean you carried the information by a
bus?
• Connecting wires are tiring in DLD lab
• Squad helps dog bite victim
Why use computers in translation?
• Too much translation for humans
• Technical materials too boring for humans
• Greater consistency required
• Need results more quickly
• Not everything needs to be top quality
• Reduce costs
• any one of these may justify machine translation or computer aids
Components of a LanguageComponents of a Language
• There are three components of a language‐There are three components of a language
1. Lexicon
C i i2. Categorization
3. Grammar Rules
LexiconLexicon
stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
| | | | | f | | | || | | | | f | | | |here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
the | a | an |the | a | an | ……
to | in | on | near |to | in | on | near | ……
and | or | but |and | or | but | ……
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
CategorizationCategorization
NounNoun >> stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
VerbVerb >> is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
AdjectiveAdjective >> right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
| | | | | f | | | || | | | | f | | | |AdverbAdverb >> here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
PronounPronoun >> me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
NameName >> John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……NameName John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
ArticleArticle >> the | a | an |the | a | an | ……
PrepositionPreposition >> to | in | on | near |to | in | on | near | ……
ConjunctionConjunction >> and | or | but |and | or | but | ……
DigitDigit >> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Grammar StructureGrammar Structure
• In this lecture and the one following it  In this lecture and the one following it, 
attending it carefully does not mean you know 
all of English languageg g g
• Because, that will take you to read NLP as one 
subject for 4 years! ☺subject for 4 years! ☺
• We will learn how to define the basic grammar 
structure for NLP systemsstructure for NLP systems
• We will also learn what things you need to 
keep in your head while devising such systemskeep in your head while devising such systems
Syntactic TreeSyntactic Tree
• Human recognizes the organization of words g g
according to their POS in a sentence with trees.
• Are you denying?
• Well you can. Because, you didn’t learn it this way in 
your childhood.
• No one did!• No one did!
• But it has been proved that our brain draws a tree 
like structure when we first develop our skills on p
language
• That research is beyond this lecture
Syntactic TreeSyntactic Tree
• So  if you really do that unintentionally  So, if you really do that unintentionally, 
then why not learn it on pen and paper so 
that you can understand how you will teach that you can understand how you will teach 
machines to learn languages?
• The tree structure human contemplates is • The tree structure human contemplates is 
called syntactic tree
Parsing a Syntactic TreeParsing a Syntactic Tree
• Parsing is the process of using grammar Parsing is the process of using grammar 
rules to determine whether a sentence is 
legal  and to obtain its syntactical structurelegal, and to obtain its syntactical structure
• ‘The large cat eats the small rat’
ParsingParsing
The large cat eats the small rat
ParsingParsing
Article adjective noun VerbArticle adjective noun
Article adjective noun
Verb
The large cat eats the small rat
ParsingParsing
Article adjective noun noun phraseVerbArticle adjective noun noun phrase
Article adjective noun
Verb
The large cat eats the small rat
ParsingParsing
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Article adjective noun
The large cat eats the small rat
Parsing
t
Parsing
Noun phrase verb phrase
sentence
Article adjective noun Verb noun phrase
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Article adjective noun
The large cat eats the small rat
Syntactic Tree
• The point where lines begin or end is called node
• Each node has labels like S  PP or chasedEach node has labels like S, PP or chased
• If 2 nodes are connected by a line, the upper node is immediate 
dominator of the lower node. D is the immediate dominator of the
• Upper nodes in a branch are called dominators. NP is the dominator Upper nodes in a branch are called dominators. NP is the dominator 
of  D, N, the, dog
Syntactic TreeSyntactic Tree
• Two nodes are sisters if they are immediately dominated by 
same node  D and N are sisterssame node. D and N are sisters.
• The immediate dominator of them is called their mother. NP is 
the mother of D and N. Similarly, D and N are daughters of NP
• The immediate dominators of them are called their parents
Syntactic TreeSyntactic Tree
• Constituents are the terminal nodes that 
  ll d i d b     i l   i l are all dominated by a single non‐terminal 
node. Chased a cat into the garden are 
constituents as they are dominated by VPconstituents as they are dominated by VP
Label BracketingLabel Bracketing
I  i       f  i   h   i    i   h  • It is a process of representing the syntactic tree in another way.
Do yourself: Label Bracket the treeDo yourself: Label Bracket the tree
R b      h     i   h  Remember, you may have to practise the 
reverse‐ constructing a syntactic tree 
f  l b l b k i  ☺from label bracketing ☺
Constituents and CategoriesConstituents and Categories
• Tree structure provides two information‐Tree structure provides two information
1. It divides the sentence into constituents
(in English  these are called phrases)(in English, these are called phrases)
2. It puts them into categories (NP, VP, etc)
Constituents and CategoriesConstituents and Categories
• How do we know what would be the right way to group g y g p
words into right category?
• How do we know into the garden is a category, but a cat 
i t  i   t?into is not?
• Any words that can be moved as group are probably
constituents‐ the meaning of the dog chased a cat into g g
the garden and into the garden, the dog chased a cat. 
• Which one did you move? Into the garden‐ right?
• And the meaning did not change
• That’s probably our constituent
Constituents and CategoriesConstituents and Categories
• Any string of words that can be deleted is Any string of words that can be deleted is 
probably a constituent
• If you omit into the garden from the sentence, y g ,
nothing is changed grammatically.
• Usually, meaning of unit of words makes sense. y g
Into the garden is much more meaningful than a 
cat into
Constituents and CategoriesConstituents and Categories
• However, we are only talking about syntactic However, we are only talking about syntactic 
structure, not the semantic one.
• The dog, the cat and the garden‐ their grammar g, g g
structure is saying they are all noun phrases. 
• It means, they can be used interchangeably‐ no y g y
linguist can deny that
• Then what about‐ “The garden chased the cat 
into the dog”? ☺☺
• We will not focus on semantics, said you before!
AmbiguityAmbiguity
• There are 2 types of ambiguity‐yp g y
1. Lexical Ambiguity: Sentence contains an 
idiom/word/term that has more than one meaning.
Glasses means both drinking glasses and spectacles
2. Structural Ambiguity: Sentence has more than 
one syntactic treeone syntactic tree
I saw the boy with the telescope‐
Did you see the boy with a telescope? OrDid you see the boy with a telescope? Or
Did you see the boy who was having a telescope?
Structural AmbiguityStructural Ambiguity
Difficulties with Natural Language:
Anaphora
•• Using pronouns to refer back to entities Using pronouns to refer back to entities Us g p o ou s to e e bac to e t t esUs g p o ou s to e e bac to e t t es
already introduced in the textalready introduced in the text
After Mary proposed to John, After Mary proposed to John, theythey found a found a 
preacher and got married.preacher and got married.
For the honeymoon, For the honeymoon, theythey went to Hawaiiwent to Hawaii
Mary saw a ring through the window and asked Mary saw a ring through the window and asked 
John for John for itit
Mary threw a rock at the window and broke Mary threw a rock at the window and broke itit
Difficulties with Natural Language:
Indexicality
•• Indexical sentences refer to utterance Indexical sentences refer to utterance Indexical sentences refer to utterance Indexical sentences refer to utterance 
situation (place, time, S/H, etc.)situation (place, time, S/H, etc.)
I am over I am over herehere
Why did you do Why did you do thatthat??
Difficulties with Natural Language:
Metonymy
•• Using one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for another
I'   dI'   d Sh kSh kI've readI've read ShakespeareShakespeare
ChryslerChrysler announced record profitsannounced record profits
The The ham sandwichham sandwich on Table 4 wants another on Table 4 wants another 
beerbeer
Difficulties with Natural Language:
Metaphor
•• “Non“Non‐‐literal" usage of words and phrases  literal" usage of words and phrases  NonNon literal  usage of words and phrases, literal  usage of words and phrases, 
often systematic.often systematic.
I've tried killing the process but it won't die. I've tried killing the process but it won't die. 
I    k  i   liI    k  i   liIts parent keeps it alive.Its parent keeps it alive.
Semantics in NLSemantics in NL
• I can't untie that knot with one hand.ca t u t e t at ot t o e a d.
– The sentence is about the abilities of whoever spoke 
or wrote it. (Call this person the speaker.)
– It's also about a knot, maybe one that the speaker is 
pointing at
– The sentence denies that the speaker has a certain– The sentence denies that the speaker has a certain 
ability. (This is the contribution of the word `can't'.)
– Untying is a way of making something not tied.
– The sentence doesn't mean that the knot has one 
hand; it has to do with how many hands are used to 
do the untyingdo the untying.
Problems in Semantics in NLProblems in Semantics in NL
• If you do not understand certain you do ot u de sta d ce ta
characteristics of linguistics, you will not be 
able to understand the semantics.
• If you do understand them, you need to feel 
them
• If you do feel them, you need to see the context
• If you see the context, you are dealt with both 
ti   d  ti  i  NLsemantics and pragmatics in NL
• ☺
SynonymySynonymy
• Synonyms are different words (or sometimesSynonyms are different words (or sometimes 
phrases) with identical or very similar 
meaningsmeanings.
• Words that are synonyms are said to 
be synonymous and the state of being abe synonymous, and the state of being a 
synonym is called synonymy
SynonymySynonymy
• student and pupil (noun)student and pupil (noun)
• buy and purchase (verb)
i k d ill ( dj i )• sick and ill (adjective)
• quickly and speedily (adverb)
• on and upon (preposition)
SynonymySynonymy
• Note that synonyms are defined with respectNote that synonyms are defined with respect 
to certain senses of words 
• pupil as the "aperture in the iris of the eye" is• pupil as the aperture in the iris of the eye is 
not synonymous with student. 
Si il l h i d h h• Similarly,he expired means the same as he 
died, yet my passport has expired cannot be 
l d b h di dreplaced by my passport has died. 
AntonymyAntonymy
• Antonyms are words with opposite or nearlyAntonyms are words with opposite or nearly 
opposite meanings. For example:
• short and tall• short and tall
• dead and alive
• increase and decrease
HomonymyHomonymy
• a homonym is one of a group of words thata homonym is one of a group of words that 
share the same spelling and the same 
pronunciation but have different meanings, 
usually as a result of the two words having 
different origins. 
• The state of being a homonym is 
called homonymy. 
• bark (the sound of a dog) and bark (the skin of 
a tree).
HeteronymyHeteronymy
• heteronyms (also known asheterophones) areheteronyms (also known asheterophones) are 
words with identical spellings (or characters) 
but different pronunciations and meaningsbut different pronunciations and meanings.
Monolingual ambiguity
• morphological ambiguity:
– German -en: noun plural, dative plural, weak noun non-nominative, adjective
masculine non-nominative, etc.
• compound nouns:
– coincide -> coin+cide, cooperate -> cooper+ate
• category ambiguity:
– round: the first round (noun), to round up cattle (verb), the round table (adjective), go on a
voyage round the Mediterranean (preposition), it measure three feet round (adverb), etc.
• homographs and polysemes:
– branch: ‘of a tree’, ‘of a bank’; crane (a bird or lifting machine)
– ball: The ball rolled down the hill, The ball lasted until midnight
Bilingual lexical ambiguity
• English wall: German Mauer (outside) or Wand (inside)
• English river: French fleuve (major) or rivière (general term)
• English leg: French jambe (human), patte (animal, insect), pied (table), étape (journey)
• English blue: Russian goluboi (pale blue) or sinii (dark blue)
• French louer: English hire or rent
• German leihen: English borrow or lend
• English wear: Japanese haoru (coat/jacket), haku (shoes/trousers), kaburu (hat), hameru
(ring/gloves), shimeru (belt/tie/scarf), tsukeru (brooch/clip), kakeru (glasses/necklace)
• resolvable by:
– rules (indicating allowable or usual categories or types of subjects, objects, verbs, etc.)
– collocations (specifying particular adjacent words)
– frequencies (most probable adjacent or dependent words)
Structural ambiguity
• Flying planes can be dangerous
• The man saw the girl with a telescope
• John mentioned the book I sent to Mary
• I told everyone concerned about the strike
– everyone concerned/involved/relevant, or: everyone disturbed/worried
• He noticed her shaking hands
– either which were shaking from cold, or which were shaking other hands
• They complained to the guide that they could not hear
– that as relative pronoun (‘whom they could not hear’) or as complementizer (‘that they could
not hear him’)
• The mathematics students sat their examinations
• The mathematics students study today is very complex
– difficulty of identifying noun compound vs. relative clause
• Gas pump prices rose last time oil stocks fell
– each word potentially noun or verb
ReferenceReference
• Richard ThomsonRichard Thomson
http://www.eecs.umich.edu/~rthomaso/docum
ents/general/what is semantics htmlents/general/what‐is‐semantics.html
ReferenceReference
• NLP for Prolog Programmers by Michael A  NLP for Prolog Programmers by Michael A. 
Covington
Chapter 4Chapter 4
Rushdi Shams, Dept of CSE, KUET, 
Bangladesh
1
ReferenceReference
• Wikipedia  Wikipedia, 
http://en.wikipedia.org/wiki/Thematic_relatio
nsns
Rushdi Shams, Dept of CSE, KUET, 
Bangladesh
1

Contenu connexe

En vedette

Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating SystemsLecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating SystemsRushdi Shams
 
Knowledge structure
Knowledge structureKnowledge structure
Knowledge structureRushdi Shams
 
Propositional logic
Propositional logicPropositional logic
Propositional logicRushdi Shams
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: ParsingRushdi Shams
 
Syntax and semantics
Syntax and semanticsSyntax and semantics
Syntax and semanticsRushdi Shams
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translationRushdi Shams
 

En vedette (7)

Weka
WekaWeka
Weka
 
Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating SystemsLecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating Systems
 
Knowledge structure
Knowledge structureKnowledge structure
Knowledge structure
 
Propositional logic
Propositional logicPropositional logic
Propositional logic
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: Parsing
 
Syntax and semantics
Syntax and semanticsSyntax and semantics
Syntax and semantics
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 

Plus de Rushdi Shams

Research Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchResearch Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchRushdi Shams
 
Common evaluation measures in NLP and IR
Common evaluation measures in NLP and IRCommon evaluation measures in NLP and IR
Common evaluation measures in NLP and IRRushdi Shams
 
Machine learning with nlp 101
Machine learning with nlp 101Machine learning with nlp 101
Machine learning with nlp 101Rushdi Shams
 
L5 understanding hacking
L5  understanding hackingL5  understanding hacking
L5 understanding hackingRushdi Shams
 
L2 Intrusion Detection System (IDS)
L2  Intrusion Detection System (IDS)L2  Intrusion Detection System (IDS)
L2 Intrusion Detection System (IDS)Rushdi Shams
 
L2 l3 l4 software process models
L2 l3 l4  software process modelsL2 l3 l4  software process models
L2 l3 l4 software process modelsRushdi Shams
 
L1 overview of software engineering
L1  overview of software engineeringL1  overview of software engineering
L1 overview of software engineeringRushdi Shams
 
Lecture 14,15 and 16 file systems
Lecture 14,15 and 16  file systemsLecture 14,15 and 16  file systems
Lecture 14,15 and 16 file systemsRushdi Shams
 
Lecture 11,12 and 13 deadlocks
Lecture 11,12 and 13  deadlocksLecture 11,12 and 13  deadlocks
Lecture 11,12 and 13 deadlocksRushdi Shams
 
Lecture 1 and 2 processes
Lecture 1 and 2  processesLecture 1 and 2  processes
Lecture 1 and 2 processesRushdi Shams
 
Lecture 3 and 4 threads
Lecture 3 and 4  threadsLecture 3 and 4  threads
Lecture 3 and 4 threadsRushdi Shams
 
Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)Rushdi Shams
 
My slide relational algebra
My slide  relational algebraMy slide  relational algebra
My slide relational algebraRushdi Shams
 

Plus de Rushdi Shams (18)

Research Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchResearch Methodology and Tips on Better Research
Research Methodology and Tips on Better Research
 
Common evaluation measures in NLP and IR
Common evaluation measures in NLP and IRCommon evaluation measures in NLP and IR
Common evaluation measures in NLP and IR
 
Machine learning with nlp 101
Machine learning with nlp 101Machine learning with nlp 101
Machine learning with nlp 101
 
First order logic
First order logicFirst order logic
First order logic
 
Belief function
Belief functionBelief function
Belief function
 
L5 understanding hacking
L5  understanding hackingL5  understanding hacking
L5 understanding hacking
 
L4 vpn
L4  vpnL4  vpn
L4 vpn
 
L3 defense
L3  defenseL3  defense
L3 defense
 
L2 Intrusion Detection System (IDS)
L2  Intrusion Detection System (IDS)L2  Intrusion Detection System (IDS)
L2 Intrusion Detection System (IDS)
 
L1 phishing
L1  phishingL1  phishing
L1 phishing
 
L2 l3 l4 software process models
L2 l3 l4  software process modelsL2 l3 l4  software process models
L2 l3 l4 software process models
 
L1 overview of software engineering
L1  overview of software engineeringL1  overview of software engineering
L1 overview of software engineering
 
Lecture 14,15 and 16 file systems
Lecture 14,15 and 16  file systemsLecture 14,15 and 16  file systems
Lecture 14,15 and 16 file systems
 
Lecture 11,12 and 13 deadlocks
Lecture 11,12 and 13  deadlocksLecture 11,12 and 13  deadlocks
Lecture 11,12 and 13 deadlocks
 
Lecture 1 and 2 processes
Lecture 1 and 2  processesLecture 1 and 2  processes
Lecture 1 and 2 processes
 
Lecture 3 and 4 threads
Lecture 3 and 4  threadsLecture 3 and 4  threads
Lecture 3 and 4 threads
 
Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)
 
My slide relational algebra
My slide  relational algebraMy slide  relational algebra
My slide relational algebra
 

Dernier

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

L1 l2 l3 introduction to machine translation

  • 2.
  • 3.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. • Peter mentioned the book I sent to Marry
  • 28. • We will give medicines to pregnant women and children
  • 29. • I saw the boy with the telescope
  • 34. • Da Vinci liked to paint his models nude.
  • 35. • He wrote the note yesterday
  • 36. • You mean you carried the information by a bus?
  • 37. • Connecting wires are tiring in DLD lab
  • 38. • Squad helps dog bite victim
  • 39. Why use computers in translation? • Too much translation for humans • Technical materials too boring for humans • Greater consistency required • Need results more quickly • Not everything needs to be top quality • Reduce costs • any one of these may justify machine translation or computer aids
  • 40. Components of a LanguageComponents of a Language • There are three components of a language‐There are three components of a language 1. Lexicon C i i2. Categorization 3. Grammar Rules
  • 41. LexiconLexicon stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .... is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | …… right | left | east | south | back | smelly |right | left | east | south | back | smelly | …… | | | | | f | | | || | | | | f | | | |here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | …… me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL …… John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | …… the | a | an |the | a | an | …… to | in | on | near |to | in | on | near | …… and | or | but |and | or | but | …… 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
  • 42. CategorizationCategorization NounNoun >> stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .... VerbVerb >> is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | …… AdjectiveAdjective >> right | left | east | south | back | smelly |right | left | east | south | back | smelly | …… | | | | | f | | | || | | | | f | | | |AdverbAdverb >> here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | …… PronounPronoun >> me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL …… NameName >> John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……NameName John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | …… ArticleArticle >> the | a | an |the | a | an | …… PrepositionPreposition >> to | in | on | near |to | in | on | near | …… ConjunctionConjunction >> and | or | but |and | or | but | …… DigitDigit >> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
  • 43. Grammar StructureGrammar Structure • In this lecture and the one following it  In this lecture and the one following it,  attending it carefully does not mean you know  all of English languageg g g • Because, that will take you to read NLP as one  subject for 4 years! ☺subject for 4 years! ☺ • We will learn how to define the basic grammar  structure for NLP systemsstructure for NLP systems • We will also learn what things you need to  keep in your head while devising such systemskeep in your head while devising such systems
  • 44. Syntactic TreeSyntactic Tree • Human recognizes the organization of words g g according to their POS in a sentence with trees. • Are you denying? • Well you can. Because, you didn’t learn it this way in  your childhood. • No one did!• No one did! • But it has been proved that our brain draws a tree  like structure when we first develop our skills on p language • That research is beyond this lecture
  • 45. Syntactic TreeSyntactic Tree • So  if you really do that unintentionally  So, if you really do that unintentionally,  then why not learn it on pen and paper so  that you can understand how you will teach that you can understand how you will teach  machines to learn languages? • The tree structure human contemplates is • The tree structure human contemplates is  called syntactic tree
  • 46. Parsing a Syntactic TreeParsing a Syntactic Tree • Parsing is the process of using grammar Parsing is the process of using grammar  rules to determine whether a sentence is  legal  and to obtain its syntactical structurelegal, and to obtain its syntactical structure • ‘The large cat eats the small rat’
  • 47. ParsingParsing The large cat eats the small rat
  • 48. ParsingParsing Article adjective noun VerbArticle adjective noun Article adjective noun Verb The large cat eats the small rat
  • 49. ParsingParsing Article adjective noun noun phraseVerbArticle adjective noun noun phrase Article adjective noun Verb The large cat eats the small rat
  • 50. ParsingParsing Noun phrase verb phrase Article adjective noun Verb noun phrase Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats the small rat
  • 51. Parsing t Parsing Noun phrase verb phrase sentence Article adjective noun Verb noun phrase Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats the small rat
  • 52. Syntactic Tree • The point where lines begin or end is called node • Each node has labels like S  PP or chasedEach node has labels like S, PP or chased • If 2 nodes are connected by a line, the upper node is immediate  dominator of the lower node. D is the immediate dominator of the • Upper nodes in a branch are called dominators. NP is the dominator Upper nodes in a branch are called dominators. NP is the dominator  of  D, N, the, dog
  • 53. Syntactic TreeSyntactic Tree • Two nodes are sisters if they are immediately dominated by  same node  D and N are sisterssame node. D and N are sisters. • The immediate dominator of them is called their mother. NP is  the mother of D and N. Similarly, D and N are daughters of NP • The immediate dominators of them are called their parents
  • 54. Syntactic TreeSyntactic Tree • Constituents are the terminal nodes that    ll d i d b     i l   i l are all dominated by a single non‐terminal  node. Chased a cat into the garden are  constituents as they are dominated by VPconstituents as they are dominated by VP
  • 55. Label BracketingLabel Bracketing I  i       f  i   h   i    i   h  • It is a process of representing the syntactic tree in another way.
  • 56. Do yourself: Label Bracket the treeDo yourself: Label Bracket the tree
  • 57. R b      h     i   h  Remember, you may have to practise the  reverse‐ constructing a syntactic tree  f  l b l b k i  ☺from label bracketing ☺
  • 58. Constituents and CategoriesConstituents and Categories • Tree structure provides two information‐Tree structure provides two information 1. It divides the sentence into constituents (in English  these are called phrases)(in English, these are called phrases) 2. It puts them into categories (NP, VP, etc)
  • 59. Constituents and CategoriesConstituents and Categories • How do we know what would be the right way to group g y g p words into right category? • How do we know into the garden is a category, but a cat  i t  i   t?into is not? • Any words that can be moved as group are probably constituents‐ the meaning of the dog chased a cat into g g the garden and into the garden, the dog chased a cat.  • Which one did you move? Into the garden‐ right? • And the meaning did not change • That’s probably our constituent
  • 60. Constituents and CategoriesConstituents and Categories • Any string of words that can be deleted is Any string of words that can be deleted is  probably a constituent • If you omit into the garden from the sentence, y g , nothing is changed grammatically. • Usually, meaning of unit of words makes sense. y g Into the garden is much more meaningful than a  cat into
  • 61. Constituents and CategoriesConstituents and Categories • However, we are only talking about syntactic However, we are only talking about syntactic  structure, not the semantic one. • The dog, the cat and the garden‐ their grammar g, g g structure is saying they are all noun phrases.  • It means, they can be used interchangeably‐ no y g y linguist can deny that • Then what about‐ “The garden chased the cat  into the dog”? ☺☺ • We will not focus on semantics, said you before!
  • 62. AmbiguityAmbiguity • There are 2 types of ambiguity‐yp g y 1. Lexical Ambiguity: Sentence contains an  idiom/word/term that has more than one meaning. Glasses means both drinking glasses and spectacles 2. Structural Ambiguity: Sentence has more than  one syntactic treeone syntactic tree I saw the boy with the telescope‐ Did you see the boy with a telescope? OrDid you see the boy with a telescope? Or Did you see the boy who was having a telescope?
  • 64. Difficulties with Natural Language: Anaphora •• Using pronouns to refer back to entities Using pronouns to refer back to entities Us g p o ou s to e e bac to e t t esUs g p o ou s to e e bac to e t t es already introduced in the textalready introduced in the text After Mary proposed to John, After Mary proposed to John, theythey found a found a  preacher and got married.preacher and got married. For the honeymoon, For the honeymoon, theythey went to Hawaiiwent to Hawaii Mary saw a ring through the window and asked Mary saw a ring through the window and asked  John for John for itit Mary threw a rock at the window and broke Mary threw a rock at the window and broke itit
  • 66. Difficulties with Natural Language: Metonymy •• Using one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for another I'   dI'   d Sh kSh kI've readI've read ShakespeareShakespeare ChryslerChrysler announced record profitsannounced record profits The The ham sandwichham sandwich on Table 4 wants another on Table 4 wants another  beerbeer
  • 67. Difficulties with Natural Language: Metaphor •• “Non“Non‐‐literal" usage of words and phrases  literal" usage of words and phrases  NonNon literal  usage of words and phrases, literal  usage of words and phrases,  often systematic.often systematic. I've tried killing the process but it won't die. I've tried killing the process but it won't die.  I    k  i   liI    k  i   liIts parent keeps it alive.Its parent keeps it alive.
  • 68. Semantics in NLSemantics in NL • I can't untie that knot with one hand.ca t u t e t at ot t o e a d. – The sentence is about the abilities of whoever spoke  or wrote it. (Call this person the speaker.) – It's also about a knot, maybe one that the speaker is  pointing at – The sentence denies that the speaker has a certain– The sentence denies that the speaker has a certain  ability. (This is the contribution of the word `can't'.) – Untying is a way of making something not tied. – The sentence doesn't mean that the knot has one  hand; it has to do with how many hands are used to  do the untyingdo the untying.
  • 69. Problems in Semantics in NLProblems in Semantics in NL • If you do not understand certain you do ot u de sta d ce ta characteristics of linguistics, you will not be  able to understand the semantics. • If you do understand them, you need to feel  them • If you do feel them, you need to see the context • If you see the context, you are dealt with both  ti   d  ti  i  NLsemantics and pragmatics in NL • ☺
  • 70. SynonymySynonymy • Synonyms are different words (or sometimesSynonyms are different words (or sometimes  phrases) with identical or very similar  meaningsmeanings. • Words that are synonyms are said to  be synonymous and the state of being abe synonymous, and the state of being a  synonym is called synonymy
  • 71. SynonymySynonymy • student and pupil (noun)student and pupil (noun) • buy and purchase (verb) i k d ill ( dj i )• sick and ill (adjective) • quickly and speedily (adverb) • on and upon (preposition)
  • 72. SynonymySynonymy • Note that synonyms are defined with respectNote that synonyms are defined with respect  to certain senses of words  • pupil as the "aperture in the iris of the eye" is• pupil as the aperture in the iris of the eye is  not synonymous with student.  Si il l h i d h h• Similarly,he expired means the same as he  died, yet my passport has expired cannot be  l d b h di dreplaced by my passport has died. 
  • 73. AntonymyAntonymy • Antonyms are words with opposite or nearlyAntonyms are words with opposite or nearly  opposite meanings. For example: • short and tall• short and tall • dead and alive • increase and decrease
  • 74. HomonymyHomonymy • a homonym is one of a group of words thata homonym is one of a group of words that  share the same spelling and the same  pronunciation but have different meanings,  usually as a result of the two words having  different origins.  • The state of being a homonym is  called homonymy.  • bark (the sound of a dog) and bark (the skin of  a tree).
  • 75. HeteronymyHeteronymy • heteronyms (also known asheterophones) areheteronyms (also known asheterophones) are  words with identical spellings (or characters)  but different pronunciations and meaningsbut different pronunciations and meanings.
  • 76. Monolingual ambiguity • morphological ambiguity: – German -en: noun plural, dative plural, weak noun non-nominative, adjective masculine non-nominative, etc. • compound nouns: – coincide -> coin+cide, cooperate -> cooper+ate • category ambiguity: – round: the first round (noun), to round up cattle (verb), the round table (adjective), go on a voyage round the Mediterranean (preposition), it measure three feet round (adverb), etc. • homographs and polysemes: – branch: ‘of a tree’, ‘of a bank’; crane (a bird or lifting machine) – ball: The ball rolled down the hill, The ball lasted until midnight
  • 77. Bilingual lexical ambiguity • English wall: German Mauer (outside) or Wand (inside) • English river: French fleuve (major) or rivière (general term) • English leg: French jambe (human), patte (animal, insect), pied (table), étape (journey) • English blue: Russian goluboi (pale blue) or sinii (dark blue) • French louer: English hire or rent • German leihen: English borrow or lend • English wear: Japanese haoru (coat/jacket), haku (shoes/trousers), kaburu (hat), hameru (ring/gloves), shimeru (belt/tie/scarf), tsukeru (brooch/clip), kakeru (glasses/necklace) • resolvable by: – rules (indicating allowable or usual categories or types of subjects, objects, verbs, etc.) – collocations (specifying particular adjacent words) – frequencies (most probable adjacent or dependent words)
  • 78. Structural ambiguity • Flying planes can be dangerous • The man saw the girl with a telescope • John mentioned the book I sent to Mary • I told everyone concerned about the strike – everyone concerned/involved/relevant, or: everyone disturbed/worried • He noticed her shaking hands – either which were shaking from cold, or which were shaking other hands • They complained to the guide that they could not hear – that as relative pronoun (‘whom they could not hear’) or as complementizer (‘that they could not hear him’) • The mathematics students sat their examinations • The mathematics students study today is very complex – difficulty of identifying noun compound vs. relative clause • Gas pump prices rose last time oil stocks fell – each word potentially noun or verb