SlideShare une entreprise Scribd logo
1  sur  110
Télécharger pour lire hors ligne
Effective Semantics for
Engineering NLP Systems
André Freitas
Lancaster, May 2018
Goals of this Talk
Provide a synthesis of the emerging representation
trends behind NLP systems.
Shift in perspective:
• Effective engineering (task driven, scalable) instead of
sound formalism.
• Best-effort representation.
Outline
• Knowledge Graphs (Frege revisited)
• Information Extraction & Text Classification
• Distributional Semantic Models
• Knowledge Graphs & Distributional Semantics
– (Distributional-Relational Models)
• Applications of DRMs
– KG Completion
– Semantic Parsing
– Natural Language Inference
“On our best behaviour”
“We need to return to our roots in Knowledge
Representation and Reasoning for language and from
language.”
Levesque, 2013
“We should not treat English text as a monolithic source
of information.”
“Instead, we should carefully study how simple
knowledge bases might be used to make sense of the
simple language needed to build slightly more complex
knowledge bases…”
Knowledge Graphs
(Frege Revisited)
Some Perspectives on “What”
“The Knowledge Graph is a knowledge base used by Google to enhance
its search engine's search results.”
“A Knowledge graph (i) mainly describes real world entities and
interrelations, organized in a graph (ii) defines possible classes and
relations of entities in a schema (iii) allows potentially interrelating arbitrary
entities with each other…” [Paulheim H.]
“We define a Knowledge Graph as an RDF graph consists of a set of RDF
triples where each RDF triple (s,p,o)….” [Pujara J. al al.]
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
• Open world representation of information.
• Every entry point is equal cost.
• Underpin Cortana, Google Assistant, Siri, Alexa.
• Typically (but doesn’t have to be) expressed in RDF.
• No longer a solution in search of a problem!
Dan Bennett, TR
Some Perspectives on “What”
• “Knowledge is Power” Hypothesis (the Knowledge
Principle): “If a program is to perform a complex task
well, it must know a great deal about the world in
which it operates.”
• The Breadth Hypothesis: “To behave intelligently in
unexpected situations, an agent must be capable of
falling back on increasingly general knowledge.”
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
Some Perspectives on “Why”
• We’re surrounded by entities, which are connected by
relations.
• We need to store them somehow, e.g., using a DB or a
graph.
• Graphs can be processed efficiently and offer a
convenient abstraction.
Some Perspectives on “Why”
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
Some Perspectives on “Why”
• Knowledge models such as Linked Data and many
problems in machine learning have a natural
representation as relational data.
• Relations between entities are often more important for a
prediction task than attributes.
• For instance, can be easier to predict the party of a vice-
president from the party of his president than from his
attributes.
[Koopman, 2010]
Building Knowledge Graphs
Open Information Extraction
• Extracting unstructured facts from text.
• TextRunner [Banko et al., IJCAI ’07], WOE [Wu & Weld,
ACL ‘10].
• ReVerb [Fader et al., EMNLP ‘11].
• OLLIE [Mausam et al., EMNLP ‘12].
• OpenIE [Mausam et al., IJCAI ‘16].
• Graphene [Niklaus et al, COLING 17].
Graphene
• Captures contextual relations.
• Extends the default Open IE representation in
order to capture inter-proposition relationships.
• Include rhetorical relations.
Cetto et al., Creating a Hierarchy of Semantically-Linked Propositions in Open Information Extraction,
COLING (2018).
Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
Transformation Stage
Rhetorical Relations
Extracting Rhetorical
Relations
Extracting Rhetorical
Relations
Clausal & Phrasal
Disembedding
Input Document
Transformation Stage
Relation Extraction
Output
Asian stocks fell anew and the yen rose to session highs
in the afternoon as worries about North Korea simmered,
after a senior Pyongyang official said the U.S. is
becoming ``more vicious and more aggressive'' under
President Donald Trump .
Asian stocks fell anew
The yen rose to session highs in the afternoon
spatial
attribution
after
Worries simmered about North Korea
The U.S. is becoming
becoming `` more vicious and more aggressive ''
under Donald Trump
A senior Pyongyang
official said
background
and
Precision:
Recall:
Improving Open Relation Extraction using Clausal
and Phrasal Disembedding, Under Review, (2017)
What to expect?
(Wikipedia & Newswire)
https://github.com/Lambda-3/Graphene
Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
Software: Extracting Knowledge
Graphs from Text
Argumentation Structures
Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
Argumentative Discourse
Unit Classification
Argumentation Schemes
Douglas Walton
Unified Schema
Argument Mining Approaches
What to expect?
F1-score: 0.74
Stab & Gurevych, Parsing Argumentation
Structures in Persuasive Essays, 2016.
Definition-based
Models
Semantic Roles for Lexical
Definitions
Aristotle’s classic theory of definition introduced important aspects
such as the genus-differentia definition pattern and the
essential/non-essential property differentiation.
Building the Definition Graph
Data: WordNetGraph
Silva et al., Categorization of Semantic Roles for Dictionary Definitions.
Cognitive Aspects of the Lexicon CogALex@COLING, 2017.
https://github.com/Lambda-3/WordnetGraph
RDF graph generated from WordNet.
Emerging perspectives
• The evolution of parsing and classification methods in
NLP is inducing a new lightweight semantic
representation.
• This representation dialogues with elements from logics,
linguistics and the Semantic/Linked Data Web (especially
RDF).
• However, they relax the semantic constraints of previous
models (which were operating under assumptions for
deductive reasoning or databases).
Emerging perspectives
• Knowledge graphs as lexical semantic models
operating under a semantic best-effort mode (canonical
identifiers when possible, otherwise, words).
• Possibly closer to the surface form of the text.
• Priority is on segmenting, categorizing and when
possible, integrating.
• A representation (data model) convenient for AI
engineering.
Categorization
A fact (main clause):
* Can be a taxonomic fact.
s p o
term, URI term, URI term, URI
instance,
class,
triple
type, property,
schema property
instance,
class,
triple
Categorization
A fact with a context:
s0 p0 o0
p1
o1
reification
e.g.
• subordination
(modality, temporality,
spatiality, RSTs)
• fact probability
• polarity
Categorization
Coordinated facts:
s0 p0 o0
s1 p1 o1
p2
e.g.
• coordination
• RSTs
• ADU
https://github.com/Lambda-3/Graphene/blob/master/wiki/RDFNL-
Format.md
RDF-NL
Knowledge Graphs &
Distributional Semantics
(A marriage made in heaven?)
Distributional Semantics
• Computational models that build contextual semantic
representations from corpus data.
• Semantic context is represented by a vector.
• Vectors are obtained through the statistical analysis of
the linguistic contexts of a word.
• Salience of contexts (cf. context weighting scheme).
• Semantic similarity/relatedness as the core operation
over the model.
Distributional Semantic Models
(Word Vector Models)
Distributional Semantics as
Commonsense Knowledge
Commonsense is here
θ
car
dog
cat
bark
run
leash
Semantic Approximation is
here
Semantic Model with low
acquisition effort
Context Weighting Measures
Kiela & Clark, 2014
Similarity Measures
x
… and of course, Glove and W2V
Distributional-Relational
Models
Distributional Relational Networks, AAAI Symposium (2013).
A Compositional-Distributional Semantic Model for Searching Complex Entity
Categories, ACL *SEM (2016)
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
…
LSA, ESA, W2V, GLOVE, …
s0 p0 o0
Compositionality of Complex
Nominals
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
Building on Word Vector Space
Models
• But how can we represent the meaning of longer phrases?
• By mapping them into the same vector space!
the country of my birth
the place where I was born
How should we map phrases
into a vector space?
Recursive Neural Networks
Mixture vs Function
A Compositional-Distributional Semantic Model for Searching Complex Entity
Categories, *SEM (2016)
Recursive vs recurrent neural
networks
5
Segmented Spaces vs
Unified Space
s0 p0 o0
s0 p0 o0
• Assumes is <s,p,o> naturally
irreconcilable.
• Inherent dimensional reduction
mechanism.
• Facilitates the specialization of
embedding-based approximations.
• Easier to compute identity.
• Requires complex and high-
dimensional tensorial model.
Software: Indra
• Semantic approximation server
• Multi-lingual (12 languages)
• Multi-domain
• Different compositional models
https://github.com/Lambda-3/indra
Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual
Semantic Relatedness using Machine Translation, EKAW, (2016).
“On our best behaviour”
“It is not enough to build knowledge bases without paying
closer attention to the demands arising from their use.”
Levesque, 2013
“We should explore more thoroughly the space of
computations between fact retrieval and full
automated logical reasoning.”
How to access Distributional-
Knowledge Graphs efficiently?
• Depends on the target operations in the
Knowledge Graphs (more on this later).
How to access Distributional-
Knowledge Graphs efficiently?
s0 p0 o0
s0
q
Inverted index
sharding
disk access
optimization
…
Multiple Randomized
K-d Tree Algorithm
The Priority Search
K-Means Tree algorithm
Database + IR
Query planning
Cardinality
Indexing
Skyline
Bitmap indexes
…
Structured Queries Approximation Queries
How to access Distributional-
Knowledge Graphs efficiently?
s0 p0 o0
Database + IR
Structured Queries Approximation Queries
Software: StarGraph
• Distributional Knowledge Graph Database.
• Word embedding Database.
https://github.com/Lambda-3/Stargraph
Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, 2014.
Emerging perspectives
• Graph-based data models + Distributional Semantic Models
(Word embeddings) have complementary semantic value.
• Graph-based Data Models:
– Facilitates querying, integration and rule-based reasoning.
• Distributional Semantic Models:
– Supports semantic approximation, coping with vocabulary variation.
Emerging perspectives
• AI systems require access to comprehensive background
knowledge for semantic interpretation tasks.
• Inheriting from Information Retrieval and Databases:
– General Indexing schemes,
– Particular Indexing schemes,
• Spatial, temporal, topological, probabilistic, causal, …
– Query planning,
– Data compression,
– Distribution,
– … even supporting hardware strategies.
Emerging perspectives
• One size of embedding does not fit all: Operate with
multiple distributional + compositional models for different
data model types (I, C, P), different domains and different
languages.
Effective Semantic Parsing
for Large KBs
The Vocabulary Problem
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
The Vocabulary Problem
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
Latino origins
selected
JudgeHigh
Obama
Last US president
Vocabulary Problem for KGs
Schema-agnostic
query mechanisms
Distributional
Inverted Index
Distributional-
Relational Model
Reference
Commonsense
corpora
Core semantic approximation
& composition operations
Semantic Parser
Query Plan
Scalable semantic
parsing
Learn to
Rank
Question Answers
Minimizing the Semantic Entropy
for the Semantic Matching
Definition of a semantic pivot: first query term to be resolved in the
database.
• Maximizes the reduction of the semantic configuration space.
• Less prone to more complex synonymic expressions and
abstraction-level differences.
• Semantic pivot serves as interpretation context for the remaining
alignments.
• proper nouns >> nouns >> complex nominals >> adjectives ,
verbs.
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
lexical specificity # of senses lexical category
…
… …
- Vector neighborhood density
- Semantic differential
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
lexical specificity # of senses lexical category
…
… …
𝜌
- Vector neighborhood density
- Semantic differential
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽} …
lexical specificity # of senses lexical category
…
… …
Δ𝑠𝑟
Δ𝑟
Semantic pivoting
- Vector neighborhood density
- Semantic differential
- Distributional compositionality
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
lexical specificity # of senses lexical category
…
… …
t h
0t m1
0
t m2
0
o t h
0t m1
0
t m1
0 =
… …
… …
Search and Composition
Operations
 Instance search
- Proper nouns
- String similarity + node cardinality
 Class (unary predicate) search
- Nouns, adjectives and adverbs
- String similarity + Distributional semantic relatedness
 Property (binary predicate) search
- Nouns, adjectives, verbs and adverbs
- Distributional semantic relatedness
 Navigation
 Extensional expansion
- Expands the instances associated with a class.
 Operator application
- Aggregations, conditionals, ordering, position
 Disjunction & Conjunction
 Disambiguation dialog (instance, predicate)
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-
Compositional Semantics Approach, IUI 2014
What to expect (@ QALD1)
F1-Score: 0.72
MRR: 0.5
Freitas & Curry, Natural Language Queries over Heterogeneous Linked
Data Graphs, IUI (2014).
Software: StarGraph
• Semantic parsing.
https://github.com/Lambda-3/Stargraph
Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, 2014.
Emerging perspectives
Semantic Parsing:
• Structured queries over KGs as explanations.
• Semantic pivoting heuristics.
• Diversity of distributional/compositional models as key.
• End-to-end vs componentised architectures.
Knowledge Graph
Completion
The Problem
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
The Problem
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Formulating the Distributional-
Relational Representation
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Relation Paths
• Complex Inference patterns for composition.
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Representation of Relation Paths
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
What to expect (PTransE@FB15K)
Relation Prediction
Natural Language
Inference
Recognizing and Justifying
Text Entailments (TE)
using Definition KGs
Distributional semantic relatedness as a
Selectivity Heuristics
Distributional
heuristics
target
source
answer
Distributional semantic relatedness as a
Selectivity Heuristics
Distributional
heuristics
target
source
answer
Distributional semantic relatedness as a
Selectivity Heuristics
Distributional
heuristics
target
source
answer
Pre-Processing
Abductive Inference
Generation
What to expect (TE@Boeing-Princeton-ISI)
F1-Score: 0.59
What to expect (TE@Guardian Headline Samples)
F1-Score: 0.53
Santos et al., Recognizing and Justifying Text Entailment through Distributional
Navigation on Definition Graphs, AAAI, 2018.
Explainable Findings
From Tensor Inferences Back to KGs
Explainable Findings
From Tensor Inferences Back to KGs
Emerging perspectives
• Distributional-relational models in KB completion
explored a large range of representation paradigms.
– Opportunity for exporting these representation models to other
tasks.
• Definition-based models can provide a corpus-viable,
low-data and explainable alternative to embedding-
based models.
Architecture
Entity Linking
Open IE
Taxonomy
Extraction
Integration
Arg. Classif.Co-reference
Resolution
KG Completion
Natural
Language
Inference
Named Entity
Recognition
Semantic
Parsing
KG Construction
Inference
Distributional
Semantics
Server
Query By
Example
Query
spatial
temporal
probabilistic
causal
Indexes
NL
Generation
NL Query
Answers
Explanations
Definition
Extraction
Entity Linking
Integration
Co-reference
Resolution
KG Completion
Natural
Language
Inference
Named Entity
Recognition
Semantic
Parsing
KG Construction
Inference
Distributional
Semantics
Server
Query By
Example
Query
spatial
temporal
probabilistic
causal
Indexes
NL
Generation
NL Query
Answers
Explanations
M
T
M
T
Open IE
Taxonomy
Extraction
Arg. Classif.
Definition
Extraction
Take
Home
Message
Take Home Message
• The evolution of methods, tools and the availability of data in
NLP creates the demand for knowledge representation models to
support complex AI systems.
• A relaxed version of RDF (RDF-NL) can provide this answer.
– Establishes a dialogue with a standard (with existing data).
– Inherits optimization aspects from Databases.
• Word-vectors (DSMs) + compositional models + RDF-NL.
• Moving beyond facts and taxonomies: rhetorical structures,
arguments, polarity, stories.
Take Home Message
• Syntactic and lexical features can go a long way for
structuring text.
– Context-preserving open information extraction.
• Integration (entity reconciliation) as semantic-best effort.
– Embrace schema on read.
• KGs can support explainable AI:
– Meeting point between extraction, reasoning and querying.
– Definition-based models.
• Inherit infrastructures from DB and IR.
Take Home Message
Opportunities:
• ML orchestrated pipelines with:
– Richer discourse-representation models.
– Explicit semantic representations (centered on KGs).
– Different compositional/distributional models (beyond W2V & Glove)
• KGs and impact on explainability.
• Quantifying domain and language transportability.
Acknowledgements
References
Vivian S. Silva, André Freitas, Siegfried Handschuh, Recognizing and Justifying Text Entailment through Distributional Navigation
on Definition Graphs, Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, USA, 2018 (pdf).
Matthias Cetto, Christina Niklaus, André Freitas and Siegfried Handschuh, Creating a Hierarchy of Semantically-Linked
Propositions in Open Information Extraction, In Proceedings of the 27th International Conference on Computational Linguistics
(COLING), New-Mexico, USA, 2018.
Christina Niklaus, Matthias Cetto, André Freitas and Siegfried Handschuh, A Survey on Open Information Extraction, In
Proceedings of the 27th International Conference on Computational Linguistics (COLING), New-Mexico, USA, 2018.
Siamak Barzegar, Brian Davis, Siegfried Handschuh, André Freitas, Multilingual Semantic Relatedness using Lightweight
Machine Translation, 12th IEEE International Conference on Semantic Computing (ICSC), USA, 2018 (pdf).
Siamak Barzegar, Brian Davis,, Manel Zarrouk, Siegfried Handschuh, André Freitas, SemR-11: A Multi-Lingual Gold-Standard for
Semantic Similarity and Relatedness for Eleven Languages, 11th Language Resources and Evaluation Conference (LREC),
Japan, 2018 (pdf).
Thomas Gaillat, Manel Zarrouk, André Freitas, Brian Davis, The SSIX Corpora: Three Gold Standard Corpora for Sentiment
Analysis in English, Spanish and German Financial Microblogs, 11th Language Resources and Evaluation Conference (LREC),
Japan, 2018 (pdf).
Juliano Efson Sales, Leonardo Souza, Siamak Barzegar, Brian Davis, André Freitas and Siegfried Handschuh, Indra: A Word
Embedding and Semantic Relatedness Server, 11th Language Resources and Evaluation Conference (LREC), Japan, 2018
(pdf).
Andre Freitas, Schema-agnostic queries over large-schema databases: a distributional semantics approach (pdf).
References
Vivian S. Silva, André Freitas, Siegfried Handschuh, Building a Knowledge Graph from Natural Language Definitions for
Interpretable Text Entailment Recognition, 11th Language Resources and Evaluation Conference (LREC), Japan, 2018 (pdf).
Juliano Efson Sales, André Freitas, Brian Davis, Siegfried Handschuh, A Compositional-Distributional Semantic Model for
Searching Complex Entity Categories, 5th Joint Conference on Lexical and Computational Semantics (*SEM), Berlin, 2016. (Full
Conference Paper) (pdf).
André Freitas, Siamak Barzegar, Juliano E. Sales, Siegfried Handschuh and Brian Davis, Semantic Relatedness for All
(Languages): A Comparative Analysis of Multilingual Semantic Relatedness using Machine Translation, 20th International
Conference on Knowledge Engineering and Knowledge Management (EKAW), 2016. (Full Conference Paper) (pdf).
Vivian S. Silva, André Freitas and Siegfried Handschuh, Supersense Word Tagging with Foundational Ontology Classes, 20th
International Conference on Knowledge Engineering and Knowledge Management (EKAW), 2016. (Full Conference Paper) (pdf).
Christina Niklaus, Bernhard Bermeitinger, Siegfried Handschuh, André Freitas, A Sentence Simplification System for Improving
Relation Extraction, 26th International Conference on Computational Linguistics, (COLING), Osaka, 2016. (Demonstration in
Proceedings) (pdf).
Vivian S. Silva, Siegfried Handschuh and André Freitas, Categorization of Semantic Roles for Dictionary Definitions, Cognitive
Aspects of the Lexicon (CogALex-V), Workshop at the 26th International Conference on Computational Linguistics, (COLING),
Osaka, 2016. (Full Workshop Paper) (pdf).
André Freitas, Juliano Efson Sales, Siegfried Handschuh, Edward Curry, How hard is the Query? Measuring the Semantic
Complexity of Schema-Agnostic Queries, In Proceedings of the 11th International Conference on Computational Semantics
(IWCS), London, 2015. (Full Conference Paper) (pdf).
André Freitas, Edward Curry, Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-
Compositional Semantics Approach, In Proceedings of the 19th International Conference on Intelligent User Interfaces (IUI),
Haifa, 2014. (Full Conference Paper) (pdf).
André Freitas, João Carlos Pereira Da Silva, Edward Curry, Paul Buitelaar, A Distributional Semantics Approach for Selective
Reasoning on Commonsense Graph Knowledge Bases, In Proceedings of the 19th International Conference on Applications of
Natural Language to Information Systems (NLDB), Montpellier, 2014. (Full Conference Paper) (pdf).

Contenu connexe

Tendances

Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Rinke Hoekstra
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
Andre Freitas
 
Semantics based Summarization of Entities in Knowledge Graphs
Semantics based Summarization of Entities in Knowledge GraphsSemantics based Summarization of Entities in Knowledge Graphs
Semantics based Summarization of Entities in Knowledge Graphs
Artificial Intelligence Institute at UofSC
 

Tendances (19)

Smart Data Webinar: Advances in Natural Language Processing I - Understanding
Smart Data Webinar: Advances in Natural Language Processing I - UnderstandingSmart Data Webinar: Advances in Natural Language Processing I - Understanding
Smart Data Webinar: Advances in Natural Language Processing I - Understanding
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
The Semantic Web: It's for Real
The Semantic Web: It's for RealThe Semantic Web: It's for Real
The Semantic Web: It's for Real
 
[DL Hacks 実装]A simple neural network module for relational reasoning
[DL Hacks 実装]A simple neural network module for relational reasoning[DL Hacks 実装]A simple neural network module for relational reasoning
[DL Hacks 実装]A simple neural network module for relational reasoning
 
4 semantic web and ontology
4 semantic web and ontology4 semantic web and ontology
4 semantic web and ontology
 
Semantic Web: The Inside Story
Semantic Web: The Inside StorySemantic Web: The Inside Story
Semantic Web: The Inside Story
 
Question answering
Question answeringQuestion answering
Question answering
 
An Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities DataAn Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities Data
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
The Semantic Web
The Semantic WebThe Semantic Web
The Semantic Web
 
Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLP
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Semantics based Summarization of Entities in Knowledge Graphs
Semantics based Summarization of Entities in Knowledge GraphsSemantics based Summarization of Entities in Knowledge Graphs
Semantics based Summarization of Entities in Knowledge Graphs
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
 
Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"
 

Similaire à Effective Semantics for Engineering NLP Systems

5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
Bhaskar Mitra
 
Artificial Intelligence - A modern approach 3ed
Artificial Intelligence - A modern approach 3edArtificial Intelligence - A modern approach 3ed
Artificial Intelligence - A modern approach 3ed
RohanMistry15
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
Andre Freitas
 

Similaire à Effective Semantics for Engineering NLP Systems (20)

Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge BasesExplanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
 
Artificial_Intelligence__A_Modern_Approach_Nho Vĩnh Share.pdf
Artificial_Intelligence__A_Modern_Approach_Nho Vĩnh Share.pdfArtificial_Intelligence__A_Modern_Approach_Nho Vĩnh Share.pdf
Artificial_Intelligence__A_Modern_Approach_Nho Vĩnh Share.pdf
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overview
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008
 
ONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESS
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Objective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynoteObjective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynote
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF CaseSemantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case
 
AI, Knowledge Representation and Graph Databases -
 Key Trends in Data Science
AI, Knowledge Representation and Graph Databases -
 Key Trends in Data ScienceAI, Knowledge Representation and Graph Databases -
 Key Trends in Data Science
AI, Knowledge Representation and Graph Databases -
 Key Trends in Data Science
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
4KN Editted 2012.ppt
4KN Editted 2012.ppt4KN Editted 2012.ppt
4KN Editted 2012.ppt
 
Artificial Intelligence - A modern approach 3ed
Artificial Intelligence - A modern approach 3edArtificial Intelligence - A modern approach 3ed
Artificial Intelligence - A modern approach 3ed
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
Ontology
Ontology Ontology
Ontology
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 

Plus de Andre Freitas

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
Andre Freitas
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Andre Freitas
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
Andre Freitas
 

Plus de Andre Freitas (20)

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and Refinement
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary Definitions
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked Data
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web Stack
 
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyOn the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
On the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsOn the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category Descriptors
 
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
 
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Effective Semantics for Engineering NLP Systems

  • 1. Effective Semantics for Engineering NLP Systems André Freitas Lancaster, May 2018
  • 2. Goals of this Talk Provide a synthesis of the emerging representation trends behind NLP systems. Shift in perspective: • Effective engineering (task driven, scalable) instead of sound formalism. • Best-effort representation.
  • 3.
  • 4. Outline • Knowledge Graphs (Frege revisited) • Information Extraction & Text Classification • Distributional Semantic Models • Knowledge Graphs & Distributional Semantics – (Distributional-Relational Models) • Applications of DRMs – KG Completion – Semantic Parsing – Natural Language Inference
  • 5. “On our best behaviour” “We need to return to our roots in Knowledge Representation and Reasoning for language and from language.” Levesque, 2013 “We should not treat English text as a monolithic source of information.” “Instead, we should carefully study how simple knowledge bases might be used to make sense of the simple language needed to build slightly more complex knowledge bases…”
  • 7. Some Perspectives on “What” “The Knowledge Graph is a knowledge base used by Google to enhance its search engine's search results.” “A Knowledge graph (i) mainly describes real world entities and interrelations, organized in a graph (ii) defines possible classes and relations of entities in a schema (iii) allows potentially interrelating arbitrary entities with each other…” [Paulheim H.] “We define a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o)….” [Pujara J. al al.] KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  • 8. • Open world representation of information. • Every entry point is equal cost. • Underpin Cortana, Google Assistant, Siri, Alexa. • Typically (but doesn’t have to be) expressed in RDF. • No longer a solution in search of a problem! Dan Bennett, TR Some Perspectives on “What”
  • 9. • “Knowledge is Power” Hypothesis (the Knowledge Principle): “If a program is to perform a complex task well, it must know a great deal about the world in which it operates.” • The Breadth Hypothesis: “To behave intelligently in unexpected situations, an agent must be capable of falling back on increasingly general knowledge.” KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014 Some Perspectives on “Why”
  • 10. • We’re surrounded by entities, which are connected by relations. • We need to store them somehow, e.g., using a DB or a graph. • Graphs can be processed efficiently and offer a convenient abstraction. Some Perspectives on “Why” KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  • 11. Some Perspectives on “Why” • Knowledge models such as Linked Data and many problems in machine learning have a natural representation as relational data. • Relations between entities are often more important for a prediction task than attributes. • For instance, can be easier to predict the party of a vice- president from the party of his president than from his attributes. [Koopman, 2010]
  • 13. Open Information Extraction • Extracting unstructured facts from text. • TextRunner [Banko et al., IJCAI ’07], WOE [Wu & Weld, ACL ‘10]. • ReVerb [Fader et al., EMNLP ‘11]. • OLLIE [Mausam et al., EMNLP ‘12]. • OpenIE [Mausam et al., IJCAI ‘16]. • Graphene [Niklaus et al, COLING 17].
  • 14. Graphene • Captures contextual relations. • Extends the default Open IE representation in order to capture inter-proposition relationships. • Include rhetorical relations. Cetto et al., Creating a Hierarchy of Semantically-Linked Propositions in Open Information Extraction, COLING (2018). Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
  • 19.
  • 25. Asian stocks fell anew and the yen rose to session highs in the afternoon as worries about North Korea simmered, after a senior Pyongyang official said the U.S. is becoming ``more vicious and more aggressive'' under President Donald Trump . Asian stocks fell anew The yen rose to session highs in the afternoon spatial attribution after Worries simmered about North Korea The U.S. is becoming becoming `` more vicious and more aggressive '' under Donald Trump A senior Pyongyang official said background and
  • 26.
  • 27.
  • 28.
  • 29. Precision: Recall: Improving Open Relation Extraction using Clausal and Phrasal Disembedding, Under Review, (2017) What to expect? (Wikipedia & Newswire)
  • 30. https://github.com/Lambda-3/Graphene Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017) Software: Extracting Knowledge Graphs from Text
  • 31. Argumentation Structures Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  • 35. Argument Mining Approaches What to expect? F1-score: 0.74 Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  • 37. Semantic Roles for Lexical Definitions Aristotle’s classic theory of definition introduced important aspects such as the genus-differentia definition pattern and the essential/non-essential property differentiation.
  • 39. Data: WordNetGraph Silva et al., Categorization of Semantic Roles for Dictionary Definitions. Cognitive Aspects of the Lexicon CogALex@COLING, 2017. https://github.com/Lambda-3/WordnetGraph RDF graph generated from WordNet.
  • 40. Emerging perspectives • The evolution of parsing and classification methods in NLP is inducing a new lightweight semantic representation. • This representation dialogues with elements from logics, linguistics and the Semantic/Linked Data Web (especially RDF). • However, they relax the semantic constraints of previous models (which were operating under assumptions for deductive reasoning or databases).
  • 41. Emerging perspectives • Knowledge graphs as lexical semantic models operating under a semantic best-effort mode (canonical identifiers when possible, otherwise, words). • Possibly closer to the surface form of the text. • Priority is on segmenting, categorizing and when possible, integrating. • A representation (data model) convenient for AI engineering.
  • 42. Categorization A fact (main clause): * Can be a taxonomic fact. s p o term, URI term, URI term, URI instance, class, triple type, property, schema property instance, class, triple
  • 43. Categorization A fact with a context: s0 p0 o0 p1 o1 reification e.g. • subordination (modality, temporality, spatiality, RSTs) • fact probability • polarity
  • 44. Categorization Coordinated facts: s0 p0 o0 s1 p1 o1 p2 e.g. • coordination • RSTs • ADU https://github.com/Lambda-3/Graphene/blob/master/wiki/RDFNL- Format.md RDF-NL
  • 45. Knowledge Graphs & Distributional Semantics (A marriage made in heaven?)
  • 47. • Computational models that build contextual semantic representations from corpus data. • Semantic context is represented by a vector. • Vectors are obtained through the statistical analysis of the linguistic contexts of a word. • Salience of contexts (cf. context weighting scheme). • Semantic similarity/relatedness as the core operation over the model. Distributional Semantic Models (Word Vector Models)
  • 48. Distributional Semantics as Commonsense Knowledge Commonsense is here θ car dog cat bark run leash Semantic Approximation is here Semantic Model with low acquisition effort
  • 49. Context Weighting Measures Kiela & Clark, 2014 Similarity Measures x … and of course, Glove and W2V
  • 50. Distributional-Relational Models Distributional Relational Networks, AAAI Symposium (2013). A Compositional-Distributional Semantic Model for Searching Complex Entity Categories, ACL *SEM (2016) Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent … LSA, ESA, W2V, GLOVE, … s0 p0 o0
  • 52. Building on Word Vector Space Models • But how can we represent the meaning of longer phrases? • By mapping them into the same vector space! the country of my birth the place where I was born
  • 53. How should we map phrases into a vector space? Recursive Neural Networks
  • 54. Mixture vs Function A Compositional-Distributional Semantic Model for Searching Complex Entity Categories, *SEM (2016)
  • 55. Recursive vs recurrent neural networks 5
  • 56. Segmented Spaces vs Unified Space s0 p0 o0 s0 p0 o0 • Assumes is <s,p,o> naturally irreconcilable. • Inherent dimensional reduction mechanism. • Facilitates the specialization of embedding-based approximations. • Easier to compute identity. • Requires complex and high- dimensional tensorial model.
  • 57. Software: Indra • Semantic approximation server • Multi-lingual (12 languages) • Multi-domain • Different compositional models https://github.com/Lambda-3/indra Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness using Machine Translation, EKAW, (2016).
  • 58. “On our best behaviour” “It is not enough to build knowledge bases without paying closer attention to the demands arising from their use.” Levesque, 2013 “We should explore more thoroughly the space of computations between fact retrieval and full automated logical reasoning.”
  • 59. How to access Distributional- Knowledge Graphs efficiently? • Depends on the target operations in the Knowledge Graphs (more on this later).
  • 60. How to access Distributional- Knowledge Graphs efficiently? s0 p0 o0 s0 q Inverted index sharding disk access optimization … Multiple Randomized K-d Tree Algorithm The Priority Search K-Means Tree algorithm Database + IR Query planning Cardinality Indexing Skyline Bitmap indexes … Structured Queries Approximation Queries
  • 61. How to access Distributional- Knowledge Graphs efficiently? s0 p0 o0 Database + IR Structured Queries Approximation Queries
  • 62. Software: StarGraph • Distributional Knowledge Graph Database. • Word embedding Database. https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  • 63. Emerging perspectives • Graph-based data models + Distributional Semantic Models (Word embeddings) have complementary semantic value. • Graph-based Data Models: – Facilitates querying, integration and rule-based reasoning. • Distributional Semantic Models: – Supports semantic approximation, coping with vocabulary variation.
  • 64. Emerging perspectives • AI systems require access to comprehensive background knowledge for semantic interpretation tasks. • Inheriting from Information Retrieval and Databases: – General Indexing schemes, – Particular Indexing schemes, • Spatial, temporal, topological, probabilistic, causal, … – Query planning, – Data compression, – Distribution, – … even supporting hardware strategies.
  • 65. Emerging perspectives • One size of embedding does not fit all: Operate with multiple distributional + compositional models for different data model types (I, C, P), different domains and different languages.
  • 68. The Vocabulary Problem Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent Latino origins selected JudgeHigh Obama Last US president
  • 69. Vocabulary Problem for KGs Schema-agnostic query mechanisms
  • 70. Distributional Inverted Index Distributional- Relational Model Reference Commonsense corpora Core semantic approximation & composition operations Semantic Parser Query Plan Scalable semantic parsing Learn to Rank Question Answers
  • 71. Minimizing the Semantic Entropy for the Semantic Matching Definition of a semantic pivot: first query term to be resolved in the database. • Maximizes the reduction of the semantic configuration space. • Less prone to more complex synonymic expressions and abstraction-level differences. • Semantic pivot serves as interpretation context for the remaining alignments. • proper nouns >> nouns >> complex nominals >> adjectives , verbs.
  • 72. I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … …
  • 73. - Vector neighborhood density - Semantic differential I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … 𝜌
  • 74. - Vector neighborhood density - Semantic differential I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … Δ𝑠𝑟 Δ𝑟 Semantic pivoting
  • 75. - Vector neighborhood density - Semantic differential - Distributional compositionality I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … t h 0t m1 0 t m2 0 o t h 0t m1 0 t m1 0 = … … … …
  • 76. Search and Composition Operations  Instance search - Proper nouns - String similarity + node cardinality  Class (unary predicate) search - Nouns, adjectives and adverbs - String similarity + Distributional semantic relatedness  Property (binary predicate) search - Nouns, adjectives, verbs and adverbs - Distributional semantic relatedness  Navigation  Extensional expansion - Expands the instances associated with a class.  Operator application - Aggregations, conditionals, ordering, position  Disjunction & Conjunction  Disambiguation dialog (instance, predicate) Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional- Compositional Semantics Approach, IUI 2014
  • 77. What to expect (@ QALD1) F1-Score: 0.72 MRR: 0.5 Freitas & Curry, Natural Language Queries over Heterogeneous Linked Data Graphs, IUI (2014).
  • 78. Software: StarGraph • Semantic parsing. https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  • 79. Emerging perspectives Semantic Parsing: • Structured queries over KGs as explanations. • Semantic pivoting heuristics. • Diversity of distributional/compositional models as key. • End-to-end vs componentised architectures.
  • 81. The Problem Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 82. The Problem Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 83. Formulating the Distributional- Relational Representation Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 84. Relation Paths • Complex Inference patterns for composition. Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 85. Representation of Relation Paths Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 86. What to expect (PTransE@FB15K) Relation Prediction
  • 88. Recognizing and Justifying Text Entailments (TE) using Definition KGs
  • 89. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  • 90. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  • 91. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  • 95. What to expect (TE@Boeing-Princeton-ISI) F1-Score: 0.59 What to expect (TE@Guardian Headline Samples) F1-Score: 0.53 Santos et al., Recognizing and Justifying Text Entailment through Distributional Navigation on Definition Graphs, AAAI, 2018.
  • 96. Explainable Findings From Tensor Inferences Back to KGs
  • 97. Explainable Findings From Tensor Inferences Back to KGs
  • 98.
  • 99.
  • 100. Emerging perspectives • Distributional-relational models in KB completion explored a large range of representation paradigms. – Opportunity for exporting these representation models to other tasks. • Definition-based models can provide a corpus-viable, low-data and explainable alternative to embedding- based models.
  • 102. Entity Linking Open IE Taxonomy Extraction Integration Arg. Classif.Co-reference Resolution KG Completion Natural Language Inference Named Entity Recognition Semantic Parsing KG Construction Inference Distributional Semantics Server Query By Example Query spatial temporal probabilistic causal Indexes NL Generation NL Query Answers Explanations Definition Extraction
  • 103. Entity Linking Integration Co-reference Resolution KG Completion Natural Language Inference Named Entity Recognition Semantic Parsing KG Construction Inference Distributional Semantics Server Query By Example Query spatial temporal probabilistic causal Indexes NL Generation NL Query Answers Explanations M T M T Open IE Taxonomy Extraction Arg. Classif. Definition Extraction
  • 105. Take Home Message • The evolution of methods, tools and the availability of data in NLP creates the demand for knowledge representation models to support complex AI systems. • A relaxed version of RDF (RDF-NL) can provide this answer. – Establishes a dialogue with a standard (with existing data). – Inherits optimization aspects from Databases. • Word-vectors (DSMs) + compositional models + RDF-NL. • Moving beyond facts and taxonomies: rhetorical structures, arguments, polarity, stories.
  • 106. Take Home Message • Syntactic and lexical features can go a long way for structuring text. – Context-preserving open information extraction. • Integration (entity reconciliation) as semantic-best effort. – Embrace schema on read. • KGs can support explainable AI: – Meeting point between extraction, reasoning and querying. – Definition-based models. • Inherit infrastructures from DB and IR.
  • 107. Take Home Message Opportunities: • ML orchestrated pipelines with: – Richer discourse-representation models. – Explicit semantic representations (centered on KGs). – Different compositional/distributional models (beyond W2V & Glove) • KGs and impact on explainability. • Quantifying domain and language transportability.
  • 109. References Vivian S. Silva, André Freitas, Siegfried Handschuh, Recognizing and Justifying Text Entailment through Distributional Navigation on Definition Graphs, Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, USA, 2018 (pdf). Matthias Cetto, Christina Niklaus, André Freitas and Siegfried Handschuh, Creating a Hierarchy of Semantically-Linked Propositions in Open Information Extraction, In Proceedings of the 27th International Conference on Computational Linguistics (COLING), New-Mexico, USA, 2018. Christina Niklaus, Matthias Cetto, André Freitas and Siegfried Handschuh, A Survey on Open Information Extraction, In Proceedings of the 27th International Conference on Computational Linguistics (COLING), New-Mexico, USA, 2018. Siamak Barzegar, Brian Davis, Siegfried Handschuh, André Freitas, Multilingual Semantic Relatedness using Lightweight Machine Translation, 12th IEEE International Conference on Semantic Computing (ICSC), USA, 2018 (pdf). Siamak Barzegar, Brian Davis,, Manel Zarrouk, Siegfried Handschuh, André Freitas, SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages, 11th Language Resources and Evaluation Conference (LREC), Japan, 2018 (pdf). Thomas Gaillat, Manel Zarrouk, André Freitas, Brian Davis, The SSIX Corpora: Three Gold Standard Corpora for Sentiment Analysis in English, Spanish and German Financial Microblogs, 11th Language Resources and Evaluation Conference (LREC), Japan, 2018 (pdf). Juliano Efson Sales, Leonardo Souza, Siamak Barzegar, Brian Davis, André Freitas and Siegfried Handschuh, Indra: A Word Embedding and Semantic Relatedness Server, 11th Language Resources and Evaluation Conference (LREC), Japan, 2018 (pdf). Andre Freitas, Schema-agnostic queries over large-schema databases: a distributional semantics approach (pdf).
  • 110. References Vivian S. Silva, André Freitas, Siegfried Handschuh, Building a Knowledge Graph from Natural Language Definitions for Interpretable Text Entailment Recognition, 11th Language Resources and Evaluation Conference (LREC), Japan, 2018 (pdf). Juliano Efson Sales, André Freitas, Brian Davis, Siegfried Handschuh, A Compositional-Distributional Semantic Model for Searching Complex Entity Categories, 5th Joint Conference on Lexical and Computational Semantics (*SEM), Berlin, 2016. (Full Conference Paper) (pdf). André Freitas, Siamak Barzegar, Juliano E. Sales, Siegfried Handschuh and Brian Davis, Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness using Machine Translation, 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW), 2016. (Full Conference Paper) (pdf). Vivian S. Silva, André Freitas and Siegfried Handschuh, Supersense Word Tagging with Foundational Ontology Classes, 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW), 2016. (Full Conference Paper) (pdf). Christina Niklaus, Bernhard Bermeitinger, Siegfried Handschuh, André Freitas, A Sentence Simplification System for Improving Relation Extraction, 26th International Conference on Computational Linguistics, (COLING), Osaka, 2016. (Demonstration in Proceedings) (pdf). Vivian S. Silva, Siegfried Handschuh and André Freitas, Categorization of Semantic Roles for Dictionary Definitions, Cognitive Aspects of the Lexicon (CogALex-V), Workshop at the 26th International Conference on Computational Linguistics, (COLING), Osaka, 2016. (Full Workshop Paper) (pdf). André Freitas, Juliano Efson Sales, Siegfried Handschuh, Edward Curry, How hard is the Query? Measuring the Semantic Complexity of Schema-Agnostic Queries, In Proceedings of the 11th International Conference on Computational Semantics (IWCS), London, 2015. (Full Conference Paper) (pdf). André Freitas, Edward Curry, Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional- Compositional Semantics Approach, In Proceedings of the 19th International Conference on Intelligent User Interfaces (IUI), Haifa, 2014. (Full Conference Paper) (pdf). André Freitas, João Carlos Pereira Da Silva, Edward Curry, Paul Buitelaar, A Distributional Semantics Approach for Selective Reasoning on Commonsense Graph Knowledge Bases, In Proceedings of the 19th International Conference on Applications of Natural Language to Information Systems (NLDB), Montpellier, 2014. (Full Conference Paper) (pdf).