SlideShare une entreprise Scribd logo
1  sur  291
André Freitas
Viktor Schlegel
Building AI Applications
using
Knowledge Graphs
TWC / WWW 2018
Lyon
Organisation
• Goals of this Tutorial:
– Provide a broad view of the multiple perspectives
underlying knowledge graphs.
– Show knowledge graphs as a foundation for building AI
systems.
• Method:
– Focus on the contemporary and emerging perspectives.
– Sampling exemplar approaches and infrastructures on
each of these emerging perspectives (not an exhaustive
survey).
Disclaimer
• Not a standard academic tutorial.
• Big picture, not a survey.
• Focuses on principles and systems.
• Biased!
Other Tutorials On KGs
(Complementary perspective)
Bordes & Gabrilovich, Constructing and Mining Web-scale Knowledge Graphs,
KDD 2014 Tutorial.
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015.
Mottin & Lissandrini, New Trends on Exploratory Methods for Data Analytics,
VLDB 2017 Tutorial.
Ren, Su & Yan, Construction and Querying of Large-scale Knowledge Bases,
CIKM 2017 Tutorial.
Outline
1. What is a ?
2. Building a
3. Querying
4. Inferences over
5. Uses of
What is a Knowledge
Graph?
Some Perspectives on “What”
“The Knowledge Graph is a knowledge base used by Google to enhance
its search engine's search results with semantic-search information
gathered from a wide variety of sources.”
“A Knowledge graph (i) mainly describes real world entities and
interrelations, organized in a graph (ii) defines possible classes and
relations of entities in a schema(iii) allows potentially interrelating arbitrary
entities with each other…” [Paulheim H.]
“We defines a Knowledge Graph as an RDF graph consists of a set of RDF
triples where each RDF triple (s,p,o) is an ordered set of following RDF
term ….” [Pujara J. al al.]
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
• Open world representation of information.
• Every entry point is equal cost.
• Underpin Cortana, Google Assistant, Siri, Alexa.
• Typically (but doesn’t have to be) expressed in RDF.
• No longer a solution in search of a problem!
Dan Bennett, Thomson Reuters
Some Perspectives on “What”
Defining KG by Example
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
• “Knowledge is Power” Hypothesis (the Knowledge
Principle): “If a program is to perform a complex task
well, it must know a great deal about the world in
which it operates.”
• The Breadth Hypothesis: “To behave intelligently in
unexpected situations, an agent must be capable of
falling back on increasingly general knowledge.”
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
Some Perspectives on “Why”
• We’re surrounded by entities, which are connected by
relations.
• We need to store them somehow, e.g., using a DB or a
graph.
• Graphs can be processed efficiently and offer a
convenient abstraction.
Some Perspectives on “Why”
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
Some Perspectives on “Why”
• Knowledge models such as Linked Data and many
problems in machine learning have a natural
representation as relational data.
• Relations between entities are often more important for a
prediction task than attributes.
• For instance, can be easier to predict the party of a vice-
president from the party of his president than from his
attributes.
[Koopman, 2010]
Schema on Write
• Fixed data model
• Slow to change
• Strong enforcement
Schema on Read
• Capture everything
• Apply logic (schema) on read
• No standards
The Data Management Perspective
Dan Bennett, Thomson Reuters
From Closed to Open
Communication
From Closed to Open
Communication
Intuitions on the Connection
Between KGs and AI
AI Archetypal Problem
• Question Answering: Develop an algorithmic
approach which answers natural language
queries no matter how the data or the question
are expressed in natural language.
From Text to Structure
From Text to Structure
From Text to Structure
From Text to Structure
Regularities in Natural
Language
Regularities in Natural
Language
Regularities in Natural
Language
Structural/Logical Form
Structural/Logical Form
Rephrasing it
Now we can answer this
query
It computes!
Extrapolating
Breaking away from the linear
order imposed by the medium
What did we get?
• By integrating terms (mapping to a canonical form) we
reduced complexity.
– ‘Atomization’ of knowledge allows integration.
• Entities, attributes and relationships are now explicit
and interconnected.
– Predicate-argument structures.
• Ability to focus (“query”) and operate on specific sets
and relations.
• Entities are organized into hierarchies of sets.
– With that we can express knowledge at different abstraction
levels (make generalizations).
Building Knowledge Graphs
“On our best behaviour”
“We need to return to our roots in Knowledge
Representation and Reasoning for language and from
language.”
Levesque, 2013
“We should not treat English text as a monolithic
source of information. Instead, we should carefully
study how simple knowledge bases might be used to
make sense of the simple language needed to build
slightly more complex knowledge bases, and so on.”
Open Information Extraction
• Extracting unstructured facts from text.
• TextRunner [Banko et al., IJCAI ’07], WOE [Wu & Weld,
ACL ‘10].
• ReVerb [Fader et al., EMNLP ‘11].
• OLLIE [Mausam et al., EMNLP ‘12].
• OpenIE [Mausam et al., IJCAI ‘16].
• Graphene [Niklaus et al, COLING 17].
Graphene
• Captures contextual relations.
• Extends the default Open IE representation in
order to capture inter-proposition relationships.
• Increase the informativeness and
expressiveness of the extracted tuples.
Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
Transformation Stage
Rhetorical Relations
Extracting Rhetorical
Relations
Extracting Rhetorical
Relations
Clausal & Phrasal
Disembedding
Input Document
Transformation Stage
Relation Extraction
Output
Asian stocks fell anew and the yen rose to session highs
in the afternoon as worries about North Korea simmered,
after a senior Pyongyang official said the U.S. is
becoming ``more vicious and more aggressive'' under
President Donald Trump .
Asian stocks fell anew
The yen rose to session highs in the afternoon
spatial
attribution
after
Worries simmered about North Korea
The U.S. is becoming
becoming `` more vicious and more aggressive ''
under Donald Trump
A senior Pyongyang
official said
background
and
Precision:
Recall:
Improving Open Relation Extraction using Clausal
and Phrasal Disembedding, Under Review, (2017)
What to expect
(Wikipedia & Newswire)
https://github.com/Lambda-3/Graphene
Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
Software: Extracting Knowledge
Graphs from Text
https://github.com/knowitall/openie/
Harinder & Mausam. "Demonyms and Compound Relational Nouns in Nominal Open IE". Workshop
on Automated Knowledge Base Construction (AKBC) at NAACL. San Diego, CA, USA. June 2016.
Mausam. "Open Information Extraction Systems and Downstream Applications". Invited Paper for
Early Career Spotlight Track. International Joint Conference on Artificial Intelligence (IJCAI). New
York, NY. July 2016.
Software: OpenIE 4.0
Frame-based Extraction
(Ontology Grounded)
Background theories:
• Combinatory Categorial Grammar [C&C],
• Discourse Representation Theory [DRT, Boxer],
• Frame Semantics [Fillmore 1976]
• Ontology Design Patterns [Ontology Handbook].
Frameworks:
• Named Entity Resolution [Stanbol, TagMe],
• Coreference Resolution [CoreNLP]
• Word Sense Disambiguation [Boxer, IMS].
Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
Frame-based Extraction
(Ontology Grounded)
N-ary relation and Event extraction by using frame detection
Relation extraction between frames, events, concepts and entities
Negation representation
Modality representation
Adjective semantics
Temporal relation extraction from tense expressions
Semantic annotation of text fragments
Coreference resolution
Type and taxonomy induction
Incremental role propagation
Entity linking to Semantic Web data
Word-sense disambiguation
Pattern-based subgraph extraction
Named-graph generation
Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
Frame-based Extraction
The New York Times reported that John McCarthy died. He invented the programming language LISP.
Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
Frame-based Extraction
The New York Times reported that John McCarthy died. He invented the programming language LISP.
Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
http://wit.istc.cnr.it/stlab-tools/fred/
Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
Software: FRED
Argumentation Structures
Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
Argumentation Structures
Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
Argumentative Discourse
Unit Classification
Argumentation &
Rhetorical Relations
• Support: background, cause, evidence, justify,
list, motivation, reason, restatement, result.
• Rebuttal: antithesis, contrast, unless.
• Undercut: concession.
Argumentation Schemes
“Argumentation Schemes are forms of argument
(structures of inference) that represent structures of
common types of arguments used in everyday
discourse, as well as in special contexts like those
of legal argumentation and scientific
argumentation.”
Douglas Walton
Argumentation Schemes
Unified Scheme
Classification
Argument Mining Approaches
What to expect?
F1-score: 0.74
Stab & Gurevych, Parsing Argumentation
Structures in Persuasive Essays, 2016.
https://www.ukp.tu-darmstadt.de/data/argumentation-
mining/argument-annotated-essays/
Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
Software: Argumentation Mining
Taxonomy
Extraction
Term Extraction – ACL
Anthology
Terms to Taxonomy
Taxonomy Extraction
Approach
Global Generality
Georgeta Bordea (2013) Domain adaptive extraction of topical hierarchies for Expertise Mining. PhD Thesis, National University of
Ireland, Galway
Taxonomy Extraction –
ACL Anthology
Semantic Roles for Lexical
Definitions
Aristotle’s classic theory of definition introduced important aspects
such as the genus-differentia definition pattern and the
essential/non-essential property differentiation.
Building the Definition Graph
Data: WordNetGraph
Silva et al., Categorization of Semantic Roles for Dictionary Definitions.
Cognitive Aspects of the Lexicon CogALex@COLING, 2017.
https://github.com/Lambda-3/WordnetGraph
RDF graph generated from WordNet.
Event
Extraction
EventKG: A Multilingual Event-Centric
Temporal Knowledge Graph
Simon Gottschalk, Elena Demidova. ESWC 2018
EventKG is an open knowledge graph containing
event-centric information: http://eventkg.l3s.uni-hannover.de/
- EventKG V1.1: 690K events and 2.3M temporal relations
- 2014 FIFA World Cup
- “The Space Shuttle Challenger is launched on its maiden voyage”
- <Jennifer Aniston, married to, Brad Pitt, [2000-07-29,2005-10-02]>
- Extracted from Wikidata, YAGO, DBpedia and Wikipedia
- Integrated data in five languages: EN, FR, DE, RU, PT
- Provides provenance information
- High coverage of event times and locations due to integration:
EventKG Wikidata DBpedia (en)
Events with Time 50.82% 33.00% 7.00%
Events with Location 26.13% 11.70% 6.21%
EventKG: A Multilingual Event-Centric
Temporal Knowledge Graph
Simon Gottschalk, Elena Demidova. ESWC 2018
Which science-related events took place in Lyon?
- 1921: “À Lyon, fusion de la Société de médecine et de la Société des
sciences médicales”
SELECT DISTINCT ?description {
?event rdf:type sem:Event .
?relation rdf:object ?lyon .
?relation rdf:subject ?event .
?event dcterms:description ?description .
FILTER regex(?description, "science", "i") .
?lyon owl:sameAs dbr:Lyon .
}
EventKG: A Multilingual Event-Centric
Temporal Knowledge Graph
Simon Gottschalk, Elena Demidova. ESWC 2018
• EventKG builds upon and extends the Simple Event Model
(SEM).
• Example: The participation of Barack Obama in his second
inauguration as US president in 2013 in EventKG.
EventKG+TL: Creating Cross-Lingual Timelines
from an Event-Centric Knowledge Graph
Simon Gottschalk, Elena Demidova. ESWC 2018
- An overview of events related to a query entity over a time
period across languages using Event KG
- Example: Brexit-related events. The pie chart size: the overall (i.e.
language independent) event relevance. The colored slices: the ratio
of the relevance in a language context.
Emerging perspectives
• The evolution of parsing and classification methods in
NLP is inducing a new lightweight semantic
representation.
• This representation dialogues with elements from logics,
linguistics and the Semantic/Linked Data Web (especially
RDF).
• However, they relax the semantic constraints of previous
models (which were operating under assumptions for
deductive reasoning or databases).
Emerging perspectives
• Knowledge graphs as lexical semantic models
operating under a semantic best-effort mode (canonical
identifiers when possible, otherwise, words).
• Possibly closer to the surface form of the text.
• Priority is on segmenting, categorizing and when
possible, integrating.
• A representation (data model) convenient for AI
engineering.
Categorization
A fact (main clause):
* Can be a taxonomic fact.
s p o
term, URI term, URI term, URI
instance,
class,
triple
type, property,
schema property
instance,
class,
triple
Categorization
A fact with a context:
s0 p0 o0
p1
o1
reification
e.g.
• subordination
(modality, temporality,
spatiality, RSTs)
• fact probability
• polarity
Categorization
Coordinated facts:
s0 p0 o0
s1 p1 o1
p2
e.g.
• coordination
• RSTs
• ADU
Knowledge Graphs &
Distributional Semantics
(A marriage made in heaven?)
Distributional Semantics
• Computational models that build contextual
semantic representations from corpus data.
• Semantic context is represented by a vector.
• Vectors are obtained through the statistical
analysis of the linguistic contexts of a word.
• Salience of contexts (cf. context weighting
scheme).
• Semantic similarity/relatedness as the core
operation over the model.
Distributional Semantic Models
Distributional Semantic Models
• Semantic Model with low acquisition effort
(automatically built from text)
Simplification of the representation
• Enables the construction of comprehensive
commonsense/semantic KBs
• What is the cost?
Some level of noise
(semantic best-effort)
Limited semantic model
Distributional Semantics as
Commonsense Knowledge
Commonsense is here
θ
car
dog
cat
bark
run
leash
Semantic Approximation is here
Semantic Model with low
acquisition effort
Context Weighting Measures
Kiela & Clark, 2014
Similarity Measures
x
… and of course, Glove and W2V
Distributional-Relational
Networks
Distributional Relational Networks, AAAI Symposium (2013).
A Compositional-Distributional Semantic Model for Searching Complex Entity
Categories, ACL *SEM (2016)
103
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
…
LSA, ESA, W2V, GLOVE, …
s0 p0 o0
The vector space is segmented
104
Dimensional reduction
mechanism!
A Distributional Structured Semantic Space for Querying RDF Graph
Data, IJSC 2012
Compositionality of Complex
Nominals
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
On Complex Nominals
On Complex Nominals
107
Building on Word Vector Space
Models
• But how can we represent the meaning of longer phrases?
• By mapping them into the same vector space!
the country of my birth
the place where I was born
Compositionality
• The meaning of a complex expression is a
function of the meaning of its constituent parts.
carnivorous plants digest
slowly
Compositionality Principles
Words in which the meaning
is directly determined by
their distributional behaviour
(e.g., nouns).
Words that act as functions
transforming the
distributional profile of other
words (e.g., verbs,
adjectives, …).
Compositionality Principles
• Take the syntactic structure to constitute the backbone
guiding the assembly of the semantic representations of
phrases.
• A correspondence between syntactic categories and
distributional objects.
Mixture vs Function
Distributional functions
as linear transformations
• Distributional functions are linear transformations on
semantic vector/tensor spaces.
• Matrix: First-order, one argument distributional
functions.
• Used to represent adjectives and adverbs.
Example: Adjective + Noun
• Adjective = a function from nouns to nouns,
Inducing distributional functions
from corpus data
- Distributional functions are
induced from input to output
transformation examples
- Regression techniques
commonly used in machine
learning.
How should we map phrases
into a vector space?
Recursive Neural Networks
Compositional-distributional
model for paraphrases
A Compositional-Distributional Semantic Model for Searching Complex
Entity Categories, *SEM (2016)
Software: Indra
• Semantic approximation server
• Multi-lingual (12 languages)
• Multi-domain
• Different compositional models
https://github.com/Lambda-3/indra
Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual
Semantic Relatedness using Machine Translation, EKAW, (2016).
Software: Gensim
https://radimrehurek.com/gensim/models/word2vec.html
Radim & Sojka, Software Framework for Topic Modelling with Large Corpora,
LREC, 2010
Data: PPDB
Ganitkevitch et al., PPDB: The Paraphrase Database, LREC, 2013
http://paraphrase.org/#/download
• 16 languages
Recursive vs recurrent neural
networks
1
Segmented Spaces vs
Unified Space
s0 p0 o0
s0 p0 o0
• Assumes is <s,p,o> naturally
irreconcilable.
• Inherent dimensional reduction
mechanism.
• Facilitates the specialization of
embedding-based approximations.
• Easier to compute identity.
• Requires complex and high-
dimensional tensorial model.
How to access Distributional-
Knowledge Graphs efficiently?
• Depends on the target operations in the
Knowledge Graphs (more on this later).
How to access Distributional-
Knowledge Graphs efficiently?
s0 p0 o0
s0
q
Inverted index
sharding
disk access
optimization
…
Multiple Randomized
K-d Tree Algorithm
The Priority Search
K-Means Tree algorithm
Database + IR
Query planning
Cardinality
Indexing
Skyline
Bitmap indexes
…
Structured Queries Approximation Queries
How to access Distributional-
Knowledge Graphs efficiently?
s0 p0 o0
Database + IR
Structured Queries Approximation Queries
Software: StarGraph
• Distributional Knowledge Graph Database.
• Word embedding Database.
https://github.com/Lambda-3/Stargraph
Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, 2014.
Emerging perspectives
• Graph-based data models + Distributional Semantic Models
(Word embeddings) have complementary semantic values.
• Graph-based Data Models:
– Facilitates querying, integration and rule-based reasoning.
• Distributional Semantic Models:
– Supports semantic approximation, coping with vocabulary variation.
Emerging perspectives
• AI systems require access to comprehensive background
knowledge for semantic interpretation tasks.
• Inheriting from Information Retrieval and Databases:
– General Indexing schemes,
– Particular Indexing schemes,
• Spatial, temporal, topological, probabilistic, causal, …
– Query planning,
– Data compression,
– Distribution,
– … even supporting hardware strategies.
Emerging perspectives
• One size of embedding does not fit all: Operate with
multiple distributional + compositional models for different
data model types (I, C, P), different domains and different
languages.
• Inheriting from Information Retrieval and Databases:
– Indexing schemes,
– Query planning,
– Data compression,
– Query distribution,
– even supporting hardware.
Best-effort
Semantic Integration
Semantic Integration
• Task: Mapping near-synonymic term references to a
canonical identifier.
• Goal: Reduce the entropy (complexity) of the underlying
KG.
• Operations:
– Co-reference Resolution
– (Named) Entity Linking
– Predicate Reconciliation
• Common aspects:
– Highly dependent on the context of the mention.
– Highly dependent on target entity background knowledge.
Software: Cobalt
• KG-based co-reference resolution.
Co
Cobalt
https://github.com/semanc/cobalt (to appear)
Software: AGDISTIS
• Agnostic Disambiguation of Named Entities Using Linked Open Data
http://aksw.org/Projects/AGDISTIS.html
Usbeck et al. AGDISTIS - Agnostic Disambiguation of Named Entities Using
Linked Open Data, ECAI, 2015
Software: StarGraph
• Predicate Reconciliation (Distributional-semantics based).
https://github.com/Lambda-3/Stargraph
Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, 2014.
Effective Semantic Parsing
for Large KBs
Sparse, large vocabulary,
heterogeneous, schema-less
databases
10s-100s attributes 1,000s-1,000,000s attributes
before 2000
circa 2017
1,000s-1,000,000s attributes
Brodie & Liu, 2010
The Vocabulary Problem
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
The Vocabulary Problem
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
Latino origins
selected
JudgeHigh
Obama
Last US president
“On our best behaviour”
“It is not enough to build knowledge bases without paying
closer attention to the demands arising from their use.”
Levesque, 2013
“We should explore more thoroughly the space of
computations between fact retrieval and full
automated logical reasoning.”
Schema-agnostic queries
Query approaches over structured databases which
allow users satisfying complex information needs
without the understanding of the representation
(schema) of the database.
First-level independency
(Relational Model)
“… it provides a basis for a high level data language which
will yield maximal independence between programs on the
one hand and representation and organization of data on
the other”
Codd, 1970
Second-level independency
(Schema-agnosticism)
Vocabulary Problem for
Databases
Schema-agnostic query
mechanisms
Query-DB
Semantic Gap
From Semantic Tractability to
Semantic Resolvability
Semantic Tractability (Popescu et al., 2004)
- Focuses on soundness and completeness conditions for mapping
natural language queries to databases.
- Focuses on a restricted class of semantic mappings.
Semantic Resolvability (Freitas et al., 2014)
- Provides a formal model for classifying query-dataset mappings for
schema-agnostic queries.
Freitas et al., On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study, NLIWOD 2014
George (2005) and Sheth & Kashyap (1990)
Towards an Information-
Theoretical Model for Schema-
agnostic Semantic Matching
Semantic Complexity & Entropy: Configuration
space of semantic matchings.
• Query-DB semantic gap.
• Ambiguity, synonymy, indeterminacy,
vagueness.
Freitas et al. How hard is this query? Measuring the Semantic Complexity of Schema-
agnostic Queries, IWCS 2015
Minimizing the Semantic Entropy
for the Semantic Matching
Definition of a semantic pivot: first query term to be
resolved in the database.
• Maximizes the reduction of the semantic
configuration space.
• Less prone to more complex synonymic
expressions and abstraction-level differences.
Minimizing the Semantic Entropy
for the Semantic Matching
Definition of a semantic pivot: first query term to be resolved in the
database.
• Maximizes the reduction of the semantic configuration space.
• Less prone to more complex synonymic expressions and
abstraction-level differences.
• Semantic pivot serves as interpretation context for the remaining
alignments.
• proper nouns >> nouns >> complex nominals >> adjectives ,
verbs.
Distributional Semantic
Relatedness
?
Semantic Relatedness Measure
as a Ranking Function
A Distributional Approach for Terminological Semantic Search on the
Linked Data Web, ACM SAC, 2012153
Semantic Pivoting + Distributional
Semantics
 Contextual mechanism for the distributional
semantic approximation.
Search and Composition
Operations
 Instance search
- Proper nouns
- String similarity + node cardinality
 Class (unary predicate) search
- Nouns, adjectives and adverbs
- String similarity + Distributional semantic relatedness
 Property (binary predicate) search
- Nouns, adjectives, verbs and adverbs
- Distributional semantic relatedness
 Navigation
 Extensional expansion
- Expands the instances associated with a class.
 Operator application
- Aggregations, conditionals, ordering, position
 Disjunction & Conjunction
 Disambiguation dialog (instance, predicate)
155
Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-
Compositional Semantics Approach, IUI 2014
Distributional
Inverted Index
Distributional-Relational
Model
Reference
Commonsense
corpora
Core semantic approximation &
composition operations
Semantic Parser
Query Plan
Scalable semantic parsing
Learn to
Rank
Question Answers
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
lexical specificity # of senses lexical category
…
… …
- Vector neighborhood density
- Semantic differential
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
lexical specificity # of senses lexical category
…
… …
𝜌
- Vector neighborhood density
- Semantic differential
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽} …
lexical specificity # of senses lexical category
…
… …
Δ𝑠𝑟
Δ𝑟
Semantic pivoting
- Vector neighborhood density
- Semantic differential
- Distributional compositionality
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
lexical specificity # of senses lexical category
…
… …
t h
0t m1
0
t m2
0
o t h
0t m1
0
t m1
0 =
… …
… …
Query Pre-Processing
(Question Analysis)
• Transform natural language queries into triple
patterns.
Query Pre-Processing
(Question Analysis)
• Transform natural language queries into triple patterns.
Query Pre-Processing
(Question Analysis)
Query Pre-Processing
(Question Analysis)
Query Pre-Processing
(Question Analysis)
Query Plan
Instance Search
Query Plan Execution
Query Plan Execution
Query Plan Execution
Query Plan Execution
What to expect (@ QALD1)
F1-Score: 0.72
MRR: 0.5
Freitas & Curry, Natural Language Queries over Heterogeneous Linked Data Graphs,
IUI (2014).
Addressing the Vocabulary
Problem
• Hierarchy of approximation spaces
– Probabilistic justification.
– Semantic pivoting.
How hard is the Query? Measuring the Semantic Complexity of Schema-Agnostic Queries, IWCS
(2015).
A Distributional Approach for Terminological Semantic Search on the Linked Data Web, ACM SAC
(2012).
Schema-agnostic queries over large-schema databases: a distributional semantics approach,
PhD Thesis (2015).
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study, NLIWoD
(2015).
How to cope with the vocabulary
problem?
• Embed resources on distributional spaces.
• Use heuristics to minimize approximation errors.
• Interaction Pattern (Semantic best-effort):
Search >> Disambiguate >> Learn
Distributional semantics
= domain and language transportability.
= high search recall and relevance ranking.
Semantic Parsing on Freebase from Question-Answer Pairs, Berant et al., EMNLP 2013
Sempre
• Semantic parser for QA.
• Parses Natural language to Freebase SPARQL
query.
• Doesn’t require logical forms for training, can be
trained from questions / answers directly.
Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
Formulating the ML Problem
Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
Learning
Alignment
Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
Overnight Semantic Parsing
Liang, SEMPRE: Semantic Parsing
with Execution Percy, 2015.
Neural Relation Detection
• Goal: Detect KB-specific entities and relations in
NL, map directly to query.
• Problems: relation chains, relations unseen in
training data…
Improved Neural Relation Detection for Knowledge Base Question Answering. Yu et al., ACL 2017
Neural Relation Detection
• Encode relations on different levels: relation and word level
• Use multi-layer (Bi-)LSTM for question encoding
• utilize both layers in final representation vector
• => abstract way of representing context
Neural Relation Detection
What to expect (@SimpleQuestions)
Accuracy: 0.93
Neural Semantic Parsing over Multiple Knowledge-bases. Herzig et al, ACL 2017
Neural Semantic Parsing
• Treat Semantic Parsing as a Sequence-to-Sequence
problem
Neural Semantic Parsing over Multiple Knowledge-bases. Herzig et al, ACL 2017
(Liang, 2013)
Shared structural
regularity
What to expect (@ Overnight)
Accuracy: 0.79
Neural Semantic Parsing over Multiple Knowledge-bases. Herzig et al, ACL 2017
Software: StarGraph
• Semantic parsing.
https://github.com/Lambda-3/Stargraph
Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, 2014.
Software: Sempre
Semantic Parser & QA System
https://github.com/Lambda-3/Stargraph
Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, 2014.
Sempre
Memory Networks
Slide credit: Jason Weston
• Class of models that combine large memory with learning
component that can read and write to it.
• Most ML has limited memory which is more-or-less all that’s
needed for “low level” tasks e.g. object detection.
• Motivation: long-term memory is required to read a story
(or watch a movie) and then e.g. answer questions about it.
• We study this by building a simple simulation to generate
``stories’’. We also try on some real QA data.
MCTest comprehension data
(Richardson et al.)
James the Turtle was always getting in trouble. Sometimes he'd reach into
the freezer and empty out all the food. Other times he'd sled on the deck
and get a splinter. His aunt Jane tried as hard as she could to keep him out
of trouble, but he was sneaky and got into lots of trouble behind her back.
One day, James thought he would go into town and see what kind of trouble
he could get into. He went to the grocery store and pulled all the pudding off
the shelves and ate two jars. Then he walked to the fast food restaurant and
ordered 15 bags of fries. He didn't pay, and instead headed home.
His aunt was waiting for him in his room. She told James that she loved
him, but he would have to start acting like a well-behaved turtle.
After about a month, and after getting into lots of trouble, James finally
made up his mind to be a better turtle.
Q: What did James pull off of the shelves in the grocery store?
A) pudding B) fries C) food D) splinters
…
Slide credit: Jason Weston
MCTest comprehension data
(Richardson et al.)
James the Turtle was always getting in trouble. Sometimes he'd reach into
the freezer and empty out all the food. Other times he'd sled on the deck
and get a splinter. His aunt Jane tried as hard as she could to keep him out
of trouble, but he was sneaky and got into lots of trouble behind her back.
One day, James thought he would go into town and see what kind of trouble
he could get into. He went to the grocery store and pulled all the pudding off
the shelves and ate two jars. Then he walked to the fast food restaurant and
ordered 15 bags of fries. He didn't pay, and instead headed home.His aunt was waiting for him in his room. She told James that she loved
him, but he would have to start acting like a well-behaved turtle.
After about a month, and after getting into lots of trouble, James finally
made up his mind to be a better turtle.
Q: What did James pull off of the shelves in the grocery store?
A) pudding B) fries C) food D) splinters
Q: Where did James go after he went to the grocery store?
…
Slide credit: Jason Weston
Problems: … it’s hard for this data to lead us to design good ML models …
1) Not enough data to train on (660 stories total). What ends up happening ordered 15
bags of fries. He didn't pay, and instead headed home.
2) If we get something wrong we don’t really understand why: every question potentially
involves a different kind of reasoning, our model has to do a lot of different things.
Our solution: focus on simpler (toy) subtasks where we can generate data to check what
the models we design can and cannot do.
Example
Slide credit: Jason Weston
Dataset in simulation command format. Dataset after adding a simple grammar.
antoine go kitchen
antoine get milk
antoine go office
antoine drop milk
antoine go bathroom
where is milk ? (A: office)
where is antoine ? (A: bathroom)
Antoine went to the kitchen.
Antoine picked up the milk.
Antoine travelled to the office.
Antoine left the milk there.
Antoine went to the bathroom.
Where is the milk now? (A: office)
Where is Antoine? (A: bathroom)
Simulation Data Generation
Slide credit: Jason Weston
Aim: built a simple simulation which behaves much like a classic text
adventure game. The idea is that generating text within this simulation
allows us to ground the language used.
Actions:
go <location>, get <object>, get <object1> from <object2>,
put <object1> in/on <object2>, give <object> to <actor>,
drop <object>, look, inventory, examine <object>.
Constraints on actions:
• an actor cannot get something that they or someone else
already has
• they cannot go to a place they are already at
• cannot drop something they do not already have
• …
(1) Factoid QA with Single Supporting Fact
John is in the playground.
Bob is in the office.
Where is John? A:playground
(2) Factoid QA with Two Supporting Facts
John is in the playground.
Bob is in the office.
John picked up the football.
Bob went to the kitchen.
Where is the football? A:playground
Where was Bob before the kitchen? A:office
… (total 20 Tasks)
Slide credit: Jason Weston
Matching function
Slide credit: Jason Weston
Match (Where is the football ?, John picked up the football)
• Use a qTUTUd embedding model with word embedding features.
• LHS features: Q:Where Q:is Q:the Q:football Q:?
• RHS features: D:John D:picked D:up D:the D:football
• QDMatch:the QDMatch:football
• For a given Q, we want a good match to the relevant memory slot(s)
containing the answer, e.g.:
(QDMatch:football is a feature to say there’s a Q&A word match, which can help.)
The parameters U are trained with a margin ranking loss:
supporting facts should score higher than non-supporting facts.
What to expect (@ QA over Reverb Data)
F1-Score: 0.82
MemNN (BoW Features)
image credit: Towards AI-complete question answering: a set of prerequisite toy tasks
Slide credit: Jason Weston
Positional Reasoning
The triangle is to the right of the blue square.
The red square is on top of the blue square.
The red sphere is to the right of the blue square.
Is the red sphere to the right of the blue square? A:yes
Is the red square to the left of the triangle? A:yes
Path Finding
The kitchen is north of the hallway.
The den is east of the hallway.
How do you go from den to kitchen? A: west, north
Failing Tasks
Slide credit: Jason Weston
Counting
Daniel picked up the football.
Daniel dropped the football.
Daniel got the milk.
Daniel took the apple.
How many objects is Daniel holding? A: two
Lists / Sets
Daniel picks up the football.
Daniel drops the newspaper.
Daniel picks up the milk.
What is Daniel holding? milk, football
Failing Tasks
Key-Value Memory Networks
• Represent memories as key-value pairs:
=> more fine-grained attention mechanism.
• key: information important for comparing memory with question.
• value: information important for comparing memory with answer.
Key-Value Memory Networks for Directly Reading Documents, Miller et al., EMNLP 2015
Question Answer
K V
Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks. Das et al,
ACL 2017
Key-Value Memory Networks
• How to tune? = what to encode in key/value
• Possibilities for keys: entities, KB subjects, …
• Possibilities for values: sentences, word windows,
relations, document title
Attention-based LSTM
• Use LSTM to obtain a representation of <q,a>.
• Better: Implement attention to reflect that different
words in q are mostly discriminative for a.
• Even better: utilize different aspects of answer
entity a (such as type, context).
An End-to-End Model for Question Answering over Knowledge Base with Cross-
Attention Combining Global Knowledge, Hao et al, ACL 2017.
What to expect (@ WebQuestions)
F1-Score (Attention-based LSTM): 42.9
F1-Score (MemNN): 42.2
Reverse Engineering
Queries
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Problem Definition
[Tran et al. 2013]
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Entity Search by Example
Metzger et al. 2013, Sobczak et al. 2015
Use Case:
• Similar Products
• User recommendation
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Entity Search by Example
Bonifati et al. 2015
Use Case:
• Proteins interactions/co-expression
• Similar processes/behaviour
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Reverse Engineering
SPARQL Queries
Use Case:
• Schema-agnostic/End-user queries
Arenas et al. 2016
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Reverse Engineering
SPARQL Queries
Arenas et al. 2016
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Reverse Engineering
SPARQL Queries
Mottin et al. 2014
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Graph Query by Example
Jayaram et al. 2015
VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
Emerging perspectives
Semantic Parsing:
• Structured queries as explanations.
• Semantic pivoting heuristics.
• Diversity of compositional/distributional models as key.
• Memory/Key-value NNs: architecture to model story
timelines.
• End-to-end vs componentised architectures.
Emerging perspectives
Reverse Engineering Queries:
• Can work as an inference method:
– Learn-by-example
– Analogical reasoning
– Similarity queries
Knowledge Graph
Completion
The Problem
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
The Problem
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Formulating the Distributional-
Relational Representation
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Formulating the Distributional-
Relational Representation
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Complex Relations
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
TransD
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
KG2E
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Complex Relations:
RL4KG
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
RL4KG with Entity Descriptions
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
RL4KG with Entity Descriptions
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Relation Paths
• Complex Inference patterns for composition.
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Relation Paths
Path-based TransE
Representation of Relation Paths
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
Path-based TransE
Addition, multiplication, RNNv
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
What to expect (PTransE@FB15K)
Relation Prediction
What to expect (PTransE@FB15K)
Relation Prediction
Software: KB2E
Relation Extraction
Knowledge Graph Embeddings including TransE, TransH,
TransR and PTransE.
https://github.com/thunlp/KB2E
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, Xuan Zhu. Learning Entity and Relation
Embeddings for Knowledge Graph Completion. The 29th AAAI Conference on Artificial
Intelligence (AAAI'15).
KB2E
Natural Language
Inference
Recognizing and Justifying
Text Entailments (TE)
using Definition KGs
Distributional semantic relatedness as a
Selectivity Heuristics
Distributional
heuristics
target
source
answer
Distributional semantic relatedness as a
Selectivity Heuristics
Distributional
heuristics
target
source
answer
Distributional semantic relatedness as a
Selectivity Heuristics
Distributional
heuristics
target
source
answer
Distributional Navigation
Algorithm
Pre-Processing
Inference
Generation
Explainable AI
“The right to explanation”
“The data subject should have
the right not to be subject to a
decision, which may include a
measure, evaluating personal
aspects relating to him or her
which is based solely on
automated processing …”
“… such processing should be
subject to suitable safeguards,
… to obtain an explanation of
the decision …”
What to expect (TE@Boeing-Princeton-ISI)
F1-Score: 0.59
What to expect (TE@Guardian Headline Samples)
F1-Score: 0.53
Santos et al., Recognizing and Justifying Text Entailment through Distributional
Navigation on Definition Graphs, AAAI, 2018.
Emerging perspectives
• Distributional-relational models in KB completion
explored a large range of representation paradigms.
– Opportunity for exporting these representation models to other
tasks.
• Definition-based models can provide a corpus-viable,
low-data and explainable alternative to embedding-
based models.
Architecture
Entity Linking
Open IE
Taxonomy
Extraction
Integration
Arg. Classif.Co-reference
Resolution
KG Completion
Natural
Language
Inference
Named Entity
Recognition
Semantic
Parsing
KG Construction
Inference
Distributional
Semantics
Server
Query By
Example
Query
spatial
temporal
probabilistic
causal
Indexes
NL
Generation
NL Query
Answers
Explanations
Definition
Extraction
Entity Linking
Integration
Co-reference
Resolution
KG Completion
Natural
Language
Inference
Named Entity
Recognition
Semantic
Parsing
KG Construction
Inference
Distributional
Semantics
Server
Query By
Example
Query
spatial
temporal
probabilistic
causal
Indexes
NL
Generation
NL Query
Answers
Explanations
M
T
M
T
Open IE
Taxonomy
Extraction
Arg. Classif.
Definition
Extraction
Evaluation
Platform: Gerbil, Hobbit
• Benchmark Platform for Linked Data.
https://project-hobbit.eu/
http://aksw.org/Projects/GERBIL.html
• SimpleQuestions: 108k manually annotated question-triple pairs
over FreeBase.
• WebQuestions: 5810 Wh-Questions (with variations) with 1 entity
generated by Google Suggest API, manually annotated answers
using the Freebase page of the Entity.
• QALD: Evaluation suite for Linked Data.
• Spades: Fill-in-the-blank queries generated from ClueWeb.
Answers contain at least 1 entity connected to Freebase.
• GraphQuestions: 5166 NL/query pairs by manually paraphrasing
500 Freebase Queries to NL.
Datasets: QA over KB
Datasets: Semantic parsing
• WebQuestionsSP: subset of WebQuestions,
question/semantic parse pairs.
• Overnight: 13682 NL/logical form pairs from 8 different
domains. logical forms generated and annotated by
humans.
• GeoQuery: 880 NL/DB query pairs about US geography.
• Free917: 917 NL/logical form pairs generated from
Freebase properties.
• WikiMovies: >100k Question-Answer pairs
about movies/actors.
• WikiQA: ~1000 Question/Sentence pairs.
sentences gathered from Wikipedia
Datasets: QA over Text
Datasets: Misc
• bAbI: 20 tasks mostly involving QA over synthetic
stories.
• Paraphrases: 18M question-paraphrase pairs gathered
from WikiAnswers.
• FB15k: Knowledge Base completion Dataset derived
from Freebase.
• VQA: >200k images with human-annotated questions &
answers.
• GeoQA: 263 question/answer pairs about geographical
entities and spatial/logical relations .
• ScienceExamQuestions: 2524 4th grade and 675 8th
grade multiple choice questions.
Knowledge Graphs
in Use
KGs in Use
KGs in Use
… and many others
Asked Financial Analysts ”What
ruins your day?” Customer ranked
pain points
Analyst workday
Source: Customer meetings, TR internal analyst
survey
20%
15%
20%
10%
15%
20%Assemble
Synthesize
Interpret
Meetings
Financial
Modeling
Communicate
“An analyst used to cover 40 firms, now it is
150 and the tools haven’t changed.
Director, $4B US L/S equity fund
“They all still do it manually”
Equities, Market data team $20B+ hedge fund
“The problem has gotten worse with more
data and more information”
Director, Research/Tech, $10B Multi-strat
1. Information overload
2. Understanding relationships
3. Unable to track impact
events
4. Cannot link internal &
external research, especially
text
What they told
Geoff Horrell & Dan Bennett
What’s in the Graph?
• Organizations – including names, address, identifiers, Country of
HQ/Incorp
• Industry Classification
• Hierarchy – Parent, Ultimate Parent, Affiliates
• Officers & Directors
• Job History, Education
• Suppliers & Customers
• Comparable Companies
• Joint Ventures & Strategic Alliances
• Meta-Data
Geoff Horrell & Dan Bennett
• 125,000,000 (equity instruments & quotes).
• 75,000,000 meta-data objects (countries, regions, cities,
currencies, commodities, holidays, industry schemes,
scripts, languages, time zones, value domains, units,
identifiers).
• 200,000,000 strategic relationships (supplier, customer,
competitor, joint venture, alliance, industry, ownership,
affiliate).
What’s in the Graph?
Geoff Horrell & Dan Bennett
What’s in the Graph?
Geoff Horrell & Dan Bennett
All Connections between
Onshore Oil Drilling & Venezuela
Geoff Horrell & Dan Bennett
One Week Snapshot
• 6,778 news articles with company news where at least one
organization has 80% relevance to the article.
• 135,267 companies are 2 steps away.
• 217,387 strategic relationships.
• Typical analyst portfolio is 200 companies.
• Each customer creates their own relative weights for each
type of relationship.
• Requires around 800,000 shortest path calculations to
deliver the ranked news feed. Each calculation optimised to
take 10ms.
Geoff Horrell & Dan Bennett
Bordea & Buitelaar
Bordea & Buitelaar
Bordea & Buitelaar
journalist expertise
Bordea & Buitelaar
Explainable Findings
From Tensor Inferences Back to KGs
Explainable Findings
From Tensor Inferences Back to KGs
Take-away Message
• The evolution of methods, tools and the availability of data in
NLP creates the demand for a knowledge representation method
to support complex AI systems.
• A relaxed version of RDF (RDF-NL?) can provide this answer.
– Establishes a dialogue with a standard (with existing data).
– Inherits optimization aspects from Databases.
• Word-embeddings (DSMs) + compositional models + RDF.
• Moving beyond facts and taxonomies: rhetorical structures,
arguments, polarity stories, pragmatics.
Take-away Message
• Syntactical and lexical features can go a long way for
structuring text.
– Context-preserving.
• Integration (entity reconciliation) as semantic-best effort.
– Embrace schema on read.
• KGs can support explainable AI:
– Meeting point between extraction, reasoning and querying.
– Definition-based models.
• Inherit infrastructures from DB and IR.
Take-away Message
Opportunities:
• ML orchestrated pipelines with:
– Richer discourse-representation models.
– Explicit semantic representations (centered on KGs).
– Different compositional/distributional models (beyond W2V & Glove)
• KGs and impact on explainability.
• Quantifying domain and language transportability.
References
Bordes & Gabrilovich, Constructing and Mining Web-scale Knowledge Graphs, KDD 2014 Tutorial.
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015.
Mottin & Lissandrini, New Trends on Exploratory Methods for Data Analytics, VLDB 2017 Tutorial.
Ren, Su & Yan, Construction and Querying of Large-scale Knowledge Bases, CIKM 2017 Tutorial.
Gottschalk & Demidova, EventKG: A Multilingual Event-Centric Temporal Knowledge Graph. Proc. of
the Extended Semantic Web Conference (ESWC 2018).
Gottschalk & Demidova, EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric
Knowledge Graph. Proc. of the ESWC 2018 Satellite Events.
Bennett, Building a Knowledge Graph,
https://www.slideshare.net/DanBennett47/building-a-knowledge-graph-86792821
Geoff Horrell, Thomson Reuters Knowledge Graph Feed
https://www.youtube.com/watch?v=MzGkfIfSrko

Contenu connexe

Tendances

Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachDatabricks
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge GraphLukas Masuch
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
 
Career Prospects and Scope of Data Science in India
Career Prospects and Scope of Data Science in IndiaCareer Prospects and Scope of Data Science in India
Career Prospects and Scope of Data Science in Indiaachaljain11
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Generative AI and law.pptx
Generative AI and law.pptxGenerative AI and law.pptx
Generative AI and law.pptxChris Marsden
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Yaman Hajja, Ph.D.
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business EnablerSrinivasan Sankar
 
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphsmukuljoshi
 
Data science and Artificial Intelligence
Data science and Artificial IntelligenceData science and Artificial Intelligence
Data science and Artificial IntelligenceSuman Srinivasan
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
Introduction to Looker Studio.pptx
Introduction to Looker Studio.pptxIntroduction to Looker Studio.pptx
Introduction to Looker Studio.pptxNirzar Bhaidkar
 
The Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdfSaeed Al Dhaheri
 

Tendances (20)

Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Big data
Big dataBig data
Big data
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT Approach
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Career Prospects and Scope of Data Science in India
Career Prospects and Scope of Data Science in IndiaCareer Prospects and Scope of Data Science in India
Career Prospects and Scope of Data Science in India
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Generative AI and law.pptx
Generative AI and law.pptxGenerative AI and law.pptx
Generative AI and law.pptx
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphs
 
Data science and Artificial Intelligence
Data science and Artificial IntelligenceData science and Artificial Intelligence
Data science and Artificial Intelligence
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Tara Raafat
Tara RaafatTara Raafat
Tara Raafat
 
Introduction to Looker Studio.pptx
Introduction to Looker Studio.pptxIntroduction to Looker Studio.pptx
Introduction to Looker Studio.pptx
 
The Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdf
 

Similaire à Building AI Applications using Knowledge Graphs

Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018Andre Freitas
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsAndre Freitas
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCValentina Presutti
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the WebRinke Hoekstra
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
 
C++ plus data structures, 3rd edition (2003)
C++ plus data structures, 3rd edition (2003)C++ plus data structures, 3rd edition (2003)
C++ plus data structures, 3rd edition (2003)SHC
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsAndre Freitas
 
ArXiv Literature Exploration using Social Network Analysis
ArXiv Literature Exploration using Social Network AnalysisArXiv Literature Exploration using Social Network Analysis
ArXiv Literature Exploration using Social Network AnalysisTanat Iempreedee
 
GraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data ScienceGraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data ScienceNeo4j
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseYongyao Jiang
 
Should we be afraid of Transformers?
Should we be afraid of Transformers?Should we be afraid of Transformers?
Should we be afraid of Transformers?Dominik Seisser
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationGong Cheng
 
Construction and Querying of Dynamic Knowledge Graphs
Construction and Querying of Dynamic Knowledge GraphsConstruction and Querying of Dynamic Knowledge Graphs
Construction and Querying of Dynamic Knowledge GraphsSutanay Choudhury
 
Knowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation ChallengesKnowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation ChallengesSören Auer
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...Maryam Farooq
 
ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ Prateek Jain
 

Similaire à Building AI Applications using Knowledge Graphs (20)

Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering Systems
 
C++ plus data structures, 3rd edition (2003)
C++ plus data structures, 3rd edition (2003)C++ plus data structures, 3rd edition (2003)
C++ plus data structures, 3rd edition (2003)
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering Systems
 
ArXiv Literature Exploration using Social Network Analysis
ArXiv Literature Exploration using Social Network AnalysisArXiv Literature Exploration using Social Network Analysis
ArXiv Literature Exploration using Social Network Analysis
 
GraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data ScienceGraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data Science
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
Should we be afraid of Transformers?
Should we be afraid of Transformers?Should we be afraid of Transformers?
Should we be afraid of Transformers?
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 
Text Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 KimelfeldText Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 Kimelfeld
 
Construction and Querying of Dynamic Knowledge Graphs
Construction and Querying of Dynamic Knowledge GraphsConstruction and Querying of Dynamic Knowledge Graphs
Construction and Querying of Dynamic Knowledge Graphs
 
Knowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation ChallengesKnowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation Challenges
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
 

Plus de Andre Freitas

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAndre Freitas
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ ManchesterAndre Freitas
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...Andre Freitas
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementAndre Freitas
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsAndre Freitas
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesAndre Freitas
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2Andre Freitas
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataAndre Freitas
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeAndre Freitas
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...Andre Freitas
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachAndre Freitas
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...Andre Freitas
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?Andre Freitas
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackAndre Freitas
 
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyOn the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study Andre Freitas
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Andre Freitas
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Andre Freitas
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional SemanticsAndre Freitas
 
On the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsOn the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsAndre Freitas
 

Plus de Andre Freitas (20)

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and Refinement
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary Definitions
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked Data
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web Stack
 
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyOn the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
On the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsOn the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category Descriptors
 

Dernier

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Dernier (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Building AI Applications using Knowledge Graphs

  • 1. André Freitas Viktor Schlegel Building AI Applications using Knowledge Graphs TWC / WWW 2018 Lyon
  • 2. Organisation • Goals of this Tutorial: – Provide a broad view of the multiple perspectives underlying knowledge graphs. – Show knowledge graphs as a foundation for building AI systems. • Method: – Focus on the contemporary and emerging perspectives. – Sampling exemplar approaches and infrastructures on each of these emerging perspectives (not an exhaustive survey).
  • 3. Disclaimer • Not a standard academic tutorial. • Big picture, not a survey. • Focuses on principles and systems. • Biased!
  • 4. Other Tutorials On KGs (Complementary perspective) Bordes & Gabrilovich, Constructing and Mining Web-scale Knowledge Graphs, KDD 2014 Tutorial. Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015. Mottin & Lissandrini, New Trends on Exploratory Methods for Data Analytics, VLDB 2017 Tutorial. Ren, Su & Yan, Construction and Querying of Large-scale Knowledge Bases, CIKM 2017 Tutorial.
  • 5.
  • 6. Outline 1. What is a ? 2. Building a 3. Querying 4. Inferences over 5. Uses of
  • 7. What is a Knowledge Graph?
  • 8. Some Perspectives on “What” “The Knowledge Graph is a knowledge base used by Google to enhance its search engine's search results with semantic-search information gathered from a wide variety of sources.” “A Knowledge graph (i) mainly describes real world entities and interrelations, organized in a graph (ii) defines possible classes and relations of entities in a schema(iii) allows potentially interrelating arbitrary entities with each other…” [Paulheim H.] “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered set of following RDF term ….” [Pujara J. al al.] KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  • 9. • Open world representation of information. • Every entry point is equal cost. • Underpin Cortana, Google Assistant, Siri, Alexa. • Typically (but doesn’t have to be) expressed in RDF. • No longer a solution in search of a problem! Dan Bennett, Thomson Reuters Some Perspectives on “What”
  • 10. Defining KG by Example KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  • 11. • “Knowledge is Power” Hypothesis (the Knowledge Principle): “If a program is to perform a complex task well, it must know a great deal about the world in which it operates.” • The Breadth Hypothesis: “To behave intelligently in unexpected situations, an agent must be capable of falling back on increasingly general knowledge.” KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014 Some Perspectives on “Why”
  • 12. • We’re surrounded by entities, which are connected by relations. • We need to store them somehow, e.g., using a DB or a graph. • Graphs can be processed efficiently and offer a convenient abstraction. Some Perspectives on “Why” KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  • 13. Some Perspectives on “Why” • Knowledge models such as Linked Data and many problems in machine learning have a natural representation as relational data. • Relations between entities are often more important for a prediction task than attributes. • For instance, can be easier to predict the party of a vice- president from the party of his president than from his attributes. [Koopman, 2010]
  • 14. Schema on Write • Fixed data model • Slow to change • Strong enforcement Schema on Read • Capture everything • Apply logic (schema) on read • No standards The Data Management Perspective Dan Bennett, Thomson Reuters
  • 15. From Closed to Open Communication
  • 16. From Closed to Open Communication
  • 17. Intuitions on the Connection Between KGs and AI
  • 18. AI Archetypal Problem • Question Answering: Develop an algorithmic approach which answers natural language queries no matter how the data or the question are expressed in natural language.
  • 19.
  • 20.
  • 21. From Text to Structure
  • 22. From Text to Structure
  • 23. From Text to Structure
  • 24. From Text to Structure
  • 31. Now we can answer this query
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Breaking away from the linear order imposed by the medium
  • 41. What did we get? • By integrating terms (mapping to a canonical form) we reduced complexity. – ‘Atomization’ of knowledge allows integration. • Entities, attributes and relationships are now explicit and interconnected. – Predicate-argument structures. • Ability to focus (“query”) and operate on specific sets and relations. • Entities are organized into hierarchies of sets. – With that we can express knowledge at different abstraction levels (make generalizations).
  • 43. “On our best behaviour” “We need to return to our roots in Knowledge Representation and Reasoning for language and from language.” Levesque, 2013 “We should not treat English text as a monolithic source of information. Instead, we should carefully study how simple knowledge bases might be used to make sense of the simple language needed to build slightly more complex knowledge bases, and so on.”
  • 44. Open Information Extraction • Extracting unstructured facts from text. • TextRunner [Banko et al., IJCAI ’07], WOE [Wu & Weld, ACL ‘10]. • ReVerb [Fader et al., EMNLP ‘11]. • OLLIE [Mausam et al., EMNLP ‘12]. • OpenIE [Mausam et al., IJCAI ‘16]. • Graphene [Niklaus et al, COLING 17].
  • 45. Graphene • Captures contextual relations. • Extends the default Open IE representation in order to capture inter-proposition relationships. • Increase the informativeness and expressiveness of the extracted tuples. Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
  • 50.
  • 56. Asian stocks fell anew and the yen rose to session highs in the afternoon as worries about North Korea simmered, after a senior Pyongyang official said the U.S. is becoming ``more vicious and more aggressive'' under President Donald Trump . Asian stocks fell anew The yen rose to session highs in the afternoon spatial attribution after Worries simmered about North Korea The U.S. is becoming becoming `` more vicious and more aggressive '' under Donald Trump A senior Pyongyang official said background and
  • 57.
  • 58.
  • 59. Precision: Recall: Improving Open Relation Extraction using Clausal and Phrasal Disembedding, Under Review, (2017) What to expect (Wikipedia & Newswire)
  • 60. https://github.com/Lambda-3/Graphene Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017) Software: Extracting Knowledge Graphs from Text
  • 61. https://github.com/knowitall/openie/ Harinder & Mausam. "Demonyms and Compound Relational Nouns in Nominal Open IE". Workshop on Automated Knowledge Base Construction (AKBC) at NAACL. San Diego, CA, USA. June 2016. Mausam. "Open Information Extraction Systems and Downstream Applications". Invited Paper for Early Career Spotlight Track. International Joint Conference on Artificial Intelligence (IJCAI). New York, NY. July 2016. Software: OpenIE 4.0
  • 62. Frame-based Extraction (Ontology Grounded) Background theories: • Combinatory Categorial Grammar [C&C], • Discourse Representation Theory [DRT, Boxer], • Frame Semantics [Fillmore 1976] • Ontology Design Patterns [Ontology Handbook]. Frameworks: • Named Entity Resolution [Stanbol, TagMe], • Coreference Resolution [CoreNLP] • Word Sense Disambiguation [Boxer, IMS]. Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  • 63. Frame-based Extraction (Ontology Grounded) N-ary relation and Event extraction by using frame detection Relation extraction between frames, events, concepts and entities Negation representation Modality representation Adjective semantics Temporal relation extraction from tense expressions Semantic annotation of text fragments Coreference resolution Type and taxonomy induction Incremental role propagation Entity linking to Semantic Web data Word-sense disambiguation Pattern-based subgraph extraction Named-graph generation Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  • 64. Frame-based Extraction The New York Times reported that John McCarthy died. He invented the programming language LISP. Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  • 65. Frame-based Extraction The New York Times reported that John McCarthy died. He invented the programming language LISP. Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  • 66. http://wit.istc.cnr.it/stlab-tools/fred/ Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017 Software: FRED
  • 67. Argumentation Structures Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  • 68. Argumentation Structures Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  • 70. Argumentation & Rhetorical Relations • Support: background, cause, evidence, justify, list, motivation, reason, restatement, result. • Rebuttal: antithesis, contrast, unless. • Undercut: concession.
  • 71. Argumentation Schemes “Argumentation Schemes are forms of argument (structures of inference) that represent structures of common types of arguments used in everyday discourse, as well as in special contexts like those of legal argumentation and scientific argumentation.” Douglas Walton
  • 75.
  • 76. Argument Mining Approaches What to expect? F1-score: 0.74 Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  • 77. https://www.ukp.tu-darmstadt.de/data/argumentation- mining/argument-annotated-essays/ Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016. Software: Argumentation Mining
  • 79. Term Extraction – ACL Anthology
  • 81. Taxonomy Extraction Approach Global Generality Georgeta Bordea (2013) Domain adaptive extraction of topical hierarchies for Expertise Mining. PhD Thesis, National University of Ireland, Galway
  • 83.
  • 84. Semantic Roles for Lexical Definitions Aristotle’s classic theory of definition introduced important aspects such as the genus-differentia definition pattern and the essential/non-essential property differentiation.
  • 86. Data: WordNetGraph Silva et al., Categorization of Semantic Roles for Dictionary Definitions. Cognitive Aspects of the Lexicon CogALex@COLING, 2017. https://github.com/Lambda-3/WordnetGraph RDF graph generated from WordNet.
  • 88. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 EventKG is an open knowledge graph containing event-centric information: http://eventkg.l3s.uni-hannover.de/ - EventKG V1.1: 690K events and 2.3M temporal relations - 2014 FIFA World Cup - “The Space Shuttle Challenger is launched on its maiden voyage” - <Jennifer Aniston, married to, Brad Pitt, [2000-07-29,2005-10-02]> - Extracted from Wikidata, YAGO, DBpedia and Wikipedia - Integrated data in five languages: EN, FR, DE, RU, PT - Provides provenance information - High coverage of event times and locations due to integration: EventKG Wikidata DBpedia (en) Events with Time 50.82% 33.00% 7.00% Events with Location 26.13% 11.70% 6.21%
  • 89. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 Which science-related events took place in Lyon? - 1921: “À Lyon, fusion de la Société de médecine et de la Société des sciences médicales” SELECT DISTINCT ?description { ?event rdf:type sem:Event . ?relation rdf:object ?lyon . ?relation rdf:subject ?event . ?event dcterms:description ?description . FILTER regex(?description, "science", "i") . ?lyon owl:sameAs dbr:Lyon . }
  • 90. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 • EventKG builds upon and extends the Simple Event Model (SEM). • Example: The participation of Barack Obama in his second inauguration as US president in 2013 in EventKG.
  • 91. EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 - An overview of events related to a query entity over a time period across languages using Event KG - Example: Brexit-related events. The pie chart size: the overall (i.e. language independent) event relevance. The colored slices: the ratio of the relevance in a language context.
  • 92. Emerging perspectives • The evolution of parsing and classification methods in NLP is inducing a new lightweight semantic representation. • This representation dialogues with elements from logics, linguistics and the Semantic/Linked Data Web (especially RDF). • However, they relax the semantic constraints of previous models (which were operating under assumptions for deductive reasoning or databases).
  • 93. Emerging perspectives • Knowledge graphs as lexical semantic models operating under a semantic best-effort mode (canonical identifiers when possible, otherwise, words). • Possibly closer to the surface form of the text. • Priority is on segmenting, categorizing and when possible, integrating. • A representation (data model) convenient for AI engineering.
  • 94. Categorization A fact (main clause): * Can be a taxonomic fact. s p o term, URI term, URI term, URI instance, class, triple type, property, schema property instance, class, triple
  • 95. Categorization A fact with a context: s0 p0 o0 p1 o1 reification e.g. • subordination (modality, temporality, spatiality, RSTs) • fact probability • polarity
  • 96. Categorization Coordinated facts: s0 p0 o0 s1 p1 o1 p2 e.g. • coordination • RSTs • ADU
  • 97. Knowledge Graphs & Distributional Semantics (A marriage made in heaven?)
  • 99. • Computational models that build contextual semantic representations from corpus data. • Semantic context is represented by a vector. • Vectors are obtained through the statistical analysis of the linguistic contexts of a word. • Salience of contexts (cf. context weighting scheme). • Semantic similarity/relatedness as the core operation over the model. Distributional Semantic Models
  • 100. Distributional Semantic Models • Semantic Model with low acquisition effort (automatically built from text) Simplification of the representation • Enables the construction of comprehensive commonsense/semantic KBs • What is the cost? Some level of noise (semantic best-effort) Limited semantic model
  • 101. Distributional Semantics as Commonsense Knowledge Commonsense is here θ car dog cat bark run leash Semantic Approximation is here Semantic Model with low acquisition effort
  • 102. Context Weighting Measures Kiela & Clark, 2014 Similarity Measures x … and of course, Glove and W2V
  • 103. Distributional-Relational Networks Distributional Relational Networks, AAAI Symposium (2013). A Compositional-Distributional Semantic Model for Searching Complex Entity Categories, ACL *SEM (2016) 103 Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent … LSA, ESA, W2V, GLOVE, … s0 p0 o0
  • 104. The vector space is segmented 104 Dimensional reduction mechanism! A Distributional Structured Semantic Space for Querying RDF Graph Data, IJSC 2012
  • 108. Building on Word Vector Space Models • But how can we represent the meaning of longer phrases? • By mapping them into the same vector space! the country of my birth the place where I was born
  • 109. Compositionality • The meaning of a complex expression is a function of the meaning of its constituent parts. carnivorous plants digest slowly
  • 110. Compositionality Principles Words in which the meaning is directly determined by their distributional behaviour (e.g., nouns). Words that act as functions transforming the distributional profile of other words (e.g., verbs, adjectives, …).
  • 111. Compositionality Principles • Take the syntactic structure to constitute the backbone guiding the assembly of the semantic representations of phrases. • A correspondence between syntactic categories and distributional objects.
  • 113. Distributional functions as linear transformations • Distributional functions are linear transformations on semantic vector/tensor spaces. • Matrix: First-order, one argument distributional functions. • Used to represent adjectives and adverbs.
  • 114. Example: Adjective + Noun • Adjective = a function from nouns to nouns,
  • 115. Inducing distributional functions from corpus data - Distributional functions are induced from input to output transformation examples - Regression techniques commonly used in machine learning.
  • 116. How should we map phrases into a vector space? Recursive Neural Networks
  • 117. Compositional-distributional model for paraphrases A Compositional-Distributional Semantic Model for Searching Complex Entity Categories, *SEM (2016)
  • 118. Software: Indra • Semantic approximation server • Multi-lingual (12 languages) • Multi-domain • Different compositional models https://github.com/Lambda-3/indra Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness using Machine Translation, EKAW, (2016).
  • 119. Software: Gensim https://radimrehurek.com/gensim/models/word2vec.html Radim & Sojka, Software Framework for Topic Modelling with Large Corpora, LREC, 2010
  • 120. Data: PPDB Ganitkevitch et al., PPDB: The Paraphrase Database, LREC, 2013 http://paraphrase.org/#/download • 16 languages
  • 121. Recursive vs recurrent neural networks 1
  • 122. Segmented Spaces vs Unified Space s0 p0 o0 s0 p0 o0 • Assumes is <s,p,o> naturally irreconcilable. • Inherent dimensional reduction mechanism. • Facilitates the specialization of embedding-based approximations. • Easier to compute identity. • Requires complex and high- dimensional tensorial model.
  • 123. How to access Distributional- Knowledge Graphs efficiently? • Depends on the target operations in the Knowledge Graphs (more on this later).
  • 124. How to access Distributional- Knowledge Graphs efficiently? s0 p0 o0 s0 q Inverted index sharding disk access optimization … Multiple Randomized K-d Tree Algorithm The Priority Search K-Means Tree algorithm Database + IR Query planning Cardinality Indexing Skyline Bitmap indexes … Structured Queries Approximation Queries
  • 125. How to access Distributional- Knowledge Graphs efficiently? s0 p0 o0 Database + IR Structured Queries Approximation Queries
  • 126. Software: StarGraph • Distributional Knowledge Graph Database. • Word embedding Database. https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  • 127. Emerging perspectives • Graph-based data models + Distributional Semantic Models (Word embeddings) have complementary semantic values. • Graph-based Data Models: – Facilitates querying, integration and rule-based reasoning. • Distributional Semantic Models: – Supports semantic approximation, coping with vocabulary variation.
  • 128. Emerging perspectives • AI systems require access to comprehensive background knowledge for semantic interpretation tasks. • Inheriting from Information Retrieval and Databases: – General Indexing schemes, – Particular Indexing schemes, • Spatial, temporal, topological, probabilistic, causal, … – Query planning, – Data compression, – Distribution, – … even supporting hardware strategies.
  • 129. Emerging perspectives • One size of embedding does not fit all: Operate with multiple distributional + compositional models for different data model types (I, C, P), different domains and different languages. • Inheriting from Information Retrieval and Databases: – Indexing schemes, – Query planning, – Data compression, – Query distribution, – even supporting hardware.
  • 131. Semantic Integration • Task: Mapping near-synonymic term references to a canonical identifier. • Goal: Reduce the entropy (complexity) of the underlying KG. • Operations: – Co-reference Resolution – (Named) Entity Linking – Predicate Reconciliation • Common aspects: – Highly dependent on the context of the mention. – Highly dependent on target entity background knowledge.
  • 132. Software: Cobalt • KG-based co-reference resolution. Co Cobalt https://github.com/semanc/cobalt (to appear)
  • 133. Software: AGDISTIS • Agnostic Disambiguation of Named Entities Using Linked Open Data http://aksw.org/Projects/AGDISTIS.html Usbeck et al. AGDISTIS - Agnostic Disambiguation of Named Entities Using Linked Open Data, ECAI, 2015
  • 134. Software: StarGraph • Predicate Reconciliation (Distributional-semantics based). https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  • 136.
  • 137. Sparse, large vocabulary, heterogeneous, schema-less databases 10s-100s attributes 1,000s-1,000,000s attributes before 2000 circa 2017 1,000s-1,000,000s attributes Brodie & Liu, 2010
  • 139. The Vocabulary Problem Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent Latino origins selected JudgeHigh Obama Last US president
  • 140. “On our best behaviour” “It is not enough to build knowledge bases without paying closer attention to the demands arising from their use.” Levesque, 2013 “We should explore more thoroughly the space of computations between fact retrieval and full automated logical reasoning.”
  • 141. Schema-agnostic queries Query approaches over structured databases which allow users satisfying complex information needs without the understanding of the representation (schema) of the database.
  • 142. First-level independency (Relational Model) “… it provides a basis for a high level data language which will yield maximal independence between programs on the one hand and representation and organization of data on the other” Codd, 1970 Second-level independency (Schema-agnosticism)
  • 145. From Semantic Tractability to Semantic Resolvability Semantic Tractability (Popescu et al., 2004) - Focuses on soundness and completeness conditions for mapping natural language queries to databases. - Focuses on a restricted class of semantic mappings. Semantic Resolvability (Freitas et al., 2014) - Provides a formal model for classifying query-dataset mappings for schema-agnostic queries. Freitas et al., On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study, NLIWOD 2014 George (2005) and Sheth & Kashyap (1990)
  • 146.
  • 147. Towards an Information- Theoretical Model for Schema- agnostic Semantic Matching Semantic Complexity & Entropy: Configuration space of semantic matchings. • Query-DB semantic gap. • Ambiguity, synonymy, indeterminacy, vagueness. Freitas et al. How hard is this query? Measuring the Semantic Complexity of Schema- agnostic Queries, IWCS 2015
  • 148. Minimizing the Semantic Entropy for the Semantic Matching Definition of a semantic pivot: first query term to be resolved in the database. • Maximizes the reduction of the semantic configuration space. • Less prone to more complex synonymic expressions and abstraction-level differences.
  • 149. Minimizing the Semantic Entropy for the Semantic Matching Definition of a semantic pivot: first query term to be resolved in the database. • Maximizes the reduction of the semantic configuration space. • Less prone to more complex synonymic expressions and abstraction-level differences. • Semantic pivot serves as interpretation context for the remaining alignments. • proper nouns >> nouns >> complex nominals >> adjectives , verbs.
  • 151. Semantic Relatedness Measure as a Ranking Function A Distributional Approach for Terminological Semantic Search on the Linked Data Web, ACM SAC, 2012153
  • 152. Semantic Pivoting + Distributional Semantics  Contextual mechanism for the distributional semantic approximation.
  • 153. Search and Composition Operations  Instance search - Proper nouns - String similarity + node cardinality  Class (unary predicate) search - Nouns, adjectives and adverbs - String similarity + Distributional semantic relatedness  Property (binary predicate) search - Nouns, adjectives, verbs and adverbs - Distributional semantic relatedness  Navigation  Extensional expansion - Expands the instances associated with a class.  Operator application - Aggregations, conditionals, ordering, position  Disjunction & Conjunction  Disambiguation dialog (instance, predicate) 155 Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional- Compositional Semantics Approach, IUI 2014
  • 154. Distributional Inverted Index Distributional-Relational Model Reference Commonsense corpora Core semantic approximation & composition operations Semantic Parser Query Plan Scalable semantic parsing Learn to Rank Question Answers
  • 155. I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … …
  • 156. - Vector neighborhood density - Semantic differential I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … 𝜌
  • 157. - Vector neighborhood density - Semantic differential I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … Δ𝑠𝑟 Δ𝑟 Semantic pivoting
  • 158. - Vector neighborhood density - Semantic differential - Distributional compositionality I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … t h 0t m1 0 t m2 0 o t h 0t m1 0 t m1 0 = … … … …
  • 159. Query Pre-Processing (Question Analysis) • Transform natural language queries into triple patterns.
  • 160. Query Pre-Processing (Question Analysis) • Transform natural language queries into triple patterns.
  • 164.
  • 165.
  • 172.
  • 173.
  • 174.
  • 175.
  • 176.
  • 177.
  • 178.
  • 179. What to expect (@ QALD1) F1-Score: 0.72 MRR: 0.5 Freitas & Curry, Natural Language Queries over Heterogeneous Linked Data Graphs, IUI (2014).
  • 180. Addressing the Vocabulary Problem • Hierarchy of approximation spaces – Probabilistic justification. – Semantic pivoting. How hard is the Query? Measuring the Semantic Complexity of Schema-Agnostic Queries, IWCS (2015). A Distributional Approach for Terminological Semantic Search on the Linked Data Web, ACM SAC (2012). Schema-agnostic queries over large-schema databases: a distributional semantics approach, PhD Thesis (2015). On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study, NLIWoD (2015).
  • 181. How to cope with the vocabulary problem? • Embed resources on distributional spaces. • Use heuristics to minimize approximation errors. • Interaction Pattern (Semantic best-effort): Search >> Disambiguate >> Learn Distributional semantics = domain and language transportability. = high search recall and relevance ranking.
  • 182. Semantic Parsing on Freebase from Question-Answer Pairs, Berant et al., EMNLP 2013 Sempre • Semantic parser for QA. • Parses Natural language to Freebase SPARQL query. • Doesn’t require logical forms for training, can be trained from questions / answers directly.
  • 183. Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
  • 184. Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
  • 185. Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
  • 186. Formulating the ML Problem Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
  • 188. Alignment Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
  • 189. Overnight Semantic Parsing Liang, SEMPRE: Semantic Parsing with Execution Percy, 2015.
  • 190. Neural Relation Detection • Goal: Detect KB-specific entities and relations in NL, map directly to query. • Problems: relation chains, relations unseen in training data… Improved Neural Relation Detection for Knowledge Base Question Answering. Yu et al., ACL 2017
  • 192. • Encode relations on different levels: relation and word level • Use multi-layer (Bi-)LSTM for question encoding • utilize both layers in final representation vector • => abstract way of representing context Neural Relation Detection
  • 193. What to expect (@SimpleQuestions) Accuracy: 0.93 Neural Semantic Parsing over Multiple Knowledge-bases. Herzig et al, ACL 2017
  • 194. Neural Semantic Parsing • Treat Semantic Parsing as a Sequence-to-Sequence problem Neural Semantic Parsing over Multiple Knowledge-bases. Herzig et al, ACL 2017 (Liang, 2013) Shared structural regularity
  • 195.
  • 196. What to expect (@ Overnight) Accuracy: 0.79 Neural Semantic Parsing over Multiple Knowledge-bases. Herzig et al, ACL 2017
  • 197. Software: StarGraph • Semantic parsing. https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  • 198. Software: Sempre Semantic Parser & QA System https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014. Sempre
  • 199. Memory Networks Slide credit: Jason Weston • Class of models that combine large memory with learning component that can read and write to it. • Most ML has limited memory which is more-or-less all that’s needed for “low level” tasks e.g. object detection. • Motivation: long-term memory is required to read a story (or watch a movie) and then e.g. answer questions about it. • We study this by building a simple simulation to generate ``stories’’. We also try on some real QA data.
  • 200. MCTest comprehension data (Richardson et al.) James the Turtle was always getting in trouble. Sometimes he'd reach into the freezer and empty out all the food. Other times he'd sled on the deck and get a splinter. His aunt Jane tried as hard as she could to keep him out of trouble, but he was sneaky and got into lots of trouble behind her back. One day, James thought he would go into town and see what kind of trouble he could get into. He went to the grocery store and pulled all the pudding off the shelves and ate two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He didn't pay, and instead headed home. His aunt was waiting for him in his room. She told James that she loved him, but he would have to start acting like a well-behaved turtle. After about a month, and after getting into lots of trouble, James finally made up his mind to be a better turtle. Q: What did James pull off of the shelves in the grocery store? A) pudding B) fries C) food D) splinters … Slide credit: Jason Weston
  • 201. MCTest comprehension data (Richardson et al.) James the Turtle was always getting in trouble. Sometimes he'd reach into the freezer and empty out all the food. Other times he'd sled on the deck and get a splinter. His aunt Jane tried as hard as she could to keep him out of trouble, but he was sneaky and got into lots of trouble behind her back. One day, James thought he would go into town and see what kind of trouble he could get into. He went to the grocery store and pulled all the pudding off the shelves and ate two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He didn't pay, and instead headed home.His aunt was waiting for him in his room. She told James that she loved him, but he would have to start acting like a well-behaved turtle. After about a month, and after getting into lots of trouble, James finally made up his mind to be a better turtle. Q: What did James pull off of the shelves in the grocery store? A) pudding B) fries C) food D) splinters Q: Where did James go after he went to the grocery store? … Slide credit: Jason Weston Problems: … it’s hard for this data to lead us to design good ML models … 1) Not enough data to train on (660 stories total). What ends up happening ordered 15 bags of fries. He didn't pay, and instead headed home. 2) If we get something wrong we don’t really understand why: every question potentially involves a different kind of reasoning, our model has to do a lot of different things. Our solution: focus on simpler (toy) subtasks where we can generate data to check what the models we design can and cannot do.
  • 202. Example Slide credit: Jason Weston Dataset in simulation command format. Dataset after adding a simple grammar. antoine go kitchen antoine get milk antoine go office antoine drop milk antoine go bathroom where is milk ? (A: office) where is antoine ? (A: bathroom) Antoine went to the kitchen. Antoine picked up the milk. Antoine travelled to the office. Antoine left the milk there. Antoine went to the bathroom. Where is the milk now? (A: office) Where is Antoine? (A: bathroom)
  • 203. Simulation Data Generation Slide credit: Jason Weston Aim: built a simple simulation which behaves much like a classic text adventure game. The idea is that generating text within this simulation allows us to ground the language used. Actions: go <location>, get <object>, get <object1> from <object2>, put <object1> in/on <object2>, give <object> to <actor>, drop <object>, look, inventory, examine <object>. Constraints on actions: • an actor cannot get something that they or someone else already has • they cannot go to a place they are already at • cannot drop something they do not already have • …
  • 204. (1) Factoid QA with Single Supporting Fact John is in the playground. Bob is in the office. Where is John? A:playground (2) Factoid QA with Two Supporting Facts John is in the playground. Bob is in the office. John picked up the football. Bob went to the kitchen. Where is the football? A:playground Where was Bob before the kitchen? A:office … (total 20 Tasks) Slide credit: Jason Weston
  • 205. Matching function Slide credit: Jason Weston Match (Where is the football ?, John picked up the football) • Use a qTUTUd embedding model with word embedding features. • LHS features: Q:Where Q:is Q:the Q:football Q:? • RHS features: D:John D:picked D:up D:the D:football • QDMatch:the QDMatch:football • For a given Q, we want a good match to the relevant memory slot(s) containing the answer, e.g.: (QDMatch:football is a feature to say there’s a Q&A word match, which can help.) The parameters U are trained with a margin ranking loss: supporting facts should score higher than non-supporting facts.
  • 206. What to expect (@ QA over Reverb Data) F1-Score: 0.82 MemNN (BoW Features)
  • 207. image credit: Towards AI-complete question answering: a set of prerequisite toy tasks
  • 208. Slide credit: Jason Weston Positional Reasoning The triangle is to the right of the blue square. The red square is on top of the blue square. The red sphere is to the right of the blue square. Is the red sphere to the right of the blue square? A:yes Is the red square to the left of the triangle? A:yes Path Finding The kitchen is north of the hallway. The den is east of the hallway. How do you go from den to kitchen? A: west, north Failing Tasks
  • 209. Slide credit: Jason Weston Counting Daniel picked up the football. Daniel dropped the football. Daniel got the milk. Daniel took the apple. How many objects is Daniel holding? A: two Lists / Sets Daniel picks up the football. Daniel drops the newspaper. Daniel picks up the milk. What is Daniel holding? milk, football Failing Tasks
  • 210. Key-Value Memory Networks • Represent memories as key-value pairs: => more fine-grained attention mechanism. • key: information important for comparing memory with question. • value: information important for comparing memory with answer. Key-Value Memory Networks for Directly Reading Documents, Miller et al., EMNLP 2015 Question Answer K V Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks. Das et al, ACL 2017
  • 211. Key-Value Memory Networks • How to tune? = what to encode in key/value • Possibilities for keys: entities, KB subjects, … • Possibilities for values: sentences, word windows, relations, document title
  • 212. Attention-based LSTM • Use LSTM to obtain a representation of <q,a>. • Better: Implement attention to reflect that different words in q are mostly discriminative for a. • Even better: utilize different aspects of answer entity a (such as type, context). An End-to-End Model for Question Answering over Knowledge Base with Cross- Attention Combining Global Knowledge, Hao et al, ACL 2017.
  • 213.
  • 214.
  • 215. What to expect (@ WebQuestions) F1-Score (Attention-based LSTM): 42.9 F1-Score (MemNN): 42.2
  • 216. Reverse Engineering Queries VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 217. Problem Definition [Tran et al. 2013] VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 218. Entity Search by Example Metzger et al. 2013, Sobczak et al. 2015 Use Case: • Similar Products • User recommendation VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 219. Entity Search by Example Bonifati et al. 2015 Use Case: • Proteins interactions/co-expression • Similar processes/behaviour VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 220. Reverse Engineering SPARQL Queries Use Case: • Schema-agnostic/End-user queries Arenas et al. 2016 VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 221. Reverse Engineering SPARQL Queries Arenas et al. 2016 VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 222. Reverse Engineering SPARQL Queries Mottin et al. 2014 VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 223. Graph Query by Example Jayaram et al. 2015 VLDB 2017 Tutorial D. Mottin, M. Lissandrini, T. Palpanas, Y. Velegrakis
  • 224. Emerging perspectives Semantic Parsing: • Structured queries as explanations. • Semantic pivoting heuristics. • Diversity of compositional/distributional models as key. • Memory/Key-value NNs: architecture to model story timelines. • End-to-end vs componentised architectures.
  • 225. Emerging perspectives Reverse Engineering Queries: • Can work as an inference method: – Learn-by-example – Analogical reasoning – Similarity queries
  • 227. The Problem Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 228. The Problem Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 229. Formulating the Distributional- Relational Representation Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 230. Formulating the Distributional- Relational Representation Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 231. Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 232. Complex Relations Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 233. TransD Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 234. KG2E Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 235. Complex Relations: RL4KG Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 236. RL4KG with Entity Descriptions Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 237. RL4KG with Entity Descriptions Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 238. Relation Paths • Complex Inference patterns for composition. Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 241. Representation of Relation Paths Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 242. Path-based TransE Addition, multiplication, RNNv Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  • 243. What to expect (PTransE@FB15K) Relation Prediction
  • 244. What to expect (PTransE@FB15K) Relation Prediction
  • 245. Software: KB2E Relation Extraction Knowledge Graph Embeddings including TransE, TransH, TransR and PTransE. https://github.com/thunlp/KB2E Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, Xuan Zhu. Learning Entity and Relation Embeddings for Knowledge Graph Completion. The 29th AAAI Conference on Artificial Intelligence (AAAI'15). KB2E
  • 247. Recognizing and Justifying Text Entailments (TE) using Definition KGs
  • 248. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  • 249. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  • 250. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  • 251.
  • 252.
  • 257. Explainable AI “The right to explanation” “The data subject should have the right not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing …” “… such processing should be subject to suitable safeguards, … to obtain an explanation of the decision …”
  • 258. What to expect (TE@Boeing-Princeton-ISI) F1-Score: 0.59 What to expect (TE@Guardian Headline Samples) F1-Score: 0.53 Santos et al., Recognizing and Justifying Text Entailment through Distributional Navigation on Definition Graphs, AAAI, 2018.
  • 259. Emerging perspectives • Distributional-relational models in KB completion explored a large range of representation paradigms. – Opportunity for exporting these representation models to other tasks. • Definition-based models can provide a corpus-viable, low-data and explainable alternative to embedding- based models.
  • 261. Entity Linking Open IE Taxonomy Extraction Integration Arg. Classif.Co-reference Resolution KG Completion Natural Language Inference Named Entity Recognition Semantic Parsing KG Construction Inference Distributional Semantics Server Query By Example Query spatial temporal probabilistic causal Indexes NL Generation NL Query Answers Explanations Definition Extraction
  • 262. Entity Linking Integration Co-reference Resolution KG Completion Natural Language Inference Named Entity Recognition Semantic Parsing KG Construction Inference Distributional Semantics Server Query By Example Query spatial temporal probabilistic causal Indexes NL Generation NL Query Answers Explanations M T M T Open IE Taxonomy Extraction Arg. Classif. Definition Extraction
  • 264. Platform: Gerbil, Hobbit • Benchmark Platform for Linked Data. https://project-hobbit.eu/ http://aksw.org/Projects/GERBIL.html
  • 265.
  • 266.
  • 267. • SimpleQuestions: 108k manually annotated question-triple pairs over FreeBase. • WebQuestions: 5810 Wh-Questions (with variations) with 1 entity generated by Google Suggest API, manually annotated answers using the Freebase page of the Entity. • QALD: Evaluation suite for Linked Data. • Spades: Fill-in-the-blank queries generated from ClueWeb. Answers contain at least 1 entity connected to Freebase. • GraphQuestions: 5166 NL/query pairs by manually paraphrasing 500 Freebase Queries to NL. Datasets: QA over KB
  • 268. Datasets: Semantic parsing • WebQuestionsSP: subset of WebQuestions, question/semantic parse pairs. • Overnight: 13682 NL/logical form pairs from 8 different domains. logical forms generated and annotated by humans. • GeoQuery: 880 NL/DB query pairs about US geography. • Free917: 917 NL/logical form pairs generated from Freebase properties.
  • 269. • WikiMovies: >100k Question-Answer pairs about movies/actors. • WikiQA: ~1000 Question/Sentence pairs. sentences gathered from Wikipedia Datasets: QA over Text
  • 270. Datasets: Misc • bAbI: 20 tasks mostly involving QA over synthetic stories. • Paraphrases: 18M question-paraphrase pairs gathered from WikiAnswers. • FB15k: Knowledge Base completion Dataset derived from Freebase. • VQA: >200k images with human-annotated questions & answers. • GeoQA: 263 question/answer pairs about geographical entities and spatial/logical relations . • ScienceExamQuestions: 2524 4th grade and 675 8th grade multiple choice questions.
  • 273. KGs in Use … and many others
  • 274. Asked Financial Analysts ”What ruins your day?” Customer ranked pain points Analyst workday Source: Customer meetings, TR internal analyst survey 20% 15% 20% 10% 15% 20%Assemble Synthesize Interpret Meetings Financial Modeling Communicate “An analyst used to cover 40 firms, now it is 150 and the tools haven’t changed. Director, $4B US L/S equity fund “They all still do it manually” Equities, Market data team $20B+ hedge fund “The problem has gotten worse with more data and more information” Director, Research/Tech, $10B Multi-strat 1. Information overload 2. Understanding relationships 3. Unable to track impact events 4. Cannot link internal & external research, especially text What they told Geoff Horrell & Dan Bennett
  • 275. What’s in the Graph? • Organizations – including names, address, identifiers, Country of HQ/Incorp • Industry Classification • Hierarchy – Parent, Ultimate Parent, Affiliates • Officers & Directors • Job History, Education • Suppliers & Customers • Comparable Companies • Joint Ventures & Strategic Alliances • Meta-Data Geoff Horrell & Dan Bennett
  • 276. • 125,000,000 (equity instruments & quotes). • 75,000,000 meta-data objects (countries, regions, cities, currencies, commodities, holidays, industry schemes, scripts, languages, time zones, value domains, units, identifiers). • 200,000,000 strategic relationships (supplier, customer, competitor, joint venture, alliance, industry, ownership, affiliate). What’s in the Graph? Geoff Horrell & Dan Bennett
  • 277. What’s in the Graph? Geoff Horrell & Dan Bennett
  • 278. All Connections between Onshore Oil Drilling & Venezuela Geoff Horrell & Dan Bennett
  • 279. One Week Snapshot • 6,778 news articles with company news where at least one organization has 80% relevance to the article. • 135,267 companies are 2 steps away. • 217,387 strategic relationships. • Typical analyst portfolio is 200 companies. • Each customer creates their own relative weights for each type of relationship. • Requires around 800,000 shortest path calculations to deliver the ranked news feed. Each calculation optimised to take 10ms. Geoff Horrell & Dan Bennett
  • 280.
  • 281.
  • 286. Explainable Findings From Tensor Inferences Back to KGs
  • 287. Explainable Findings From Tensor Inferences Back to KGs
  • 288. Take-away Message • The evolution of methods, tools and the availability of data in NLP creates the demand for a knowledge representation method to support complex AI systems. • A relaxed version of RDF (RDF-NL?) can provide this answer. – Establishes a dialogue with a standard (with existing data). – Inherits optimization aspects from Databases. • Word-embeddings (DSMs) + compositional models + RDF. • Moving beyond facts and taxonomies: rhetorical structures, arguments, polarity stories, pragmatics.
  • 289. Take-away Message • Syntactical and lexical features can go a long way for structuring text. – Context-preserving. • Integration (entity reconciliation) as semantic-best effort. – Embrace schema on read. • KGs can support explainable AI: – Meeting point between extraction, reasoning and querying. – Definition-based models. • Inherit infrastructures from DB and IR.
  • 290. Take-away Message Opportunities: • ML orchestrated pipelines with: – Richer discourse-representation models. – Explicit semantic representations (centered on KGs). – Different compositional/distributional models (beyond W2V & Glove) • KGs and impact on explainability. • Quantifying domain and language transportability.
  • 291. References Bordes & Gabrilovich, Constructing and Mining Web-scale Knowledge Graphs, KDD 2014 Tutorial. Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015. Mottin & Lissandrini, New Trends on Exploratory Methods for Data Analytics, VLDB 2017 Tutorial. Ren, Su & Yan, Construction and Querying of Large-scale Knowledge Bases, CIKM 2017 Tutorial. Gottschalk & Demidova, EventKG: A Multilingual Event-Centric Temporal Knowledge Graph. Proc. of the Extended Semantic Web Conference (ESWC 2018). Gottschalk & Demidova, EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric Knowledge Graph. Proc. of the ESWC 2018 Satellite Events. Bennett, Building a Knowledge Graph, https://www.slideshare.net/DanBennett47/building-a-knowledge-graph-86792821 Geoff Horrell, Thomson Reuters Knowledge Graph Feed https://www.youtube.com/watch?v=MzGkfIfSrko