New Adventures in RDF2vec

02/14/22 Heiko Paulheim 1
New Adventures in RDF2vec
Heiko Paulheim
University of Mannheim
Heiko Paulheim
also includes
the latest
adventures
Alert:
contains spoilers
on future
publications

Graphs vs. Vectors
• Data Science tools for prediction etc.
– Python, Weka, R, RapidMiner, …
– Algorithms that work on vectors, not graphs
• Bridges built over the past years:
– FeGeLOD (Weka, 2012), RapidMiner LOD Extension (2015),
Python KG Extension (2021)
?

Graphs vs. Vectors
• Transformation strategies (aka propositionalization)
– e.g., types: type_horror_movie=true
– e.g., data values: year=2011
– e.g., aggregates: nominations=7
?

Graphs vs. Vectors
• Observations with simple propositionalization strategies
– Even simple features (e.g., add all numbers and types)
can help on many problems
– More sophisticated features often bring additional improvements
• Combinations of relations and individuals
– e.g., movies directed by Steven Spielberg
• Combinations of relations and types
– e.g., movies directed by Oscar-winning directors
• …
– But
• The search space is enormous!
• Generate first, filter later does not scale well

Towards RDF2vec
• Excursion: word embeddings
– word2vec proposed by Mikolov et al. (2013)
– predict a word from its context or vice versa
• Idea: similar words appear in similar contexts, like
– Jobs, Wozniak, and Wayne founded Apple Computer Company in April
1976
– Google was officially founded as a company in January 2006
– usually trained on large text corpora
• projection layer: embedding vectors

From Word Embeddings to Graph Embeddings
• Basic idea:
– extract random walks from an RDF graph:
Mulholland Dr. David Lynch US
– feed walks into word2vec algorithm
• Order of magnitude (e.g., DBpedia)
– ~6M entities (“words”)
– start up to 500 random walks per entity, length up to 8
→ corpus of >20B tokens
• Result:
– entity embeddings
– most often outperform other propositionalization techniques
director nationality
Ristoski and Paulheim (2016): RDF2vec: RDF graph embeddings for data mining

A First Glance at RDF2vec Embeddings
• Observation: close projection of similar entities
– can be exploited by downstream ML algorithms (think: k-NN)
Ristoski and Paulheim (2016): RDF2vec: RDF graph embeddings for data mining

The End of Petar’s PhD Journey…
• ...and the beginning of the RDF2vec adventure

Embeddings for Link Prediction
• RDF2vec example
– similar instances form clusters, direction of relation is ~stable
– link prediction by analogy reasoning (Japan – Tokyo ≈ China – Beijing)
Ristoski & Paulheim: RDF2vec: RDF Graph Embeddings for Data Mining. ISWC, 2016

Embeddings for Link Prediction
• In RDF2vec, relation preservation is a by-product
• TransE (and its descendants): direct modeling
– Formulates RDF embedding as an optimization problem
– Find mapping of entities and relations to Rn
so that
• across all triples <s,p,o>
Σ ||s+p-o|| is minimized
• try to obtain a smaller error
for existing triples
than for non-existing ones
Bordes et al: Translating Embeddings for Modeling Multi-relational Data. NIPS 2013.
Fan et al.: Learning Embedding Representations for Knowledge Inference on Imperfect and Incomplete
Repositories. WI 2016

Link Prediction vs. Node Embedding
• Hypothesis:
– Embeddings for link prediction also cluster similar entities
– Node embeddings can also be used for link prediction
Portisch et al. (to appear): Knowledge Graph Embedding for Data Mining vs. Knowledge Graph Embedding
for Link Prediction - Two Sides of the Same Coin?

Using RDF2vec for Link Prediction
• Use embeddings for head and relation, predict tail
– Train separate network for head prediction

Local Embeddings: RDF2vec Light
• Recap: order of magnitude (e.g., DBpedia)
– ~6M entities (“words”)
– start up to 500 random walks per entity, length up to 8
→ corpus of >20B tokens
– “Train once, reuse often”
• In some cases, only a small subset (of 6M) is of interest
– RDF2vec light: “train when needed”
– Runtime: minutes instead of days
Portisch et al. (2020): RDF2Vec Light – A Lightweight Approach for Knowledge
Graph Embeddings

Local Embeddings: RDF2vec Light
• Results:
– Many classification and regression tasks work fine with light
• As good as or sometimes even better (!) than normal RDF2vec
– ...but there is a huge performance drop in tasks like document similarity
• First take away: RDF2vec light works better for
homogeneous sets of entities
Portisch et al. (2020): RDF2Vec Light – A Lightweight Approach for Knowledge
Graph Embeddings

Random vs. non-random Walks
• Maybe random walks are not such a good idea
– They may give too much weight on less important entities and facts
• Strategies:
– Prefer edges with more frequent predicates
– Prefer nodes with higher indegree or PageRank
– …
– They may cover less important entities and facts too little
• Strategies:
– The opposite of all of the above strategies
• The results are mixed
• External signals (e.g., human notions of importance)
– generally work better than graph-internal signals
Cochez et al. (2017): Biased Graph Walks for RDF Graph Embeddings
Al Taweel and Paulheim (2020): Towards Exploiting Implicit Human Feedback for Improving RDF2vec
Embeddings

Random vs. non-random Walks
• Other walking strategies include, but are not limited to…
– Walks with community hops (i.e., random jumps between similar nodes)
– Walklets (i.e., smaller subwalks fed into word2vec)
– Hierarchical walks (i.e., ignoring rarer hops, putting more emphasis on
common connections)
– Walks with wildcards
• The results, again, are mixed
Steenwinckel et al. (2021): Walk Extraction Strategies for Node Embeddings with RDF2Vec in Knowledge
Graphs. Database and Expert Systems Applications - DEXA 2021 Workshops

Similarity vs. Relatedness
• Closest 10 entities to Angela Merkel in different vector spaces
Portisch et al. (2022): Knowledge Graph Embedding for Data Mining vs. Knowledge Graph Embedding for
Link Prediction - Two Sides of the Same Coin?

• Why bother?
– Use case: table interpretation (a special case of entity disambiguation)
related
similar

• Recap word embeddings:
– Jobs, Wozniak, and Wayne founded Apple Computer Company in April
1976
– Google was officially founded as a company in January 2006
• Graph walks:
– Hamburg → country → Germany → leader → Angela_Merkel
– Germany → leader → Angela_Merkel → birthPlace → Hamburg
– Hamburg → leader → Peter_Tschentscher → residence → Hamburg
Germany
Angela_Merkel Hamburg
birthPlace
country
leader
Peter_Tschentscher
leader
residence
country

• Surrounding entities indicate relatedness
– Hamburg → country → Germany → leader → Angela_Merkel
• Same entities in similar positions indicate similarity
– Hamburg → leader → Peter_Tschentscher → residence → Hamburg
• Someone is a leader vs. something has a leader
• Solution approach: use embedding approach that respects positions
– CWINDOW / Structured Skip-ngram
Portisch and Paulheim (2021): Putting RDF2vec in Order.

Order-Aware RDF2vec
• Using an order-aware variant of word2vec
• Experimental results:
– order-aware RDF2vec most often outperforms classic RDF2vec
– a bit more computation heavy, but still scales to DBpedia etc.
Ling et al. (2015): Two/Too Simple Adaptations of Word2Vec for Syntax Problems.

• (s-)RDF2vec allows an explicit trade off w/ different walk strategies
Mannheim
Baden-
Württemberg
Germany
Adler
Mannheim
SAP Arena
Reiss-
Engelhorn
-Museum
location
location
location
federal
state
country
location
city
stadium
Knowledge Graph
Walk Generation
Adler_Mannheim → city → Mannheim → country → Germany
Adler_Mannheim → stadium → SAP_Arena → location → Mannheim
SAP_Arena → location → Mannheim → country → Germany
...
“Classic” RDF2vec walks
city → Mannheim → country
stadium → SAP_Arena → location
location → Mannheim → country
...
s-RDF2vec walks
+
RDF2vec “union walks”
RDF2vec “classic”
RDF2vec “edge”
concatenated
vector
Global PCA
Test Cases
concatenated
vector
(task-specific
subset)
w
2
w
1
(weighted)
local PCA
Portisch et al. (under review): s-RDF2vec: Injecting Knowledge Graph Structure
Into RDF2vec Entity Embeddings.

• s-RDF2vec
– using different walk strategies
– combining different vector spaces (weighted combinations are possible)
• 10 closest neighbors to Mannheim:
Portisch et al. (under review): s-RDF2vec: Injecting Knowledge Graph Structure
Into RDF2vec Entity Embeddings.

To Materialize or Not to Materialize?
May I ask you a question?
Sure, go ahead!

Rumor has it that
RDF2vec performs worse
if you run a reasoner to add inferences
to the graph first...
???

I know it sounds
counter intuitive...
Hmmm...

Hmmm… sounds reasonable.
(Pun intended)
Okay, there might
be an explanation...

We need more beer
experiments

Back Home...
We need more beer
experiments
OK, let’s go!

Experimental Setup
RDF2vec
+ inferences
●
Classification
●
Regression
●
Entity Similarity
●
Entity Relatedness
●
Document Similarity
(a) (b)
Iana and Paulheim (2020): More is not always better: The negative impact of a-box materialization on
RDF2vec knowledge graph embeddings

Experimental Results
• Classification: unmaterialized is better in 60/80 cases
• Regression: unmaterialized is better in 39/60 cases
• Entity similarity: unmaterialized is better in 16/20 cases
• Entity relatedness: unmaterialized is better in 13/20 cases
• But: document similarity: materialized is always better
– task has a very different nature
– more heterogeneity

To Materialize or not to Materialize?
• Explanation 1: materialization skews property distributions

• Explanation 2 is a bit more complex...
• Thought experiment:
– DBpedia mostly does not include persons’ gender
– learn classifier for gender
• Spouse is a symmetric property, but…
– distribution is highly uneven
– 80% of all subjects of spouse are women
spouse
Ayda_Field spouse Robbie_Williams . Graells-Garrido et al: (2012): First Women,
Second Sex: Gender Bias in Wikipedia

• Thought experiment: learn classifier for gender
• Spouse is a symmetric property, but…
– 80% of all subjects of spouse are women
• Assume that an embedding captures that information
– e.g., order-aware RDF2vec
→ a downstream classifier can reach >80% accuracy
• On the other hand
– Materialization completely erases that information
• Bottom line: missing information can be a signal
– Machine learning terminology: MAR vs. MNAR
Iana and Paulheim (2021): More is not Always Better: The Negative Impact of A-box Materialization
on RDF2vec Knowledge Graph Embeddings

Dynamic Knowledge Graphs
• In theory, RDF2vec can
also produce embeddings for
dynamic knowledge graphs
to a certain extent
– given that the neighbors are
all known
– Experiments are still
under way

Understanding the RDF2vec Model Zoo
• Variations
– Walk extraction (e.g., classic, s-RDF2vec, e-RDF2vec)
– Ordered vs. non-ordered
– Skip-gram vs. CBOW
• This alone gives us 12 combinations
of how to train an RDF2vec model
• We assume that not all of them are equally good

• Variations
– Walk extraction (e.g., classic, s-RDF2vec, e-RDF2vec)
– Ordered vs. non-ordered
– Skip-gram vs. CBOW
• Build a systematic collection of basic classification problems
• For example, r.{e} vs. ¬r.{e}
– e.g., person born in NYC vs. person not born in NYC
– here, s-RDF2vec should not be able to solve this

Embeddings and Interpretability
• Hot topic: Explainable AI
– Knowledge Graphs are a favorable ingredient
– Human/machine interpretable knowledge → explainable systems
• However:
– Embeddings replace interpretable axioms
with numeric vectors over non-interpretable dimensions
– Where did the semantics go?
Paulheim (2018): Make Embeddings Semantic Again!

The 2009 Semantic Web Layer Cake

The 2018 Semantic Web Layer Cake
Embeddings

Towards Semantic Vector Space Embeddings
cartoon
superhero
Paulheim (2018): Make Embeddings Semantic Again!

cartoon
superhero
• Approach 1: learn interpretation function
• Each dimension of the embedding model
is a target for a separate learning problem
• Learn a function to explain the dimension
• E.g.:
• Just an approximation used for explanations and justifications
y≈−|∃character .Superhero|

cartoon
superhero
• Approach 2: learn inherently
interpretable embeddings
• Step 1: learn typical patterns
that exist in a knowledge graph
– e.g., graph pattern learning
– e.g., Horn clauses
• Step 2a: use those patterns
as embedding dimensions
– probably not low dimensional
• Step 2b: compact the space
– e.g., use dimensions for mutually exclusive patterns

• Different angle: learn interpretation for similarity function
~similar
type
~same
country
~connected
to same
entity

Explaining Predictions with RDF2vec
• Recap: we can, in principle, create vectors for new entities
• Some explanation models, like LIME, do this:
– Create new artificial entities by perturbation
• In our KG context: add/remove connections
• Predict for new entities
• Learn explanation for predictions
• With that approach, LIME should be applicable to predictions
w/ RDF2vec
Ribeiro et al. (2016): "Why Should I Trust You?": Explaining the Predictions of Any Classifier

Summary
• Knowledge Graph Embeddings with RDF2vec
– Effective processing of large-scale knowledge sources
• Light variant possible for scalability
– Variations visited: walk extraction, order-awareness, materialization, ...
– Encoding of similarity and/or relatedness
• RDF2vec: explicit trade-off is possible!
– Additional insights that are not explicit in the graph
• aka latent semantics

More on RDF2vec
• Collection of
– Implementations
– Pre-trained models
– >45 use cases
in various domains

Thank you!
http://www.heikopaulheim.com
@heikopaulheim

New Adventures in RDF2vec
Heiko Paulheim
University of Mannheim
Heiko Paulheim

New Adventures in RDF2vec

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à New Adventures in RDF2vec

Similaire à New Adventures in RDF2vec (20)

Plus de Heiko Paulheim

Plus de Heiko Paulheim (20)

Dernier

Dernier (20)

New Adventures in RDF2vec