Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
1. Graph Pattern Entity Ranking Model
for Knowledge Graph Completion
Takuma Ebisu @Sokendai, Ryutaro Ichise @NIIC
NAACL 2019
Introduced by Naomi Shiraishi
January 29th, 2020
2. “Graph Pattern Entity Ranking Model for Knowledge Graph
Completion”
• Why this work?
• Knowledge graph is important to deal lots of AI tasks.
• But lots of missing facts (=relations, edges) exist -> Link prediction task
• Many Knowledge Graph Embedding models were developed to tackle the task,
however…
• Black box! User has no idea how the information is processed.
• Model is difficult to interpret.
• Approach
• Use Graph Patterns in knowledge graph
• GPro: Use conditional probability
• GRank: Use Graph Pattern Association Rues as ranking model of entities
• Link Prediction で SOTA!
• Easy to implement
• Validation
• Dataset: WN18, FB15k, WN18RR, FB15k-237
• Metrics: MRR, HITS@n
3. • Proposed by Fan et al. (2015) for social network graph
• This paper proposes GPARs for Knowledge Graph
Preparation: Graph Pattern Association Rues (GPARs)
𝑡
ℎ
𝑟
4. Preparation: Graph Pattern Association Rues (GPARs)
Graph Pattern:
Graph Pattern Association Rues:
Example
Set of variables
Set of Relations
Located_in
5. Preparation: Queries and answers
• Devide a knowledge graph G into queries and answers to use as
training data for the model
Training queries
The answers of training queries
?
ℎ
𝑟
𝑡
𝑟
?
6. Preparation: Graph Pattern Matching Function
• A matching function of 𝐺𝑃(',)) on ℎ, 𝑡 ∈ 𝐸 × 𝐸 is
an injective function(単射関数) m: 𝑉12 → 𝐸
• Matching function satisfies:
• : set of all matching functions
Example
matching function of
7. Link prediction 1: Graph Pattern Probability Model (GPro)
• Use 2 confidence values (confidence for tail and head)
Conditional Probability
h を通るGraph Patternの先の候補の数
hを通るGraph Patternの先が
正しときのentityの数
9. Link prediction 2: GRank
• Consider GPARs as entity ranking system
• Define a scoring function whose arguments are Graph Pattern and a
pair of entities:
• When applying a GPARs to answer a query, create rankings of the
candidate entities by their score
• We can evaluate GPAR as an entity ranking system by evaluating
output rankings.
10. Link prediction 2: GRank
• Distributed Ranking
• Given a pattern GP and a query (h, r, ?), we obtain distributed
rankings of entities according to their scores:
𝑑𝑟𝑎𝑛𝑘K,W = X
6
Y
0
When 𝑎 + 1 ≤ 𝑖 ≤ 𝑎 + 𝑏
Otherwise
Entity j
rank i
1
2
3
…
a: スコアがentity j より⼤きいentityの数
b: スコアがentity j と同じentity の数
11. Link prediction 2: GRank
• To evaluate GPAR, define metric to evaluate distributed rankings of
entities by generalizing the average precision
• For a pattern GP and a training query (h, r, ?), the distributed
precision at rank k is defined:
Entity j
rank i
1
2
…
…
rank k
12. Link prediction 2: GRank
• For a pattern GP and a training query (h, r, ?), the distributed average
precision is defined:
• The distributed mean average precision
for a GPAR GP->r is defined:
Entity j
k
1
2
3
…
13. Experiments: Baselines
• Knowledge graph embedding models
• Translation-based: TransE (Bordes et al. 2013), TrousE (Ebisu and Ichise, 2018)
• Simple (h+r = t) and effective
• Bilinear: RESCAL (Nickel et al., 2011)
• Neural network-based: GCNs (Duvenaud et al., 2015)
• Observed Feature models:
predict based on observable features in the graph (???)
• Node+LinkFeat (Toutanova et al., 2015)
• only use on-hop information
• PRA (Lao and Cohen, 2010)
• use multi-hop information
14. Results
• WN18 from WordNet (1995): Well constructed. Few missing or wrong facts
• FB15k from Freebase (2008): Many missing facts.
• WN18 and FB15k have redundancy in the form of reverse relations
-> WN18RR and FB15k-237: the inverse relation of other relations are removed
• Graph pattern is restricted to connected and closed (No branch between x and y)
• Limit the size of GP to be <= 3
Mean reciprocal rank
Observed
feature models
The proportion of test queries whose corresponding
entities are ranked in the top n of the obtained rankings
Knowledge graph
embedding models
eliminating entities
whose
corresponding
triples (except the
target triple) were
included in the
training dataset.
15. Results
• Example of antecedent patterns for dMAP_tail which were given high
ranks by GRank for FB15k
high Low Low
This only works when an
individual has more
than two siblings
Individual’s parents are
often missing in FB15k
16. Conclusions
• Defined GPAR for a knowledge graph.
• Defined the standard confidence measures of GPARs for the link
prediction.
• Introduced GPro model
• Introduced GRank model using distributed ranking -> SOTA
• Future work: Extend GRank to use more complex pattern
• Graph pattern size > 3