Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction
1. A GRAPH-BASED CROSS-LINGUAL
PROJECTION APPROACH FOR
WEAKLY SUPERVISED RELATION EXTRACTION
The 50th Annual Meeting of the Association for Computational Linguistics
(ACL 2012)
July 11th, 2012, Jeju
Seokhwan Kim (Institute for Infocomm Research)
Gary Geunbae Lee (POSTECH)
4. Problem Definition
• Relation Extraction
To identify semantic relations between a pair of entities
Birthplace
Barack Obama was born in Honolulu , Hawaii .
PER LOC LOC
Considered as a classification problem
4
5. Related Work (1)
• Supervised Learning
Many supervised machine learning approaches have been
successfully applied
• (Kambhatla, 2004; Zhou et al., 2005; Zelenko et al., 2003; Culotta and
Sorensen, 2004; Bunescu and Mooney, 2005; Zhang et al., 2006)
• Semi-supervised Learning
To obtain the annotations of unlabeled instances from the seed
information
• (Brin, 1999; Riloff and Jones, 1999; Agichtein and Gravano, 2000;
Sudo et al, 2003; Yangarber, 2003; Stevenson and Greenwood, 2006;
Zhang, 2004; Chen el al., 2006; Zhou et al., 2009)
5
6. Motivation
• Resources for Relation Extraction
Supervised/Semi-supervised Approaches
• Labeled corpora for supervised learning
• Seed instances for semi-supervised learning
• Available for only a few languages
ACE 2003 Multilingual Training Dataset
• English (252 articles)
• Chinese (221 articles)
• Arabic (206 articles)
• No resources for other languages
Korean
6
7. Related Work (2)
• Self-supervised Learning
To obtain the annotated dataset without any human effort
Using the information obtained from external resources
• Heuristic-based Method (Banko et al., 2007; Banko et al., 2008)
• Wikipedia-based Methods (Wu and Weld, 2010)
• Cross-lingual Annotation Projection
To leverage parallel corpora to project the relation annotations on
the resource-rich source language to the resource-poor target
language (Kim et al., 2010, Kim et al., 2011)
7
9. Overall Architecture
Annotation Parallel
Projection
Corpus
Sentences in Sentences in
Ls Lt
Preprocessing Preprocessing
(POS Tagging, (POS Tagging,
Parsing) Parsing)
NER Word Alignment
Relation
Projection
Extraction
Annotated Annotated
Sentences in Sentences in
Ls Lt 9
10. Direct Projection
(Kim et al., 2010)
• Annotation
• Projection
fE (<Barack Obama, Honolulu>) = 1
Barack Obama was born in Honolulu , Hawaii .
버락 오바마 는 하와이 의 호놀룰루 에서 태어났다
(beo-rak-o-ba-ma) (neun) (ha-wa-i) (ui) (ho-nol-rul-ru) (e-seo) (tae-eo-nat-da)
fK (<버락 오바마, 호놀룰루>) = 1
10
11. Limitations of Direct Projection
• Direct projection approach is still vulnerable to the
erroneous inputs generated by preprocessors
• Main causes of this limitation
Considering alignment between entity candidates only, not any
contextual information
Performed by just a single pass process
11
12. Graph-based Learning
• Semi-supervised learning algorithm
• Defining a graph
The nodes represent labeled and unlabeled examples in a dataset
The edges reflect the similarity of examples
• Learning a labeling function in an iterative manner
It should be close to the given labels on the similar labeled nodes
It should be smooth on the whole graph
• Related Work
Graph-based Learning for Relation Extraction (Chen et al, 2006)
Bilingual projection of POS tagging (Das and Petrov, 2011)
12
13. Graph Construction
• Graph Nodes
Instance Nodes
• Defined for all pairs of entity candidates in both languages
• Each instance node has a soft label vector Y = [y+ y-]
Context Nodes
• For identifying the relation descriptors of the positive instances
• Defined for each trigram which is located between a given entity pair
which is semantically related
• Each context node has a soft label vector Y = [y+ y-]
<ARG1> was born in <ARG2>
<ARG1> was born was born in born in <ARG2> 13
14. Graph Construction
• Edge Weights
Between instance node and context node in the same language
𝑤 𝑣 𝑖,𝑗 , 𝑢 𝑘
1 𝑖𝑓 𝑣 𝑖𝑗 ℎ𝑎𝑠 𝑢 𝑘 𝑎𝑠 𝑎 𝑐𝑜𝑛𝑡𝑒𝑥𝑡𝑢𝑎𝑙 𝑠𝑢𝑏𝑠𝑒𝑞𝑢𝑒𝑛𝑐𝑒,
= 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
Between context nodes in a language
𝑘,
|𝑢 𝑘 ∩ 𝑢 𝑙 |
𝑤(𝑢 𝑢 𝑙) = 𝐽(𝑢 𝑘,
𝑢 𝑙) = 𝑘 .
|𝑢 ∪ 𝑢 𝑙 |
Between context nodes in source and target languages
𝑐𝑜𝑢𝑛𝑡 𝑢 𝑠𝑘 , 𝑢 𝑙𝑡
𝑤(𝑢 𝑠𝑘 , 𝑢 𝑙𝑡 ) = 𝑘 𝑚
,
𝑢𝑡 𝑚 𝑐𝑜𝑢𝑛𝑡 𝑢 𝑠 , 𝑢 𝑡
14
16. Label Propagation
Initialize T
• Algorithm
Input
• A transition matrix T
• An initial label matrix Y0 Normalize T
Output
• The updated label matrix Yt
Initialize Y
Update Y
16
19. Implementation
• Dataset
English-Korean parallel corpus
• 266,982 bi-sentence pairs in English and Korean
• Aligned by GIZA++
• Annotation
ReVerb (Fader et al., 2011)
• English Open IE system
• Label Propagation
Junto Label Propagation Toolkit
• Learning
Tree kernel-based SVM classifier
• Shortest path dependency kernel (Bunescu and Mooney, 2005)
• SVM-Light (Joachims, 1998)
19
20. Evaluation
• Dataset
Manually annotated Korean dataset
• Obtained from the Web following Bunescu and Mooney(2007)’s work
• 500 sentences with manual annotations for four relation types
Acquisition
Birthplace
Inventor Of
Won Prize
• Evaluation Metrics
Precision/Recall/F-measure
20
21. Experimental Results
• Direct Projection vs. Graph-based Projection
Direct Projection Graph-based Projection
Type
P R F P R F
Acquisition 51.6 87.7 64.9 55.3 91.2 68.9
Birthplace 69.8 84.5 76.4 73.8 87.3 80.0
Inventor of 62.4 85.3 72.1 66.3 89.7 76.3
Won Prize 73.3 80.5 76.7 76.4 82.9 79.5
Total 63.9 84.2 72.7 67.7 87.4 76.3
21
22. Experimental Results
• Comparisons to other self-supervised approaches
Heuristic-based Approach (Banko et al., 2007; Banko et al., 2008)
• Korean Treebank and Syntactic Heuristics
Wikipedia-based Approach (Wu and Weld, 2010)
• Korean Wikipedia articles and Infoboxes
Approach P R F
Heuristic-based 92.31 17.27 29.09
Wikipedia-based 66.67 66.91 66.79
Projection-based 67.69 87.41 76.30
22
24. Conclusion
• Summary
A graph-based projection approach for relation extraction
• Label propagation algorithm
• On a graph that represents the instance and context features of both
the source and target languages
Experimental results show that our approach helps to improve the
performances of relation extraction compared to other approaches
• Future work
To relieve the high complexity problem of the approach
To deal with more expanded graph structure to improve the
extraction performances
24