SlideShare une entreprise Scribd logo
1  sur  37
Facilitating Human Intervention in
Coreference Resolution with
Comparative Entity Summaries
Danyun Xu, Gong Cheng, Yuzhong Qu
Nanjing University, China
Presented at ESWC 2014, Crete, Greece
Coreference resolution
TimBL
givenName: “Tim”
surname: “Berners-Lee”
altName: “Tim BL”
type: Scientist
gender: “male”
isDirectorOf: W3C
TBL
name: “Tim Berners-Lee”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Male”
invented: WWW
founded: WSRI
Wendy
fullName: “Wendy Hall”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Female”
birthplace: London
founded: WSRI
Methods with humans in the loop
(or, coordinating “ings”)
• Active learning
• Crowdsourcing
• Pay-as-you-go
Methods with humans in the loop
(or, coordinating “ings”)
• Active learning
• Crowdsourcing
• Pay-as-you-go
Candidate coreferent entities
…
TimBL ------ Wendy
TimBL ------ TBL
ChrisB ------ Bizer
…
Select & Present
Verify
Methods with humans in the loop
(or, coordinating “ings”)
• Active learning
• Crowdsourcing
• Pay-as-you-go
Candidate coreferent entities
…
TimBL ------ Wendy
TimBL ------ TBL
ChrisB ------ Bizer
…
Select & Present
Verify
Existing focus
Methods with humans in the loop
(or, coordinating “ings”)
• Active learning
• Crowdsourcing
• Pay-as-you-go
Candidate coreferent entities
…
TimBL ------ Wendy
TimBL ------ TBL
ChrisB ------ Bizer
…
Select & Present
Verify
Our focus
Present entire entity descriptions?
Present a compact comparative summary!
givenName: “Tim”
surname: “Berners-Lee”
isDirectorOf: W3C
name: “Tim Berners-Lee”
invented: WWW
Present a compact comparative summary!
Which property-value (PV) pairs
are more helpful?
Four aspects of a good comparative summary
1. Reflecting commonality
2. Reflecting difference
3. Providing information on identity
4. Providing diverse information
1. Commonality
• Common PV pairs =
comparable properties + similar values
TimBL
givenName: “Tim”
surname: “Berners-Lee”
altName: “Tim BL”
type: Scientist
gender: “male”
isDirectorOf: W3C
TBL
name: “Tim Berners-Lee”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Male”
invented: WWW
founded: WSRI
1. Commonality
• Common PV pairs =
comparable properties + similar values
• More helpful properties =
more like an Inverse Functional Property (IFP)
TimBL
givenName: “Tim”
surname: “Berners-Lee”
altName: “Tim BL”
type: Scientist
gender: “male”
isDirectorOf: W3C
TBL
name: “Tim Berners-Lee”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Male”
invented: WWW
founded: WSRI
1. Commonality (details)
• Comparability between properties
• Learned from known coreferent entities
• String similarity
Comparable properties = Properties having similar values
1. Commonality (details)
• Comparability between properties
• Learned from known coreferent entities
• String similarity
• Similarity between values
• String similarity
Comparable properties = Properties having similar values
1. Commonality (details)
• Comparability between properties
• Learned from known coreferent entities
• String similarity
• Similarity between values
• String similarity
• Likeness to an IFP
• Estimated based on the data set
𝐿𝑖𝑘𝑒𝑛𝑒𝑠𝑠 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑖𝑠𝑡𝑖𝑛𝑐𝑡 𝑣𝑎𝑙𝑢𝑒𝑠
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠
Comparable properties = Properties having similar values
1. Commonality (weakness)
• Only reflecting commonality can be misleading.
TBL
name: “Tim Berners-Lee”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Male”
invented: WWW
founded: WSRI
Wendy
fullName: “Wendy Hall”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Female”
birthplace: London
founded: WSRI
2. Difference
• Different PV pairs =
comparable properties + dissimilar values
TBL
name: “Tim Berners-Lee”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Male”
invented: WWW
founded: WSRI
Wendy
fullName: “Wendy Hall”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Female”
birthplace: London
founded: WSRI
2. Difference
• Different PV pairs =
comparable properties + dissimilar values
• More helpful properties =
more like a Functional Property (FP)
TBL
name: “Tim Berners-Lee”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Male”
invented: WWW
founded: WSRI
Wendy
fullName: “Wendy Hall”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Female”
birthplace: London
founded: WSRI
2. Difference (details)
• Comparability between properties
• Learned from known coreferent entities
• String similarity
• Dissimilarity between values
• String similarity
• Likeness to a FP
• Estimated based on the data set
𝐿𝑖𝑘𝑒𝑛𝑒𝑠𝑠 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑖𝑠𝑡𝑖𝑛𝑐𝑡 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠
3. Information on identity
TimBL
givenName: “Tim”
surname: “Berners-Lee”
altName: “Tim BL”
type: Scientist
gender: “male”
isDirectorOf: W3C
TBL
name: “Tim Berners-Lee”
type: ComputerScientist
type: RoyalSocietyFellow
sex: “Male”
invented: WWW
founded: WSRI
3. Information on identity (details)
• Information on identity
• Estimated based on the data set
𝑖𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 = 1 −
log 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠 ℎ𝑎𝑣𝑖𝑛𝑔 𝑡ℎ𝑖𝑠 𝑃𝑉 𝑝𝑎𝑖𝑟
log 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠
4. Diversity of information
• Overlapping PV pairs =
similar properties or similar values
TimBL
givenName: “Tim”
surname: “Berners-Lee”
altName: “Tim BL”
type: Scientist
gender: “male”
isDirectorOf: W3C
Overlapping
To find an optimal summary
(or, to find the most helpful PV pairs)
• Maximize
• Commonality
• Difference
• Information on identity
• Diversity of information
• Subject to
• A length limit
To find an optimal summary
(or, to find the most helpful PV pairs)
• Maximize
• Commonality
• Difference
• Information on identity
• Diversity of information
• Subject to
• A length limit
• Formulated as a binary quadratic knapsack problem
• Solved by GRASP-based local search
Evaluation method
• 4 approaches to be blindly tested
• 20 subjects (university students)
• 24 random tasks for each subject
• 4 approaches * (3 positive cases + 3 negative cases)
• Sorted in random order
givenName: “Tim”
surname: “Berners-Lee”
isDirectorOf: W3C
name: “Tim Berners-Lee”
invented: WWW
Entity summary Subject
Coreferent Non-coreferent Not sure
Present
Verify
Data sets and tasks
• Data sets
Places
Films
Data sets and tasks
• Data sets
• Tasks
http://dbpedia.org/resource/Paris,_Texas
http://dbpedia.org/resource/Paris
http://sws.geonames.org/4717560/
http://sws.geonames.org/2988507/
sameAs
(positive case)
sameAs
(positive case)
Places
Films
Data sets and tasks
• Data sets
• Tasks
Paris
http://dbpedia.org/resource/Paris,_Texas
http://dbpedia.org/resource/Paris
http://sws.geonames.org/4717560/
http://sws.geonames.org/2988507/
disambiguates
sameAs
(positive case)
sameAs
(positive case)
(negative cases)
Places
Films
Approaches
Approach Description
NOSUMM Present entire entity descriptions
GENERIC • Information on identity [3]
• Diversity of information
COMPSUMM • Commonality
• Difference
• Information on identity
• Diversity of information
COMPSUMM-C • Commonality
• Difference
• Information on identity
• Diversity of information
[3] Gong Cheng et al. RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization (ISWC 2011)
Results (1)
• Accuracy of verification
• COMPSUMM ≈ NOSUMM
> COMPSUMM-C
> GENERIC
Results (2)
• Efficiency of verification
• COMPSUMM > NOSUMM (2.7—2.9 times faster)
Take-home messages
• Provide entity summaries for verifying coreference.
• improves efficiency (2.7—2.9 times faster)
• without notably affecting accuracy
• Provide comparative (but not just generic) summaries.
• Show both commonality and difference.
Future work
• Present = Summarize + Visualize
Candidate coreferent entities
…
TimBL ------ Wendy
TimBL ------ TBL
ChrisB ------ Bizer
…
Select & Present
Verify
Our focus
Thanks for your attention
Results (3)
• Erroneous decisions
• COMPSUMM-C > COMPSUMM (mostly in negative cases)
Performance testing
• Offline computation
• Comparability between properties (the learning part)
• Likeness to an IFP/FP
• Information on identity
Performance testing
• Offline computation
• Comparability between properties (the learning part)
• Likeness to an IFP/FP
• Information on identity
• Online computation
• Similarity between properties/values
• Optimization
• Results
• Places (DBpedia and GeoNames): 24ms per case
• Films (DBpedia and LinkedMDB): 35ms per case

Contenu connexe

Tendances

Principles and practice of Open Science
Principles and practice of Open SciencePrinciples and practice of Open Science
Principles and practice of Open Sciencepetermurrayrust
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18Rafael Alvarado
 
Meyer dig ethno_2013sdp
Meyer dig ethno_2013sdpMeyer dig ethno_2013sdp
Meyer dig ethno_2013sdpEric Meyer
 
From Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge ExtractionFrom Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge ExtractionSTLab
 
Open data and Open Science
Open data and Open ScienceOpen data and Open Science
Open data and Open Sciencepetermurrayrust
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for LibrariesLukas Koster
 
Copyright Reform and Open Data
Copyright Reform and Open DataCopyright Reform and Open Data
Copyright Reform and Open Datapetermurrayrust
 
ContentMine: Open Data and Social Machines
ContentMine: Open Data and Social MachinesContentMine: Open Data and Social Machines
ContentMine: Open Data and Social Machinespetermurrayrust
 
Crowdsourcing Open Corpus-based Resources for EAP
Crowdsourcing Open Corpus-based Resources for EAPCrowdsourcing Open Corpus-based Resources for EAP
Crowdsourcing Open Corpus-based Resources for EAPAlannah Fitzgerald
 
03 Researchfriendly Org2
03 Researchfriendly Org203 Researchfriendly Org2
03 Researchfriendly Org2Inria
 
UKSG 2015 Mechanical curator and British Library labs
UKSG 2015  Mechanical curator and British Library labsUKSG 2015  Mechanical curator and British Library labs
UKSG 2015 Mechanical curator and British Library labsbenosteen
 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesDorothea Salo
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...Alison Hitchens
 
Irish Studies - making library data work harder
Irish Studies - making library data work harderIrish Studies - making library data work harder
Irish Studies - making library data work harderlisld
 
FirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearchFirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearchwebuploader
 
OAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall ForumOAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall ForumRobert Sanderson
 

Tendances (20)

Clark - Metadata is the Message
Clark - Metadata is the MessageClark - Metadata is the Message
Clark - Metadata is the Message
 
Principles and practice of Open Science
Principles and practice of Open SciencePrinciples and practice of Open Science
Principles and practice of Open Science
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18
 
Meyer dig ethno_2013sdp
Meyer dig ethno_2013sdpMeyer dig ethno_2013sdp
Meyer dig ethno_2013sdp
 
From Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge ExtractionFrom Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
From Hyperlinks to Semantic Web Properties using Open Knowledge Extraction
 
Open data and Open Science
Open data and Open ScienceOpen data and Open Science
Open data and Open Science
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for Libraries
 
Copyright Reform and Open Data
Copyright Reform and Open DataCopyright Reform and Open Data
Copyright Reform and Open Data
 
ContentMine: Open Data and Social Machines
ContentMine: Open Data and Social MachinesContentMine: Open Data and Social Machines
ContentMine: Open Data and Social Machines
 
What We Organize
What We OrganizeWhat We Organize
What We Organize
 
Crowdsourcing Open Corpus-based Resources for EAP
Crowdsourcing Open Corpus-based Resources for EAPCrowdsourcing Open Corpus-based Resources for EAP
Crowdsourcing Open Corpus-based Resources for EAP
 
03 Researchfriendly Org2
03 Researchfriendly Org203 Researchfriendly Org2
03 Researchfriendly Org2
 
UKSG 2015 Mechanical curator and British Library labs
UKSG 2015  Mechanical curator and British Library labsUKSG 2015  Mechanical curator and British Library labs
UKSG 2015 Mechanical curator and British Library labs
 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archives
 
Metadata
MetadataMetadata
Metadata
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
 
Irish Studies - making library data work harder
Irish Studies - making library data work harderIrish Studies - making library data work harder
Irish Studies - making library data work harder
 
FirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearchFirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearch
 
MDST 3270 F10 Seminar 9
MDST 3270 F10 Seminar 9MDST 3270 F10 Seminar 9
MDST 3270 F10 Seminar 9
 
OAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall ForumOAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall Forum
 

Similaire à Facilitating Human Intervention in Coreference Resolution with Comparative Entity Summaries

Questions and Answers in a Virtual World : Educators and Librarians as Inform...
Questions and Answers in a Virtual World : Educators and Librarians as Inform...Questions and Answers in a Virtual World : Educators and Librarians as Inform...
Questions and Answers in a Virtual World : Educators and Librarians as Inform...siguse_history
 
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeCraig Knoblock
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesHeiko Paulheim
 
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Gong Cheng
 
Research presentation for teens (1)
Research presentation for teens (1)Research presentation for teens (1)
Research presentation for teens (1)Nicolette Sosulski
 
Linked dataworkshopintro14aug2014
Linked dataworkshopintro14aug2014Linked dataworkshopintro14aug2014
Linked dataworkshopintro14aug2014Jane Stevenson
 
EXTRACTING KNOWLEDGE FROM WORLD WIDE WEB
EXTRACTING KNOWLEDGE FROM WORLD WIDE WEBEXTRACTING KNOWLEDGE FROM WORLD WIDE WEB
EXTRACTING KNOWLEDGE FROM WORLD WIDE WEBsujikrishna
 
Crim 4385 undergraduate research methods spr15
Crim 4385 undergraduate research methods spr15Crim 4385 undergraduate research methods spr15
Crim 4385 undergraduate research methods spr15ciakov
 
Hpsj orientation
Hpsj orientationHpsj orientation
Hpsj orientationTraciwm
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples Victor de Boer
 
Cj 4111 serial killers1
Cj 4111 serial killers1Cj 4111 serial killers1
Cj 4111 serial killers1Traciwm
 
Exploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataExploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataShenghui Wang
 
Fa2012 college level research peck
 Fa2012  college level research peck Fa2012  college level research peck
Fa2012 college level research peckdkaram
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Marieke van Erp
 
The Nevada Test Site Project: Finding Treasures in Firsthand Historical Acco...
The Nevada Test Site Project:  Finding Treasures in Firsthand Historical Acco...The Nevada Test Site Project:  Finding Treasures in Firsthand Historical Acco...
The Nevada Test Site Project: Finding Treasures in Firsthand Historical Acco...Cory Lampert
 
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...DeVonne Parks, CEM
 

Similaire à Facilitating Human Intervention in Coreference Resolution with Comparative Entity Summaries (20)

Questions and Answers in a Virtual World : Educators and Librarians as Inform...
Questions and Answers in a Virtual World : Educators and Librarians as Inform...Questions and Answers in a Virtual World : Educators and Librarians as Inform...
Questions and Answers in a Virtual World : Educators and Librarians as Inform...
 
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked KnowledgeFrom Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
From Virtual Museums to Peacebuilding: Creating and Using Linked Knowledge
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
 
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...
 
Research presentation for teens (1)
Research presentation for teens (1)Research presentation for teens (1)
Research presentation for teens (1)
 
Man224
Man224Man224
Man224
 
Extending Schema.org
Extending Schema.orgExtending Schema.org
Extending Schema.org
 
Linked dataworkshopintro14aug2014
Linked dataworkshopintro14aug2014Linked dataworkshopintro14aug2014
Linked dataworkshopintro14aug2014
 
EXTRACTING KNOWLEDGE FROM WORLD WIDE WEB
EXTRACTING KNOWLEDGE FROM WORLD WIDE WEBEXTRACTING KNOWLEDGE FROM WORLD WIDE WEB
EXTRACTING KNOWLEDGE FROM WORLD WIDE WEB
 
Crim 4385 undergraduate research methods spr15
Crim 4385 undergraduate research methods spr15Crim 4385 undergraduate research methods spr15
Crim 4385 undergraduate research methods spr15
 
Hpsj orientation
Hpsj orientationHpsj orientation
Hpsj orientation
 
Sources
SourcesSources
Sources
 
Wolven, Hickey, and Henderson, "Identifiers: New Problems, New Solutions, Par...
Wolven, Hickey, and Henderson, "Identifiers: New Problems, New Solutions, Par...Wolven, Hickey, and Henderson, "Identifiers: New Problems, New Solutions, Par...
Wolven, Hickey, and Henderson, "Identifiers: New Problems, New Solutions, Par...
 
Linked Data: principles and examples
Linked Data: principles and examples Linked Data: principles and examples
Linked Data: principles and examples
 
Cj 4111 serial killers1
Cj 4111 serial killers1Cj 4111 serial killers1
Cj 4111 serial killers1
 
Exploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataExploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadata
 
Fa2012 college level research peck
 Fa2012  college level research peck Fa2012  college level research peck
Fa2012 college level research peck
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
The Nevada Test Site Project: Finding Treasures in Firsthand Historical Acco...
The Nevada Test Site Project:  Finding Treasures in Firsthand Historical Acco...The Nevada Test Site Project:  Finding Treasures in Firsthand Historical Acco...
The Nevada Test Site Project: Finding Treasures in Firsthand Historical Acco...
 
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
 

Plus de Gong Cheng

Towards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondTowards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondGong Cheng
 
从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探Gong Cheng
 
知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法Gong Cheng
 
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Gong Cheng
 
知识图谱中的关联搜索
知识图谱中的关联搜索知识图谱中的关联搜索
知识图谱中的关联搜索Gong Cheng
 
面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探Gong Cheng
 
知识图谱中的实体关联搜索
知识图谱中的实体关联搜索知识图谱中的实体关联搜索
知识图谱中的实体关联搜索Gong Cheng
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationGong Cheng
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference reviewGong Cheng
 
Relatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationRelatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationGong Cheng
 
Generating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGenerating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGong Cheng
 
常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析Gong Cheng
 
Summarizing Semantic Data
Summarizing Semantic DataSummarizing Semantic Data
Summarizing Semantic DataGong Cheng
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationGong Cheng
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachGong Cheng
 
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...Gong Cheng
 
知识的摘要
知识的摘要知识的摘要
知识的摘要Gong Cheng
 
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Gong Cheng
 
Towards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachTowards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachGong Cheng
 
NJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary RepositoryNJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary RepositoryGong Cheng
 

Plus de Gong Cheng (20)

Towards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondTowards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and Beyond
 
从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探
 
知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法
 
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
 
知识图谱中的关联搜索
知识图谱中的关联搜索知识图谱中的关联搜索
知识图谱中的关联搜索
 
面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探
 
知识图谱中的实体关联搜索
知识图谱中的实体关联搜索知识图谱中的实体关联搜索
知识图谱中的实体关联搜索
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference review
 
Relatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationRelatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity Summarization
 
Generating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGenerating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the Web
 
常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
 
Summarizing Semantic Data
Summarizing Semantic DataSummarizing Semantic Data
Summarizing Semantic Data
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval Approach
 
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
 
知识的摘要
知识的摘要知识的摘要
知识的摘要
 
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
 
Towards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachTowards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based Approach
 
NJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary RepositoryNJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary Repository
 

Dernier

Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitysandeepnani2260
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptxerickamwana1
 
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08LloydHelferty
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per MVidyaAdsule1
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityApp Ethena
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
Scootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City DeliveryScootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City Deliveryrishi338139
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxRoquia Salam
 
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...soumyapottola
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
 

Dernier (14)

Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
 
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per M
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
Scootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City DeliveryScootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City Delivery
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
 
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 

Facilitating Human Intervention in Coreference Resolution with Comparative Entity Summaries

  • 1. Facilitating Human Intervention in Coreference Resolution with Comparative Entity Summaries Danyun Xu, Gong Cheng, Yuzhong Qu Nanjing University, China Presented at ESWC 2014, Crete, Greece
  • 2. Coreference resolution TimBL givenName: “Tim” surname: “Berners-Lee” altName: “Tim BL” type: Scientist gender: “male” isDirectorOf: W3C TBL name: “Tim Berners-Lee” type: ComputerScientist type: RoyalSocietyFellow sex: “Male” invented: WWW founded: WSRI Wendy fullName: “Wendy Hall” type: ComputerScientist type: RoyalSocietyFellow sex: “Female” birthplace: London founded: WSRI
  • 3. Methods with humans in the loop (or, coordinating “ings”) • Active learning • Crowdsourcing • Pay-as-you-go
  • 4. Methods with humans in the loop (or, coordinating “ings”) • Active learning • Crowdsourcing • Pay-as-you-go Candidate coreferent entities … TimBL ------ Wendy TimBL ------ TBL ChrisB ------ Bizer … Select & Present Verify
  • 5. Methods with humans in the loop (or, coordinating “ings”) • Active learning • Crowdsourcing • Pay-as-you-go Candidate coreferent entities … TimBL ------ Wendy TimBL ------ TBL ChrisB ------ Bizer … Select & Present Verify Existing focus
  • 6. Methods with humans in the loop (or, coordinating “ings”) • Active learning • Crowdsourcing • Pay-as-you-go Candidate coreferent entities … TimBL ------ Wendy TimBL ------ TBL ChrisB ------ Bizer … Select & Present Verify Our focus
  • 7. Present entire entity descriptions?
  • 8. Present a compact comparative summary! givenName: “Tim” surname: “Berners-Lee” isDirectorOf: W3C name: “Tim Berners-Lee” invented: WWW
  • 9. Present a compact comparative summary! Which property-value (PV) pairs are more helpful?
  • 10. Four aspects of a good comparative summary 1. Reflecting commonality 2. Reflecting difference 3. Providing information on identity 4. Providing diverse information
  • 11. 1. Commonality • Common PV pairs = comparable properties + similar values TimBL givenName: “Tim” surname: “Berners-Lee” altName: “Tim BL” type: Scientist gender: “male” isDirectorOf: W3C TBL name: “Tim Berners-Lee” type: ComputerScientist type: RoyalSocietyFellow sex: “Male” invented: WWW founded: WSRI
  • 12. 1. Commonality • Common PV pairs = comparable properties + similar values • More helpful properties = more like an Inverse Functional Property (IFP) TimBL givenName: “Tim” surname: “Berners-Lee” altName: “Tim BL” type: Scientist gender: “male” isDirectorOf: W3C TBL name: “Tim Berners-Lee” type: ComputerScientist type: RoyalSocietyFellow sex: “Male” invented: WWW founded: WSRI
  • 13. 1. Commonality (details) • Comparability between properties • Learned from known coreferent entities • String similarity Comparable properties = Properties having similar values
  • 14. 1. Commonality (details) • Comparability between properties • Learned from known coreferent entities • String similarity • Similarity between values • String similarity Comparable properties = Properties having similar values
  • 15. 1. Commonality (details) • Comparability between properties • Learned from known coreferent entities • String similarity • Similarity between values • String similarity • Likeness to an IFP • Estimated based on the data set 𝐿𝑖𝑘𝑒𝑛𝑒𝑠𝑠 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑖𝑠𝑡𝑖𝑛𝑐𝑡 𝑣𝑎𝑙𝑢𝑒𝑠 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 Comparable properties = Properties having similar values
  • 16. 1. Commonality (weakness) • Only reflecting commonality can be misleading. TBL name: “Tim Berners-Lee” type: ComputerScientist type: RoyalSocietyFellow sex: “Male” invented: WWW founded: WSRI Wendy fullName: “Wendy Hall” type: ComputerScientist type: RoyalSocietyFellow sex: “Female” birthplace: London founded: WSRI
  • 17. 2. Difference • Different PV pairs = comparable properties + dissimilar values TBL name: “Tim Berners-Lee” type: ComputerScientist type: RoyalSocietyFellow sex: “Male” invented: WWW founded: WSRI Wendy fullName: “Wendy Hall” type: ComputerScientist type: RoyalSocietyFellow sex: “Female” birthplace: London founded: WSRI
  • 18. 2. Difference • Different PV pairs = comparable properties + dissimilar values • More helpful properties = more like a Functional Property (FP) TBL name: “Tim Berners-Lee” type: ComputerScientist type: RoyalSocietyFellow sex: “Male” invented: WWW founded: WSRI Wendy fullName: “Wendy Hall” type: ComputerScientist type: RoyalSocietyFellow sex: “Female” birthplace: London founded: WSRI
  • 19. 2. Difference (details) • Comparability between properties • Learned from known coreferent entities • String similarity • Dissimilarity between values • String similarity • Likeness to a FP • Estimated based on the data set 𝐿𝑖𝑘𝑒𝑛𝑒𝑠𝑠 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑖𝑠𝑡𝑖𝑛𝑐𝑡 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠
  • 20. 3. Information on identity TimBL givenName: “Tim” surname: “Berners-Lee” altName: “Tim BL” type: Scientist gender: “male” isDirectorOf: W3C TBL name: “Tim Berners-Lee” type: ComputerScientist type: RoyalSocietyFellow sex: “Male” invented: WWW founded: WSRI
  • 21. 3. Information on identity (details) • Information on identity • Estimated based on the data set 𝑖𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 = 1 − log 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠 ℎ𝑎𝑣𝑖𝑛𝑔 𝑡ℎ𝑖𝑠 𝑃𝑉 𝑝𝑎𝑖𝑟 log 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑒𝑛𝑡𝑖𝑡𝑖𝑒𝑠
  • 22. 4. Diversity of information • Overlapping PV pairs = similar properties or similar values TimBL givenName: “Tim” surname: “Berners-Lee” altName: “Tim BL” type: Scientist gender: “male” isDirectorOf: W3C Overlapping
  • 23. To find an optimal summary (or, to find the most helpful PV pairs) • Maximize • Commonality • Difference • Information on identity • Diversity of information • Subject to • A length limit
  • 24. To find an optimal summary (or, to find the most helpful PV pairs) • Maximize • Commonality • Difference • Information on identity • Diversity of information • Subject to • A length limit • Formulated as a binary quadratic knapsack problem • Solved by GRASP-based local search
  • 25. Evaluation method • 4 approaches to be blindly tested • 20 subjects (university students) • 24 random tasks for each subject • 4 approaches * (3 positive cases + 3 negative cases) • Sorted in random order givenName: “Tim” surname: “Berners-Lee” isDirectorOf: W3C name: “Tim Berners-Lee” invented: WWW Entity summary Subject Coreferent Non-coreferent Not sure Present Verify
  • 26. Data sets and tasks • Data sets Places Films
  • 27. Data sets and tasks • Data sets • Tasks http://dbpedia.org/resource/Paris,_Texas http://dbpedia.org/resource/Paris http://sws.geonames.org/4717560/ http://sws.geonames.org/2988507/ sameAs (positive case) sameAs (positive case) Places Films
  • 28. Data sets and tasks • Data sets • Tasks Paris http://dbpedia.org/resource/Paris,_Texas http://dbpedia.org/resource/Paris http://sws.geonames.org/4717560/ http://sws.geonames.org/2988507/ disambiguates sameAs (positive case) sameAs (positive case) (negative cases) Places Films
  • 29. Approaches Approach Description NOSUMM Present entire entity descriptions GENERIC • Information on identity [3] • Diversity of information COMPSUMM • Commonality • Difference • Information on identity • Diversity of information COMPSUMM-C • Commonality • Difference • Information on identity • Diversity of information [3] Gong Cheng et al. RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization (ISWC 2011)
  • 30. Results (1) • Accuracy of verification • COMPSUMM ≈ NOSUMM > COMPSUMM-C > GENERIC
  • 31. Results (2) • Efficiency of verification • COMPSUMM > NOSUMM (2.7—2.9 times faster)
  • 32. Take-home messages • Provide entity summaries for verifying coreference. • improves efficiency (2.7—2.9 times faster) • without notably affecting accuracy • Provide comparative (but not just generic) summaries. • Show both commonality and difference.
  • 33. Future work • Present = Summarize + Visualize Candidate coreferent entities … TimBL ------ Wendy TimBL ------ TBL ChrisB ------ Bizer … Select & Present Verify Our focus
  • 34. Thanks for your attention
  • 35. Results (3) • Erroneous decisions • COMPSUMM-C > COMPSUMM (mostly in negative cases)
  • 36. Performance testing • Offline computation • Comparability between properties (the learning part) • Likeness to an IFP/FP • Information on identity
  • 37. Performance testing • Offline computation • Comparability between properties (the learning part) • Likeness to an IFP/FP • Information on identity • Online computation • Similarity between properties/values • Optimization • Results • Places (DBpedia and GeoNames): 24ms per case • Films (DBpedia and LinkedMDB): 35ms per case