Ontology-Based Word Sense Disambiguation for Scientific Literature
1. Ontology-based Word Sense Disambiguation
For Scientific Literature
Roman Prokofyev, Gianluca Demartini, Philippe Cudre-Mauroux, Alexey
Boyarsky and Oleg Ruchayskiy
eXascale Infolab
University of Fribourg, Switzerland
March 25, ECIR 2013, Moscow
2. Problem definition
Supersymmetric Standard Model
State Space Model
Sequential Standard Model
• Machine translation: correct lexical choice.
• Information retrieval: ambiguity in queries, result diversification, etc.
• Knowledge extraction: proper text analysis and classification (our case).
Our contribution: leveraging the structure of communitybased ontology to improve correct sense identification.
Datasets
• ScienceWISE abstract dataset + SW ontology
http://sciencewise.info
• MSH abstract dataset + ontology from bioontology.org
Available at http://exascale.info/papers/ecir2013disambig
3. Base models
• Concept Context Vectors
Star formation efficiency (SFE)
(Instability, 4), (Supernova, 2), (Milky Way, 3),…
• Document Concept Context Vectors
1
(Milky way, 1), (Electron neutrino, 1), (Electron antineutrino, 1),…
2
(Local analysis, 1), (White dwarf, 3), (Poynting-Robertson effect, 1), …
Min distance
Minimum over the ontological
paths to other concepts in the
document
4. Ontology shortest path
Average distance to other
concepts in the document
Nearest neighbors
Co-occurring 1-hop neighbors
from the ontology