SlideShare une entreprise Scribd logo
1  sur  19
Hybrid Acquisition of Temporal Scopes for
RDF Data
Anisa Rula1, Matteo Palmonari1, Axel-Cyrille Ngonga Ngomo2,
Daniel Gerber2, Jens Lehmann2, and Lorenz Bühmann2
1. University of Milano-Bicocca, SITI Lab
2. Universität Leipzig, Institut für Informatik, AKSW
2
Outline
Anisa Rula
1. Introduction & Motivation
2. Approach Overview
3. Details of the Approach
4. Experimental Evaluation
5. Conclusions
team
team
Temporally annotated RDF triples
Alexandre Pato
S.C. Corinthians
Anisa Rula
 Some facts are always valid while other facts are valid for a certain
time interval (volatile facts)
 Volatile facts are represented by triples whose validity is defined by a
time interval i.e. the temporal scope
Temporal Scoping of RDF triples
2007-2013
2013-2014
Temporal scopes,
represented by
time intervals
A.C. Milan
3
Motivation
 World changes: relations represented in RDF triples may be valid only
for a specific time interval [Gutierrez et al.,2005]
o E.g. <Alexandre_Pato, team, A.C._Milan> [2007,2013]
 Many applications have to use temporally annotated RDF triples
o E.g. Temporal Query Answering, Question Answering over KBs, Temporal
Reasoning, Timelines
Challenges
 Low availability and quality of temporal information in RDF data
 NLP challenges for web-scale temporal information extraction
(scalability, availability of corpus, conflicting information) [Derczynsk et
al., 2013]
Motivation & Challenges
Anisa Rula 4
Temporally annotated RDF triples are largely
unavailable or incomplete in the LOD
(Rula et al., 2012)
Anisa Rula
Approach Overview: Use the Web as Source of
Evidence
Web of Data - RDF
(61.9 Billion)
World Wide Web
(1.8 Billion)
Source of
evidence
Temporally annotated RDF triples
team
team
Alexandre Pato
team
team
Alexandre Pato
S.C. Corinthians
A.C. Milan
2007-2013
2013-2014S.C. Corinthians
A.C. Milan
5Anisa Rula
 Use evidence from the Web for temporal scoping of RDF triples
Web of Documents
Mapping facts to time intervals
Temporal Information
Extraction
fact
t1 occ1
t2 occ2
t3 occ3
t4 occ4
Matching Selection
Reasoning
Approach Overview: Hybrid Acquisition of Time
Scopes
<s,p,o>
Web of
Data
t1 t2 t3 … tn
t1
t2
t3
…
tn
Temporally annotated
RDF triples
6Anisa Rula
Set of disconnected
time intervals
<s,p,o>[x1,y1],…,[xn,yn]
Temporal Information Extraction - Web Documents
Anisa Rula 7
DeFacto [Lehmann & al. 2012]
 Retrieves a set of webpages that
confirm the given RDF triple
 The RDF triple issued to the search
engine is verbalized by using natural
language patterns
Temporal Extension for DeFacto (TempDeFacto)
 Apply Named Entity Tagger to extract the entities of type Date class
 Observe the occurrences of the labels of the subject and object in less
than 20 tokens
 Analyze the context window of n characters before and after subject-
object occurrences in order to retrieve the time points
 Return a distribution vector of date and their number of occurrences
Temporal Information Extraction - Web Documents
Anisa Rula 8
<Alexandre_Pato,team, A.C._Milan>
“Alexandre Pato” “played for” “A.C. Milan”
“Pato” “’s striker” “Milan”
“CR7” “Mi”
Pato played for A.C. Milan from 2007 to 2013.
A.C. Milan’s top striker Pato left in 2013.
In 2013 Pato visited Milan for a short holiday.
2013 17
2007 11
2006 1
…. ….
2010 4
2009 4
1989 2
Occurrences of the labels of the subject and object
Context window of n characters before and after
subject-object occurrences
NamedEntityTagger
DeFacto Vector (dfv)
Temporal Information Extraction - Web of Data
<Alexandre_Pato>
Content negotiation
null null null null null null
0 null null null null null
0 0 null null null null
0 0 0 null null null
0 0 0 0 null null
0 0 0 0 0 null
1989 2000 2006 2007 2008 2013
1989
2000
2006
2007
2008
2013
Relevant Interval Matrix (RIM)
Regular expressions
TAlexandre_Pato= {1989, 2000, 2006, 2007, 2008, 2013}
Relevant Time Points
RDF document d
Alexandre_Pato
Anisa Rula
 The set of time intervals for a given
triple with starting and ending time
points defined with the set of
relevant time points
∀ 𝑟𝑖𝑚 𝑡𝑖 𝑡 𝑗
∈ 𝑅𝐼𝑀 𝑒 𝑤𝑖𝑡ℎ 𝑖, 𝑗 > 0
𝑓𝑜𝑟 𝑖 ≤ 𝑗 𝑟𝑖𝑚 𝑡𝑖 𝑡 𝑗
= 𝑛𝑢𝑙𝑙
𝑓𝑜𝑟 𝑖 > 𝑗 𝑟𝑖𝑚 𝑡𝑖 𝑡 𝑗
= 0
9
null null null null null null
null null null null null
null null null null
null null null
null null
null
1989 2000 2006 2007 2008 2013
1989
2000
2006
2007
2008
2013
1. Matching temporal distribution (dfv) against the relevant
time interval matrix
0.004 0.166 0.166 0.736 0.8 2.48
0 0 0.142 1.5 1.555 4.2
0 0 0.002 6 4.666 7.5
0 0 0 0.026 6.5 8.428
0 0 0 0 0.004 8
0 0 0 0 0 0.040
1989 2000 2006 2007 2008 2013
1989
2000
2006
2007
2008
2013
RIM
Mapping Facts to Time Intervals - Matching
Matching
Selection
Reasoning
RDF
data
2013 17
2007 11
2006 1
2011 6
2008 2
2016 3
2012 15
2010 4
2009 4
1989 2
𝑠𝑚2007:2008=
11 + 2
2
= 6.5
Significance Matrix (SM)dfv
Anisa Rula 10
1989 2000 2006 2007 2008 2013
1989
2000
2006
2007
2008
2013
SM
0.004 0.166 0.166 0.736 0.8 2.48
0 0 0.142 1.5 1.555 4.2
0 0 0.002 6 4.666 7.5
0 0 0 0.026 6.5 8.428
0 0 0 0 0.004 8
0 0 0 0 0 0.040
Mapping Facts to Time Intervals - Selection
2. Mapping Selection:
 top-k function: selects the k intervals that have highest scores in the SM
 neighbor-x: selects a set of intervals whose significance score is close to
the maximum significance score in the SM matrix, up to a certain
threshold x
 neighbor-k-x: selects the top-k intervals in the neighborhood of the
interval with higher significance score
neighbor, 𝑥 = 23
top-k , 𝑘 = 3
neighbor-k-x , 𝑘 = 2, 𝑥 = 23 [2007, 2013][2008, 2013]
[2006,2013][2007, 2013][2008, 2013]
[2007,2008][2006,2013][2007, 2013][2008, 2013]
Matching
Selection
Reasoning
11Anisa Rula
[2007, 2013][2008, 2013]
[ 2007 2013]
Mapping Facts to Time Intervals - Reasoning
3. Interval merging via reasoning based on Allen’s algebra
relation
<Alexander_Pato,playsFor, A.C._Milan>
Matching
Selection
Reasoning
12Anisa Rula
Experimental Setup - Dataset
Dataset # facts Domain Property Equivalent Property
Freebase Yago2
DBpedia 1000 Sport team team playsFor
DBpedia 1000 Politicians office government_positions_held holdsPoliticalPosition
DBpedia 500 Celebrities spouse spouse ismarriedTo
Dataset: 2500 DBpedia triples with semantic equivalent triples in Freebase
and Yago2
Gold standard: triples annotated with temporal scopes in Yago2
 manually curated to correct missing or wrong values
Anisa Rula 13
Experimental Setup - Evaluation Measures
The evaluation measures capture the degree of overlap between the
retrieved intervals and the intervals in the gold standard
 Precision (for a triple): number of time points in the temporal scope
that fall into the time interval in the gold standard
 Recall (for a triple): number of time points in the gold standard that are
covered by the temporal scope
 F1 measure (for a triple): the harmonic mean of precision and recall
 Macro-averaged F1 (avgF-1): aggregated measure for a set of triples
14Anisa Rula
2007 2011
2008 2010
2007 2011
2006 2012
2007 2011
2007 2011
F1=1F1=0.83F1=0.75
Ref
R
Temp prop DBpedia Freebase TemporalDeFacto
Config #facts avgF1 Config #facts avgF1 Config #facts avgF1
playsFor top-1 loc 264 0.505 top-1 loc 213 0.477 top-3 311 0.511
holdsPolitica
lPosition
neigh-10 702 0.699 neigh-10-2 242 0.549 top-3 709 0.586
ismarriedTo neigh-10 702 0.600 neigh-10 524 0.547 top-3 709 0.545
 Good quality of the approach with an avgF1 of up to 70%
 Using evidence from RDF documents the performance can be
significantly improved (significantly better results for two properties and
negligibly worst results for one property)
Experimental Results - Accuracy of Best
Configurations for all Properties
 Different sources for the creation of the RIM
 Setup different configurations in the selection and reasoning steps:
o E.g. config top-3 refers to selection function top-3 and reasoning = yes
15Anisa Rula
Temp prop Source Configuration
With
reasoning
Without
reasoning
#fact avgF1 #fact avgF1
playsFor TempDeFacto top-3 311 0.511 505 0.467
holdsPoliticalPosition DBpedia neigh-10 702 0.699 822 0.667
ismarriedTo DBpedia neigh-10 705 0.600 977 0.563
 The best results are obtained when reasoning is enabled
Experimental Results - Accuracy with vs. without
Reasoning for all Properties
 The best configurations for the three properties
16Anisa Rula
Conclusions & Future Work
Summary
 Temporal extension of the DeFacto framework
 Modeling a space of relevant time intervals given an RDF triple
 Mapping volatile facts to time intervals based on a three-phase algorithm
 Unsupervised method
Future work
 Determine when to add or not to add the temporal scope based on the
confidence of the acquisition process
 Collect additional relevant time points to improve the overall results
 Show the effectiveness of acquired temporal scopes in temporal query
answering
17Anisa Rula
Thank you for your attention
Question?
#eswc2014Rula
18Anisa Rula
References
 [Rula&2012] Anisa Rula, Matteo Palmonari, Andreas Harth, Steffen Stadtmüller,
Andrea Maurino: On the Diversity and Availability of Temporal Information in
Linked Open Data. International Semantic Web Conference (1) 2012: 492-507
 [Gutiérrez&2005] C. Gutierrez, C. A. Hurtado, and A. A. Vaisman. Temporal RDF.
In The 2ndESWC, pages 93-107, 2005
 [Lehmann&2012] Jens Lehmann, Daniel Gerber, Mohamed Morsey, Axel-Cyrille
Ngonga Ngomo: DeFacto - Deep Fact Validation. International Semantic Web
Conference (1) 2012: 312-327
 [Ling&2010] X. Ling and D. S. Weld. Temporal information extraction. In 25th
AAAI, 2010.
 [Derczynsk&2013] L. Derczynski and R. Gaizauskas. Information retrieval for
temporal bounding. In 4th ICTIR, pages 29:129–29:130. ACM, 2013.
19Anisa Rula

Contenu connexe

Tendances

Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O Sri Ambati
 
05 Analysis of Algorithms: Heap and Quick Sort - Corrected
05 Analysis of Algorithms: Heap and Quick Sort - Corrected05 Analysis of Algorithms: Heap and Quick Sort - Corrected
05 Analysis of Algorithms: Heap and Quick Sort - CorrectedAndres Mendez-Vazquez
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Implementing parallel evolutionary algorithms in concurrent and functional pa...
Implementing parallel evolutionary algorithms in concurrent and functional pa...Implementing parallel evolutionary algorithms in concurrent and functional pa...
Implementing parallel evolutionary algorithms in concurrent and functional pa...José Albert
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationDBOnto
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonWes McKinney
 
Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...
Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...
Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...Goran S. Milovanovic
 
Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R] Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R] Goran S. Milovanovic
 
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Goran S. Milovanovic
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...Spark Summit
 
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...NoSQLmatters
 
Distributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupDistributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupSri Ambati
 
R-programming-training-in-mumbai
R-programming-training-in-mumbaiR-programming-training-in-mumbai
R-programming-training-in-mumbaiUnmesh Baile
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsAlbert Bifet
 
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]Goran S. Milovanovic
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with SparkBarak Gitsis
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsAjay Ohri
 

Tendances (20)

Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
 
05 Analysis of Algorithms: Heap and Quick Sort - Corrected
05 Analysis of Algorithms: Heap and Quick Sort - Corrected05 Analysis of Algorithms: Heap and Quick Sort - Corrected
05 Analysis of Algorithms: Heap and Quick Sort - Corrected
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Implementing parallel evolutionary algorithms in concurrent and functional pa...
Implementing parallel evolutionary algorithms in concurrent and functional pa...Implementing parallel evolutionary algorithms in concurrent and functional pa...
Implementing parallel evolutionary algorithms in concurrent and functional pa...
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox Presentation
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in Python
 
R language
R languageR language
R language
 
Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...
Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...
Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, M...
 
Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R] Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R]
 
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
 
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
 
Distributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupDistributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta Meetup
 
R-programming-training-in-mumbai
R-programming-training-in-mumbaiR-programming-training-in-mumbai
R-programming-training-in-mumbai
 
Sequential Pattern Mining and GSP
Sequential Pattern Mining and GSPSequential Pattern Mining and GSP
Sequential Pattern Mining and GSP
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
Introduction to R for Data Science :: Session 5 [Data Structuring: Strings in R]
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with Spark
 
pattern mining
pattern miningpattern mining
pattern mining
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
 

Similaire à Hybrid acquisition of temporal scopes for rdf data

SSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on EmroozSSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on EmroozMarkus Stocker
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...Oscar Corcho
 
Temporal
TemporalTemporal
Temporalsunsie
 
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010ivan provalov
 
Anil timeline construction
Anil timeline constructionAnil timeline construction
Anil timeline constructionanilcs0405
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management ServiceSafe Software
 
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей МоренецFwdays
 
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTERPERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTERijdms
 
Roberto Trasarti PhD Thesis
Roberto Trasarti PhD ThesisRoberto Trasarti PhD Thesis
Roberto Trasarti PhD ThesisRoberto Trasarti
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache DruidImply
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid Matt Sarrel
 
Apache Lens at Hadoop meetup
Apache Lens at Hadoop meetupApache Lens at Hadoop meetup
Apache Lens at Hadoop meetupamarsri
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsAlejandro Llaves
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsAlejandro Llaves
 
Unit5_Time Series Analysis.pdf
Unit5_Time Series  Analysis.pdfUnit5_Time Series  Analysis.pdf
Unit5_Time Series Analysis.pdfKaranvhatkar1
 
And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are AlgorithmsInfluxData
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...Vladimir Alexiev, PhD, PMP
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan Kumar
 

Similaire à Hybrid acquisition of temporal scopes for rdf data (20)

SSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on EmroozSSN-TC workshop talk at ISWC 2015 on Emrooz
SSN-TC workshop talk at ISWC 2015 on Emrooz
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
 
R time series analysis
R   time series analysisR   time series analysis
R time series analysis
 
Temporal
TemporalTemporal
Temporal
 
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
 
Anil timeline construction
Anil timeline constructionAnil timeline construction
Anil timeline construction
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
 
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец"Эффективность и оптимизация кода в Java 8" Сергей Моренец
"Эффективность и оптимизация кода в Java 8" Сергей Моренец
 
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTERPERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
 
Roberto Trasarti PhD Thesis
Roberto Trasarti PhD ThesisRoberto Trasarti PhD Thesis
Roberto Trasarti PhD Thesis
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache Druid
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid
 
Apache Lens at Hadoop meetup
Apache Lens at Hadoop meetupApache Lens at Hadoop meetup
Apache Lens at Hadoop meetup
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Towards efficient processing of RDF data streams
Towards efficient processing of RDF data streamsTowards efficient processing of RDF data streams
Towards efficient processing of RDF data streams
 
Unit5_Time Series Analysis.pdf
Unit5_Time Series  Analysis.pdfUnit5_Time Series  Analysis.pdf
Unit5_Time Series Analysis.pdf
 
And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are Algorithms
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
 

Dernier

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsNurulAfiqah307317
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 

Dernier (20)

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening Designs
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 

Hybrid acquisition of temporal scopes for rdf data

  • 1. Hybrid Acquisition of Temporal Scopes for RDF Data Anisa Rula1, Matteo Palmonari1, Axel-Cyrille Ngonga Ngomo2, Daniel Gerber2, Jens Lehmann2, and Lorenz Bühmann2 1. University of Milano-Bicocca, SITI Lab 2. Universität Leipzig, Institut für Informatik, AKSW
  • 2. 2 Outline Anisa Rula 1. Introduction & Motivation 2. Approach Overview 3. Details of the Approach 4. Experimental Evaluation 5. Conclusions
  • 3. team team Temporally annotated RDF triples Alexandre Pato S.C. Corinthians Anisa Rula  Some facts are always valid while other facts are valid for a certain time interval (volatile facts)  Volatile facts are represented by triples whose validity is defined by a time interval i.e. the temporal scope Temporal Scoping of RDF triples 2007-2013 2013-2014 Temporal scopes, represented by time intervals A.C. Milan 3
  • 4. Motivation  World changes: relations represented in RDF triples may be valid only for a specific time interval [Gutierrez et al.,2005] o E.g. <Alexandre_Pato, team, A.C._Milan> [2007,2013]  Many applications have to use temporally annotated RDF triples o E.g. Temporal Query Answering, Question Answering over KBs, Temporal Reasoning, Timelines Challenges  Low availability and quality of temporal information in RDF data  NLP challenges for web-scale temporal information extraction (scalability, availability of corpus, conflicting information) [Derczynsk et al., 2013] Motivation & Challenges Anisa Rula 4 Temporally annotated RDF triples are largely unavailable or incomplete in the LOD (Rula et al., 2012)
  • 5. Anisa Rula Approach Overview: Use the Web as Source of Evidence Web of Data - RDF (61.9 Billion) World Wide Web (1.8 Billion) Source of evidence Temporally annotated RDF triples team team Alexandre Pato team team Alexandre Pato S.C. Corinthians A.C. Milan 2007-2013 2013-2014S.C. Corinthians A.C. Milan 5Anisa Rula  Use evidence from the Web for temporal scoping of RDF triples
  • 6. Web of Documents Mapping facts to time intervals Temporal Information Extraction fact t1 occ1 t2 occ2 t3 occ3 t4 occ4 Matching Selection Reasoning Approach Overview: Hybrid Acquisition of Time Scopes <s,p,o> Web of Data t1 t2 t3 … tn t1 t2 t3 … tn Temporally annotated RDF triples 6Anisa Rula Set of disconnected time intervals <s,p,o>[x1,y1],…,[xn,yn]
  • 7. Temporal Information Extraction - Web Documents Anisa Rula 7 DeFacto [Lehmann & al. 2012]  Retrieves a set of webpages that confirm the given RDF triple  The RDF triple issued to the search engine is verbalized by using natural language patterns Temporal Extension for DeFacto (TempDeFacto)  Apply Named Entity Tagger to extract the entities of type Date class  Observe the occurrences of the labels of the subject and object in less than 20 tokens  Analyze the context window of n characters before and after subject- object occurrences in order to retrieve the time points  Return a distribution vector of date and their number of occurrences
  • 8. Temporal Information Extraction - Web Documents Anisa Rula 8 <Alexandre_Pato,team, A.C._Milan> “Alexandre Pato” “played for” “A.C. Milan” “Pato” “’s striker” “Milan” “CR7” “Mi” Pato played for A.C. Milan from 2007 to 2013. A.C. Milan’s top striker Pato left in 2013. In 2013 Pato visited Milan for a short holiday. 2013 17 2007 11 2006 1 …. …. 2010 4 2009 4 1989 2 Occurrences of the labels of the subject and object Context window of n characters before and after subject-object occurrences NamedEntityTagger DeFacto Vector (dfv)
  • 9. Temporal Information Extraction - Web of Data <Alexandre_Pato> Content negotiation null null null null null null 0 null null null null null 0 0 null null null null 0 0 0 null null null 0 0 0 0 null null 0 0 0 0 0 null 1989 2000 2006 2007 2008 2013 1989 2000 2006 2007 2008 2013 Relevant Interval Matrix (RIM) Regular expressions TAlexandre_Pato= {1989, 2000, 2006, 2007, 2008, 2013} Relevant Time Points RDF document d Alexandre_Pato Anisa Rula  The set of time intervals for a given triple with starting and ending time points defined with the set of relevant time points ∀ 𝑟𝑖𝑚 𝑡𝑖 𝑡 𝑗 ∈ 𝑅𝐼𝑀 𝑒 𝑤𝑖𝑡ℎ 𝑖, 𝑗 > 0 𝑓𝑜𝑟 𝑖 ≤ 𝑗 𝑟𝑖𝑚 𝑡𝑖 𝑡 𝑗 = 𝑛𝑢𝑙𝑙 𝑓𝑜𝑟 𝑖 > 𝑗 𝑟𝑖𝑚 𝑡𝑖 𝑡 𝑗 = 0 9
  • 10. null null null null null null null null null null null null null null null null null null null null null 1989 2000 2006 2007 2008 2013 1989 2000 2006 2007 2008 2013 1. Matching temporal distribution (dfv) against the relevant time interval matrix 0.004 0.166 0.166 0.736 0.8 2.48 0 0 0.142 1.5 1.555 4.2 0 0 0.002 6 4.666 7.5 0 0 0 0.026 6.5 8.428 0 0 0 0 0.004 8 0 0 0 0 0 0.040 1989 2000 2006 2007 2008 2013 1989 2000 2006 2007 2008 2013 RIM Mapping Facts to Time Intervals - Matching Matching Selection Reasoning RDF data 2013 17 2007 11 2006 1 2011 6 2008 2 2016 3 2012 15 2010 4 2009 4 1989 2 𝑠𝑚2007:2008= 11 + 2 2 = 6.5 Significance Matrix (SM)dfv Anisa Rula 10
  • 11. 1989 2000 2006 2007 2008 2013 1989 2000 2006 2007 2008 2013 SM 0.004 0.166 0.166 0.736 0.8 2.48 0 0 0.142 1.5 1.555 4.2 0 0 0.002 6 4.666 7.5 0 0 0 0.026 6.5 8.428 0 0 0 0 0.004 8 0 0 0 0 0 0.040 Mapping Facts to Time Intervals - Selection 2. Mapping Selection:  top-k function: selects the k intervals that have highest scores in the SM  neighbor-x: selects a set of intervals whose significance score is close to the maximum significance score in the SM matrix, up to a certain threshold x  neighbor-k-x: selects the top-k intervals in the neighborhood of the interval with higher significance score neighbor, 𝑥 = 23 top-k , 𝑘 = 3 neighbor-k-x , 𝑘 = 2, 𝑥 = 23 [2007, 2013][2008, 2013] [2006,2013][2007, 2013][2008, 2013] [2007,2008][2006,2013][2007, 2013][2008, 2013] Matching Selection Reasoning 11Anisa Rula
  • 12. [2007, 2013][2008, 2013] [ 2007 2013] Mapping Facts to Time Intervals - Reasoning 3. Interval merging via reasoning based on Allen’s algebra relation <Alexander_Pato,playsFor, A.C._Milan> Matching Selection Reasoning 12Anisa Rula
  • 13. Experimental Setup - Dataset Dataset # facts Domain Property Equivalent Property Freebase Yago2 DBpedia 1000 Sport team team playsFor DBpedia 1000 Politicians office government_positions_held holdsPoliticalPosition DBpedia 500 Celebrities spouse spouse ismarriedTo Dataset: 2500 DBpedia triples with semantic equivalent triples in Freebase and Yago2 Gold standard: triples annotated with temporal scopes in Yago2  manually curated to correct missing or wrong values Anisa Rula 13
  • 14. Experimental Setup - Evaluation Measures The evaluation measures capture the degree of overlap between the retrieved intervals and the intervals in the gold standard  Precision (for a triple): number of time points in the temporal scope that fall into the time interval in the gold standard  Recall (for a triple): number of time points in the gold standard that are covered by the temporal scope  F1 measure (for a triple): the harmonic mean of precision and recall  Macro-averaged F1 (avgF-1): aggregated measure for a set of triples 14Anisa Rula 2007 2011 2008 2010 2007 2011 2006 2012 2007 2011 2007 2011 F1=1F1=0.83F1=0.75 Ref R
  • 15. Temp prop DBpedia Freebase TemporalDeFacto Config #facts avgF1 Config #facts avgF1 Config #facts avgF1 playsFor top-1 loc 264 0.505 top-1 loc 213 0.477 top-3 311 0.511 holdsPolitica lPosition neigh-10 702 0.699 neigh-10-2 242 0.549 top-3 709 0.586 ismarriedTo neigh-10 702 0.600 neigh-10 524 0.547 top-3 709 0.545  Good quality of the approach with an avgF1 of up to 70%  Using evidence from RDF documents the performance can be significantly improved (significantly better results for two properties and negligibly worst results for one property) Experimental Results - Accuracy of Best Configurations for all Properties  Different sources for the creation of the RIM  Setup different configurations in the selection and reasoning steps: o E.g. config top-3 refers to selection function top-3 and reasoning = yes 15Anisa Rula
  • 16. Temp prop Source Configuration With reasoning Without reasoning #fact avgF1 #fact avgF1 playsFor TempDeFacto top-3 311 0.511 505 0.467 holdsPoliticalPosition DBpedia neigh-10 702 0.699 822 0.667 ismarriedTo DBpedia neigh-10 705 0.600 977 0.563  The best results are obtained when reasoning is enabled Experimental Results - Accuracy with vs. without Reasoning for all Properties  The best configurations for the three properties 16Anisa Rula
  • 17. Conclusions & Future Work Summary  Temporal extension of the DeFacto framework  Modeling a space of relevant time intervals given an RDF triple  Mapping volatile facts to time intervals based on a three-phase algorithm  Unsupervised method Future work  Determine when to add or not to add the temporal scope based on the confidence of the acquisition process  Collect additional relevant time points to improve the overall results  Show the effectiveness of acquired temporal scopes in temporal query answering 17Anisa Rula
  • 18. Thank you for your attention Question? #eswc2014Rula 18Anisa Rula
  • 19. References  [Rula&2012] Anisa Rula, Matteo Palmonari, Andreas Harth, Steffen Stadtmüller, Andrea Maurino: On the Diversity and Availability of Temporal Information in Linked Open Data. International Semantic Web Conference (1) 2012: 492-507  [Gutiérrez&2005] C. Gutierrez, C. A. Hurtado, and A. A. Vaisman. Temporal RDF. In The 2ndESWC, pages 93-107, 2005  [Lehmann&2012] Jens Lehmann, Daniel Gerber, Mohamed Morsey, Axel-Cyrille Ngonga Ngomo: DeFacto - Deep Fact Validation. International Semantic Web Conference (1) 2012: 312-327  [Ling&2010] X. Ling and D. S. Weld. Temporal information extraction. In 25th AAAI, 2010.  [Derczynsk&2013] L. Derczynski and R. Gaizauskas. Information retrieval for temporal bounding. In 4th ICTIR, pages 29:129–29:130. ACM, 2013. 19Anisa Rula

Notes de l'éditeur

  1. A temporal 1.8 Billion from http://www.worldwidewebsize.com/ Note: we also consider more temporal annotation per triple!
  2. Temporally annotated RDF triples are useful for many reasons... -facts are usually considered as time invariant while in reality they dynamically change Large problem space (even at high temporal granularity levels, e.g., all possible time intervals at year granularity) Can be used as a dimensions along with facts can be organized, ranked or explored Relevancy ranking purposes
  3. 1.8 Billion from http://www.worldwidewebsize.com/
  4. Finally we return a distribution of all dates and their number of occurrences in a given context. Hence, the output of temporal DeFacto for a fact f <s, p, o> can be regarded as a vector DFV over all possible time points ti whose ith entry is the number of co-occurrences of s or o with ti
  5. The links between the facts and the date are lost We assume that temporal triples contain relevant Dates are considered at year level
  6. Each cell in the SM represents the significance of the interval identified by the cell for the given fact based on the distribution of time points acquired from the web - inject a time distribution vector into the entity-level RIM by producing a significance matrix SM Each cell of the matrix where i<j is calculated as the number of time points included in the interval [i,j] (average of time points contained in the interval) For the diagonal we provide in alternative another formula to penalize intervals of 1 year by giving a weight to the number of time point in the diagonal
  7. %
  8. -macro precision as the average of all facts
  9. Difficulty of the task Sufficient relevant time points Macro averge Difficult task since it depends on the number of available time points
  10. Molti fatti siamo molto precisi Altri meno Future: capire quelli che non sono precisi il perché, dare la confidence