Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
What is concept dirft and how to measure it?
1. Introduction A theory of concept drift Case studies Summary and future work
What is concept drift and how to measure it?
Shenghui Wang, Stefan Schlobach, Michel Klein
Vrije Universiteit Amsterdam
EKAW 2010
Lisbon
2. Introduction A theory of concept drift Case studies Summary and future work
Outline
1 Introduction
2 A theory of concept drift
3 Case studies
Concept drift in political communication
Concept drift in DBpedia
Concept drift in LKIF-Core
4 Summary and future work
3. Introduction A theory of concept drift Case studies Summary and future work
Introduction
Knowledge organisation systems (KOS) play a crucial role in
providing semantic interoperability
formal ontologies (modelled in OWL)
thesauri or taxonomies (described in SKOS)
other term classification schemes
Concepts are the central constructs
However, it is also recognised that concepts drift
the meaning of a concept changes over time, location, or
culture
4. Introduction A theory of concept drift Case studies Summary and future work
Introduction
Knowledge organisation systems (KOS) play a crucial role in
providing semantic interoperability
formal ontologies (modelled in OWL)
thesauri or taxonomies (described in SKOS)
other term classification schemes
Concepts are the central constructs
However, it is also recognised that concepts drift
the meaning of a concept changes over time, location, or
culture
5. Introduction A theory of concept drift Case studies Summary and future work
Example 1: Follow the Fashion?
6. Introduction A theory of concept drift Case studies Summary and future work
Example 2: Women’s role?
Suffragettes said that women’s role in society is unacceptable
Pope says that women’s role in society is unacceptable
7. Introduction A theory of concept drift Case studies Summary and future work
Example 3: European Union
(1979) The European Community is a common denominator for the
European Economic (EEC), the European Coal and Steel
Community (ECSC), and the European Atomic Energy Community
(EAEC). – DTV Atlas
(1999) The European Community is the new stage in the implementation
of increasing the Union of the European people. – Brockhaus:
Europaeische Gemeinschaft
(2003) The European Union or EU is an international organisation of
European states, established by the Treaty on European Union. –
Wikipedia 2003
(2006) The European Union (EU) is a supranational and intergovernmental
union of 25 independent, democratic member states. – Wikipedia
2006
(2010) The European Union is an international organisation comprising 27
European countries and governing common economic, social, and
security policies. – Encyclopedia Britanica
8. Introduction A theory of concept drift Case studies Summary and future work
Research questions
1 What is concept drift, and how to formalise it?
2 Can we identify the impact of concept-drift?
9. Introduction A theory of concept drift Case studies Summary and future work
The meaning of a concept
We consider the intension, extension and label as three
components of the meaning of a concept:
Definition
The meaning Ct of a concept C at some moment in time t is a
triple (labelt(C), intt(C), extt(C)), where labelt(C) is a String,
intt(C) a set of properties (the intension of C), and extt(C) a
subset of the universe (the extension of C).
10. Introduction A theory of concept drift Case studies Summary and future work
Identity
Identity allows us to compare two variants of the same concept at
different moments in time even if the meaning (either label,
extension or the non-rigid part of its intension) has changed.
Definition
Two concepts C1 and C2 are considered identical if and only if,
their rigid intension are equivalent, i.e., intr (C1) = intr (C2).
11. Introduction A theory of concept drift Case studies Summary and future work
Identity
Identity allows us to compare two variants of the same concept at
different moments in time even if the meaning (either label,
extension or the non-rigid part of its intension) has changed.
Definition
Two concepts C1 and C2 are considered identical if and only if,
their rigid intension are equivalent, i.e., intr (C1) = intr (C2).
12. Introduction A theory of concept drift Case studies Summary and future work
Concept drift
This definition of drift is based on the idea that a concept retains
its identity over time, i.e., remains the same at least temporarily.
Definition
A concept C has extensionally drifted between time ti and tj if and
only if simext(Cti , Ctj ) = 1. Intensional and label drift are defined
similarly. The meaning of a concept has drifted if one of the
aspects has drifted.
13. Introduction A theory of concept drift Case studies Summary and future work
Concept shift
Definition
The meaning of a concept C extensionally shifts between two of its
variants Cti and Ctj if the extension of Ctj is more similar to the
extension of a non-identical concept rather than to the extension
of Cti . Intensional and label shift are defined similarly.
C1
t1
C1
t2
C2
t2
time t1 t2
14. Introduction A theory of concept drift Case studies Summary and future work
(In)stability
The more the meaning of a concept drifts, the more unstable it
becomes.
We put the variants of one concept at different moments into
a chain, i.e., chain(C, t1, tn) = Ct1 → Ct2 → . . . → Ctn
We take the average similarity of all steps along this chain as
the stability measure
As an relative measure, it tells whether one concept is more
stable than another over a certain period of time
15. Introduction A theory of concept drift Case studies Summary and future work
Applying the framework
To apply our framework for concept drift in a specific use-case, the
following steps are required:
1 to define intension, extension and a labelling function
2 to define the identity of concepts
3 to define similarity functions over intension, extension and
labels
16. Introduction A theory of concept drift Case studies Summary and future work
Case studies
Political communication (a political vocabulary described in
SKOS)
DBpedia (a general-purposed ontology modelled in RDF(S))
LKIF-core (a legal ontology modelled in OWL)
17. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Concept drift in political vocabularies
Communication scientists use certain vocabularies to annotate
newspapers, so that they can do content analysis.
We studied five variants of a SKOS vocabulary of political
concepts used during five recent Dutch national election
campaigns, which took place in 1994, 1998, 2002, 2003 and
2006.
We also collected all newspaper articles which were manually
annotated with the concepts from the particular variant of
that year.
Manuel mappings are used as the identities.
18. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Intension, extension and label of political concepts
The label of a concept is obtained using the SKOS Core
labelling property skos:prefLabel.
The extension ext(Ct) of a concept Ct ∈ Vt at time t is the
set of all sentences annotated by Ct, i.e.,
exts(Ct) = {s ∈ ∆t | annotatedBy Ct}.
The intension of a concept int(Ct) is determined by the most
associated concepts. For each concept C, its intension is a set
of concepts which co-occur the most in the sentences they
code in one moment in time.
19. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Intension, extension and label of political concepts
The label of a concept is obtained using the SKOS Core
labelling property skos:prefLabel.
The extension ext(Ct) of a concept Ct ∈ Vt at time t is the
set of all sentences annotated by Ct, i.e.,
exts(Ct) = {s ∈ ∆t | annotatedBy Ct}.
The intension of a concept int(Ct) is determined by the most
associated concepts. For each concept C, its intension is a set
of concepts which co-occur the most in the sentences they
code in one moment in time.
20. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Intension, extension and label of political concepts
The label of a concept is obtained using the SKOS Core
labelling property skos:prefLabel.
The extension ext(Ct) of a concept Ct ∈ Vt at time t is the
set of all sentences annotated by Ct, i.e.,
exts(Ct) = {s ∈ ∆t | annotatedBy Ct}.
The intension of a concept int(Ct) is determined by the most
associated concepts. For each concept C, its intension is a set
of concepts which co-occur the most in the sentences they
code in one moment in time.
21. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Intension, extension and label of political concepts
The label of a concept is obtained using the SKOS Core
labelling property skos:prefLabel.
The extension ext(Ct) of a concept Ct ∈ Vt at time t is the
set of all sentences annotated by Ct, i.e.,
exts(Ct) = {s ∈ ∆t | annotatedBy Ct}.
The intension of a concept int(Ct) is determined by the most
associated concepts. For each concept C, its intension is a set
of concepts which co-occur the most in the sentences they
code in one moment in time.
22. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Similarity measures
Edit distance between concept labels
Jaccard similarity between concept intensions
Instance-matching based similarity between concept extensions
23. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Stability of political concepts
2002 2003 2006
Environmental
Activist
Democracy
0.03
Moroccans
0.02
Rechtsstaat
0.03
Democracy
High Incomes
0.02
Referendum
0.04
Bureaucracy
0.04
Democracy
Islam
0.02
Voting
Computers
0.01
Sharia
0.03
Figure: Intension of concept Democracy in 3 years, with average stability
of (Sint = 0.02)
24. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Stability of political concepts
2002 2003 2006
unions
employees unions
Socio-Economic
Council
employees
employers
0.1220.09
0.229
social
pact
employers
0.189
0.266
0.26
employers
employees
work
migration
0.032
0.085
discrimination
0.048
Figure: Intension of concept Employers in 3 years, with average stability
of (Sint = 0.15)
25. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in political communication
Concept shifts of political concepts
1994
1998 2006
Military
Military
Dutch military deployment
Military 2003
2006
Childcare
Childcare
Free Childcare
(a) Label shift (b) Extensional shift
Figure: Example of label shift and extension shift, where the red links
indicate the two concepts are identical according to our domain experts,
while the blue links are the most similar concepts in terms of the
corresponding aspect.
26. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in DBpedia
Concept drift in DBpedia
We studied 4 versions of DBpedia: 3.2, 3.3. 3.4 and 3.5
We use URI references as identities of concepts
27. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in DBpedia
Concept drift in DBpedia
We studied 4 versions of DBpedia: 3.2, 3.3. 3.4 and 3.5
We use URI references as identities of concepts
28. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in DBpedia
RDF(S) concepts and their meaning
Definition
Let O be the DBpedia ontology, i.e., a set of triples (s, p, o), and
O∗ the semantic closure of O.
The rdf-label labr (C) of C is defined as the object of the
(C,rdfs:label, o).
The rdf-extension extr (C) of C is defined as the set of
resources r such that (r rdf:type C) ∈ O∗.
The rdf-intension intr (C) of C is defined as the set of all
triples (C, p, o) ∈ O∗ in O where p =rdfs:subclass and
(s, p, C), where p ∈ {rdfs:subclass, rdfs:domain, rdfs:range}.
29. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in DBpedia
Stability ranking of DBpedia concepts
Rank Extensional Intensional
1 Planet SportsEvent
2 Road FormulaOneRacer
3 Infrastructure WineRegion
4 Cyclist Cleric
5 LunarCrater WrestlingEvent
... ...
163 OfficeHolder Vein
164 Politician BasketballPlayer
165 City EthnicGroup
166 College Band
167 ChemicalCompound BritishRoyalty
Table: The top 5 most stable and last 5 least stable DBpedia concepts in
terms of their extension and intension (of the 167 concepts present in all
four versions)
30. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in DBpedia
Concept shifts in DBpedia
dbpedia32 dbpedia33 dbpedia34 dbpedia35
SportsEvent SportsEvent
0.98
Protista Protista
0.89
City City
0.99
River River
0.99
ChemicalCompound ChemicalCompound
0.64
SportsEvent
0.98
Fungus
0.77
City
0.84
River
0.78
ChemicalCompound
0.47
SportsEvent
0.97
Fungus
0.89
Settlement
0.60
Stream
0.62
ChemicalCompound
0.71
31. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in LKIF-Core
LKIF-Core ontology
The Legal Knowledge Interchange Format (LKIF) Core
Ontology is a core ontology of basic legal concepts, developed
by the ESTRELLA consortium
We study 4 major versions of LKIF-Core: 1.0, 1.0.2, 1.0.3 and
1.1.
Unfortunately, the rdfs:label actually was rarely used; only 4
concepts specify their labels which stay constant for all
variants.
There are no instances associated with these legal concepts
32. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in LKIF-Core
LKIF-Core ontology
The Legal Knowledge Interchange Format (LKIF) Core
Ontology is a core ontology of basic legal concepts, developed
by the ESTRELLA consortium
We study 4 major versions of LKIF-Core: 1.0, 1.0.2, 1.0.3 and
1.1.
Unfortunately, the rdfs:label actually was rarely used; only 4
concepts specify their labels which stay constant for all
variants.
There are no instances associated with these legal concepts
33. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in LKIF-Core
The meaning of OWL concepts
Definition
Let O to be the OWL ontology and O∗ denote the OWLIM
inferred semantic closure. The owl-label labo(C) of C is defined as
the object of the (C,rdfs:label, o). The owl-intension into(C) of C
is defined:
1 all triples (C, p, o) ∈ O∗ and (s, p, C) ∈ O∗
2 all triples in chains {(C, p1, o1) ◦ (s2, p2, o2) ◦ . . . , ◦(sn, pn, on)}
where sk = ok−1, plus
3 all triples in chains
{(s1, p1, o1) ◦ (s2, p2, o2), ◦, . . . , ◦(sn, pn, C)} where sk+1 = ok
being blank nodes.
34. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in LKIF-Core
Stable and unstable concepts
Most stable concepts Most unstable concepts
norm.owl#Custom legal-action.owl#Mandate
expression.owl#Promise legal-action.owl#Public Law
norm.owl#Potestative Expression legal-action.owl#Asignment
norm.owl#Hohfeldian Power legal-action.owl#Act of Law
relative-places.owl#Place legal-action.owl#Delegation
Table: Top 5 stable and unstable concepts.
35. Introduction A theory of concept drift Case studies Summary and future work
Concept drift in LKIF-Core
Intensional shift in LKIF-Core
lkif1.0:action.owl#Speech Act lkif1.0.2:expression.owl#Speech Act
lkif1.0:action.owl#Termination lkif1.0.2:process.owl#Termination
lkif1.0.2:lkif-top.owl#Mental Concept lkif1.0.3:lkif-top.owl#Mental Entity
lkif1.0.2:lkif-top.owl#Physical Concept lkif1.0.3:lkif-top.owl#Physical Entity
Table: Examples of confirmed intensional shift in LKIF-Core
36. Introduction A theory of concept drift Case studies Summary and future work
Summary
We proposed a general theory to study concept drift based on
concept identity.
We introduced a theoretical foundation for the notions of
drift, shift and stability
We applied the general mechenism in three practical
applications modelled in SKOS, RDFS and OWL respectively.
37. Introduction A theory of concept drift Case studies Summary and future work
Future work
Investigate alternative theories for concept drift, such as based
on morphing
Develop systematic evaluation methods
Develop applications which leverage the detected concept drift