Presented at Journal Paper Track, The Web Conference, Lyon, France, April 15, 2018
https://doi.org/10.1145/3184558.3186234
Abstract: Linked Open Data (LOD) technology enables web of data and exchangeable knowledge graphs through the Internet. However, the change in knowledge is happened everywhere and every time, and it becomes a challenging issue of linking data precisely because the misinterpretation and misunderstanding of some terms and concepts may be dissimilar under different context of time and different community knowledge. To solve this issue, we introduce an approach to the preservation of knowledge graph, and we select the biodiversity domain to be our case studies because knowledge of this domain is commonly changed and all changes are clearly documented. Our work produces an ontology, transformation rules, and an application to demonstrate that it is feasible to present and preserve knowledge graphs and provides open and accurate access to linked data. It covers changes in names and their relationships from different time and communities as can be seen in the cases of taxonomic knowledge.
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
1. Presenting and Preserving
the Change in Taxonomic Knowledge
for Linked Data
Rathachai
Chawuthai
rathachai.ch@kmitl.ac.th
Hideaki
Takeda
Professor
Vilas
Wuwongse
Professor
Utsugi
Jinbo
Entomologist
Taxonomist
Journal Paper Track, The Web Conference, Lyon, France, April 15, 2018
SemanticWeb, vol. 7, no. 6, pp. 589-616, 2016, DOI: 10.3233/SW-150192
2. Agenda
Change in Taxonomy
LTK: A Logical Model for Linking
Taxonomic Knowledge
Result
4. Knowledge on Biodiversity Domain
Taxonomy:
Description, identification, nomenclature, and classification of organisms.
Taxa (taxon)
Scientific Names
Information on taxa
Taxonomic Concept description, Interspecies Interaction, Ecological Information, Food
Web, etc.
Databases: GBIF, uBio, TDWG, ZooBank, MycoBank, etc.
Most of them are based on scientific names.
Problem: Taxonomic Knowledge is dynamic
Biologists continue discovering more knowledge.
The Change in Taxonomic Knowledge is common due to the new discovery
and new viewpoint by biologists
4
Biodiversity Knowledge
8. 1758 1827 1964 1995
8
I. galbula
I. bullockii
Merged
Into I. galbula
I. bullockii
I. galbula
Split
Into
9. How to represent and preserve changes in taxonomy?
Not current knowledge alone is valuable. Past knowledge should be
preserved correctly.
How to publish these changes as Linked Data with
Machine/Human-Readable Entities
(taxon concept with name & context)
Light-weight expressions (compatible with the current use of
taxon in other DBs) ?
9
Challenge
C
E
11. 12
LTK : Linked Taxonomic
Knowledge
Linked Taxonomic Knowledge (LTK) for preserving and presenting the change in
taxonomic knowledge for linked data.
The model can manage the changes in taxonomic knowledge.
The model preserves the changes as an event along with aspects of time
and provenance.
The model supports the changes in either taxa or association between
taxa.
The model allows tracing the background knowledge of the changes by
linking the cause and effect between them.
The model can be used to publish a suitable format for a dataset
for linked open data.
The linked data model deals with simple identifiers of Semantic Web
resources in order to make the linked data be easily recognized by both
humans and machines.
The model provides a sequence of changes in taxa.
The model presents temporal data on the basis of a given time point.
12. 13
Definition
Entities for LTK
Nominal Entity, Simple Nominal Entity, and Contextual Nominal Entity
Operations of Change
Change in Conceptions:
Merge, Split, and Replace
Change in Relations:
Change higher taxon, subdivide, combine, synonym link, etc.
Data Models
Event-Centric Model, Transition Model, and Snapshot Model
Symbols in the following Diagrams
(nom) is an instance of a nominal entity,
(sim) is an instance of a simple nominal entity,
(con) is an instance of a contextual nominal entity,
(OPR) is a class of a change entity (operation),
(opr) is an instance of an operation, and
(event) is an instance of an event entity.
13. A taxon can be species, genera, families, etc.
But, a taxon may change to a synonym by time and vice versa.
14
Entity Issue: Taxon and Name
EC
Merging of 2 Genera:
Bubo and Nyctea
into Bubo
causes
Nytea scandica
is a synonym of
Bubo scandiacus
Nytea scandica
1999 Now
Name
Taxon
Taxon
14. Introduce terms that satisfy the use case of biologists
15
Taxon ID for Linked Data
Taxon
Concept
Name
• uri
• uri
• uri
Nominal
Entity
(nom)
A concept and an Internet resource used for
taxonomic knowledge that can be a taxon
concept and a name (ex. synonym)
Simple
Nominal
Entity
(sim)
A subset of the Nominal Entity corresponds to a
single scientific name.
- genus:Bubo (accepted) a taxon concept
- genus:Nyctea (obsoleted) a name.
Contextual
Nominal
Entity
(con)
It is a version of the nominal entity specified by
an accepted period.
genus:Bubo_1999
dct:isVersionOf genus:Bubo.
EC
15. Ontology for Knowledge Change
• Change in taxonomic knowledge is modeled as operations.
• The operations are organized as the ontology.
16
16. It is an RDF format for presenting the operations of change with time, and references. It also
provides links between operations for showing some reasons behind the change. This is an n-
ary relation, so it is complicated by design, but is flexible for the uses of other applications.
17
Event-Centric Model
ltk:Taxon
Merger
ltk:Change
HigherTaxon
ex:merge1 ex:reclass1
ex:event1
rdf:type rdf:type
cka:interval
“t1”
“t2”
tl:beginsAt
DateTime
tl:endsAt
DateTime
cka:effect
ex:A_1
ex:B_1
ex:A_2
ex:X_1
(OPR) (OPR)
(opr)(opr)
(con)
(con)
(con)
(con)
(event)
C
C
17. It is transformed from the event-centric model by Semantic Web rules in order to
generate flat, straightforward, and easily linkable triples representing the
chronological changes of taxon concepts or their names.
18
Transition Model
ltk:Taxon
Merger
ex:merge1
ex:A_1
ex:A_2
ex:B_1
rdf:type
cka:Concept
Evolution
rdfs:subClassOf
ltk:mergedInto
ltk:mergedInto
(OPR)
(opr)
(con)
(con)
(con)
ex:event1
cka:interval
“t1”
“t2”
tl:beginsAt
DateTime
tl:endsAt
DateTime
cka:assures
(event)
rules
ex:A_1
ex:B_1
ex:A_2
ltk:major
MergedInto
ltk:major
MergedInto
ex:inv1
ltk:major
Link
“t1”
“t1”
“t2”
“t1”
ltk:expired
ltk:expired
ltk:entered
ltk:expired
Event-Centric Model Transition Model
(con)
(con)
(con)
E
C E
18. It is a set of simply regular triples that are transformed from the event-centric
model with a given time point using Semantic Web rules, so the triples can present
snapshot knowledge at a particular time point.
19
Snapshot Model
ltk:Change
HigherTaxon
ex:reclass1
rdf:type
cka:Relationship
Evolution
rdfs:subClassOf
ltk:higherTaxon
cka:relation
ex:event1
cka:interval
“t1”
“t2”
tl:beginsAt
DateTime
tl:endsAt
DateTime
ex:A_2
ex:X_1
ex:B_1
cka:assures
(OPR)
(opr)
(event)
(con)
(con)
(con)
ex:inv1
Event-Centric Model
ex:inv1
“t1” “t2”
tl:endsAt
DateTime
tl:beginsAt
DateTime
(the name of the graph)
(named graph)
ltk:higher
Taxon
ex:X_1
ex:A_2
(con)
(con)
rules
Snapshot Model
E
C E
19. Role of LTK (right) in LOD Cloud (left) containing example datasets. Ovals with single
alphabet or ID number are general concepts, ovals with version are versions of general
concepts, dashed lines show same URIs, :sameAs is owl:sameAs, :isVer is dct:isVersionOf,
:re is ltk:replacedInto, and :mg is ltk:mergedInto.
20
LTK with LOD Cloud
Linked Taxonomic Knowledge
Transition Model
/Snapshot Model
(For linked data)
Event-Centric Model
(for presenting change)
:re
:mg
:mg
DL
O
Example Dataset 2
(LODAC)
C
LOD Cloud
Example Dataset 1
(GBIF)
A
c_3
a_1 b_1
a_2
a_2
b_1
c_3
a_1
a_2
02
01
0304
b
a
c
External Links
(for managing
linked data with
external
datasets)
(con)
(con)
(con)
(con)
(con)
(con)
(con)
(con)
(con)
(sim)
(sim)
(sim)
(event)
(opr)
(nom)
(nom)
C
E
21. Evaluation against Use Cases
Change of moths species of the
family Saturniidae among 3
checklists: Inoue (1982),
Jinbo (2008), and Kishida (2011)
LTK model covers all cases
including: creating a concept,
obsoleting a concept,
replacing a taxon, merging taxa,
splitting a taxon, linking synonym,
changing a higher taxon,
subdividing a taxon, and combining
taxa.
22
Outcome
Implementation
http://rc.lod.nii.ac.jp/ltk
C
E
22. 23
Comparison & Discussion
Criteria
TaxMeOn
(& its enhancement) LTK
Change in Knowledge
Capturing changes in taxonomy Yes Yes (Even-Centric Model)
Presenting context in a graph No Yes (Even-Centric Model)
Linking background between
changes
No (it is limited by design due to the use
of a single binary relation presenting
changes)
Yes (Even-Centric Model)
Human-Readable Identifiers
Including a human-readable
name in a URI
Rare
(Only in schema but not taxon concepts)
Yes
(SIM & CON)
Light-Weight Triples
Accessing a name of a taxon use 1 triple
(taxon and name are split)
get directly from the URI
(SIM & CON)
Accessing taxa before and after
merging or splitting
use 2 triples use 1 triple
(Transition Model)
Presenting a relation between
two names
use 3 triples use 1 triple
(CON & Transition/Snapshot Model)
Accessing temporal information by full-text linking to a taxon Yes (Snapshot Model)
C
C
C
E
E
E
E
C
EC
23. LTK framework allows increasing the capability of a system to other domain with
other vocabularies.
Developer can create other operations under either the classes of the change in
conception (cka:ConceptEvolution) or the change in triple
(cka:RelationshipEvolution) and reusing or adapting the Semantic Web rules.
24
Extensibility
Geographic Area Representations in Statistical Linked Open Data of Japan, D. Yamamoto, et al. Joint
Proceedings of the International Workshops on Hybrid Statistical Semantic Understanding and Emerging Semantics,
and Semantic Statistics, co-located with 16th Extended Semantic Web Conference (ISWC 2017)
Icterus galbula has been flound since 1758
Icterus bullockii has been flound since 1827
Because the name “galbula ” is the former name, it becomes an accepted name.
So, these name are synonym.
I galbula is a senior synonym whereas I. bullockii is a jounior synonym.
Of course, knowledge of these name must be combined together.
Moreover, after this day, if some researchers discovered new knowledge of this bird, they would record the new information a long with this name.
If we need to find information of the “galbula”, we can query by this name.
However, some information from year 1960 include knowledge of “bullockii”.
In the other hand,
Some information about “bullockii” are missing, because some knowledge between 1960 and 1995 are recorded with the name “galbula”.
Therefore, the correct temporal context of concepts and reasons of their changes becomes necessity for understanding a taxon concept as well.
First of all, I would like to introduce some background and terms.
If using RDF for capturing context, information will be rich but graph becomes much more complex.
So, it need to think about the lightweight expressions
First of all, I would like to introduces some terms for making a clear borders among the uses of URIs.
First of all, I would like to introduces some terms for making a clear borders among the uses of URIs.
The outcome of this project is that ….
After that, I compare our work against the TaxMeOn
LTK: Every name (and change) has URI.
TaxMeOn: Every taxon has URI.