SlideShare a Scribd company logo
1 of 42
The role of Thesauri
and Standard Vocabularies
in linking data
Dr. Johannes Keizer
FAO of the United Nations
Office of Knowledge Exchange, Research and Extension
Knowledge and Capacity for Development
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
The Development of the Internet
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
 “Closed” (“normal”) IT environments
 Data sources carefully controlled.
 Data formats “custom-defined” for an
application.
 Linked data based on an “open world
mindset”
 Integrating data from the open Web
 Systems designed to incorporate new
information incrementally
 By design, tolerance of incomplete
information
Open World Mindset
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
The Linked Data Universe:
http://www.linkeddata.org (july 2009)
4
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22 The Linked Data Universe:
http://www.linkeddata.org (july 2010)
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Example: BBC Wildlife Finder
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Humboldt Squid page, pulled together from a diversity of Linked Data
sources
Animal Diversity Web:
Nocturnal way of life
BBC TV Documentary
BBC News item
Wikipedia
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
RDF– a grammar for the language of data
Resource
relatedTo
ResourceA ResourceB
Resource
describedBy
ResourceA Some text
1. Describe resources using interrelated “statements” (“triples”).
2. Use URIs – unique, globally managed identifiers –
as the “words” of statements.
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
•http://www.w3.org/2007/Talks/0221-Bangalore-IH/
RDF as a common format for merging data
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Finding things related to “genes” across
databases
Source: Joanne Luciano, Mitre, and the W3C HCLS IG
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
 Born as tools to assure consistency in the
indexing of library collections
 Thesauri were based on “terms”, but terms
represented already concepts in a non
explicit way
 Hierarchical and associative relationships
represented generic ontological domain
knowledge
 Candidate building blocks for the semantic
web
Role of thesauri/concept schemes
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
..from thesaurus to Ontologies….
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
 around 30,000 concepts
 600000 labels in around 20 languages.
 one-stop shop for terminological knowledge
related to agriculture in general
 a knowledge base of related concepts organized
in ontological relationships (hierarchical,
associative, equivalence)
 Is a concept/term/string based system
 Concepts may be organized in multiple categories.
AGROVOC today
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Semantic Relationships
Concept to
Concept
isA (hierarchy), isPestOf, hasPest
Concept to
Term
has_lexicalization
(links concepts to their lexical
realizations)
Term to
Term
isSynonymOf, isTranslationOf,
hasAcronym, hasAbbreviation
Term to
String
hasSpellingVariant, hasSingular
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
The AGROVOC SKOS-XL Model
8171
1474
12332
skosxl:altLabel
skosxl:prefLabel
skos:broader
SKOS
Label
skos:broader
SKOS
Concept
rdf:type
rdf:type
6211
skos:broader
Agrovoc
Concept
Scheme
skos:topConceptOfskos:inScheme
SKOS
Concept
Scheme
rdf:type
rdf:type
:bar
:foo
“corn”
“maize”
skosxl:literalForm
skosxl:literalForm
rdf:type
rdf:type
rdf:type
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
http://www.w3.org/2004/02/skos/
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
SKOS-XL output
<rdf:Description
rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type
rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf
:Description><rdf:Description
rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type
rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
<skos:inScheme
rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/>
<skos:topConceptOf
rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Descri
ption><rdf:Description
rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610">
<literalForm xmlns="http://www.w3.org/2008/05/skos-xl#"
xml:lang="en">subjects</literalForm> <rdf:type
rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description>
URI of AGROVOC concept
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
AGROVOC EUROVOC UNBIS Relationship
http://aims.fao.
org/aos/agrovoc
/c_207
http://eurovoc
.europa.eu/21
9055
agroforestry skos:exactMatch
/ owl:sameAs
http://aims.fao.
org/aos/agrovoc
/c_4826
http://eurovoc
.europa.eu/22
0018
MILK skos:exactMatch
/ owl:sameAs
http://aims.fao.
org/aos/agrovoc
/c_12332
http://eurovoc
.europa.eu/21
9871
MAIZE skos:exactMatch
/ owl:sameAs
Linking vocabularies
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
http://agris.fao.org/agris-search/search/display.do?f=2004/ZA/ZA04002.xml;ZA2004000049
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
http://eurovoc.europa.eu/
219871
Maize
skosxl: literalForm
Maize
http://aims.fao.org/ao
s/agrovoc/c_12332
AGROVOC
skosxl: literalForm
Maize
http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAs http://eurovoc.europa.eu/219871
owl:sameAs/exactMatch
http://agris.fao.org/agris-
search/search/display.do?f=1996
/TR/TR96001.xml;TR9600026
Linking data through common URIs
skosxl: literalForm
owl:sameAs/exactMatch
http://eur-
lex.europa.eu/LexUriServ/LexUriSe
rv.do?uri=OJ:L:2010:202:0011:001
5:EN:PDF
http://unbisnet.un.org:8080/ipac20/ipac.j
sp?session=128F308557F34.283092&pr
ofile=bib&uri=full=3100001~!685149~!1&
ri=1&aspect=subtab124&menu=search&
source=~!horizon
Maize
Eurovoc
UNBIS
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
What are we doing with unstructured data?
• We have enormous amounts of unstructured
material
• Still most of the documents that we are
producing are mostly semantically
unstructured
• Human work to catalogue and index is
becoming always more rare
• We need machines to do automatic semantic
mark ups of text
• If machines are trained and based on concept
schemes, ther are able to do so
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
• Does Concept identification in unstructured
texts
• Uses Agrovoc as a controlled vocabulary
• Prototype under testing with excellent
results (entire repository of ICARDA
indexed)
• Will produce in future Structured RDF files
that can be used to link data like “open
Calais”
•
AgroTagger
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Life Demo: Semantic mark ups:
http://viewer.opencalais.com/
http://agropedialabs.iitk.ac.in/Tagger/Agrotagger_text.php
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
The concept scheme workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
 Is a web-based working environment for managing the
AGROVOC Concept Server
 Facilitate the collaborative editing of multilingual
terminology and semantic concept information
 It includes administration and group management
features
 It includes workflows for maintenance, validation and
quality assurance of the data pool
 The CS is accessible freely to everybody to facilitates
collaborative editing
The workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Group/Action/Status
GROUP
Non registered users
Term editors
Ontology editors
Validators
Publishers
Administrators
ACTION
concept-create
concept-delete
concept-edit
term-create
term-edit
term-delete
..........
STATUS
Proposed by guest
Proposed
Revised by guest
Revised
Validated
Published
Proposed deprecated
Deprecated
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
3
Concept Life Cycle
GUEST
<concept-create>
Proposed by guest
VALIDATOR
<validates>
Validated
PUBLISHER
<publishes>
Published
TERM EDITOR
<concept-edit>
Revised
ADMINISTRATOR
<validates>
Published
ONTOLOGY EDITOR
<concept-delete>
Proposed deprecated
PUBLISHER
<validates>
Deprecated
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Modules
• Home
• Search
• Concept/Term
Management
• Relationship
Management
• Classification Scheme
Management
• Validation
• Consistency Check
• Import/Export
• User/Group Management
• Statistics/Preferences
3
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
• by string: the user can specify if the system
should search by exact match, beginning with,
contains or fuzzy
• by URI or term code; or by range of term code
(e.g. between 123 and 9876)
• by classification schemes
• by creation or modification date
• by specific relationships (e.g. search all
concepts using the “has_pest”)
• by status, language
by notes/attributes
Search
3
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
3
Graph Visualization
 Java Applets
based touch
graph
 Visualizes
concepts and
its
relationships
with other
concepts in
graphical view
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
3
Web services
AGROVOC CS
WORKBENCH maintain access
response
uses
SKOS
Triple
Store
Other
Applications
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
AGROVOC Web Services
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Architecture of the System
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
3
Front end Back end
Administrativ
e Database
(Mysql)
Protégé
Triple Store
(Mysql)
Middleware
Hibernate
Layer
Protégé
OWL API
Gilead
Intermediate
Layer
Google
Web
Toolkit
(GWT)
Graph
Visualizatio
n
GWT
Incubator
Web
services
System Overview
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Giving it a try…….
A demo version of the AWB:
http://202.73.13.50:55234/agrovocdevv10d/ With all
functionalities, availabe to users for testing purpose.
Latest stable release version 1.0 : (read/write)
http://202.73.13.50:55381/agrovocv10i/
Latest stable release version 1.0 (Read only):
http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only
view privilege)
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
…and more: http://aims.fao.org
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
ThesaurusWorkshop–CASBeijing,2010-10-22
Thank You!

More Related Content

What's hot

Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemHerbert Van de Sompel
 
鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107皓仁 柯
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...Carole Goble
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data ArrivesRichard Wallis
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Carole Goble
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
KESW2012 Hackathon St Petersburg
KESW2012 Hackathon St PetersburgKESW2012 Hackathon St Petersburg
KESW2012 Hackathon St PetersburgAI4BD GmbH
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDMpetermurrayrust
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynoteCarole Goble
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...Herbert Van de Sompel
 
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...Franck Michel
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...keesvb
 

What's hot (14)

Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 
鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data Arrives
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
KESW2012 Hackathon St Petersburg
KESW2012 Hackathon St PetersburgKESW2012 Hackathon St Petersburg
KESW2012 Hackathon St Petersburg
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDM
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
 
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...Bioschemas: Marking up biodiversity websites to improve data discovery and we...
Bioschemas: Marking up biodiversity websites to improve data discovery and we...
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...
 

Viewers also liked

Budapest Shoes(Nx Power Lite) Noemi
Budapest Shoes(Nx Power Lite) NoemiBudapest Shoes(Nx Power Lite) Noemi
Budapest Shoes(Nx Power Lite) Noemimgala
 
From Open Access to Open data, our initiatives
From Open Access to Open data, our initiativesFrom Open Access to Open data, our initiatives
From Open Access to Open data, our initiativesJohannes Keizer
 
2010 04 iaald-agrovoc_keizer-et-al
2010 04 iaald-agrovoc_keizer-et-al2010 04 iaald-agrovoc_keizer-et-al
2010 04 iaald-agrovoc_keizer-et-alJohannes Keizer
 

Viewers also liked (8)

Budapest Shoes(Nx Power Lite) Noemi
Budapest Shoes(Nx Power Lite) NoemiBudapest Shoes(Nx Power Lite) Noemi
Budapest Shoes(Nx Power Lite) Noemi
 
World bank 2011-05
World bank 2011-05World bank 2011-05
World bank 2011-05
 
Cornell 2011 05-13
Cornell 2011 05-13Cornell 2011 05-13
Cornell 2011 05-13
 
From Open Access to Open data, our initiatives
From Open Access to Open data, our initiativesFrom Open Access to Open data, our initiatives
From Open Access to Open data, our initiatives
 
2010 04 iaald-agrovoc_keizer-et-al
2010 04 iaald-agrovoc_keizer-et-al2010 04 iaald-agrovoc_keizer-et-al
2010 04 iaald-agrovoc_keizer-et-al
 
How can GLAMs support Wikimedians? - Micha Reiser
How can GLAMs support Wikimedians? - Micha ReiserHow can GLAMs support Wikimedians? - Micha Reiser
How can GLAMs support Wikimedians? - Micha Reiser
 
2007 09 21 Aos 8
2007 09 21 Aos 82007 09 21 Aos 8
2007 09 21 Aos 8
 
Lo c 2011-05-18
Lo c 2011-05-18Lo c 2011-05-18
Lo c 2011-05-18
 

Similar to Istic thesaurus ws-keizer_2010-10-22

Aos china keizer-2010-10-30
Aos china keizer-2010-10-30Aos china keizer-2010-10-30
Aos china keizer-2010-10-30Johannes Keizer
 
Un unbis-agrovoc 2010-09-03
Un unbis-agrovoc 2010-09-03Un unbis-agrovoc 2010-09-03
Un unbis-agrovoc 2010-09-03Johannes Keizer
 
Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25Johannes Keizer
 
Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsRichard Littauer
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
 
SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
SWAN/SIOC: Aligning Scientific Discourse Representation and Social SemanticsSWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
SWAN/SIOC: Aligning Scientific Discourse Representation and Social SemanticsJohn Breslin
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptxvijayapraba1
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Robert H. McDonald
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
 
Skos Presention 5 16 2008 Leitte
Skos Presention 5 16 2008 LeitteSkos Presention 5 16 2008 Leitte
Skos Presention 5 16 2008 LeitteLynn Leitte
 
Semantic Web in Physical Science
Semantic Web in Physical ScienceSemantic Web in Physical Science
Semantic Web in Physical Sciencepetermurrayrust
 
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...CIGScotland
 

Similar to Istic thesaurus ws-keizer_2010-10-22 (20)

Aos china keizer-2010-10-30
Aos china keizer-2010-10-30Aos china keizer-2010-10-30
Aos china keizer-2010-10-30
 
10 years Agricultural Ontology Initiative: Building Blocks for a Linked Data ...
10 years Agricultural Ontology Initiative: Building Blocks for a Linked Data ...10 years Agricultural Ontology Initiative: Building Blocks for a Linked Data ...
10 years Agricultural Ontology Initiative: Building Blocks for a Linked Data ...
 
Un unbis-agrovoc 2010-09-03
Un unbis-agrovoc 2010-09-03Un unbis-agrovoc 2010-09-03
Un unbis-agrovoc 2010-09-03
 
AgriOcean DSpace
AgriOcean DSpace AgriOcean DSpace
AgriOcean DSpace
 
A Collaborative Framework for Managing and Publishing KOS
A Collaborative  Framework for  Managing and Publishing KOS A Collaborative  Framework for  Managing and Publishing KOS
A Collaborative Framework for Managing and Publishing KOS
 
Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25
 
Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data
 
Open Access in Agricultural Research for Development : a Global Movement
Open Access in Agricultural Research for Development : a Global MovementOpen Access in Agricultural Research for Development : a Global Movement
Open Access in Agricultural Research for Development : a Global Movement
 
Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in Linguistics
 
Resources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the WebResources, resources, resources: the three rs of the Web
Resources, resources, resources: the three rs of the Web
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
Linked open data for science, culture and society
Linked open data for science, culture and societyLinked open data for science, culture and society
Linked open data for science, culture and society
 
SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
SWAN/SIOC: Aligning Scientific Discourse Representation and Social SemanticsSWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
 
Reorienting open repositories to the challenges of the Semantic Web: Experien...
Reorienting open repositories to the challenges of the Semantic Web: Experien...Reorienting open repositories to the challenges of the Semantic Web: Experien...
Reorienting open repositories to the challenges of the Semantic Web: Experien...
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
Skos Presention 5 16 2008 Leitte
Skos Presention 5 16 2008 LeitteSkos Presention 5 16 2008 Leitte
Skos Presention 5 16 2008 Leitte
 
Semantic Web in Physical Science
Semantic Web in Physical ScienceSemantic Web in Physical Science
Semantic Web in Physical Science
 
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Link...
 

More from Johannes Keizer (20)

Presentation CABI Beijing 2019 11-04
Presentation CABI Beijing  2019 11-04Presentation CABI Beijing  2019 11-04
Presentation CABI Beijing 2019 11-04
 
eROSA presentation at CAAS, September 2018
eROSA presentation at CAAS, September 2018eROSA presentation at CAAS, September 2018
eROSA presentation at CAAS, September 2018
 
2018 03 apan
2018 03 apan2018 03 apan
2018 03 apan
 
2017 11-15 macs
2017 11-15 macs2017 11-15 macs
2017 11-15 macs
 
2016 10 caas-ats
2016 10 caas-ats2016 10 caas-ats
2016 10 caas-ats
 
2016 08 gxaas
2016 08 gxaas2016 08 gxaas
2016 08 gxaas
 
2016 06 chengdu
2016 06 chengdu2016 06 chengdu
2016 06 chengdu
 
2017 08 apan
2017 08 apan2017 08 apan
2017 08 apan
 
2017 09 caas
2017 09 caas2017 09 caas
2017 09 caas
 
2017 11 wageningen-keizer
2017 11 wageningen-keizer2017 11 wageningen-keizer
2017 11 wageningen-keizer
 
2017 11 eosc-keizer
2017 11 eosc-keizer2017 11 eosc-keizer
2017 11 eosc-keizer
 
2017 11 cascd
2017 11 cascd2017 11 cascd
2017 11 cascd
 
2017 04 igad-jk
2017 04 igad-jk2017 04 igad-jk
2017 04 igad-jk
 
2017 02 apan
2017 02 apan2017 02 apan
2017 02 apan
 
2017 06 itpgrfa
2017 06 itpgrfa2017 06 itpgrfa
2017 06 itpgrfa
 
2017 03 brussels
2017 03 brussels2017 03 brussels
2017 03 brussels
 
2017 076 efita-sponsor-godan
2017 076 efita-sponsor-godan2017 076 efita-sponsor-godan
2017 076 efita-sponsor-godan
 
2017 07 montpellier-keizer
2017 07 montpellier-keizer2017 07 montpellier-keizer
2017 07 montpellier-keizer
 
2017 04 embl
2017 04 embl2017 04 embl
2017 04 embl
 
The FAIR principle in the Big Data World
The FAIR principle in the Big Data WorldThe FAIR principle in the Big Data World
The FAIR principle in the Big Data World
 

Istic thesaurus ws-keizer_2010-10-22

  • 1. The role of Thesauri and Standard Vocabularies in linking data Dr. Johannes Keizer FAO of the United Nations Office of Knowledge Exchange, Research and Extension Knowledge and Capacity for Development
  • 2. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 The Development of the Internet
  • 3. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22  “Closed” (“normal”) IT environments  Data sources carefully controlled.  Data formats “custom-defined” for an application.  Linked data based on an “open world mindset”  Integrating data from the open Web  Systems designed to incorporate new information incrementally  By design, tolerance of incomplete information Open World Mindset
  • 4. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 The Linked Data Universe: http://www.linkeddata.org (july 2009) 4
  • 5. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 The Linked Data Universe: http://www.linkeddata.org (july 2010)
  • 6. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Example: BBC Wildlife Finder
  • 7. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Humboldt Squid page, pulled together from a diversity of Linked Data sources Animal Diversity Web: Nocturnal way of life BBC TV Documentary BBC News item Wikipedia
  • 8. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 RDF– a grammar for the language of data Resource relatedTo ResourceA ResourceB Resource describedBy ResourceA Some text 1. Describe resources using interrelated “statements” (“triples”). 2. Use URIs – unique, globally managed identifiers – as the “words” of statements.
  • 9. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 •http://www.w3.org/2007/Talks/0221-Bangalore-IH/ RDF as a common format for merging data
  • 10. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Finding things related to “genes” across databases Source: Joanne Luciano, Mitre, and the W3C HCLS IG
  • 11. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22  Born as tools to assure consistency in the indexing of library collections  Thesauri were based on “terms”, but terms represented already concepts in a non explicit way  Hierarchical and associative relationships represented generic ontological domain knowledge  Candidate building blocks for the semantic web Role of thesauri/concept schemes
  • 12. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 ..from thesaurus to Ontologies….
  • 13. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22  around 30,000 concepts  600000 labels in around 20 languages.  one-stop shop for terminological knowledge related to agriculture in general  a knowledge base of related concepts organized in ontological relationships (hierarchical, associative, equivalence)  Is a concept/term/string based system  Concepts may be organized in multiple categories. AGROVOC today
  • 14. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Semantic Relationships Concept to Concept isA (hierarchy), isPestOf, hasPest Concept to Term has_lexicalization (links concepts to their lexical realizations) Term to Term isSynonymOf, isTranslationOf, hasAcronym, hasAbbreviation Term to String hasSpellingVariant, hasSingular
  • 15. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 The AGROVOC SKOS-XL Model 8171 1474 12332 skosxl:altLabel skosxl:prefLabel skos:broader SKOS Label skos:broader SKOS Concept rdf:type rdf:type 6211 skos:broader Agrovoc Concept Scheme skos:topConceptOfskos:inScheme SKOS Concept Scheme rdf:type rdf:type :bar :foo “corn” “maize” skosxl:literalForm skosxl:literalForm rdf:type rdf:type rdf:type
  • 16. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 http://www.w3.org/2004/02/skos/
  • 17. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 SKOS-XL output <rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf :Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/> <skos:inScheme rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/> <skos:topConceptOf rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Descri ption><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610"> <literalForm xmlns="http://www.w3.org/2008/05/skos-xl#" xml:lang="en">subjects</literalForm> <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description> URI of AGROVOC concept
  • 18. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 AGROVOC EUROVOC UNBIS Relationship http://aims.fao. org/aos/agrovoc /c_207 http://eurovoc .europa.eu/21 9055 agroforestry skos:exactMatch / owl:sameAs http://aims.fao. org/aos/agrovoc /c_4826 http://eurovoc .europa.eu/22 0018 MILK skos:exactMatch / owl:sameAs http://aims.fao. org/aos/agrovoc /c_12332 http://eurovoc .europa.eu/21 9871 MAIZE skos:exactMatch / owl:sameAs Linking vocabularies
  • 19. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 http://agris.fao.org/agris-search/search/display.do?f=2004/ZA/ZA04002.xml;ZA2004000049
  • 20. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 http://aims.fao.org/aos/agrovoc/c_7825 http://eurovoc.europa.eu/218754
  • 21. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 http://eurovoc.europa.eu/ 219871 Maize skosxl: literalForm Maize http://aims.fao.org/ao s/agrovoc/c_12332 AGROVOC skosxl: literalForm Maize http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAs http://eurovoc.europa.eu/219871 owl:sameAs/exactMatch http://agris.fao.org/agris- search/search/display.do?f=1996 /TR/TR96001.xml;TR9600026 Linking data through common URIs skosxl: literalForm owl:sameAs/exactMatch http://eur- lex.europa.eu/LexUriServ/LexUriSe rv.do?uri=OJ:L:2010:202:0011:001 5:EN:PDF http://unbisnet.un.org:8080/ipac20/ipac.j sp?session=128F308557F34.283092&pr ofile=bib&uri=full=3100001~!685149~!1& ri=1&aspect=subtab124&menu=search& source=~!horizon Maize Eurovoc UNBIS
  • 22. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 What are we doing with unstructured data? • We have enormous amounts of unstructured material • Still most of the documents that we are producing are mostly semantically unstructured • Human work to catalogue and index is becoming always more rare • We need machines to do automatic semantic mark ups of text • If machines are trained and based on concept schemes, ther are able to do so
  • 23. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22
  • 24. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 • Does Concept identification in unstructured texts • Uses Agrovoc as a controlled vocabulary • Prototype under testing with excellent results (entire repository of ICARDA indexed) • Will produce in future Structured RDF files that can be used to link data like “open Calais” • AgroTagger
  • 25. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22
  • 26. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22
  • 27. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22
  • 28. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Life Demo: Semantic mark ups: http://viewer.opencalais.com/ http://agropedialabs.iitk.ac.in/Tagger/Agrotagger_text.php
  • 29. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 The concept scheme workbench
  • 30. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22  Is a web-based working environment for managing the AGROVOC Concept Server  Facilitate the collaborative editing of multilingual terminology and semantic concept information  It includes administration and group management features  It includes workflows for maintenance, validation and quality assurance of the data pool  The CS is accessible freely to everybody to facilitates collaborative editing The workbench
  • 31. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Group/Action/Status GROUP Non registered users Term editors Ontology editors Validators Publishers Administrators ACTION concept-create concept-delete concept-edit term-create term-edit term-delete .......... STATUS Proposed by guest Proposed Revised by guest Revised Validated Published Proposed deprecated Deprecated
  • 32. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 3 Concept Life Cycle GUEST <concept-create> Proposed by guest VALIDATOR <validates> Validated PUBLISHER <publishes> Published TERM EDITOR <concept-edit> Revised ADMINISTRATOR <validates> Published ONTOLOGY EDITOR <concept-delete> Proposed deprecated PUBLISHER <validates> Deprecated
  • 33. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Modules • Home • Search • Concept/Term Management • Relationship Management • Classification Scheme Management • Validation • Consistency Check • Import/Export • User/Group Management • Statistics/Preferences 3
  • 34. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 • by string: the user can specify if the system should search by exact match, beginning with, contains or fuzzy • by URI or term code; or by range of term code (e.g. between 123 and 9876) • by classification schemes • by creation or modification date • by specific relationships (e.g. search all concepts using the “has_pest”) • by status, language by notes/attributes Search 3
  • 35. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 3 Graph Visualization  Java Applets based touch graph  Visualizes concepts and its relationships with other concepts in graphical view
  • 36. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 3 Web services AGROVOC CS WORKBENCH maintain access response uses SKOS Triple Store Other Applications
  • 37. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 AGROVOC Web Services
  • 38. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Architecture of the System
  • 39. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 3 Front end Back end Administrativ e Database (Mysql) Protégé Triple Store (Mysql) Middleware Hibernate Layer Protégé OWL API Gilead Intermediate Layer Google Web Toolkit (GWT) Graph Visualizatio n GWT Incubator Web services System Overview
  • 40. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Giving it a try……. A demo version of the AWB: http://202.73.13.50:55234/agrovocdevv10d/ With all functionalities, availabe to users for testing purpose. Latest stable release version 1.0 : (read/write) http://202.73.13.50:55381/agrovocv10i/ Latest stable release version 1.0 (Read only): http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only view privilege)
  • 41. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 …and more: http://aims.fao.org
  • 42. dr johannes keizer - FAO of the United Nations - knowledge and capacity for development ThesaurusWorkshop–CASBeijing,2010-10-22 Thank You!

Editor's Notes

  1. Thisgraphelaboratedby Nova Spivacksfrom Radar Networksispopular at the moment. The Y-Axisisfor the increaseof information connections. The X-Axisisfor the increaseof social connections. Whereas the Web Operating System in 2030 isstill a brilliantguess in the future, the developmentof the Semantic Web, or Web 3.0 hasnowgotconsiderablemomentum
  2. Oneof the key development in the semantic web are “Linked Open Data”. The Linked Open Data paradigmclaimsthatexistingstructured data needtobereleasedfrom the proprietary silos in whichthey are at the moment. With the existenceof RDF (ResourceDescriptionFramework) there are the semantictoolsto do so. Thereisalsotechnologytouse RDF. More tothislater.
  3. Thisis a snapshotoneyearlater. The growthisenormous. A centralpointisDBPedia, “triplified” information fromWikipedia. The differentcoloursrepresent the different information types, being “life sciences” and “publications” the mostpopulatedareas, butwith the area “government” stronglygrowingInterestingnewcomers in the last months are the two VIVO datasetsfrom the UnitedStatesdescriping expertise in Science. Vivo isactually a project thatstarted the agriculturallibraryofCornellUniversity
  4. Whatdoesthismean in practice? I will show thiswithanexamplefrom the BBC. The biggestconsumers (and producers) of LOD are as I know the BBC and the New York times (Butnowalso the US government)
  5. During the Web 1.0 phase, Webpageswerecomposedbyhumans. Todaymostwebpages are drivenbydatabasesthat can bedynamicallyqueried. Theycontainthrough RSS feedsalso data fromotherwebsitesThis BBC webpageis a big jumpfurther. I hasnotbeencomposedbyhumans and itisnotfromone database generated. Itisgeneratedfromdifferentdatasourcesthatwerepresentaslinked open data, linkedonlythrough common URIs
  6. The “technology” thatmakeslinked open data possibleis RDF. Everything in RDF ismadeof “triples”, A triple means a statement with “Subject-Predicate-Object” asshown in thisexample. Ideally, allelementsof a triple are representedbyan URI, anunambiguousdefinitionof a concept, whichismachinereadable, buttriples can bebuiltalsofromsimpleletterstrings.
  7. Whatisnow the roleofthesauri and specifically the roleofourthesauri in this set up?
  8. In our team wehadveryearly the idea thatthesauriwouldbecomeofimportance in the developmentof Web information management. Within the AOS (AgriculturalOntology Service) initiativewehavegone a long and winding road. The Google searchshowsour 2003 paper in JODI.Butnow AGROVOC hasbecome showcase for the useofthesauritobuildconceptschemes
  9. Some auto appreciation
  10. Thisis the AGROVOC SKOS modelthathasbeendeveloped and decided in April 2010 under activecollaborationfrom Tom Baker, whowasmemberof the W3C SKOS workinggroup.
  11. SKOS-XL hasbeenpublishedas a W3C standard oneyear ago. The initialversionsof SKOS werenotsufficientto express the complexicitiesofmultilingualthesauri. Margherita Sini from FAO wasmemberof the SKOS workinggroup and we are vere satisfiedthat at then end a standard emergedthatcatersforourneeds
  12. You can seehere the AGROVOC encoding in SKOS
  13. The tableshows 3 descriptorsthat are in AGROVOC, EUROVOC and UNBIS. In AGROVOC and EUROVOC they are alreadyencodedasURIs. Easilywecouldestablishrelationshipslikeowl.sameAsbetween the concepts or skos:exactMatchbetweenlabels.
  14. In a bibliographical record thereismuch more hidden information thandisplayedwith the metadata. Manyof the highlystructured data are linkingtoother information on the web. In AGRIS wehavenowintroducedsomethingwhatwecall “naivelinking”. An AGRIS record linksautomaticallyto Google Mapsfor the location of the center and to Google toretrieve the full text of the resource, citationlists or otherpublicationsfrom the authors. Thisoftenworks, butclearlynotalway, s asitisnotcontrolledbysemantics, butonlythroughidentyofstrings. Foranuneducatedmachineunfortunately COW and C.O.W. are the same, whereaspeanuts and groundnuts are somethingdifferent.
  15. Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  16. How does this work: A resource is connected with each concept URI in the web. The concepts between three vocabularies are having same literal which is connected with owl:sameAS/exactMatch relationship. As we are speakingaboutthesauri and notontologieswekept the relation tobechosenpurposelyvague. The conceptscouldbematchedwithowl:sameAS or the termscouldbematcheswith SKOS:exactMatch. A lotofdiscussion on thisisongoing
  17. Oneof the groundbreakingenterprises in this area isThomsonReuters “Open Calais”. Thisis a webservicethatprovidessemanticmark up foranyunstructured text thatyoufeedintotheir service The service is free ofCharge. Why? I will show youlater.
  18. My team in collaborationwith the IndianInstituteofTechnology in Kanpur isdeveloping a similar service foroursubject area.
  19. Wehavehere a text from 1964 without a bibliographic record at handabout a plantprotectionissue
  20. Open Calais isverygood in thoseareas, in whichtheyhavetheirownelaboratedconceptschemeagainstwhich the texts are analyzed: “Places”, “Persons”, “Business Processes” , “IndustryTerms”, butitisweak in the specifictopicanalysis, whattheycall “social tags”
  21. AgroTaggerstilllacksmanyof the sophisticated featuresof “Open Calais” ,butismuch, muchbetter in the subjectanalysisof the text
  22. Wewillnowtry a life demo
  23. During the discussions on the AGROVOC model, wealsodid some software engineering. The resultis the conceptschemeworkbench.Is a web-based working environment for managing the AGROVOC Concept Server  Facilitate the collaborative editing of multilingual terminology and semantic concept information  It includes administration and group management features  It includes workflows for maintenance, validation and quality assurance of the data pool  The CS is accessible freely to everybody to facilitates collaborative editing Alreadynownotonly AGROVOC is on the workbench, butalso the FAO OpenArchive authority data. We can hostanyconceptscheme