1. Vocabularies and Linked Open Data Dr. Johannes Keizer Office ofKnowledge Exchange, Research and Extension Food andAgricultureOrganizationofthe UN Talk at Library ofCongress, 2011-05-18
2. We will promote research for food and agriculture, including research to adapt to, and mitigate climate change, and access to research results and technologies at national, regional and international levels. We will reinvigorate national research systems and will share information and best practices. We will improve access to knowledge. worldfoodsummit 2009
14. (quite easy to do, bibData map well to RDFThen Everyone who knows to write SparqlQeries could get all these publications with one shot for a new website on toxic wastes
15. Vocabularies and LOD Simply publishing your data as RDF does not link them to other data sets Creating this links by humans is interesting in detail, but unrealistic as mass processing Linking 2 standard vocabularies can link 200 datasets which use these standard vocabularies
16. …just out of the pipele -----Original Message-----From: Antoine Isaac [mailto:aisaac@few.vu.nl] Sent: Thursday, May 12, 2011 7:19 PMTo: UDC SummaryCc: Anibaldi, Stefano (OEKC); Dan BrickleySubject: Re: AGRIS Journals and UDC URIs/ checkingAida, Stefano,…..Of course the first hints re. URIs is to keep it short. www.udcc.org/udcclass_631.1/50900 seems a bit long.Then it might be interesting to use "class" somewhere, if you're going to release entities with a different type one day.On the most difficult issue, class numbers vs. DB identifiers. Probably you will have to create both, if you want to intercept these cases where concepts have changed class number.…………
18. AGROVOC A multilingual agricultural vocabulary organized as concept scheme in 20 languages Covers agriculture, forestry, fisheries and related themes (food security, land use, environment, etc.) Organized in sub-vocabularies, e.g. chemicals, fisheries terms, scientific/common names of organisms Maintained by a global community (e.g. librarians, terminologists, information managers) using VocBench
20. AGROVOC - Restructuring Goal: Transform AGROVOC from a traditional thesaurus into a concept scheme with distinction between conceptual level and terminological level Overall revision done by FAO in collaboration with KSI (Knowledge Sharing and Innovation) team at ICRISAT, Hyderabad, India Top concepts reduced from 918 to 25 Around 85,000 term relations revised Non-hierarchical relationships refined by semantic relations Ca. 4,000 non-preferred terms changed to preferred terms
30. AGROVOC Links after 3 weeks LOD Outlinks: GEMET-AGROVOC 1,198 RAMEAU-AGROVOC :700 Total Outlinks: 1898 Inlinks: AGROVOC-EUROVOC:1,297 AGROVOC-GEMET:1,198 AGROVOC-LCSH :1,093 AGROVOC-NAL: 13,390 AGROVOC-STW:1136 AGROVOC-RAMEAU:700 Total Inlinks:18,814
31. Europe:(It is better to use this example during the presentation)http://aims.fao.org/aos/agrovoc/c_2724From the Top concept:Ref: http://aims.fao.org/aos/agrovoc/c_7644Vocbench (Production)Ref: http://agrovoc.mimos.my/vocbenchv1.1i/VocBench(Sandbox)Ref:http://agrovoc.mimos.my/vocbenchv1.1i/
52. Will produce in future Structured RDF files that can be used to link data like “open Calais”AgroTagger
53.
54.
55.
56. RING routemapto information nodes and gateways VocBench concepts and entitiesreferencetriples Cloud storagefor RDF data triples Tools LOD enabled software LOD Generator triplifier, concept and entityidentifier Data Services Webservices + APIsto triple stores agINFRA - the elements
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
How does this work: A resource is connected with each concept URI in the web. The concepts between three vocabularies are having same literal which is connected with owl:sameAS/exactMatch relationship. As we are speakingaboutthesauri and notontologieswekept the relation tobechosenpurposelyvague. The conceptscouldbematchedwithowl:sameAS or the termscouldbematcheswith SKOS:exactMatch. A lotofdiscussion on thisisongoing
Note: we identified outlinks to RAMAEU and GEMET, and they have taken them as inlinks to their own thesaurus.
- All links are checked by a domain expert.
- All links are checked by a domain expert.
Once a content provider (icon person thinking) has decided to publish a bibliographical database as Linked Open Data….(arrow in red)1.- What kinds of entities and relationships are involved in bibliographic resource description? The definition of a conceptual model helps to bring an overall picture of involving entities and relationships in bibliographic descriptionto establish a common understanding of the involving data models. LODE-BD proposes a simple conceptual model based on three entities: resource, agent and thema. (arrow in blue)2. What properties should be considered for publishing meaningful/useful LOD-ready bibliographic data? In the Linked Data context any content provider can expose anything contained in its local database. However, in the case of bibliographical data, standardized types of information should be considered in order to maximize the impact of exposing, sharing, and connecting of data. LODE-BD has identified nine groups of common properties for describing bibliographic resources: about two dozen properties used for describing a bibliographic resource as well as an additional two sets of properties for describing relations between bibliographic resources or between agents. They form the backbone of LODE-BD, basis of the decision-trees (the next slide).
(arrow in orange)3. What metadata standards should be used for preparing LOD-ready metadata? LODE-BD has selected a number of well-accepted and widely-used metadata vocabularies and used their metadata terms in the recommendations. Like dc, dcterms, bibo, agmes…. New metadata standards can be added on the list in the future depending on the needs on the Linked Open Data Community.(arrow in green)4. What metadata terms are appropriate in any given property for publishing LOD-ready metadata based on a local database? Metadata terms from the DCMES (dc:) and DCMI Metadata Terms (dcterms:) namespaces are the fundamentals in the LODE-BD Recommendations, while metadata terms from other namespaces are supplemented when additional needs are to be satisfied. LODE-BD has prepared a crosswalk table where all metadata terms used in the Recommendations are included.
This part of the LODE-BD report aims to assist in the metadata term selection process to be carry out by any bibliographical data provider. LODE-BD uses flowcharts to present individualized decision trees for the properties included in each of the nine groups (refer to the previous chapter). Starting from the property that describes a resource instance, each flowchart presents decision points and gives a step-by-step solution to a given problem of metadata encoding. These flowcharts are designed to facilitate the selection of the appropriate strategies adjustable to data providers according to their situations, while all work towards the goal of data exchange and reuse. At the end of each flowchart there are alternative sets of metadata terms for selection. Each chart is followed by the text-based explanations corresponding to the flowchart, with notes, steps, and examples whenever necessary in the tables.
Oneof the groundbreakingenterprises in this area isThomsonReuters “Open Calais”. Thisis a webservicethatprovidessemanticmark up foranyunstructured text thatyoufeedintotheir service The service is free ofCharge. Why? I will show youlater.
My team in collaborationwith the IndianInstituteofTechnology in Kanpur isdeveloping a similar service foroursubject area.
Wehavehere a text from 1964 without a bibliographic record at handabout a plantprotectionissue
Open Calais isverygood in thoseareas, in whichtheyhavetheirownelaboratedconceptschemeagainstwhich the texts are analyzed: “Places”, “Persons”, “Business Processes” , “IndustryTerms”, butitisweak in the specifictopicanalysis, whattheycall “social tags”
AgroTaggerstilllacksmanyof the sophisticated featuresof “Open Calais” ,butismuch, muchbetter in the subjectanalysisof the text
The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww