SlideShare une entreprise Scribd logo
1  sur  12
ISOcatA short introduction Marc Kemps-Snijdersa, Sue Ellen Wrightb, MenzoWindhouwera aMax Planck Institute for Psycholinguistics, bKent State University marc.kemps-snijders@mpi.nl, sellenwright@gmail.com, menzo.windhouwer@mpi.nl February 19, 2010 1 CLARIN-NL Bijeenkomst
ISOcat: a data category registry ISO 12620:2009 Terminology and other content and language resources — Specification of data categories and management of a Data Category Registry for language resources February 19, 2010 CLARIN-NL Bijeenkomst 2
Data category The result of the specification of a given data field A data category is an elementary descriptor in a linguistic structure or an annotation scheme. Model consists of 3 main parts: Administrative part Administration and identification Descriptive part Documentation in various working languages Linguistic part Conceptual domain(s for various object languages) February 19, 2010 3 CLARIN-NL Bijeenkomst
Data Category Registry DCR Board ,[object Object]
Data categories can be submitted to the standardization process, in which case they are assigned to a Thematic Domain Group which judges it.
At regular intervals, snapshots of the standardized subset of the DCR will be submitted to ISO. TDG metadata TDG ….. TDG morphosyntax TDG terminology February 19, 2010 4 CLARIN-NL Bijeenkomst
Standardization February 19, 2010 CLARIN-NL Bijeenkomst 5 Decision Group Submission group Data Category Registry Board Thematic Domain Group Stewardship group Validation Evaluation rejected rejected Publication
Data categories and linguistic resources February 19, 2010 CLARIN-NL Bijeenkomst 6 partOfSpeech Lemma writtenForm writtenForm Word Form grammaticalGender lexicalType grammaticalGender wordOrder Lexicon 1..* A (schema for a) typological database Lexical Entry Shared semantics! 0..* 1..* Form Sense 0..* A (schema for a) lexicon
<dcif:dataCategorypid="http://www.isocat.org/datcat/DC-1345" type="complex"> <dcif:administrationInformation> 		<dcif:administrationRecord>   			<dcif:identifier>partOfSpeech</dcif:identifier>   			<dcif:version>0.0.0</dcif:version>   			<dcif:registrationStatus>candidate</dcif:registrationStatus>     			<dcif:origin>?</dcif:origin> 			<dcif:creation>   				<dcif:creationDate>2004-07-09</dcif:creationDate>   				<dcif:changeDescriptionxml:lang="en"> 				… </dcif:changeDescription>     			</dcif:creation>   		</dcif:administrationRecord>   	</dcif:administrationInformation> 	<dcif:descriptionSection>   		<dcif:profile>MorphoSyntax</dcif:profile> 		<dcif:languageSection> 			<dcif:language>en</dcif:language> 			<dcif:definitionSection> 	<dcif:definitionxml:lang="en"> 		Term used to describe how a particular word is used in a sentence. 		… 	…  … </dcif:dataCategory> Data category persistent identifier (PID): http://isocat.org/datcat/ISO-DC-1345 HTML content type Default content type HTTP 307 redirect http://www.isocat.org/rest/dc/1345.html http://www.isocat.org/rest/dc/1345.dcif Referencing data categories February 19, 2010 CLARIN-NL Bijeenkomst 7
Annotating linguistic resources February 19, 2010 CLARIN-NL Bijeenkomst 8 ,[object Object]
for example ODD from TEI<elementSpecident="pos"> 	 <equiv name="partOfSpeech" uri="http://isocat.org/datcat/ISO-DC-369"/>  …  </elementSpec> ,[object Object]
for schemas or instances

Contenu connexe

Tendances

Xml dtd- Document Type Definition- Web Technology
Xml dtd- Document Type Definition- Web TechnologyXml dtd- Document Type Definition- Web Technology
Xml dtd- Document Type Definition- Web TechnologyRajan Shah
 
Analyzing Files
Analyzing FilesAnalyzing Files
Analyzing Filesguest7c8b8
 
Introduction to XML and Databases
Introduction to XML and DatabasesIntroduction to XML and Databases
Introduction to XML and Databasestorp42
 
Xml Java
Xml JavaXml Java
Xml Javacbee48
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XMLyht4ever
 
On the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categoriesOn the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categoriesMenzo Windhouwer
 
Document type definition
Document type definitionDocument type definition
Document type definitionRaghu nath
 
Atla dublin core basics workshop
Atla dublin core basics workshopAtla dublin core basics workshop
Atla dublin core basics workshopLisa Gonzalez
 
Preservation of 3D models
Preservation of 3D modelsPreservation of 3D models
Preservation of 3D modelsdatable_be
 
Dublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialDublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialEduserv Foundation
 
A Brief Introduction to Encoded Archival Description
A Brief Introduction to Encoded Archival DescriptionA Brief Introduction to Encoded Archival Description
A Brief Introduction to Encoded Archival DescriptionKevin Schlottmann
 
XML and Databases
XML and DatabasesXML and Databases
XML and DatabasesCittrex
 
Ks2008 Semanticweb In Action
Ks2008 Semanticweb In ActionKs2008 Semanticweb In Action
Ks2008 Semanticweb In ActionRinke Hoekstra
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-tonvitucci
 

Tendances (20)

Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
Dublin Core Intro
Dublin Core IntroDublin Core Intro
Dublin Core Intro
 
Xml dtd
Xml dtdXml dtd
Xml dtd
 
Xml dtd- Document Type Definition- Web Technology
Xml dtd- Document Type Definition- Web TechnologyXml dtd- Document Type Definition- Web Technology
Xml dtd- Document Type Definition- Web Technology
 
Analyzing Files
Analyzing FilesAnalyzing Files
Analyzing Files
 
Introduction to XML and Databases
Introduction to XML and DatabasesIntroduction to XML and Databases
Introduction to XML and Databases
 
Xml Java
Xml JavaXml Java
Xml Java
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
On the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categoriesOn the way to a Relation Registry for ISOcat data categories
On the way to a Relation Registry for ISOcat data categories
 
Document Type Definitions
Document Type DefinitionsDocument Type Definitions
Document Type Definitions
 
Document type definition
Document type definitionDocument type definition
Document type definition
 
Atla dublin core basics workshop
Atla dublin core basics workshopAtla dublin core basics workshop
Atla dublin core basics workshop
 
Dtd
DtdDtd
Dtd
 
Preservation of 3D models
Preservation of 3D modelsPreservation of 3D models
Preservation of 3D models
 
Dublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax TutorialDublin Core Basic Syntax Tutorial
Dublin Core Basic Syntax Tutorial
 
A Brief Introduction to Encoded Archival Description
A Brief Introduction to Encoded Archival DescriptionA Brief Introduction to Encoded Archival Description
A Brief Introduction to Encoded Archival Description
 
Dtd
DtdDtd
Dtd
 
XML and Databases
XML and DatabasesXML and Databases
XML and Databases
 
Ks2008 Semanticweb In Action
Ks2008 Semanticweb In ActionKs2008 Semanticweb In Action
Ks2008 Semanticweb In Action
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 

Similaire à ISOcat: a short introduction

IMS Learning Tools Interoperability @ UCLA
IMS Learning Tools Interoperability @ UCLAIMS Learning Tools Interoperability @ UCLA
IMS Learning Tools Interoperability @ UCLACharles Severance
 
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemasDC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemasMikael Nilsson
 
Consuming open and linked data with open source tools
Consuming open and linked data with open source toolsConsuming open and linked data with open source tools
Consuming open and linked data with open source toolsJoanne Cook
 
Jayson lorenzen iptc_rnews_overview
Jayson lorenzen iptc_rnews_overviewJayson lorenzen iptc_rnews_overview
Jayson lorenzen iptc_rnews_overviewJayson Lorenzen
 
Web Services Part 1
Web Services Part 1Web Services Part 1
Web Services Part 1patinijava
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7Deniz Kılınç
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadatasuyu22
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemRoberto García
 
Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Christopher Biow
 
NEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator PresentationNEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator Presentationaskankit
 

Similaire à ISOcat: a short introduction (20)

The ISO-DCR
The ISO-DCRThe ISO-DCR
The ISO-DCR
 
Metadata Cloud
Metadata CloudMetadata Cloud
Metadata Cloud
 
IMS Learning Tools Interoperability @ UCLA
IMS Learning Tools Interoperability @ UCLAIMS Learning Tools Interoperability @ UCLA
IMS Learning Tools Interoperability @ UCLA
 
ISOcat to LMF to TEI
ISOcat to LMF to TEIISOcat to LMF to TEI
ISOcat to LMF to TEI
 
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemasDC-2008 Tutorial 3 - Dublin Core and other metadata schemas
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
 
Consuming open and linked data with open source tools
Consuming open and linked data with open source toolsConsuming open and linked data with open source tools
Consuming open and linked data with open source tools
 
Use of ISOcat within CMDI
Use of ISOcat within CMDIUse of ISOcat within CMDI
Use of ISOcat within CMDI
 
Jayson lorenzen iptc_rnews_overview
Jayson lorenzen iptc_rnews_overviewJayson lorenzen iptc_rnews_overview
Jayson lorenzen iptc_rnews_overview
 
Web Services Part 1
Web Services Part 1Web Services Part 1
Web Services Part 1
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
 
New Directions in Metadata
New Directions in MetadataNew Directions in Metadata
New Directions in Metadata
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management System
 
Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010
 
Odp
OdpOdp
Odp
 
NEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator PresentationNEOOUG 2010 Oracle Data Integrator Presentation
NEOOUG 2010 Oracle Data Integrator Presentation
 
Unit 5
Unit 5Unit 5
Unit 5
 
6 311 W
6 311 W6 311 W
6 311 W
 
test
testtest
test
 
6 311 W
6 311 W6 311 W
6 311 W
 

Plus de Menzo Windhouwer

Fedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN InfrastructureFedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN InfrastructureMenzo Windhouwer
 
ISOcat and RELcat, two cooperating semantic registries
	ISOcat and RELcat, two cooperating semantic registries	ISOcat and RELcat, two cooperating semantic registries
ISOcat and RELcat, two cooperating semantic registriesMenzo Windhouwer
 
Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Menzo Windhouwer
 
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...Menzo Windhouwer
 
A CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web ServicesA CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web ServicesMenzo Windhouwer
 
LDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesLDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesMenzo Windhouwer
 
What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?Menzo Windhouwer
 
Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.Menzo Windhouwer
 

Plus de Menzo Windhouwer (9)

CMD2RDF
CMD2RDFCMD2RDF
CMD2RDF
 
Fedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN InfrastructureFedora Commons in the CLARIN Infrastructure
Fedora Commons in the CLARIN Infrastructure
 
ISOcat and RELcat, two cooperating semantic registries
	ISOcat and RELcat, two cooperating semantic registries	ISOcat and RELcat, two cooperating semantic registries
ISOcat and RELcat, two cooperating semantic registries
 
Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.
 
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
Collaboratively Defining Widely Accepted Linguistic Data Categories in the IS...
 
A CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web ServicesA CMD Core Model for CLARIN Web Services
A CMD Core Model for CLARIN Web Services
 
LDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data CategoriesLDL 2012 - Linking to ISOcat Data Categories
LDL 2012 - Linking to ISOcat Data Categories
 
What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?What do cats have to do with explicit semantics?
What do cats have to do with explicit semantics?
 
Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.Sustainable operability: Keeping complex linguistic resources alive.
Sustainable operability: Keeping complex linguistic resources alive.
 

ISOcat: a short introduction

  • 1. ISOcatA short introduction Marc Kemps-Snijdersa, Sue Ellen Wrightb, MenzoWindhouwera aMax Planck Institute for Psycholinguistics, bKent State University marc.kemps-snijders@mpi.nl, sellenwright@gmail.com, menzo.windhouwer@mpi.nl February 19, 2010 1 CLARIN-NL Bijeenkomst
  • 2. ISOcat: a data category registry ISO 12620:2009 Terminology and other content and language resources — Specification of data categories and management of a Data Category Registry for language resources February 19, 2010 CLARIN-NL Bijeenkomst 2
  • 3. Data category The result of the specification of a given data field A data category is an elementary descriptor in a linguistic structure or an annotation scheme. Model consists of 3 main parts: Administrative part Administration and identification Descriptive part Documentation in various working languages Linguistic part Conceptual domain(s for various object languages) February 19, 2010 3 CLARIN-NL Bijeenkomst
  • 4.
  • 5. Data categories can be submitted to the standardization process, in which case they are assigned to a Thematic Domain Group which judges it.
  • 6. At regular intervals, snapshots of the standardized subset of the DCR will be submitted to ISO. TDG metadata TDG ….. TDG morphosyntax TDG terminology February 19, 2010 4 CLARIN-NL Bijeenkomst
  • 7. Standardization February 19, 2010 CLARIN-NL Bijeenkomst 5 Decision Group Submission group Data Category Registry Board Thematic Domain Group Stewardship group Validation Evaluation rejected rejected Publication
  • 8. Data categories and linguistic resources February 19, 2010 CLARIN-NL Bijeenkomst 6 partOfSpeech Lemma writtenForm writtenForm Word Form grammaticalGender lexicalType grammaticalGender wordOrder Lexicon 1..* A (schema for a) typological database Lexical Entry Shared semantics! 0..* 1..* Form Sense 0..* A (schema for a) lexicon
  • 9. <dcif:dataCategorypid="http://www.isocat.org/datcat/DC-1345" type="complex"> <dcif:administrationInformation> <dcif:administrationRecord> <dcif:identifier>partOfSpeech</dcif:identifier> <dcif:version>0.0.0</dcif:version> <dcif:registrationStatus>candidate</dcif:registrationStatus> <dcif:origin>?</dcif:origin> <dcif:creation> <dcif:creationDate>2004-07-09</dcif:creationDate> <dcif:changeDescriptionxml:lang="en"> … </dcif:changeDescription> </dcif:creation> </dcif:administrationRecord> </dcif:administrationInformation> <dcif:descriptionSection> <dcif:profile>MorphoSyntax</dcif:profile> <dcif:languageSection> <dcif:language>en</dcif:language> <dcif:definitionSection> <dcif:definitionxml:lang="en"> Term used to describe how a particular word is used in a sentence. … … … </dcif:dataCategory> Data category persistent identifier (PID): http://isocat.org/datcat/ISO-DC-1345 HTML content type Default content type HTTP 307 redirect http://www.isocat.org/rest/dc/1345.html http://www.isocat.org/rest/dc/1345.dcif Referencing data categories February 19, 2010 CLARIN-NL Bijeenkomst 7
  • 10.
  • 11.
  • 12. for schemas or instances
  • 13.
  • 14. ISOcat status February 19, 2010 CLARIN-NL Bijeenkomst 10 ISOcat is under active development: Now: You can access public data categories and selections You can create your own data categories and selections You can share your data categories and selections with others (everyone, or a specified group) In progress: Cleanup of profiles by TDGs Standardization workflow Some social features (forum to discuss specific data categories) Import external ‘data category’ sets, such as: parts of the ISO Concept Database Dublin Core TEI Future: High availability (mirrors) Relation registry
  • 15. ISOcat workshop Utrecht, Thursday March 25, 2010 Especially aimed at supporting Call 1 projects Signup at: www.clarin.nl Program: A deeper introduction to ISOcat A tutorial on using ISOcat How to annotate specific linguistic resources? February 19, 2010 CLARIN-NL Bijeenkomst 11 Invitation Send examples of the types of linguistic resources your project wants to annotate with data category references to isocat@mpi.nl and we will discuss them at the workshop!
  • 16. February 19, 2010 CLARIN-NL Bijeenkomst 12 Thank you for your attention! Visit www.isocat.org Questions? isocat@mpi.nl