Preliminary discusses why and how application profiles should be build for different subject domains and different vocabulary structures, based on FRSAD model. Presented at the Joint meeting of LLD XG and DCMI Architecture Forum.
2. Questions to be discussed
1. Why APs are needed for subject authority
data?
2. How formally (or informally) can this style of
“application profile” be defined?
3. In what ways are application profiles for
subject domains different from APs for
descriptive metadata?
3. FRSAD Conceptual Model
Thema = “any entity used as a subject of a work".
NOMEN = any sign or sequence of signs (alphanumeric characters,
symbols, sound, etc.) that a thema is known by, referred to or
addressed as.
Note: in a given controlled vocabulary and within a domain, a
nomen should be an appellation of only one thema.
4. 1. Why APs are needed?
Cologne, July 20. 2010
4
a. thema types
Depending on the implementation, themas
can be categorised in various ways, even
in the same discipline/subject domain
6. Health/
Medical
UMLS
Entities
Physical Object
Organism
Anatomical
Structure
Manufactured
Object
Substance
Conceptual Entity
Idea or Concept
Finding
Organism Attribute
Intellectual Product
Language
Occupation or
Discipline
Organization
Group Attribute
Group
Events
Activity
Phenomenon or
The Foundational
Model of Anatomy
(FMA)
oAnatomical Entity
oNon-physical anatomical
entity
oPhysical anatomical entity
oAttribute Entity
oCell morphology
oCell shape type
oCell surface feature
oConcept name
oMiscellaneous term
oOrgan part phenotype
oPhysical attribute
relationship
oPhysical state
oStructural relationship
value
oDimensional Entity
oLine
International
Classification
of Diseases
(IDC)
• DISEASES AND
INJURIES
• PROCEDURES
•+ EXTERNAL
CAUSES OF
INJURY AND
POISONING
•+FACTORS
INFLUENCING
HEALTH STATUS
AND CONTACT
7. The situation is
just like this:
Andy Corbett, James Reid, David Medyckyj-Scott, Cressida Chappell
(Universities of Edinburgh and Essex): Geo-Crosswalk: A gazetteer
service and server for the UK. JCDL2002 NKOS Workshop July
18, 2002, Portland, Oregon.
8. (cont.) 1. Why APs are needed?
b) thema- to - thema relationships
General relationships between themas
(applicable to all types)
Hierarchical
Partitive
Generic
Instance
Polyhierarchical
Associative (=other)
Other thema-to-thema relationships are
implementation-dependent
9. Area types:
• Groups
• Territories
Data associated to areas:
• Names (multilingual)
• International codes
• Coordinates
• DBPedia ID
• Currency names and codes
• Adjectives of nationality
• Basic statistical data
Relations:
• Groups membership
• Land borders
• Historic changes: predecessor,
successor, valid since, valid until
FAO Country Profiles -- The Geopolitical
Ontology
http://www.fao.org/countryprofiles/geoinfo.asp
Geograph
y
ADL Digital Gazetteer
Relationships between
entries
Inherently spatial
Containment
Overlap
Proximity
Directional
Explicitly stated
PartOf
AdministrativePartOf
AdministrativePartitionMember
Of
AdministrativeSeatOf
ConventionallyQualifiedBy
SubfeatureOf
GeophysicalPartitionMemberOf
10. Page 10
terms (preferred & non-preferred)
notations
terms of pre-coordinated strings
category labels (w or w/t notations)
terms or identifiers
terms
… …
• thesauri:
• classification schemes:
• subject heading systems:
• taxonomies:
• ontologies:
• picklists:
• … …
themas
represented by:
Nomens in different types of KOS
11. 2. How formally (or informally) can this
style of application profile be defined?
11 Functional
Requirements*
Domain
Model*
Description
Set Profile*
Usage
Guidelines
Encoding syntax
guidelines
*mandatory
12. (cont.) 2. How formally (or informally)
can this style of application profile be
defined?
Functional
Requirements*
(describes what a community
wants to accomplish with its
application)
vocab control for retrieval,
organizing/categorizing,
navigation, reasoning,
provenance …
DCAP FRSAD-AP
13. (cont.) 2. How formally (or informally)
can this style of application profile be
defined?
FRSAD is a general
model;
Need more specific ones
for
different types
(e.g. classification vs.
thesaurus vs. subject
headings)
different subject domains
(e.g., medical vs.
consumer health)
DCAP FRSAD-AP
14. E.g., What are the basic
entities in a classification
system?
thema : class
nomen: notation
themas :
class
. including built classes[1]
. memberInClass[2]
. . .
nomens:
notation
caption
nameOfMember-inScopeNote
index term
… …
‘546.663’ @ ddc
class@ddcclass@ddc
‘546.66’ @ ddc
has nomen
has nomen
has super class
‘*Mercury’ @ en
has caption
‘Group 12’ @ en
has caption
or
A notation has its semantic value and an
ordinal value
15. e.g., How to describe the orders/sequences of coordinatee.g., How to describe the orders/sequences of coordinate
classes (not just hierarchical relationships)classes (not just hierarchical relationships)
Semantically meaningful orders in a classification
system
Classes are arranged according to
• stages in a process (e.g., brewing processes,
packaging of product processes);
• time or evolutionary sequence (e.g., ancient
Greek sculptures, paleontology, stars);
• degree of complexity (e.g., geometric figures),
• size (e.g., town, cities, metropolis, and other
administrative unites)
•According to Literary Warrant principle (e.g.,
arrange literature according to publication
amount)
•According to User Warrant principle (e.g.,
arrange services and products according to
popularity)
15
16. Description Set
Profile*
(enumerates the
metadata terms to be
used)
Properties of entities
APs may need specific
attributes and/or values,
e.g., for notation &
caption
[other questions]
DCAP (cont.) FRSAD-AP (cont.)
(cont.) 2. How formally (or informally)
can this style of application profile be
defined?
17. Nomen general attributes (include but not limited
to)
Page 17
Type of nomen (identifier, controlled name, …)*
Scheme (LCSH, DDC, UDC, ULAN, ISO 8601…)
Reference Source of nomen (Encyclopedia Britannica…)
Representation of nomen (alphanumeric, sound,
visual,...)
Language of nomen (English, Japanese, Slovenian,…)
Script of nomen (Cyrillic, Thai, Chinese-simplified,…)
Script conversion (Pinyin, ISO 3601, Romanisation of
Japanese…)
Form of nomen (full name, abbreviation, formula…)
Time of validity of nomen (until xxxx, after xxxx, from…
to …)
Audience (English-speaking users, scientists, children
…)
Status of nomen (provisional, accepted, official,...)
*note: examples of attribute values in parenthesis
18. Example: Notations -- Rules
Classification numbers may be built according to rules
Example from DDC:
821.008 Collections of English poetry
is built with
82 (following the instruction at 820.1-828 Subdivisions of English literature)
plus 100 (following the instruction at T3B--1001-T3B--1009 Standard subdivisions;
collections; history, description, critical appraisal)
plus 8 Collections of literary texts from the add table at T3B--1-T3B--8 Specific forms.
821 English poetry
821.008 English poetry--collections
821.00803543 Love--poetry--English literature--collections, . . .
821.0080355 English poetry--social themes--collections, . . .
821.008036 English poetry--nature--collections, . . .
821.0080382 English poetry--religious themes--collections, . . .
821.009 English poetry--history and criticism
821.04 English poetry--lyric poetry, . . .
821.0708 Humorous poetry--English literature--collections, . . .
http://ddc.typepad.com/025431/ddc_tip_of_the_week/
Source: One Zero or Two? Dewey Blog. September 28, 2006
18
19. General Nomen relationships
19
Partitive
Equivalence
Equivalence can be specified further, e.g.:
Replaces/is replaced by
Has variant form/is variant form
Has derivation/is derived from
Has acronym/is acronym
Has abbreviation/is abbreviation
Has transliterated form/is transliteration
APs may need more specific relationships, e.g., for
notation & caption
20. Usage Guidelines
Encoding syntax
guidelines
Usage Guidelines
Recommendation:
e.g., SKOS &
extensions;
MADS, BS8723-5,
ISO25964, …
DCAP (cont.) FRSAD-AP (cont.)
(cont.) 2. How formally (or informally)
can this style of application profile be
defined?
21. 3. In what ways are application profiles for subject
domains different from APs for descriptive
metadata?
Descriptive metadata Subject domain vocabularies
22. 3. In what ways are application profiles for subject
domains different from APs for descriptive
metadata?
Descriptive metadata Subject domain vocabularies
Describing a thema
-- what a concept is about
-- where it belongs to
Serious sameAs issue
-- senior@schemaA =? senior@schemaB
-- sunflower@mesh =? sunflower@aat
Integrity rely on the domain model and
properties around a thema and a nomen
23. Questions to be discussed
1. Why APs are needed for subject authority
data?
2. How formally (or informally) can this style of
“application profile” be defined?
3. In what ways are application profiles for
subject domains different from APs for
descriptive metadata?
The main focus of the model is intellectual property and rights management, but it also overlaps significantly with FRBR. The basic entities are defined as:
Percept: an entity which is perceived directly with at least one of the five senses.
Being: an entity which has characteristics of animate life; anything which lives and dies
Thing: an entity without the characteristics of animate life
Concept: an entity which cannot be perceived directly through the mode of one of the five senses; and abstract entity, a notion or idea; an abstract noun; an unobservable proposition which exists independently of time and space
Relation: the interaction of percepts and/or concepts; a connection between two or more entities
Event: a dynamic relation involving two or more entities; something that happens; a relation through which an attribute of an entity is changed, added or removed
Situation: a static relation involving two or more entities; something that continues to be the case; a relation in which the attributes of entities remain unchanged
Ranganathan
Personality
Matter
Energy
Space
Time
OpenCYC chart: http://www.bioinfo.de/isb/2002020017/
“These are the first people that a country writes to when it changes itsname, I'm told. So in terms of provenance, this is the mostauthoritative. It's not ideal since it does not deal with intra-countryconcepts. Also it's a bit "design-y" for my tastes. When I asked aboutthis they said they had not envisaged other types of user wanting makereference to this material.
If one wants to define a set of reference ontologies to use which are asstandard as possible then I'd suggest that this should be used forreasons of provenance. I'm in the process of trying to integrate the EDMCouncil Semantics Repository upper ontology terms into this, for thatreason, and replacing the "holding" terms we currently use, which youare welcome to look at on www.hypercube.co.uk/edmcouncil under "GlobalTerms/Geographical".The intra-country terms are more problematic, and I'm also looking forwhat is the "most definitive" ontology for such terms, i.e. withprovenance from a standards or governing body. The recognised authorityon countries and country components is ISO 3166 athttp://www.iso.org/iso/country_codes.htm but unfortunately that simplyitemises the various intra-country components without attempting tocreate common concepts (e.g. a federal province is a common concept,whether a given federation calls it a State, a Province, a Canton orwhatever, subject to different legal nuances of course).”
Mike Bennett <mbennett@hypercube.co.uk>3/27 via "[ontolog-forum]" <ontolog-forum@ontolog.cim3.net>
The Guidelines for Dublin Core Application Profiles document provides a framework for the content and structure of any Dublin Core Application Profile (DCAP). The document explains the key components of a Dublin Core Application Profile and walks through the process of developing a profile. According to these guidelines, “[a] DCAP is a document (or set of documents) that specifies and describes the metadata used in a particular application. To accomplish this, a profile:
describes what a community wants to accomplish with its application (Functional Requirements);
characterizes the types of things described by the metadata and their relationships (Domain Model);
enumerates the metadata terms to be used and the rules for their use (Description Set Profile and Usage Guidelines); and
defines the machine syntax that will be used to encode the data (Syntax Guidelines and Data Formats)” (Coyle and Baker, 2009).
A SKOS concept scheme can be viewed as an aggregation of one or more SKOS concepts.
Semantic relationships (links) between those concepts may also be viewed as part of a concept scheme.
schedule
The series of numbers, captions and accompanying instructions or notes that constitute the core of a classification scheme. For the Library of Congress Classification, the schedules are designated from A-Z with one, two, or three letters denoting the class or subclass within the schedule. In the Dewey Decimal Classification the schedules are designated by the series of DDC numbers 001-999.
This is why nomen (in general) has to be an entity, not an attribute of thema.
In a particular implementation the relationship between a thema and nomen can be compressed into the nomen becoming an attribute of thema
Image source: nal.usda.gov
•Group 1 entities are defined as the products of intellectual or artistic endeavors: work, expression, manifestation, and item
•Group 2 entities are actors, those who are responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship, of Group 1 entities: person, corporate body
•Group 3 entities are the subjects of works, intellectual or artistic endeavor