SlideShare une entreprise Scribd logo
1  sur  21
IDs in and out of the database
Entomological Collection Network (ECN) 2012
November 10 – 11, Knoxville, TN
Debbie Paul, Greg Riccardi
• What good is identification?
• How are identifiers used by consumers
• Providing IDs
• Resolving IDs in a server
–Strategies for storing IDs in databases
• Linked Data
• Annotations ~ all sorts
• Feedback
Overview
What good is identification?
• Aggregation
– If you get info from 2 sources that are about
the same object, you can combine the info
• Resolution (finding information about object)
– Types of resolution
• Determine where to get information
• Determine how to get information
• Providing information
– How to create IDs
– How to publish IDs
– How to fetch database information for IDs
HTTP URIs
• Biggest problem
– Identification and 2 types of resolution
are comingled
• Resolution
– Where to get information
• Look somewhere
– How to get information
• Fetch information using some
protocol
DOI example
• The DOI is
• 10.3897/zookeys.209.3135
• URI (for aggregating) is
• doi:10.3897/zookeys.209.3135
• A URL for information retrieval (proxy resolution) is
• http://dx.doi.org/10.3897/zookeys.209.3135
• Information fetched from
– HTML:
• http://www.pensoft.net/journals/zookeys/article/3
135/abstract/five-task-clusters-that-enable-
efficient-and-effective-digitization-of-biological-
collections
– RDF:
• http://data.crossref.org/10.3897/zookeys.209.3135
What’s in an ID?
• For consumer:
– NOTHING! No information
– Might as well be UUID
• Can’t type it, remember it, parse it,
resolve it
– Useful for comparison and aggregation
• Equal strings (persistence)
• Different strings about the same object
– fetching information
• Send the ID somewhere for info
What’s in an ID?
• For Provider/resolver:
– Use ID to find local storage of
information
– E.g.
• parse out the DWC triple
• Extract the database table and
primary key
• Look up the ID in a table of IDs
• Look up ID in a URI field of a
database table
What’s in an id for the provider?
• record id 112234
• uuid 954c8760-e1a6-4b4b-ab82-6bf7311c25f3
• lsid urn:lsid:example.org:specimen:22545
• uri
• ezid http://n2t.net/ark:/99999/fk42b9hdf
• doi doi:10.1038/ng0609-637
What about Specimen identifiers?
• identifier on the specimen?
– readable text
– encoded data
– barcode is a contextual identifier
• identifier in the database?
– http://ids.usms.edu/herb/0014097
– http://ids.usms.edu/herb/0303134303937
How do providers identify?
 Notice online databases and your
database and find the identifiers of
the various objects
 Some identifiers are local (e.g. primary
key)
 Some identifiers are globally unique
 Some identifiers are URIs
Identification in the field
Storing IDs in databases
• your contextual ids?, your guids?
• What to use for IDs?
–record id
–uuid
–lsid
–uri
• what’s in your wallet database?
• Morphbank Example
IDs in Morphbank
• Morphbank Example
• http://www.morphbank.net/818505
IDs in Morphbank
• Morphbank Example
• http://www.morphbank.net/643261
Sharing data with IDs
• into a publication
• uploaded to the web
• data shared with a database integrator /
aggregator
– GBIF
– iDigBio
– VertNet
– Morphbank
• what is it exactly in the publication?
– an id?, a guid? a link to more information?
– what will be cited? searched for?
Feedback with IDs
• Annotations
– Target of annotation
• http://www.morphbank.net/818505
– filtered PUSH
• linked data ~ the semantic web
– (benefits – in a minute)
• updating the database
– be(a)ware
– Remember previous IDs
What’s coming up next?
• expect guids for all sorts of objects
–collection objects (example: specimen)
–georeferences
–taxon concepts
–determinations
–people
GUIDs are key
• 1 to many IDs known for a given object
• store and share the ones you know about
Specimen RecordID 19537
Specimen Previous Catalog Number 212345
Specimen Catalog Number / bar code bbbrc000123
Darwin Core Triplet (DwC) flmnh:herb:bbbrc000123
DwC Occurrence URI urn:catalog:flmnh:herb:bbbrc000123
Specimen GUID of type lsid urn:lsid:biocol.org:flmnh:bbbrc000123
Specimen Opaque Identifier (UUID) 424854d7-baec-42cf-a142-805b64117b9f
URI for UUID urn:uuid:424854d7-baec-42cf-a142-805b64117b9f
Specimen GUID of type HTTP-URI http://ids.flmnh.ufl.edu/herb/bbbrc000123
*Cannot enforce single identifier per object
caring for guids
• store them
– database adjustments
– tweaking current standard practices
• share them
– data standards
– 3 ways to modify darwin core
• reap the benefits
caring for guids – reap the benefits
• Data quality feedback
• Dialog based on annotation
• Tracking objects through analysis and use
• Maintaining attribution to provider
• Find related objects
• Find a way to take advantage of efforts
of many smart dedicated people
– BHL, biscicol, filtered PUSH, GNA, TNRS,
SGR,…
Thanks from iDigBio

Contenu connexe

Tendances

Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clarkdatascienceiqss
 
Data sharing & the nih data catalog
Data sharing & the nih data catalogData sharing & the nih data catalog
Data sharing & the nih data catalogreadkev
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE
 
RDM and the Donders Repository
RDM and the Donders RepositoryRDM and the Donders Repository
RDM and the Donders RepositoryRobert Oostenveld
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishingVarsha Khodiyar
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsMerce Crosas
 
Mohannad hussain dicom and imaging tools
Mohannad hussain   dicom and imaging toolsMohannad hussain   dicom and imaging tools
Mohannad hussain dicom and imaging toolsDevDays
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...stepheneisenhauer
 
Sensitive data - SQL Saturday Tampa
Sensitive data - SQL Saturday TampaSensitive data - SQL Saturday Tampa
Sensitive data - SQL Saturday TampaJohn Magnabosco
 
iNaturalist.org: Document Analysis
iNaturalist.org: Document AnalysisiNaturalist.org: Document Analysis
iNaturalist.org: Document Analysiskueda
 
Data Management for Graduate Students
Data Management for Graduate StudentsData Management for Graduate Students
Data Management for Graduate StudentsRebekah Cummings
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsDaniel S. Katz
 
How to search_free_crystallography_databases_benedictine_university final 111...
How to search_free_crystallography_databases_benedictine_university final 111...How to search_free_crystallography_databases_benedictine_university final 111...
How to search_free_crystallography_databases_benedictine_university final 111...Benedictine University Library
 
The OI Project - Geoffrey Bilder
The OI Project - Geoffrey BilderThe OI Project - Geoffrey Bilder
The OI Project - Geoffrey BilderCrossref
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016 Rebecca Raworth, MLIS
 
"Data in Context" IG sessions @ RDA 3rd Plenary
"Data in Context" IG sessions @  RDA 3rd Plenary"Data in Context" IG sessions @  RDA 3rd Plenary
"Data in Context" IG sessions @ RDA 3rd PlenaryBrigitte Jörg
 
Thomas ecn 2012
Thomas ecn 2012Thomas ecn 2012
Thomas ecn 2012ECNOfficer
 

Tendances (20)

Research data management for historians
Research data management for historiansResearch data management for historians
Research data management for historians
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
 
Data sharing & the nih data catalog
Data sharing & the nih data catalogData sharing & the nih data catalog
Data sharing & the nih data catalog
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
RDM and the Donders Repository
RDM and the Donders RepositoryRDM and the Donders Repository
RDM and the Donders Repository
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishing
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
Mohannad hussain dicom and imaging tools
Mohannad hussain   dicom and imaging toolsMohannad hussain   dicom and imaging tools
Mohannad hussain dicom and imaging tools
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
 
Sensitive data - SQL Saturday Tampa
Sensitive data - SQL Saturday TampaSensitive data - SQL Saturday Tampa
Sensitive data - SQL Saturday Tampa
 
iNaturalist.org: Document Analysis
iNaturalist.org: Document AnalysisiNaturalist.org: Document Analysis
iNaturalist.org: Document Analysis
 
Linked Data
Linked DataLinked Data
Linked Data
 
Data Management for Graduate Students
Data Management for Graduate StudentsData Management for Graduate Students
Data Management for Graduate Students
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
How to search_free_crystallography_databases_benedictine_university final 111...
How to search_free_crystallography_databases_benedictine_university final 111...How to search_free_crystallography_databases_benedictine_university final 111...
How to search_free_crystallography_databases_benedictine_university final 111...
 
The OI Project - Geoffrey Bilder
The OI Project - Geoffrey BilderThe OI Project - Geoffrey Bilder
The OI Project - Geoffrey Bilder
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 
"Data in Context" IG sessions @ RDA 3rd Plenary
"Data in Context" IG sessions @  RDA 3rd Plenary"Data in Context" IG sessions @  RDA 3rd Plenary
"Data in Context" IG sessions @ RDA 3rd Plenary
 
Thomas ecn 2012
Thomas ecn 2012Thomas ecn 2012
Thomas ecn 2012
 

En vedette (7)

Media oferty 20.01.2012
Media oferty 20.01.2012Media oferty 20.01.2012
Media oferty 20.01.2012
 
Domy z bali
Domy z baliDomy z bali
Domy z bali
 
Broszura 60 lat WIP
Broszura 60 lat WIPBroszura 60 lat WIP
Broszura 60 lat WIP
 
Media 04.04.2012 (1)
Media 04.04.2012 (1)Media 04.04.2012 (1)
Media 04.04.2012 (1)
 
Warsztaty 2012 skompresowane
Warsztaty 2012 skompresowaneWarsztaty 2012 skompresowane
Warsztaty 2012 skompresowane
 
Trzy Drzewa - prezentacja
Trzy Drzewa - prezentacjaTrzy Drzewa - prezentacja
Trzy Drzewa - prezentacja
 
La Integral definida
La Integral definidaLa Integral definida
La Integral definida
 

Similaire à Paul2 ecn 2012

It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveJanifer Gatenby
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 Scott Edmunds
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsJohn Kunze
 
Data and Donuts: How to write a data management plan
Data and Donuts: How to write a data management planData and Donuts: How to write a data management plan
Data and Donuts: How to write a data management planC. Tobin Magle
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...Carole Goble
 
Research data catalogues and data interoperability in life sciences
Research data catalogues and data interoperability in life sciencesResearch data catalogues and data interoperability in life sciences
Research data catalogues and data interoperability in life sciencesBlue BRIDGE
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Oscar Peña del Rio
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research DataKristin Briney
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data GenerationFilip Radulovic
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data LocallyErin D. Foster
 
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...Brigitte Jörg
 
Don't make me think: biodiversity data publishing made easy
Don't make me think: biodiversity data publishing made easyDon't make me think: biodiversity data publishing made easy
Don't make me think: biodiversity data publishing made easyVince Smith
 
Datat and donuts: how to write a data management plan
Datat and donuts: how to write a data management planDatat and donuts: how to write a data management plan
Datat and donuts: how to write a data management planC. Tobin Magle
 

Similaire à Paul2 ecn 2012 (20)

It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspective
 
EZID: Easy Persistent Identifiers and Data Citation
EZID: Easy Persistent Identifiers and Data CitationEZID: Easy Persistent Identifiers and Data Citation
EZID: Easy Persistent Identifiers and Data Citation
 
Second Thoughts about Metadata Standards for Data
Second Thoughts about Metadata Standards for DataSecond Thoughts about Metadata Standards for Data
Second Thoughts about Metadata Standards for Data
 
Schema and Identity for Linked Data
Schema and Identity for Linked DataSchema and Identity for Linked Data
Schema and Identity for Linked Data
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Scalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History CollectionsScalable Identifiers for Natural History Collections
Scalable Identifiers for Natural History Collections
 
Data and Donuts: How to write a data management plan
Data and Donuts: How to write a data management planData and Donuts: How to write a data management plan
Data and Donuts: How to write a data management plan
 
Lecture - Data Mining
Lecture - Data MiningLecture - Data Mining
Lecture - Data Mining
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Research data catalogues and data interoperability in life sciences
Research data catalogues and data interoperability in life sciencesResearch data catalogues and data interoperability in life sciences
Research data catalogues and data interoperability in life sciences
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
 
Where's the Data?
Where's the Data?Where's the Data?
Where's the Data?
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research Data
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
Krnarich "Assessing Contribution & Value"
Krnarich "Assessing Contribution & Value"Krnarich "Assessing Contribution & Value"
Krnarich "Assessing Contribution & Value"
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data Locally
 
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
Data in Context Interest Group Sessions @ RDA 3rd Plenary, Dublin (March 26-2...
 
Don't make me think: biodiversity data publishing made easy
Don't make me think: biodiversity data publishing made easyDon't make me think: biodiversity data publishing made easy
Don't make me think: biodiversity data publishing made easy
 
Datat and donuts: how to write a data management plan
Datat and donuts: how to write a data management planDatat and donuts: how to write a data management plan
Datat and donuts: how to write a data management plan
 

Plus de ECNOfficer

Price2 ecn2013
Price2 ecn2013Price2 ecn2013
Price2 ecn2013ECNOfficer
 
Sikes ecn2013 dn_ab
Sikes ecn2013 dn_abSikes ecn2013 dn_ab
Sikes ecn2013 dn_abECNOfficer
 
Janzen ecn2013
Janzen ecn2013Janzen ecn2013
Janzen ecn2013ECNOfficer
 
Nearns ecn2013
Nearns ecn2013Nearns ecn2013
Nearns ecn2013ECNOfficer
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013ECNOfficer
 
Giddens ecn2013
Giddens ecn2013Giddens ecn2013
Giddens ecn2013ECNOfficer
 
Rubinoff ecn2013 uhim
Rubinoff ecn2013 uhimRubinoff ecn2013 uhim
Rubinoff ecn2013 uhimECNOfficer
 
Mc alister ecn2013
Mc alister ecn2013Mc alister ecn2013
Mc alister ecn2013ECNOfficer
 
Dombroskie ecn2013
Dombroskie ecn2013Dombroskie ecn2013
Dombroskie ecn2013ECNOfficer
 
Dmitriev ecn2013
Dmitriev ecn2013Dmitriev ecn2013
Dmitriev ecn2013ECNOfficer
 
Oboyski ecn2013
Oboyski ecn2013Oboyski ecn2013
Oboyski ecn2013ECNOfficer
 
Thomas ecn2013
Thomas ecn2013Thomas ecn2013
Thomas ecn2013ECNOfficer
 
Jones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabioJones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabioECNOfficer
 
Austin ecn2013
Austin ecn2013Austin ecn2013
Austin ecn2013ECNOfficer
 
Yu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasingYu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasingECNOfficer
 
Solis ecn2013 usfws
Solis ecn2013 usfwsSolis ecn2013 usfws
Solis ecn2013 usfwsECNOfficer
 
Schuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structureSchuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structureECNOfficer
 
Gil ecn2013 ppt
Gil ecn2013 pptGil ecn2013 ppt
Gil ecn2013 pptECNOfficer
 

Plus de ECNOfficer (20)

Price2 ecn2013
Price2 ecn2013Price2 ecn2013
Price2 ecn2013
 
Sikes ecn2013 dn_ab
Sikes ecn2013 dn_abSikes ecn2013 dn_ab
Sikes ecn2013 dn_ab
 
Ryder ecn2013
Ryder ecn2013Ryder ecn2013
Ryder ecn2013
 
Janzen ecn2013
Janzen ecn2013Janzen ecn2013
Janzen ecn2013
 
Nearns ecn2013
Nearns ecn2013Nearns ecn2013
Nearns ecn2013
 
Krell ecn2013
Krell ecn2013Krell ecn2013
Krell ecn2013
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
 
Giddens ecn2013
Giddens ecn2013Giddens ecn2013
Giddens ecn2013
 
Rubinoff ecn2013 uhim
Rubinoff ecn2013 uhimRubinoff ecn2013 uhim
Rubinoff ecn2013 uhim
 
Mc alister ecn2013
Mc alister ecn2013Mc alister ecn2013
Mc alister ecn2013
 
Dombroskie ecn2013
Dombroskie ecn2013Dombroskie ecn2013
Dombroskie ecn2013
 
Dmitriev ecn2013
Dmitriev ecn2013Dmitriev ecn2013
Dmitriev ecn2013
 
Oboyski ecn2013
Oboyski ecn2013Oboyski ecn2013
Oboyski ecn2013
 
Thomas ecn2013
Thomas ecn2013Thomas ecn2013
Thomas ecn2013
 
Jones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabioJones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabio
 
Austin ecn2013
Austin ecn2013Austin ecn2013
Austin ecn2013
 
Yu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasingYu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasing
 
Solis ecn2013 usfws
Solis ecn2013 usfwsSolis ecn2013 usfws
Solis ecn2013 usfws
 
Schuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structureSchuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structure
 
Gil ecn2013 ppt
Gil ecn2013 pptGil ecn2013 ppt
Gil ecn2013 ppt
 

Dernier

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurSuhani Kapoor
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth MarketingShawn Pang
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 

Dernier (20)

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517
Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517
Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 

Paul2 ecn 2012

  • 1. IDs in and out of the database Entomological Collection Network (ECN) 2012 November 10 – 11, Knoxville, TN Debbie Paul, Greg Riccardi
  • 2. • What good is identification? • How are identifiers used by consumers • Providing IDs • Resolving IDs in a server –Strategies for storing IDs in databases • Linked Data • Annotations ~ all sorts • Feedback Overview
  • 3. What good is identification? • Aggregation – If you get info from 2 sources that are about the same object, you can combine the info • Resolution (finding information about object) – Types of resolution • Determine where to get information • Determine how to get information • Providing information – How to create IDs – How to publish IDs – How to fetch database information for IDs
  • 4. HTTP URIs • Biggest problem – Identification and 2 types of resolution are comingled • Resolution – Where to get information • Look somewhere – How to get information • Fetch information using some protocol
  • 5. DOI example • The DOI is • 10.3897/zookeys.209.3135 • URI (for aggregating) is • doi:10.3897/zookeys.209.3135 • A URL for information retrieval (proxy resolution) is • http://dx.doi.org/10.3897/zookeys.209.3135 • Information fetched from – HTML: • http://www.pensoft.net/journals/zookeys/article/3 135/abstract/five-task-clusters-that-enable- efficient-and-effective-digitization-of-biological- collections – RDF: • http://data.crossref.org/10.3897/zookeys.209.3135
  • 6. What’s in an ID? • For consumer: – NOTHING! No information – Might as well be UUID • Can’t type it, remember it, parse it, resolve it – Useful for comparison and aggregation • Equal strings (persistence) • Different strings about the same object – fetching information • Send the ID somewhere for info
  • 7. What’s in an ID? • For Provider/resolver: – Use ID to find local storage of information – E.g. • parse out the DWC triple • Extract the database table and primary key • Look up the ID in a table of IDs • Look up ID in a URI field of a database table
  • 8. What’s in an id for the provider? • record id 112234 • uuid 954c8760-e1a6-4b4b-ab82-6bf7311c25f3 • lsid urn:lsid:example.org:specimen:22545 • uri • ezid http://n2t.net/ark:/99999/fk42b9hdf • doi doi:10.1038/ng0609-637
  • 9. What about Specimen identifiers? • identifier on the specimen? – readable text – encoded data – barcode is a contextual identifier • identifier in the database? – http://ids.usms.edu/herb/0014097 – http://ids.usms.edu/herb/0303134303937
  • 10. How do providers identify?  Notice online databases and your database and find the identifiers of the various objects  Some identifiers are local (e.g. primary key)  Some identifiers are globally unique  Some identifiers are URIs
  • 12. Storing IDs in databases • your contextual ids?, your guids? • What to use for IDs? –record id –uuid –lsid –uri • what’s in your wallet database? • Morphbank Example
  • 13. IDs in Morphbank • Morphbank Example • http://www.morphbank.net/818505
  • 14. IDs in Morphbank • Morphbank Example • http://www.morphbank.net/643261
  • 15. Sharing data with IDs • into a publication • uploaded to the web • data shared with a database integrator / aggregator – GBIF – iDigBio – VertNet – Morphbank • what is it exactly in the publication? – an id?, a guid? a link to more information? – what will be cited? searched for?
  • 16. Feedback with IDs • Annotations – Target of annotation • http://www.morphbank.net/818505 – filtered PUSH • linked data ~ the semantic web – (benefits – in a minute) • updating the database – be(a)ware – Remember previous IDs
  • 17. What’s coming up next? • expect guids for all sorts of objects –collection objects (example: specimen) –georeferences –taxon concepts –determinations –people
  • 18. GUIDs are key • 1 to many IDs known for a given object • store and share the ones you know about Specimen RecordID 19537 Specimen Previous Catalog Number 212345 Specimen Catalog Number / bar code bbbrc000123 Darwin Core Triplet (DwC) flmnh:herb:bbbrc000123 DwC Occurrence URI urn:catalog:flmnh:herb:bbbrc000123 Specimen GUID of type lsid urn:lsid:biocol.org:flmnh:bbbrc000123 Specimen Opaque Identifier (UUID) 424854d7-baec-42cf-a142-805b64117b9f URI for UUID urn:uuid:424854d7-baec-42cf-a142-805b64117b9f Specimen GUID of type HTTP-URI http://ids.flmnh.ufl.edu/herb/bbbrc000123 *Cannot enforce single identifier per object
  • 19. caring for guids • store them – database adjustments – tweaking current standard practices • share them – data standards – 3 ways to modify darwin core • reap the benefits
  • 20. caring for guids – reap the benefits • Data quality feedback • Dialog based on annotation • Tracking objects through analysis and use • Maintaining attribution to provider • Find related objects • Find a way to take advantage of efforts of many smart dedicated people – BHL, biscicol, filtered PUSH, GNA, TNRS, SGR,…

Notes de l'éditeur

  1. iDigBio Summit 2011input from initial members
  2. Careful Id1=id2 means same objectId1!= id2 does not mean different objects
  3. Aggregation and resolution are separate issues, comingled by HTTP URIsuniform resource identifier (URI) is a string of characters used to identify a name or a resource.URIs can be classified as locators (URLs), as names (URNs), or as both. A uniform resource name (URN) functions like a person's name, while a uniform resource locator (URL) resembles that person's street address. In other words: the URN defines an item's identity, while the URL provides a method for finding it.from wikipediaOne can use a URN to talk about a resource without implying its location or how to access it. The resource does not need necessarily to be accessible over a network. For example, the URN urn:isbn:0-395-36341-1 is a URI that specifies the identifier system, i.e. international standard book number (ISBN), as well as the unique reference within that system and allows one to talk about a book, but the URI doesn't suggest where and how to obtain an actual copy of it.
  4. Not comingled. Identifier and resolution (proxy) are separateThe consumer has to know somewhere to look for infoRequires organization to manage allocation of id space and proxy resolutionMembers pay for service
  5. last of digressionCareful Id1=id2 means same objectId1!= id2 does not mean different objects
  6. Back to the primary purpose, managing identifiers as a provider/creator
  7. The standard for identification advocated by W3C is to use Universal (uniform)Resource Identifiers (URIs).-- a URI is a string that begins with a scheme name (or protocol). (http, https, mailto, doi, ftp, urn).UUID (sometimes GUID)definitely uniqueE.g. 954c8760-e1a6-4b4b-ab82-6bf7311c25f3Hard to type inNot resolvableNot always DB friendlyOpaqueurn:lsid:authority:namespace:identifierhttp://lsid.tdwg.org/urn:lsid:authority:namespace:identifier
  8. The standard for identification advocated by W3C is to use Universal Resource Identifiers (URIs).-- a URI is a string that begins with a scheme name (or protocol). (http, https, mailto, doi, ftp, urn).Second URI has hex encoding of “0014097”UUID (sometimes GUID)Assured uniqueE.g. d6610130-5248-11e1-b86c-0800200c9a66Hard to type inNot resolvableNot always DB friendlyOpaque
  9. Emerging Trends in Data Collection, Data Sharing, Data Integration for research, Data citation.Little science to Big Science.Imagine getting credit for all your digitization efforts!
  10. Example – Jeremy Miller – link to collections instead of each identifier.
  11. ‘target’ is a property of the annotation
  12. For more on GUIDs for upload to iDigBio,see our suggestions / policy at:https://www.idigbio.org/sites/default/files/iDigBio-GUID-Statement20MAR2012.pdfServe data for your objects.If you are serving data for other institutions, it needs to be clear in fields likeDarwin Core: Owner Instituion ID, Institution Code, Collection Code fieldsIf this place starts serving their own data, stop serving it for them.
  13. 1. go through community process of extending the Darwin Core2. extend uniquely in your community - as a set of terms needed by your community to share data concepts not currently in Darwin Core (example might be paleo extension)3. GBIF extension process - to be able to extend the IPT.    3a. it is possible to create needed extensions.    http://vocabularies.gbif.org/node/124372    http://vocabularies.gbif.org/extensionsNote: base your database on your needs. It does make it easier to match (map) if you use standard terms where possible. So, if adding georef fields to your database, try to use the standard terms if they exist.