SlideShare une entreprise Scribd logo
1  sur  50
Addressing the name:meaning drift
challenge in open biodiversity
information environments
Please
@taxonbytes
Nico M. Franz1 , Salvatore A. Anzaldo1, Edward E. Gilbert1,
M. Andrew Jansen1, M. Andrew Johnston1 & Bertram Ludäscher2
1 School of Life Sciences, Arizona State University
2 iSchool, University of Illinois at Urbana-Champaign
Symposium: Building the Biodiversity Knowledge Graph for Insects – Components, Progress, and Challenges
2016 XXV International Congress of Entomology, Orlando, FL – September 26, 2016 (#ICE2016)
Presentation available @ SlideShare: http://tinyurl.com/franz-et-al-ice-2016
Our biodiversity informatics research program, summarized
• We are no longer just putting articles and monographs on library shelves.
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Our biodiversity informatics research program, summarized
• We are no longer just putting articles and monographs on library shelves.
• This is more than 'just technology'; we must develop new systematic theory
to deal with inherently dynamic, open data systems.
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Our biodiversity informatics research program, summarized
• We are no longer just putting articles and monographs on library shelves.
• This is more than 'just technology'; we must develop new systematic theory
to deal with inherently dynamic, open data systems.
• The concept taxonomy approach has practical implications for strengthening
the roles that individual experts play in big biodiversity data environments.
91dd0ee1-8a37-4efc-85b7-8176874cf5be
Products – concept taxonomy in theory and in practice
ZooKeys. doi:10.3897/zookeys.528.6001
Semantic Web. doi:10.3233/SW-160220
Biological Theory (in review). doi:10.1101/022145
PloS ONE. doi:10.1371/journal.pone.0118247
Systematics Biodiv. doi:10.1080/14772000.2013.806371
Systematic Biology. doi:10.1093/sysbio/syw023
Biodiversity Data Journal (in review). #6093
Research Ideas and Outcomes (in review). #6302
Premise: We're lucky that insect revisions are not so frequent
"In biology, there are many taxa that are so under-studied that
they are only known from their original description and
none or very few subsequent references […].
The name alone, so long as it is a unique name,
is sufficient to locate all related material."
– David Remsen 2016: 213
Source: Remsen. 2016. The use and limits of scientific names […]. ZooKeys 550: 207–223. doi:10.3897/zookeys.550.9546
Diagnosis:
What happens in dynamic, open systems?
Snapshot of a more frequently revised organismal lineage
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids)
Snapshot of a more frequently revised organismal lineage
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids)
• Vertical sections identify taxonomic concept regions
Snapshot of a more frequently revised organismal lineage
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids)
• Vertical sections identify taxonomic concept regions
• Colors identify lineages of taxonomic names (epithets) in use
Snapshot of a more frequently revised organismal lineage
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
• 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids)
• Vertical sections identify taxonomic concept regions
• Colors identify lineages of taxonomic names (epithets) in use
• There is no consensus! Five incongruent schemata are used concurrently
Premise:
If incongruent taxonomies are endorsed
– locally, provisionally, and democratically –
then what is the impact for
aggregated biodiversity data?
Conclusion:
 Taxonomy becomes a variable
that we need to represent,
and control for
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus'
• Query: "Where do these orchid
species occur?"
• Same set of 250 orchid specimens,
according to 4 taxonomies.
"Controllingthetaxonomicvariable" Example: the Cleistes use case
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
"Controllingthetaxonomicvariable"
• Query: "Where do these orchid
species occur?"
• Same set of 250 orchid specimens,
according to 4 taxonomies.
Example: the Cleistes use case
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
"Controllingthetaxonomicvariable"
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Expert views
are in conflict
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Expert views
are in conflict
"Just bad"
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
Impact:
Name-based aggregation has created
a novel synthesis that nobody believes in
"Controllingthetaxonomicvariable"
"Just bad"
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
"Just
bad"
Expert views
are in conflict
Solution:
Instead of aggregating
an artificial 'consensus',
…
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
"Just
bad"
Expert views
are reconciled
Solution:
Instead of aggregating
an artificial 'consensus',
build translation services
Challenges:
How can we redesign aggregation to yield
high-quality biodiversity data packages?
Challenges:
How can we redesign aggregation to yield
high-quality biodiversity data packages?
What does this mean for Darwin Core1
and how we use this aggregation standard?
1 Wieczorek et al. 2012. Darwin Core: an evolving […]. PLoS ONE 7(1): e29715. doi:10.1371/journal.pone.0029715
Preview of solution with 8 steps
• DwC is insufficient, and part of the problem
Step 7:
# 1: Represent only taxonomic concept labels (TCLs) 1
• Syntax (TCL): taxonomic name [author, year, page] sec. source
1 Multi-taxonomy input/alignment visualizations generated with Euler/X toolkit: https://github.com/EulerProject/EulerX
Cleistes divaricata
sec. Gregg & Catling 1993
Pogonia
sec. Brown & Wunderlin 1997
# 1: DwC score keeping  TCLs are optional; < 1% realized?
• TCL ~ DwC: nameAccordingTo
• SCAN: 19,722 of nearly 9 million records have TCLs (0.2%)
• Lack of enforcement to use TCLs makes standard less big data-ready
DwC record with nameAccordingTo (TCL)
(BDJ)
"Who authors GBIF's Backbone?"
https://storify.com/taxonbytes/who-authors-gbif-s-backbone
# 2: Represent each source coherently (Parent-Child relationships)
• Syntax (PC): TCL1 is a child/parent of TCL2 [where TCL1/2 = same source]
Cleistesiopsis bifaria sec. Pans. & de Barr. 2008
is a child of
Cleistesiopsis sec. Pans. & de Barr. 2008
# 2: DwC score keeping  Not (adequately) represented
• PC ~ DwC: genus, family, order (etc.; higherClassification)
• However, higher-level names in DwC are not modeled as TCLs
• Taxonomic coherence of sources cannot be preserved with DwC alone
DwC record with higherClassification
(BDJ)
# 3: Do not force a single hierarchy onto all tip-level TCLs
• Syntax (PC): Tip-level TCL1 , TCL2 , etc. [where TCL1/2 = different sources]
# 3: DwC score keeping  Optional Not (ever?) practiced
• No PC ~ DwC: infra-/specificEpithet only
• Typically, a single, 'unitary' higher-level classification is represented
• Combinations of algorithmic and social practices achieve the single hierarchy
"Who authors GBIF's Backbone?"
https://storify.com/taxonbytes/who-authors-gbif-s-backbone
# 4: Link TCLs via expert-provided RCC–5 articulations
• Syntax (RCC–5): TCL1 {==, >, <, ><, !} TCL2 [where TCL1/2 = diff. sources]
• RCC–5 = Region Connection Calculus
• 14 articulations provided by: http://tinyurl.com/Weakley-Flora-2015
Cleistes bifaria "Coastal Populations" sec. Smith et al. 2004
== (is congruent with)
Cleistesiopsis oricamporum sec. Brown & Pans. 2009
==
Source: Thau, D.M. 2010. Reasoning about taxonomies. Thesis, UC Davis. http://gradworks.proquest.com/3422778.pdf
Region Connection Calculus (semantics: set constraints)
== < > >< !
• Two regions N, M are either:
• congruent (N == M)
• properly inclusive (N < M)
• inversely properly inclusive (N > M)
• overlapping (N >< M)
• exclusive of each other (N ! M)
Source: Thau, D.M. 2010. Reasoning about taxonomies. Thesis, UC Davis. http://gradworks.proquest.com/3422778.pdf
Region Connection Calculus (semantics: set constraints)
== < > >< !
• Two regions N, M are either:
• congruent (N == M)
• properly inclusive (N < M)
• inversely properly inclusive (N > M)
• overlapping (N >< M)
• exclusive of each other (N ! M)
• RCC–5 articulations answer the query: "can we join regions N and M?"
• Taxonomies have multiple RCC–5 alignable components: nodes (parents,
children), node-associated traits, even node-anchoring specimens
# 4: DwC score keeping  Not (adequately) represented
• RCC–5 ~ DwC: accepted(Scientific)Name(Usage), relationshipOfResource,
taxonomicStatus (etc.; nomenclatural relationships)
• Nomenclatural relationships are type-focused, not region-focused
• "Taxonomic Concept Schema"  yes! (however: http://www.tdwg.org/standards/117)
Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063
Example:
Milkweed butterflies
Oscillating meanings of the epithet hyalites – 1911 to 2003
Phenotypicdiversity
Type-anchorednameidentityrelations
Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063
# 5: Identify occurrence records only to TCLs
Records:
EKY39235
MTSU003611
NCSC00040204
…
Records:
BOON8098
CLEMS0061133
WILLI39399
…
Records:
GMUF-0039355
IBE006808
USCH58399
…
Records:
CONV0006268
MDKY00006482
NCU00038930
…
Records:
BRYV0023582, BRYV0023584
KHD00032030, MISS0016604
MMNS000227, NCSC00040206
USMS_000002923, USMS_000002924
VSC0053223, VSC0065528
…
Records:
ARIZ393087
DBG39049
USCH51217
…
Records:
NCU00040710
USCH96248
VSC0053218
…
Records:
CLEMS0012881
FUGR0003293
GA023130
…
Records:
BOON8100
NCSC00040210
SJNM45487
…
Records:
GA023144
LSU00012494
MISS0016608
…
Records:
IBE006810, IND-0012374, MMNS000227
Records:
NY8654
• Syntax (ID): Occurrence / organism is identified to TCL
"CLEMS0012881"
is identified to
Cleistes divaricata sec. Smith et al. 2004
[additional ID metadata]
DwC record with Identification metadata
(BDJ)
# 5: DwC score keeping  ID metadata optional; > 50% realized
• ID ~ DwC: Identification, (date)identified(By), identificationReference
• SCAN: 4,715,277 of nearly 9 million records have ID metadata (52.5%)
• Enforcement…still also require use of TCLs
# 6: Generate comprehensive, consistent RCC–5 alignments
• Euler/X is a toolkit that infers logically consistent RCC–5 alignments
# 6: Generate comprehensive, consistent RCC–5 alignments
• Valued-added: MIR – set of Maximally Informative Relations containing
the RCC–5 articulation for every possible TCL pair  scalability
Reasonerinference
# 7: Joining occurrence-to-TCL identifications & RCC–5 alignments
Records:
BOON8098, CLEMS0061133, CONV0006268, EKY39235
GMUF-0039355, IBE006808, IBE006810, IND-0012374
MDKY00006482, MMNS000227, MTSU003611, NCSC00040204
NCU00038930, NY8654, USCH58399, WILLI39399
…
Records:
ARIZ393087, BRYV0023582, BRYV0023584, DBG39049
KHD00032030, MISS0016604, MMNS00022, NCSC00040206
USMS_000002923, USMS_000002924, VSC0053223, VSC0065528
…
Records:
BOON8100, CLEMS0012881, FUGR0003293
GA023130, GA023144, LSU00012494
MISS0016608, NCSC00040210, NCU00040710
SJNM45487, USCH96248, VSC0053218
…
• Specimen integration is fully driven by TCL-to-TCL RCC–5 signals
Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review)
The 'consensus' The 'bible'
The (formerly)
federal 'standard'
The 'best', latest
regional flora
"Controllingthetaxonomicvariable"
Impact:
"Please select your preference (A – D);
we can perform all translations"
• We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records  resolves incongruent lineage of name usages
# 8: "Do you trust us now?" Aggregation as a translational service
• We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records  resolves incongruent lineage of name usages
• "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015"
• Returns record subset  resolving only one narrowly circumscribed concept
# 8: "Do you trust us now?" Aggregation as a translational service
# 8: "Do you trust us now?" Aggregation as a translational service
• We can now respond to queries such as:
• "Show all specimens identified to the taxonomic name Cleistes divaricata"
• Returns many records  resolves incongruent lineage of name usages
• "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015"
• Returns record subset  resolving only one narrowly circumscribed concept
• "Now show specimens identified to the TCL Cleistes divaricata sec. RAB 1968,
yet translated into the more granular TCLs sec. Weakley 2015"
• Returns (again) many records, yet represents and contrasts two treatments,
as opposed to providing the ambiguous lineage view (above)
• "Show all specimens with ambiguous 2010/2015 TCL identifications…" (etc.)
Conclusions – designing trusted biodiversity data services
• The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• We are developing new solutions – including TCLs, PC relations, RCC–5, and
scalable logic applications – that realize data aggregation via translational
services, without disrupting the formation of expert-licensed, high-quality
biodiversity data packages
Conclusions – designing trusted biodiversity data services
• The Darwin Core standard for aggregating biodiversity data:
(1) Has under-utilized options for better representing taxonomic expertise
(2) Is part of a design paradigm that undermines the plurality of expertise
• We are developing new solutions – including TCLs, PC relations, RCC–5, and
scalable logic applications – that realize data aggregation via translational
services, without disrupting the formation of expert-licensed, high-quality
biodiversity data packages
• All of us – not just aggregators – "own" the responsibility of designing
systems where the plurality of taxonomic expertise is fairly accommodated
Conclusions – designing trusted biodiversity data services
Acknowledgments & links to products
• Cleistes use case: Alan Weakley (UNC)
• Euler/X toolkit: Shizhuo Yu (UC Davis)
• Data trajectories: Beckett Sterner (ASU)
• OBKMS design: Viktor Senderov (Pensoft)
• NSF DEB–1155984, DBI–1342595 (PI Franz)
• NSF IIS–118088, DBI–1147273 (PI Ludäscher)
• Euler/X code @ https://github.com/EulerProject/EulerX
• Franz et al. 2016. Two influential primate classifications logically aligned.
Systematic Biology 65(4): 561–582. Link
Interested in exploring
multi-taxonomy and/or
-phylogeny alignments?
Please contact me.
nico.franz@asu.edu
@taxonbytes
https://biokic.asu.edu/

Contenu connexe

Similaire à Franz et al ice 2016 addressing the name meaning drift challenge in open ended biodiversity information environments

Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
taxonbytes
 
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
taxonbytes
 

Similaire à Franz et al ice 2016 addressing the name meaning drift challenge in open ended biodiversity information environments (20)

Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
 
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biologyFranz 2017 sols cbs seminar the limits of synthesis for integrative biology
Franz 2017 sols cbs seminar the limits of synthesis for integrative biology
 
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
 
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
Paul Groth
Paul GrothPaul Groth
Paul Groth
 
GARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceGARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant Science
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
 
Scientific and Technical Translation in English: Week 2
Scientific and Technical Translation in English: Week 2Scientific and Technical Translation in English: Week 2
Scientific and Technical Translation in English: Week 2
 
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
NISO Apr 29 Virtual Conference: Dismantling a Single-Discipline Journal Bundl...
 
2014 mmg-talk
2014 mmg-talk2014 mmg-talk
2014 mmg-talk
 
Rii stock centerdir_aug9_2016
Rii stock centerdir_aug9_2016Rii stock centerdir_aug9_2016
Rii stock centerdir_aug9_2016
 
Publishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked DataPublishing Germplasm Vocabularies as Linked Data
Publishing Germplasm Vocabularies as Linked Data
 
The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Botanists and annotations: use cases and their relevance for the larger scie...
Botanists and annotations:  use cases and their relevance for the larger scie...Botanists and annotations:  use cases and their relevance for the larger scie...
Botanists and annotations: use cases and their relevance for the larger scie...
 
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical LiteratureII-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
II-SDV 2016 Srinivasan Parthiban - KOL Analytics from Biomedical Literature
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
 

Plus de taxonbytes

Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus ReclassificationZhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
taxonbytes
 

Plus de taxonbytes (20)

De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...
 
Anzaldo franz 2017 ecn your daily weevil
Anzaldo franz 2017 ecn your daily weevilAnzaldo franz 2017 ecn your daily weevil
Anzaldo franz 2017 ecn your daily weevil
 
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
Franz et al 2017 ecn creating and publishing a symbiota based checklist versionFranz et al 2017 ecn creating and publishing a symbiota based checklist version
Franz et al 2017 ecn creating and publishing a symbiota based checklist version
 
Franz et al tdwg 2016 new developments for libraries of life
Franz et al tdwg 2016 new developments for libraries of lifeFranz et al tdwg 2016 new developments for libraries of life
Franz et al tdwg 2016 new developments for libraries of life
 
Franz et al tdwg 2016 introducing lep net
Franz et al tdwg 2016 introducing lep netFranz et al tdwg 2016 introducing lep net
Franz et al tdwg 2016 introducing lep net
 
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portals
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portalsFranz et al TDWG 2016 Updates on multiple neotropical symbiota portals
Franz et al TDWG 2016 Updates on multiple neotropical symbiota portals
 
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
Franz Zhang et al Weevil Workshop 2016 Neotropical Entiminae Systematics evol...
 
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
Zhang et al ecn 2016 building an accessible weevil tissue collection for geno...
 
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
 
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
Franz 2016 Phenotype RCN Representing Taxonomy and Phylogeny as Logically Tra...
 
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus ReclassificationZhang Franz ESCJAM 2015 Exophthalmus Reclassification
Zhang Franz ESCJAM 2015 Exophthalmus Reclassification
 
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity dataFranz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
Franz cobb seltmann 2015 spnhc current state of arthropod biodiversity data
 
Johnston ESA 2014 Trogloderus Sand Dune Speciation
Johnston ESA 2014 Trogloderus Sand Dune SpeciationJohnston ESA 2014 Trogloderus Sand Dune Speciation
Johnston ESA 2014 Trogloderus Sand Dune Speciation
 
Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...
Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...
Zhang Et Al ESA 2014 Ancient reverse colonization of Central America from the...
 
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other CasesFranz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
 
Franz 2014 BIGCB Tracking Change across Classifications and Phylogenies
Franz 2014 BIGCB Tracking Change across Classifications and PhylogeniesFranz 2014 BIGCB Tracking Change across Classifications and Phylogenies
Franz 2014 BIGCB Tracking Change across Classifications and Phylogenies
 
Arizona State University Natural History Collections - Moving to Alameda (201...
Arizona State University Natural History Collections - Moving to Alameda (201...Arizona State University Natural History Collections - Moving to Alameda (201...
Arizona State University Natural History Collections - Moving to Alameda (201...
 
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...
 
Franz. 2014. Explaining taxonomy's legacy to computers – how and why?
Franz. 2014. Explaining taxonomy's legacy to computers – how and why?Franz. 2014. Explaining taxonomy's legacy to computers – how and why?
Franz. 2014. Explaining taxonomy's legacy to computers – how and why?
 
Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...
Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...
Ludäscher et al. 2014 - A Hybrid Diagnosis Approach Combining Black-Box and W...
 

Dernier

Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 

Dernier (20)

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 

Franz et al ice 2016 addressing the name meaning drift challenge in open ended biodiversity information environments

  • 1. Addressing the name:meaning drift challenge in open biodiversity information environments Please @taxonbytes Nico M. Franz1 , Salvatore A. Anzaldo1, Edward E. Gilbert1, M. Andrew Jansen1, M. Andrew Johnston1 & Bertram Ludäscher2 1 School of Life Sciences, Arizona State University 2 iSchool, University of Illinois at Urbana-Champaign Symposium: Building the Biodiversity Knowledge Graph for Insects – Components, Progress, and Challenges 2016 XXV International Congress of Entomology, Orlando, FL – September 26, 2016 (#ICE2016) Presentation available @ SlideShare: http://tinyurl.com/franz-et-al-ice-2016
  • 2. Our biodiversity informatics research program, summarized • We are no longer just putting articles and monographs on library shelves. 91dd0ee1-8a37-4efc-85b7-8176874cf5be
  • 3. Our biodiversity informatics research program, summarized • We are no longer just putting articles and monographs on library shelves. • This is more than 'just technology'; we must develop new systematic theory to deal with inherently dynamic, open data systems. 91dd0ee1-8a37-4efc-85b7-8176874cf5be
  • 4. Our biodiversity informatics research program, summarized • We are no longer just putting articles and monographs on library shelves. • This is more than 'just technology'; we must develop new systematic theory to deal with inherently dynamic, open data systems. • The concept taxonomy approach has practical implications for strengthening the roles that individual experts play in big biodiversity data environments. 91dd0ee1-8a37-4efc-85b7-8176874cf5be
  • 5. Products – concept taxonomy in theory and in practice ZooKeys. doi:10.3897/zookeys.528.6001 Semantic Web. doi:10.3233/SW-160220 Biological Theory (in review). doi:10.1101/022145 PloS ONE. doi:10.1371/journal.pone.0118247 Systematics Biodiv. doi:10.1080/14772000.2013.806371 Systematic Biology. doi:10.1093/sysbio/syw023 Biodiversity Data Journal (in review). #6093 Research Ideas and Outcomes (in review). #6302
  • 6. Premise: We're lucky that insect revisions are not so frequent "In biology, there are many taxa that are so under-studied that they are only known from their original description and none or very few subsequent references […]. The name alone, so long as it is a unique name, is sufficient to locate all related material." – David Remsen 2016: 213 Source: Remsen. 2016. The use and limits of scientific names […]. ZooKeys 550: 207–223. doi:10.3897/zookeys.550.9546
  • 7. Diagnosis: What happens in dynamic, open systems?
  • 8. Snapshot of a more frequently revised organismal lineage Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids)
  • 9. Snapshot of a more frequently revised organismal lineage Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids) • Vertical sections identify taxonomic concept regions
  • 10. Snapshot of a more frequently revised organismal lineage Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids) • Vertical sections identify taxonomic concept regions • Colors identify lineages of taxonomic names (epithets) in use
  • 11. Snapshot of a more frequently revised organismal lineage Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) • 9 schemata for the NA Cleistes/Cleistesiopsis complex (orchids) • Vertical sections identify taxonomic concept regions • Colors identify lineages of taxonomic names (epithets) in use • There is no consensus! Five incongruent schemata are used concurrently
  • 12. Premise: If incongruent taxonomies are endorsed – locally, provisionally, and democratically – then what is the impact for aggregated biodiversity data?
  • 13. Conclusion:  Taxonomy becomes a variable that we need to represent, and control for
  • 14. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' • Query: "Where do these orchid species occur?" • Same set of 250 orchid specimens, according to 4 taxonomies. "Controllingthetaxonomicvariable" Example: the Cleistes use case
  • 15. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' "Controllingthetaxonomicvariable" • Query: "Where do these orchid species occur?" • Same set of 250 orchid specimens, according to 4 taxonomies. Example: the Cleistes use case
  • 16. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' "Controllingthetaxonomicvariable"
  • 17. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable"
  • 18. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" Expert views are in conflict
  • 19. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" Expert views are in conflict "Just bad"
  • 20. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora Impact: Name-based aggregation has created a novel synthesis that nobody believes in "Controllingthetaxonomicvariable" "Just bad"
  • 21. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" "Just bad" Expert views are in conflict Solution: Instead of aggregating an artificial 'consensus', …
  • 22. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" "Just bad" Expert views are reconciled Solution: Instead of aggregating an artificial 'consensus', build translation services
  • 23. Challenges: How can we redesign aggregation to yield high-quality biodiversity data packages?
  • 24. Challenges: How can we redesign aggregation to yield high-quality biodiversity data packages? What does this mean for Darwin Core1 and how we use this aggregation standard? 1 Wieczorek et al. 2012. Darwin Core: an evolving […]. PLoS ONE 7(1): e29715. doi:10.1371/journal.pone.0029715
  • 25. Preview of solution with 8 steps • DwC is insufficient, and part of the problem Step 7:
  • 26. # 1: Represent only taxonomic concept labels (TCLs) 1 • Syntax (TCL): taxonomic name [author, year, page] sec. source 1 Multi-taxonomy input/alignment visualizations generated with Euler/X toolkit: https://github.com/EulerProject/EulerX Cleistes divaricata sec. Gregg & Catling 1993 Pogonia sec. Brown & Wunderlin 1997
  • 27. # 1: DwC score keeping  TCLs are optional; < 1% realized? • TCL ~ DwC: nameAccordingTo • SCAN: 19,722 of nearly 9 million records have TCLs (0.2%) • Lack of enforcement to use TCLs makes standard less big data-ready DwC record with nameAccordingTo (TCL) (BDJ) "Who authors GBIF's Backbone?" https://storify.com/taxonbytes/who-authors-gbif-s-backbone
  • 28. # 2: Represent each source coherently (Parent-Child relationships) • Syntax (PC): TCL1 is a child/parent of TCL2 [where TCL1/2 = same source] Cleistesiopsis bifaria sec. Pans. & de Barr. 2008 is a child of Cleistesiopsis sec. Pans. & de Barr. 2008
  • 29. # 2: DwC score keeping  Not (adequately) represented • PC ~ DwC: genus, family, order (etc.; higherClassification) • However, higher-level names in DwC are not modeled as TCLs • Taxonomic coherence of sources cannot be preserved with DwC alone DwC record with higherClassification (BDJ)
  • 30. # 3: Do not force a single hierarchy onto all tip-level TCLs • Syntax (PC): Tip-level TCL1 , TCL2 , etc. [where TCL1/2 = different sources]
  • 31. # 3: DwC score keeping  Optional Not (ever?) practiced • No PC ~ DwC: infra-/specificEpithet only • Typically, a single, 'unitary' higher-level classification is represented • Combinations of algorithmic and social practices achieve the single hierarchy "Who authors GBIF's Backbone?" https://storify.com/taxonbytes/who-authors-gbif-s-backbone
  • 32. # 4: Link TCLs via expert-provided RCC–5 articulations • Syntax (RCC–5): TCL1 {==, >, <, ><, !} TCL2 [where TCL1/2 = diff. sources] • RCC–5 = Region Connection Calculus • 14 articulations provided by: http://tinyurl.com/Weakley-Flora-2015 Cleistes bifaria "Coastal Populations" sec. Smith et al. 2004 == (is congruent with) Cleistesiopsis oricamporum sec. Brown & Pans. 2009 ==
  • 33. Source: Thau, D.M. 2010. Reasoning about taxonomies. Thesis, UC Davis. http://gradworks.proquest.com/3422778.pdf Region Connection Calculus (semantics: set constraints) == < > >< ! • Two regions N, M are either: • congruent (N == M) • properly inclusive (N < M) • inversely properly inclusive (N > M) • overlapping (N >< M) • exclusive of each other (N ! M)
  • 34. Source: Thau, D.M. 2010. Reasoning about taxonomies. Thesis, UC Davis. http://gradworks.proquest.com/3422778.pdf Region Connection Calculus (semantics: set constraints) == < > >< ! • Two regions N, M are either: • congruent (N == M) • properly inclusive (N < M) • inversely properly inclusive (N > M) • overlapping (N >< M) • exclusive of each other (N ! M) • RCC–5 articulations answer the query: "can we join regions N and M?" • Taxonomies have multiple RCC–5 alignable components: nodes (parents, children), node-associated traits, even node-anchoring specimens
  • 35. # 4: DwC score keeping  Not (adequately) represented • RCC–5 ~ DwC: accepted(Scientific)Name(Usage), relationshipOfResource, taxonomicStatus (etc.; nomenclatural relationships) • Nomenclatural relationships are type-focused, not region-focused • "Taxonomic Concept Schema"  yes! (however: http://www.tdwg.org/standards/117) Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063 Example: Milkweed butterflies
  • 36. Oscillating meanings of the epithet hyalites – 1911 to 2003 Phenotypicdiversity Type-anchorednameidentityrelations Source: Vane-Wright. 2003. Indifferent philosophy versus […]. Syst. Biodiv. 1: 3–11. doi:10.1017/S1477200003001063
  • 37. # 5: Identify occurrence records only to TCLs Records: EKY39235 MTSU003611 NCSC00040204 … Records: BOON8098 CLEMS0061133 WILLI39399 … Records: GMUF-0039355 IBE006808 USCH58399 … Records: CONV0006268 MDKY00006482 NCU00038930 … Records: BRYV0023582, BRYV0023584 KHD00032030, MISS0016604 MMNS000227, NCSC00040206 USMS_000002923, USMS_000002924 VSC0053223, VSC0065528 … Records: ARIZ393087 DBG39049 USCH51217 … Records: NCU00040710 USCH96248 VSC0053218 … Records: CLEMS0012881 FUGR0003293 GA023130 … Records: BOON8100 NCSC00040210 SJNM45487 … Records: GA023144 LSU00012494 MISS0016608 … Records: IBE006810, IND-0012374, MMNS000227 Records: NY8654 • Syntax (ID): Occurrence / organism is identified to TCL "CLEMS0012881" is identified to Cleistes divaricata sec. Smith et al. 2004 [additional ID metadata]
  • 38. DwC record with Identification metadata (BDJ) # 5: DwC score keeping  ID metadata optional; > 50% realized • ID ~ DwC: Identification, (date)identified(By), identificationReference • SCAN: 4,715,277 of nearly 9 million records have ID metadata (52.5%) • Enforcement…still also require use of TCLs
  • 39. # 6: Generate comprehensive, consistent RCC–5 alignments • Euler/X is a toolkit that infers logically consistent RCC–5 alignments
  • 40. # 6: Generate comprehensive, consistent RCC–5 alignments • Valued-added: MIR – set of Maximally Informative Relations containing the RCC–5 articulation for every possible TCL pair  scalability Reasonerinference
  • 41. # 7: Joining occurrence-to-TCL identifications & RCC–5 alignments Records: BOON8098, CLEMS0061133, CONV0006268, EKY39235 GMUF-0039355, IBE006808, IBE006810, IND-0012374 MDKY00006482, MMNS000227, MTSU003611, NCSC00040204 NCU00038930, NY8654, USCH58399, WILLI39399 … Records: ARIZ393087, BRYV0023582, BRYV0023584, DBG39049 KHD00032030, MISS0016604, MMNS00022, NCSC00040206 USMS_000002923, USMS_000002924, VSC0053223, VSC0065528 … Records: BOON8100, CLEMS0012881, FUGR0003293 GA023130, GA023144, LSU00012494 MISS0016608, NCSC00040210, NCU00040710 SJNM45487, USCH96248, VSC0053218 … • Specimen integration is fully driven by TCL-to-TCL RCC–5 signals
  • 42. Source: Franz et al. 2016. Controlling the taxonomic variable […]. Research Ideas and Outcomes (RIO). (In Review) The 'consensus' The 'bible' The (formerly) federal 'standard' The 'best', latest regional flora "Controllingthetaxonomicvariable" Impact: "Please select your preference (A – D); we can perform all translations"
  • 43. • We can now respond to queries such as: • "Show all specimens identified to the taxonomic name Cleistes divaricata" • Returns many records  resolves incongruent lineage of name usages # 8: "Do you trust us now?" Aggregation as a translational service
  • 44. • We can now respond to queries such as: • "Show all specimens identified to the taxonomic name Cleistes divaricata" • Returns many records  resolves incongruent lineage of name usages • "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015" • Returns record subset  resolving only one narrowly circumscribed concept # 8: "Do you trust us now?" Aggregation as a translational service
  • 45. # 8: "Do you trust us now?" Aggregation as a translational service • We can now respond to queries such as: • "Show all specimens identified to the taxonomic name Cleistes divaricata" • Returns many records  resolves incongruent lineage of name usages • "Now show specimens with the TCL Cleistesiopsis divaricata sec. Weakley 2015" • Returns record subset  resolving only one narrowly circumscribed concept • "Now show specimens identified to the TCL Cleistes divaricata sec. RAB 1968, yet translated into the more granular TCLs sec. Weakley 2015" • Returns (again) many records, yet represents and contrasts two treatments, as opposed to providing the ambiguous lineage view (above) • "Show all specimens with ambiguous 2010/2015 TCL identifications…" (etc.)
  • 46. Conclusions – designing trusted biodiversity data services • The Darwin Core standard for aggregating biodiversity data: (1) Has under-utilized options for better representing taxonomic expertise (2) Is part of a design paradigm that undermines the plurality of expertise
  • 47. • The Darwin Core standard for aggregating biodiversity data: (1) Has under-utilized options for better representing taxonomic expertise (2) Is part of a design paradigm that undermines the plurality of expertise • We are developing new solutions – including TCLs, PC relations, RCC–5, and scalable logic applications – that realize data aggregation via translational services, without disrupting the formation of expert-licensed, high-quality biodiversity data packages Conclusions – designing trusted biodiversity data services
  • 48. • The Darwin Core standard for aggregating biodiversity data: (1) Has under-utilized options for better representing taxonomic expertise (2) Is part of a design paradigm that undermines the plurality of expertise • We are developing new solutions – including TCLs, PC relations, RCC–5, and scalable logic applications – that realize data aggregation via translational services, without disrupting the formation of expert-licensed, high-quality biodiversity data packages • All of us – not just aggregators – "own" the responsibility of designing systems where the plurality of taxonomic expertise is fairly accommodated Conclusions – designing trusted biodiversity data services
  • 49. Acknowledgments & links to products • Cleistes use case: Alan Weakley (UNC) • Euler/X toolkit: Shizhuo Yu (UC Davis) • Data trajectories: Beckett Sterner (ASU) • OBKMS design: Viktor Senderov (Pensoft) • NSF DEB–1155984, DBI–1342595 (PI Franz) • NSF IIS–118088, DBI–1147273 (PI Ludäscher) • Euler/X code @ https://github.com/EulerProject/EulerX • Franz et al. 2016. Two influential primate classifications logically aligned. Systematic Biology 65(4): 561–582. Link
  • 50. Interested in exploring multi-taxonomy and/or -phylogeny alignments? Please contact me. nico.franz@asu.edu @taxonbytes https://biokic.asu.edu/

Notes de l'éditeur

  1. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  2. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  3. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  4. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  5. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  6. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  7. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  8. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  9. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  10. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  11. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  12. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  13. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  14. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  15. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  16. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  17. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  18. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  19. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  20. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  21. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  22. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  23. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  24. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  25. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  26. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  27. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  28. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  29. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  30. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  31. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  32. The simple semantics of RCC-5 makes this a rather generic vocabulary for representing advancement in phylogenetic knowledge. At the same time, the onus is on the phylogeneticists to apply the articulations in auch ways that the desired query services are actually obtained.
  33. The simple semantics of RCC-5 makes this a rather generic vocabulary for representing advancement in phylogenetic knowledge. At the same time, the onus is on the phylogeneticists to apply the articulations in auch ways that the desired query services are actually obtained.
  34. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  35. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  36. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  37. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  38. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  39. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  40. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  41. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  42. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  43. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  44. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.
  45. The more one looks, the more complicated it gets. Notice also the node labeling, or lack thereof.