Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

ASSESSMENTS-Taxonomic-Assessments-Javier

511 vues

Publié le

  • Soyez le premier à commenter

ASSESSMENTS-Taxonomic-Assessments-Javier

  1. 1. Data Cleaning and Data Publishing Workshop 2013 18-22 February, Nairobi, Kenya Javier Otegui @jotegui TAXONOMIC ASSESSMENTS
  2. 2. ¡ What is Taxonomy? § CBD – “Taxonomy is the science of naming, describing and classifying organisms and includes all plants, animals and microorganisms of the world” § Using morphological, behavioral, genetic and biochemical observations, taxonomists identify, describe and arrange species into classifications, including those that are new to science. ¡ Taxonomy is related to: § the identification of an organism § Placing the organism in context with the rest of living organisms TAXONOMY – WHAT IS IT?
  3. 3. ¡ Taxonomy is based on names ¡ Humans have always given names ¡ Binomial nomenclature ¡ Define individuals and groups ¡ Each name defines a taxon TAXONOMY – TAXONOMIC NAMES
  4. 4. ¡  Organization and classification of organisms ¡  According to common features ¡  Taxonomic classification TAXONOMY - HIERARCHIES http://wp.lps.org/jbenson2/blog/2012/01/18/january-18-taxonomy-chart-lab
  5. 5. ¡ Taxonomy has a strong subjective component ¡ Classifications depend on the expertise and point of view of the specialist ¡ Lots of episodes of: § Name removals § Taxon splits § Taxon merges § Different organizations according to different features ¡ Some cases… TAXONOMY – NAMES AND TAXONOMIES
  6. 6. ¡  Two different names are applied to the same organism ¡  Expert argues that two originally different taxa are the same ¡  Generally one name remains, the other is considered a synonym and no longer valid TAXONOMY - SYNONYMY Photo: Arthur Chapman Antilocapra americana Ord, 1815 Antilocapra anteflexa Gray, 1855
  7. 7. ¡  Two different names are applied to the same organism ¡  Expert argues that two originally different taxa are the same ¡  Generally one name remains, the other is considered a synonym and no longer valid TAXONOMY - SYNONYMY Photo: Arthur Chapman Antilocapra americana Ord, 1815 Antilocapra anteflexa Gray, 1855
  8. 8. ¡  The same name is applied to two different organisms ¡  New description using “already taken” name ¡  Generally, oldest name prevails and newest has to change TAXONOMY - HOMONYMY Echidna Cuvier, 1797 Echidna Forster, 1777 Photo: David R Photo: Petr Baum
  9. 9. Photo: David R Photo: Petr Baum ¡  The same name is applied to two different organisms ¡  New description using “already taken” name ¡  Generally, oldest name prevails and newest has to change TAXONOMY - HOMONYMY Echidna Cuvier, 1797 Echidna Forster, 1777
  10. 10. Photo: Petr Baum ¡  The same name is applied to two different organisms ¡  New description using “already taken” name ¡  Generally, oldest name prevails and newest has to change TAXONOMY - HOMONYMY Echidna Cuvier, 1797 Tachyglossus Illiger, 1811
  11. 11. ¡ Taxonomic classifications are subjective ¡ Based on common features ¡ Different experts select different features ¡ Scientific names might remain the same ¡ Higher level taxa or groups might differ ¡ See example… TAXONOMY – ALTERNATE CLASSIFICATIONS
  12. 12. TAXONOMY – ALTERNATE CLASSIFICATIONS
  13. 13. ¡ Issues with names hamper the use of taxonomic names alone to be effective ¡ New term: Taxon concept ¡ Name – Concatenation of characters ¡ Concept – Name + context ¡ Even if the name is the same, the concept is different since it applies to different organisms TAXONOMY – NAME VS CONCEPT
  14. 14. TAXONOMY - STANDARDS ¡  Taxonomic names: Scientific name and all higher taxa ¡  Taxon concept: taxonConceptID, nameAccordingTo, namePublishedIn…
  15. 15. TAXONOMY - STANDARDS ¡  Taxonomic names: Scientific name and all higher taxa ¡  Taxon concept: taxonConceptID, nameAccordingTo, namePublishedIn… Source in which the specific taxon concept circumscription is defined or implied
  16. 16. TAXONOMY - STANDARDS ¡  Taxonomic names: Scientific name and all higher taxa ¡  Taxon concept: taxonConceptID, nameAccordingTo, namePublishedIn… For taxa that result from identifications, a reference to the keys, monographs, experts and other sources should be given
  17. 17. ¡ One of the most common issues ¡ Random alteration of one or more characters in a name ¡ Possibilities: § Purely accidental § Due to low knowledge ¡ Tend to appear at the time of digitization NOISE - MISSPELLINGS
  18. 18. NOISE - MISSPELLINGS Photo: Barracuda1983 Pipistrellus Pipistrelus Pippistrellus Pipistrella Pippistrela …
  19. 19. ¡ Misidentification § A more obscure type of error § Wrongly identify a taxon § The only way of solving is through close examination by expert taxonomist § Might not be resolvable at all ¡ Emptiness § Seriousness depends on missing level/s § Importance decreases as taxonomic rank increases § Scientific name missing? § Special cases: homonymies, synonymies… NOISE – MISIDENTIFICATIONS & EMPTINESS
  20. 20. ¡ Not defining used taxonomy § Can have the same effect as having only scientific name § We might complete hierarchy, but reliability? § Providing employed taxonomy (taxonomic concept) § Use identification qualifiers: “Sensu Otegui, 2013”, or “Sensu Biologia Centrali Americana” ¡ Synonymies and homonymies § Again, background information (metadata, taxonomic concept) needed § Use of identification qualifiers NOISE – NATURE OF TAXONOMY
  21. 21. ¡  Instability of taxonomic identifications ¡  Background information greatly help ¡  Also having source of change records NOISE – NATURE OF TAXONOMY
  22. 22. ¡  Aims of taxonomic assessments §  Correct issues §  Reconcile taxonomies §  Complete hierarchies ¡  Basic general process – controlled name list §  Take a name §  Check if exists in a reliable list of names §  Extract related information §  Apply to our dataset ASSESSMENTS
  23. 23. ¡  General Databases §  Ideally, global high-quality information §  Not complete §  Rely on taxon-specific sources and their completeness ASSESSMENTS – SOURCES OF DATA
  24. 24. ¡  General Databases §  Ideally, global high-quality information §  Not complete §  Rely on taxon-specific sources and their completeness ¡  Thematic databases and regional checklists §  If our collection is taxon-specific or location-specific §  Gather all available knowledge on their topic §  Reliable authoritative sources ASSESSMENTS – SOURCES OF DATA
  25. 25. ¡  General Databases §  Ideally, global high-quality information §  Not complete §  Rely on taxon-specific sources and their completeness ¡  Thematic databases and regional checklists §  If our collection is taxon-specific or location-specific §  Gather all available knowledge on their topic §  Reliable authoritative sources ¡  Taxonomic Literature §  Most specific source §  Very high reliability §  Hard to retrieve relevant literature §  Some processing needed ASSESSMENTS – SOURCES OF DATA
  26. 26. ¡ Free of misspellings § Ab initio, or manage to reduce to the minimum § Some of the tools (Refine, Excel processing…) to accomplish this § Taxonomic reconciliation depends on this requirement ¡ Completeness § At least to certain point § This minimum is scientific name § But only scientific name might not be enough ¡ Helpful metadata § Not related to the organism, but to the process of identification § The person who identified, taxonomic classification ASSESSMENTS - REQUIREMENTS
  27. 27. ¡  Manual §  Removing inconsistencies, updating the wrong information §  Taxonomy is an interpretation of explicit and implicit knowledge §  Explicit knowledge – records §  Implicit knowledge – human deduction §  Machines are not good at interpreting implicit knowledge §  Prone to errors. Automated approach recommended ¡  Automatic §  Big amounts of data §  Repetitive tasks §  Removal of misspellings, checking against source, update §  Only explicit knowledge. Explicit metadata mandatory ASSESSMENTS - METHODS
  28. 28. ASSESSMENTS - SEQUENCE
  29. 29. ¡  After cleaning, validate output ¡  Check: §  The data that has been corrected §  The data that could not be corrected §  The data that might have gone worse ¡  Taxonomic validation: §  Expertise §  Mixture of explicit and implicit knowledge §  Not completely automatable ¡  If assessments fail: §  Our data – Document and report reliability §  Distributed data – Flag and report VALIDATION

×