Natural sciences collections around the world comprise more than one billion specimens, representing a vast source of information on the natural world.
Natural History Museums and similar institutions hold and care for these collections on behalf of us all - they are an international public resource. Mobilising these data for research, conservation and public use is a formidable task - and one that is ideally suited to citizen science.
Using the power of the crowd to extract, transcribe, interpret and/or analyse data from handwritten labels brings the scale of the task within reach within our lifetimes.
My talk 'Digitising Dinosaurs' focused on the crowdsourcing of specimen label transcriptions as part of the Digital Collections Programme at the Natural History Museum London.
This Crowdsourcing symposium in which my talk was one out of four, brought together international examples of crowdsourcing platforms, and highlights practical tools and advice for setting up and running a crowdsourcing project.
We shared innovative ideas for engaging broad global audiences in this endeavour and tips for supporting and nurturing an online community of citizen scientists including the similarities and differences to face-to-face engagement and training.
Crowdsourcing by its nature is a big data movement, and we will demonstrate existing tools and new ones under development that can facilitate open data sharing and the onward use of data for education, conservation and ongoing research.
Finally, such a task doesn't come without significant challenges and opportunities! We shared our lessons learned, highlight issues we are still facing and invite suggestions and collaborations from the audience to overcome these.
24. DCP highlights 2016-17
>1bn
Portal records
downloaded,
in >6,000
events
>3.5m
specimen
records on
portal*
>3,000
Fossil explorer
app
downloads
c.99,000
External
portal users
*Approx 0.5m added during 2016-17
>1,100
Twitter
followers, up
from c.250
>3,500
Reads of 12
blog posts
42
Digital
Visiteers &
1500 waiting
>30
Events & tours
inside and
outside NHM
>26,500
Slides
digitised, @
up to 1,000
per day
>2,500
Crowdsourced
Transcriptions
by >600
people
>1m
Records onto
EMu, and de-
duplications
etc
£27,000
External
funding &
more in
discussion
25. Miniature Lives
Magnified on
Notes from Nature
In our first tranche of mass slide digitisation, we imaged
100,000 microscope slides of tiny insects, barely visible to
the naked eye.
Of these, we have selected 6,000 to form our first collection
for crowdsourcing – focusing on
a group of wasps called chalcids (pronounced 'kal-sids').
These tiny wasps are parasitoids, meaning they lay their
eggs inside other insects. When chalcid eggs hatch the
emerging larvae eat the inside of their host. They then grow
and pupate until mature enough to burst out as adults, finally
killing the host.
This unique life cycle makes chalcids an important biological
control agent, protecting crops and reducing invasive
species.
26. Miniature Lives Magnified, on Notes
from Nature
The Notes from Nature platform has been developed by a team from the University of Florida and SERNEC,
supported by funding from the NSF, using the Panoptes open project-building tool on the Zooniverse platform.
27. Progress to Date
The first batch of 2,000+ chalcid slides were launched on Notes from
Nature in mid-August 2016 and completed in mid-January, 2017.
The second batch of 2,000+ chalcid slides were launched at the end
of January, 2017 and are currently 52% complete – a marked rise in
completion rate compared to the first set.
28. Participation Statistics
Our statistics show a class long-tail pattern of engagement, whereby 14 out of 600+ participants were
the most active contributors, and 2 volunteers in particular were ‘super-classifiers’.
29. The impact of Visiteering events on
classifications
October 20th, 2016 - as part of WeDigBio
March 2nd, 2017
CKAN from the OKFN
Open Source on GitHub
MapQuest & Open Street Map
Built on APIs
-all this goes to the Data Portal
- Next step is to open to contributions from around the world – YOU are the scientist & observer
DCP has an ‘iceberg budget’ – the things people associate with DCP are less that what we actually do. The museum would have to do a lot of DCP’s activities with or without the programme.
Underpins key museum infrastructure – Collections management, data portal & data architecture
Pioneers new technologies and approaches – from imaging workflows to agile project management
Influences museum culture and approaches, from data policies to digital skills
Lives by museum values - ‘We make our data, experience, knowledge and expertise openly available.’
Looking for impact, either way
A combination of DCP and donor funding will enable us to expand on the blue whale scanning project by covering more cetacean skulls – relevant to public programme re oceans, and also to research e.g. changes in response to climate.
Labour intensive process
Old school – happening behind closed doors
A different kind of challenge – smaller scale but unique, and even more fragile than usual…
Equipment / imaging innovation
A different kind of challenge – smaller scale but unique, and even more fragile than usual…
Equipment / imaging innovation
A different kind of challenge – smaller scale but unique, and even more fragile than usual…
Equipment / imaging innovation
We’ve hacked together some pretty ingenious rigs to get 360 imagery and labels from all angles
Building on digitisation of the British butterflies & macro-moths, we have received external funding to digitise madagascan specimens, focusing on types – we will learn about the benefits and limitations of type-focused projects and how well these represent a wider group. Preference remains to digitise wider groups but sometimes this is not feasible owing to volumes, taxonomic uncertainty etc.
Madagascar is home to some of the richest insect biodiversity on the planet; some 4,600 species of Lepidoptera are described but it is likely most are yet unknown. It is critical we learn more about these species so that groups can be revised and new species understood in relation to one another.
In addition, digitisation of these type specimens and making this data available online will help catalyse taxonomic research worldwide at a time of unprecedented decline in Madagascan biodiversity.
Overall – programme established, within budget and meeting KPIs.
Events include NHM Now; Science Uncovered; international conferences such as SPNHC; and VIP tours e.g. for Matt Hancock MP
ALSO:
Electronic content management proof of concept work
Flux file orchestration
Sloane herbarium
ALICE
computer vision
numerous tours etc
improvements to portal (e.g. rebuild down from 8-9 hrs to 1-2 hrs)
data management plans
New intranet pages
Related projects - Dippy, Cockayne, Human Remains, Blue whale, insect pollinator initiative, Airless
Crowdsourcing specimen transcription
Notes from Nature
Parasitoid wasps
Natural enemy of the pests who invade our crops
We are Crowdsourcing specimen transcription
Notes from Nature
Parasitoid wasps
Natural enemy of the pests who invade our crops
We are Crowdsourcing specimen transcription
Notes from Nature
Parasitoid wasps
Natural enemy of the pests who invade our crops