1. To what extent are GLAMs ready for Open Data and
Crowdsourcing?
Results of a Pilot Survey from Switzerland
Beat Estermann, 12 April 2013
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
2. Recent Trends in the GLAM sector…
Single-Point-of-Access
Source: http://www.europeana.eu/
Coordinated Digitization Efforts
Wikimedia Commons, User:Dvortygirl (CC-by-sa) Increased cooperation and
coordination among GLAMs:
- common catalogues
- virtual libraries
EU: Lund Action Plan for - coordination of digitization efforts
Digitization (2001) - long-term archiving 2
4. Crowdsourcing / Collaborative Content Creation
Source: https://commons.wikimedia.org/wiki/Commons:Bundesarchiv and http://www.flickr.com/groups/greatwararchive
Crowdsourcing
Approaches:
- Correction
Linked Open Data
- Classification
Free Licensing / Open Data
- Contextualisation Source: http://www.wikiarthistory.info (CC-by-sa)
Source: http://www.creativecommons.org
- Co-curation
- Complementing
«Web of Data» /
collections
Open Data: Semantic Web
- Crowdfunding
- machine readable - RDF triples
4
- «freely» re-usable See: Oomen / Aroyo 2011 - unique URLs
5. Where do Swiss GLAMs stand today with regard to…?
…Digitization?
…Exchange of metadata in multilateral cooperations?
…Open Data?
…Crowdsourcing?
…Linked Open Data?
Innovation Diffusion Model,
Everett Rogers, 1962
Awareness Interest Evaluation Trial Adoption
What are the perceived risks and opportunities? (drivers vs. hindering factors)
What are the expected benefits? Who are the beneficiaries? 5
6. Pilot Study among Swiss GLAMs
GLAMs in Switzerland:
• ca. 600-700 independent GLAMs of national or regional significance
• ca. 1000 independent GLAMs organized in three umbrella organizations
Our sample: memory institutions of national significance in the German-speaking
part of Switzerland
• 197 organisations contacted (233 e-mail addresses)
• 72 questionnaires completed (34% of the contacted organisations)
Caveats:
• The sample is rather small (results are not very precise with regard to the
entire Swiss GLAM population, large confidence intervals apply)
• Archives are over-represented in the sample (higher response rate);
museums and «other institutions» are under-represented; libraries are about
average. 6
7. Innovation Diffusion among Swiss GLAMs: The Overall Picture
A critical mass has been reached.
How about the laggards?
Will we see a higher rate of adoption for
Open Data than for Crowdsourcing?
Some institutions are starting to think
about Linked Data…
7
8. Digitization and Availability on the Internet
Availability on the Internet
(in % of institutions, N=71)
17% "is partly the case"
37%
32% "is the case"
42%
23% 11%
Metadata Reproductions of Background
memory objects information
60% of institutions make metadata and reproductions at least partly
available on the Internet. 40% still don’t!
8
9. Exchange of Metadata / Cooperation in Networks
The exchange of metadata is important for us... (in % of institutions; N=72)
Do you exchange metadata 100%
with other institutions? 90%
(in % of institutions; N=72) 80%
100% 70%
90% 60%
80% 50%
70% 40% 8% 17%
61%
30%
60% 15%
20% 35%
50% 29%
39% 10% 15% 3% 3%
40% 0% "is partly the case"
30%
"is the case"
20%
10%
0%
yes no
61% of the responding GLAMs exchange metadata with other institutions. 39% don’t.
30% do so in the context of bilateral cooperations; 43% in the context of multilateral
cooperations.
For 29% the exchange of metadata is part of their core mission. 17% say this is partly the 9
case.
10. Metadata: Need for Improvement
Metadata: Need for improvement? (in % of institutions; N=71)
100%
90%
80%
70%
60% Quality of metadata
50% (accuracy, completeness, up-to-
42% 43% dateness, clarity, availability)
40%
Interoperability of metadata
30% 25% 24% (availability in digital
21% 23%
20% format, conformity with standards)
11% 10%
10%
0%
urgent need need in the no need no answer
medium term
A bit more than 50% of responding GLAMs perceive a need to improve
their metadata.
The need to improve metadata quality and the need to improve their
interoperability are highly correlated. – Does the envisioned exchange of
metadata lead to higher quality requirements?
25% of responding GLAMs couldn’t answer this question. – What does
this mean? 10
11. Metadata: What needs to be improved?
Metadata: What needs to be improved? (in % of institutions; N=43)
100%
90%
80%
70%
33%
60%
26%
50% 23%
37%
"is partly the case"
40% 60% 28%
40% "is the case"
30%
51%
20% 40% 42%
30%
26%
10%
16%
9%
0%
accuracy completeness up-to-dateness clarity availability digitization conformity with
current
exchange
formats
The main challenges: completeness, availability, digitization
11
12. Open Data Readiness
The memory objects are available on the Internet... (in % of institutions; N=68)
100%
90%
80%
70%
60%
not accessible for free
50% 21%
accessible at no charge (but you are
40%
not allowed to modify them)
51%
30% "freely" accessible
32%
20%
10%
7% 7%
0% 1%
for charitable projects, such as for users who are intending to
Wikipedia, which also permit commercially exploit them
commercial use
Between 1% and 7% of responding GLAMs make scans/photographs of their
memory objects «freely» available on the Internet. Over half of them make them
available on the Internet, but with restrictions. 40% don’t make them available at all.
Over 50% of the GLAMs which make their memory objects available on the Internet
do not understand that you cannot make works available for Wikipedia and 12
simultaneously prevent their modification and/or their commercial use!
13. Desirability and Importance of Open Data
Desirability of Open Data (in % of institutions, N=71)
40%
36% Importance / Desirability of Open Data
35% (in % of institutions; N=71)
30% risks prevail opportunities prevail
25% 35%
25%
30%
20% 25%
20%
15% 15% 31%
11% 8% 14%
21%
10%
10%
7% 5% 6%
6% 6% 6% 8% 7%
5% 0% 1% 3%
3%
1%
0% very important neither, nor unimportant no answer
0% important
-10 to - -8 to -6 -6 to -4 -4 to -2 -2 to 0 0 to 2 2 to 4 4 to 6 6 to 8 8 to 10
8
For over 80% of responding GLAMs the opportunities outweigh the risks of
Open Data.
Over 50% think Open Data is an important issue; almost all of these believe
that the opportunities outweigh the risks.
13
14. Open Data / “Free” Licensing of Content
Conditions under which they would make memory objects freely accessible on the Internet
(in % der Institutionen; N=70)
100%
90%
19%
80% 20%
70% 23%
21%
60%
34%
50%
"is partly the case"
40% 26% "is the case"
76%
69%
30% 59% 60%
20% 40%
29%
10% 9%
7% 1%
0%
For private use
For education and research For users who are the name to commercially exploit them
For charitable projects, suchOnly if intending of the institution remains attached to the data
For charitable projects as Wikipedia, whichOnly if the work will be re-used in unmodified form
also permit
commercial use
Most GLAMs wouldn’t readily agree to «freely» license their content – even in
the absence of third party rights: they would like to prevent the commercial use
at no charge as well as the modification of works.
14
15. Crowdsourcing
Are any of your staff members engaging in projects which support open
data or collaborative projects on the Internet? (in % of institutions; N=71)
100%
90%
80%
70%
60%
50% in their spare time
40%
as part of their professional
30% activity
20% 14%
10%
11% 3%
4% 6% 1%
0%
Wikipedia Wikimedia Flickr others
Commons Commons
11% of responding GLAMs have staff members who contribute to Wikipedia as
part of their professional activity.
10% of responding GLAMs say that online volunteering plays partly an
important role for them.
15
Interestingly, no correlation was found between the two variables.
16. Desirability and Importance of Crowdsourcing
Desirability of Crowdsourcing (in % of institutions; N=69)
43% Importance / Desirability of Crowdsourcing
45% (in % of institutions; N=69)
40%
risks prevail opportunities prevail
35%
35%
30%
30%
25% 25% 3%
19%
20% 20%
15% 1%
15% 1% 29%
15% 11% 25%
10%
10% 14% 16%
5% 10%
4%
5% 3% 3% 0%
1%
0% very important neither, nor unimportant no answer
-10 to - -8 to -6 -6 to -4 -4 to -2 -2 to 0 0 to 2 2 to 4 4 to 6 6 to 8 8 to 10 important
8
For over 90% of the responding GLAMs the risks of Crowdsourcing are at least
as great as the opportunities. For half of them the risks clearly prevail.
Among GLAMs which think that Crowdsourcing is an important issue, the risk
perception is equally high.
16
17. Linked Data / Semantic Web
Is „Linked Data“ / „Semantic Web“ an issue for your
institution?
(in % of institutions; N=71)
100%
90%
80%
70% Yes, it is an issue, but we haven't
planned any projects yet
60%
50% Yes, we have already planned
projects in this area
40%
30%
20%
23%
10%
6%
0%
29% of responding GLAMs say that Linked Data is an issue for them.
None of them has a running project.
17
18. Recapitulation
Metadata available on the Internet 59% 60% of responding
Photos/scans of memory object available on the Internet 60% GLAMs are technically
ready for Open Data.
Exchange of metadata takes place and is important 43%
Open Data is important 53%
Open Data is desirable 81%
Readiness to make data available for Wikipedia 7%
Readiness to make data available for commercial use 1%
Different dynamics for
Open Data and
Crowdsourcing is important 38%
Crowdsourcing
Crowdsourcing is desirable 7%
Importance of online-volunteer work 10%
Professional engagement in Wikipedia 11%
Linked Data is an issue 29%
18
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%100%
19. Open Data: Opportunities
Why do we need Open Data from the point of view of your institution? (in % of institutions; N=72)
100% "is partly the case"
90% "is the case"
80% 18% 19%
70% 14%
21%
24% 26% 21%
60%
50% 33%
40% 22%
65% 68% 67%
30% 58%
53% 50% 53%
20% 19% 36%
29%
10%
11% 3%
0%
Main target groups: research and education, private individuals, cultural institutions
Main opportunities: better visibility and accessibility of holdings; better visibility of 19
the
institutions; better networking among GLAMs.
20. Open Data: Risks
What are the risks of open data for your institution? (in % of institutions; N=71)
100%
90%
80%
20%
70%
60%
50% 34% 34%
40% 34% "is partly the case"
23%
66% "is the case"
30%
17%
20%
34% 32% 28% 25%
10% 18% 11%
0% 3%
Time effort and The use of the Copyright Infringements of Divulgation of Increased time Loss of
expense for data cannot be infringements data protection classified effort in order to revenues
making them controlled regulations information respond to
available enquiries
Major risk: extra time effort and expenses
Considerable risks: loss of control, copyright, data protection, secrecy infringements
Almost no risk: Loss of revenues
20
21. Crowdsourcing: Opportunities
What are the opportunities of crowdsourcing for your institution?
(in % of institutions; N=71)
100%
90%
80%
70%
60%
50%
40%
"is partly the case"
30% "is the case"
20% 20%
24%
24% 21% 21%
10%
11% 14%
6% 1% 4% 3%
0%
Correction and Enhancement Completion of Classification / Co-curators Crowdfunding
transcription and expansion collections completion of (fundraising)
tasks of texts (contribution / metadata
identification of
additional
objects)
Crowdsourcing is most likely to be employed for classification tasks.
21
22. Crowdsourcing: Risks
What are the risks of crowdsourcing from your point of view? (in % of institutions; N=69)
100%
90%
80%
70%
60%
30%
50% 35% 28%
26%
30%
40%
"is partly the case"
30% "is the case"
20% 42%
35% 35% 38%
30% 17%
10%
6%
0%
Unforeseeable Considerable Difficulties in No guarantee Low level of Fears among
results time/effort estimating the concerning planning employees
needed for time-effort long-term data reliability (job
preparation maintenance loss, changing
and follow-up roles and
tasks)
All the enumerated risks are rated about the same, except for fears among
employees which seem to play a minor role. 22
23. Economic Considerations
• Extra time effort and expenses are seen as the greatest
risks/shortcomings of Open Data and Crowdsourcing.
• Expected losses of revenue play virtually no role.
The revenues of the responding GLAMs are composed as follows:
71%: institutional funding (public funds)
8%: institutional funding (private funds)
7%: donations and sponsoring
6%: revenues from commercial activities
(entrance fees: 3%; lending fees: 1%; sale of image rights: < 0.5%; other: 1%)
2%: project funding (public or private)
6%: other revenues
• While the responding GLAMs may perceive at least some efficiency
gains related to Open Data, they do not perceive any potential
economies associated to Crowdsourcing (yet). 23
24. Outlook / Next Steps
• Contact GLAMs that have indicated an interest in receiving further
information
• Promote the study among GLAMs and political actors in Switzerland
• Orient GLAM outreach activities in the light of the findings
• Evaluate the demand for follow-up studies:
Study with a larger sample in Switzerland
Longitudinal study in Switzerland
(e.g. similar survey in 2014 to measure the changes)
International benchmark study
Please contact me if you are interested!
24
25. Contact Information and Affiliations
Beat Estermann
E-mail: beat.estermann@bfh.ch
Phone: +41 31 848 34 38
Affiliations:
Research Associate, Bern University of Applied Sciences
Member of opendata.ch (Swiss Chapter of the Open Knowledge Foundation)
Member of Digitale Allmend (Swiss Chapter of CreativeCommons)
Member of Wikimedia CH’s GLAM working group
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
25
Notes de l'éditeur
CoordinatedDigitizationEfforts (2000)Single-Point-of-Access Offers (2000..)Web2.0, Personnalization (2005..)Crowdsourcing (2006..)Open Data (2009..)Linked Open Data (2010..)
CoordinatedDigitizationEfforts (2000)Single-Point-of-Access Offers (2000..)Web2.0, Personnalization (2005..)Crowdsourcing (2006..)Open Data (2009..)Linked Open Data (2010..)
CoordinatedDigitizationEfforts (2000)Single-Point-of-Access Offers (2000..)Web2.0, Personnalization (2005..)Crowdsourcing (2006..)Open Data (2009..)Linked Open Data (2010..)
Q: There is a trend among memory institutions to make reproductions / content of their objects freely available on the internet.Under which conditions could you imagine making reproductions / content of your objects available on the internet free of charge, without earning any extra money?(Provided that the contents are already available in digital format and are free from third parties’ copyright claims or confidentiality restrictions.)