SlideShare a Scribd company logo
1 of 11
In search of the sweet spot: infrastructure at the
intersection of cultural heritage and data science?
Dr Mia Ridge, Digital Curator, British Library
@mia_out @BL_DigiSchol @LivingWMachines
Data Science & Digital Humanities: new collaborations, new
opportunities and new complexities panel, DH2019
www.bl.uk
Why should GLAMs and data scientists collaborate?
GLAMs (galleries, libraries, archives and museums) have vast collections
• The British Library has up to 200 million items, including: 14 million books; 8
million stamps; 310,000 manuscript volumes; 4 million maps; Pamphlets,
magazines, newspapers, sheet music; Television and radio recordings;
Websites, e-books, e-journals. Over 3 million new items are added every year
2
www.bl.uk
Why should GLAMs and data scientists
collaborate?
GLAMs (galleries, libraries, archives and museums) have vast collections
• The British Library has up to 200 million items, including: 14 million books; 8
million stamps; 310,000 manuscript volumes; 4 million maps; Pamphlets,
magazines, newspapers, sheet music; Television and radio recordings;
Websites, e-books, e-journals. Over 3 million new items are added every year
But – that scale means cataloguing is often minimal / focused on
particular uses, so they’re not easily findable
3
www.bl.uk WOMAT-AFR-BEA-227-2-2
Data science methods could invent and
provide routes into collections
www.bl.uk 5Cook's Handbook for London, 1897
Data scientists need (challenging) sources
www.bl.uk 6Cook's Handbook for London, 1897
Data scientists need (challenging) sources
(Are our collections too challenging?)
What stands in the way of
collaboration?
‘to get the most out of machine learning at your
organization, you need the right team and the right
mindset. The latter requires a cultural shift that
prioritizes and rewards experimentation,
measurement, and testing throughout your
organization’
- Google, ‘Everything a marketer needs to know about
machine learning’
7
Image from page 440 of "Bell telephone magazine" (1922)
‘you need the right team and the right mindset. The
latter requires a cultural shift that prioritizes and
rewards experimentation, measurement, and testing
throughout your organization’
- Google, ‘Everything a marketer needs to know about
machine learning’
‘And a lot of spare capacity across teams. Whose job
changes when you bring in data science?’
- me
8
What stands in the way of
collaboration?
• GLAM data can add up to terabytes of data – transfer,
storage and processing become expensive
• Copyright / licensing and data protection issues
• Are GLAM and academic data science outcomes
aligned? Novelty vs application, long-term, at scale?
• How do we integrate AI-generated metadata at scale
without flooding the catalogue with ‘mentions’?
9
What else stands in the
way of collaboration?
www.bl.uk
Opportunities to shift GLAM infrastructure from ‘catalogue’
to ‘lake’ and provide platforms for collaborative work?
10
https://www.flickr.com/photos/missouristatearchives/11653956994
www.bl.uk 11
Thank you!
Questions?
Dr Mia Ridge, Digital Curator, British Library
@mia_out @BL_DigiSchol @LivingWMachines
Data Science & Digital Humanities: new collaborations, new
opportunities and new complexities panel, DH2019

More Related Content

What's hot

What's hot (20)

Open Data & Local Authorities, Paul Maltby, Nov 2014
Open Data & Local Authorities, Paul Maltby, Nov 2014Open Data & Local Authorities, Paul Maltby, Nov 2014
Open Data & Local Authorities, Paul Maltby, Nov 2014
 
Working with other sectors
Working with other sectorsWorking with other sectors
Working with other sectors
 
Keynote: Stefano Bertolo
Keynote: Stefano BertoloKeynote: Stefano Bertolo
Keynote: Stefano Bertolo
 
Open data and reuse: Issues and challenges for cultural institutions
Open data and reuse: Issues and challenges for cultural institutionsOpen data and reuse: Issues and challenges for cultural institutions
Open data and reuse: Issues and challenges for cultural institutions
 
02 apps4 energy erik mannens what if we need open data, linked and big data t...
02 apps4 energy erik mannens what if we need open data, linked and big data t...02 apps4 energy erik mannens what if we need open data, linked and big data t...
02 apps4 energy erik mannens what if we need open data, linked and big data t...
 
BDE Webinar: How does the research community benefit from the new EU General ...
BDE Webinar: How does the research community benefit from the new EU General ...BDE Webinar: How does the research community benefit from the new EU General ...
BDE Webinar: How does the research community benefit from the new EU General ...
 
Text and Data Mining at the Royal Library in the Netherlands
Text and Data Mining at the Royal Library in the NetherlandsText and Data Mining at the Royal Library in the Netherlands
Text and Data Mining at the Royal Library in the Netherlands
 
Cambridgeshire open data
Cambridgeshire open dataCambridgeshire open data
Cambridgeshire open data
 
The current status of TDM in Europe
The current status of TDM in EuropeThe current status of TDM in Europe
The current status of TDM in Europe
 
Archiving News on the Web
Archiving News on the WebArchiving News on the Web
Archiving News on the Web
 
From Digital Enterprise to Insight(s) - Stefan Decker
From Digital Enterprise to Insight(s) - Stefan DeckerFrom Digital Enterprise to Insight(s) - Stefan Decker
From Digital Enterprise to Insight(s) - Stefan Decker
 
Nesta destination local cc 070715
Nesta destination local cc 070715Nesta destination local cc 070715
Nesta destination local cc 070715
 
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
 
The British Library Digital Research Centre
The British Library Digital Research CentreThe British Library Digital Research Centre
The British Library Digital Research Centre
 
Getting value from institutional repositories: IRUS UK - Jisc Digital Festiva...
Getting value from institutional repositories: IRUS UK - Jisc Digital Festiva...Getting value from institutional repositories: IRUS UK - Jisc Digital Festiva...
Getting value from institutional repositories: IRUS UK - Jisc Digital Festiva...
 
Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015Big data and the dark arts - Jisc Digital Media 2015
Big data and the dark arts - Jisc Digital Media 2015
 
BDE Webinar: SC6 - EUROPE IN A CHANGING WORLD -INCLUSIVE, INNOVATIVE AND REFL...
BDE Webinar: SC6 - EUROPE IN A CHANGING WORLD -INCLUSIVE, INNOVATIVE AND REFL...BDE Webinar: SC6 - EUROPE IN A CHANGING WORLD -INCLUSIVE, INNOVATIVE AND REFL...
BDE Webinar: SC6 - EUROPE IN A CHANGING WORLD -INCLUSIVE, INNOVATIVE AND REFL...
 
OPERAS: open access in the european research area through scholarly communica...
OPERAS: open access in the european research area through scholarly communica...OPERAS: open access in the european research area through scholarly communica...
OPERAS: open access in the european research area through scholarly communica...
 
Presentación de Okfn-Spain
Presentación de Okfn-SpainPresentación de Okfn-Spain
Presentación de Okfn-Spain
 
EDF2014: Talk of Ksenia Petrichenko, Building Policy Analyst, Global Building...
EDF2014: Talk of Ksenia Petrichenko, Building Policy Analyst, Global Building...EDF2014: Talk of Ksenia Petrichenko, Building Policy Analyst, Global Building...
EDF2014: Talk of Ksenia Petrichenko, Building Policy Analyst, Global Building...
 

Similar to In search of the sweet spot: infrastructure at the intersection of cultural heritage and data science?

A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...
labsbl
 
GLAM Innovation Study September 2014 - Report Final_accessible
GLAM Innovation Study September 2014 - Report Final_accessibleGLAM Innovation Study September 2014 - Report Final_accessible
GLAM Innovation Study September 2014 - Report Final_accessible
NettieD
 

Similar to In search of the sweet spot: infrastructure at the intersection of cultural heritage and data science? (20)

Operationalising AI at a national library
Operationalising AI at a national libraryOperationalising AI at a national library
Operationalising AI at a national library
 
A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...
 
BL Labs Presentation at Liverpool John Moores University
BL Labs Presentation at Liverpool John Moores UniversityBL Labs Presentation at Liverpool John Moores University
BL Labs Presentation at Liverpool John Moores University
 
Digital Scholarship at the British Library
Digital Scholarship at the British LibraryDigital Scholarship at the British Library
Digital Scholarship at the British Library
 
Rethink research, illuminate history with the British Library
Rethink research, illuminate history with the British LibraryRethink research, illuminate history with the British Library
Rethink research, illuminate history with the British Library
 
BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural DataBL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
 
CultureLabel Trend Briefing
CultureLabel Trend BriefingCultureLabel Trend Briefing
CultureLabel Trend Briefing
 
BL Labs CityLIS Talk
BL Labs CityLIS TalkBL Labs CityLIS Talk
BL Labs CityLIS Talk
 
GLAM Innovation Study September 2014 - Report Final_accessible
GLAM Innovation Study September 2014 - Report Final_accessibleGLAM Innovation Study September 2014 - Report Final_accessible
GLAM Innovation Study September 2014 - Report Final_accessible
 
BL Labs Presentation to the British Library Development Team
BL Labs Presentation to the British Library Development TeamBL Labs Presentation to the British Library Development Team
BL Labs Presentation to the British Library Development Team
 
Designing social tech Innovation for those furthest from work
Designing social tech Innovation for those furthest from workDesigning social tech Innovation for those furthest from work
Designing social tech Innovation for those furthest from work
 
Closing Keynote - AWS Executive Summit 2014 India
Closing Keynote - AWS Executive Summit 2014 IndiaClosing Keynote - AWS Executive Summit 2014 India
Closing Keynote - AWS Executive Summit 2014 India
 
The Evolving Realities of Digital Marketing:
The Evolving Realities of Digital Marketing: The Evolving Realities of Digital Marketing:
The Evolving Realities of Digital Marketing:
 
WORLD CAT AS BIG DATA
WORLD CAT AS  BIG DATAWORLD CAT AS  BIG DATA
WORLD CAT AS BIG DATA
 
The art of work in the age of ??? reproduction
The art of work in the age of ??? reproductionThe art of work in the age of ??? reproduction
The art of work in the age of ??? reproduction
 
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
 
Working with the British Library’s Digital Collections & Data - Insights from...
Working with the British Library’s Digital Collections & Data - Insights from...Working with the British Library’s Digital Collections & Data - Insights from...
Working with the British Library’s Digital Collections & Data - Insights from...
 
digital.together launch
digital.together launchdigital.together launch
digital.together launch
 
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4jNeo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
 
Connected heritage: How should Cultural Institutions Open and Connect Data?
Connected heritage: How should Cultural Institutions Open and Connect Data?Connected heritage: How should Cultural Institutions Open and Connect Data?
Connected heritage: How should Cultural Institutions Open and Connect Data?
 

More from Mia

Crowdsourcing at the British Library: lessons learnt and future directions
Crowdsourcing at the British Library: lessons learnt and future directionsCrowdsourcing at the British Library: lessons learnt and future directions
Crowdsourcing at the British Library: lessons learnt and future directions
Mia
 
Historical thinking in crowdsourcing and citizen history projects
Historical thinking in crowdsourcing and citizen history projectsHistorical thinking in crowdsourcing and citizen history projects
Historical thinking in crowdsourcing and citizen history projects
Mia
 

More from Mia (20)

Living with Machines year two update
Living with Machines year two updateLiving with Machines year two update
Living with Machines year two update
 
Living with Machines: one year in
Living with Machines: one year inLiving with Machines: one year in
Living with Machines: one year in
 
Festival of Maintenance talk: Apps, microsites and collections online: innova...
Festival of Maintenance talk: Apps, microsites and collections online: innova...Festival of Maintenance talk: Apps, microsites and collections online: innova...
Festival of Maintenance talk: Apps, microsites and collections online: innova...
 
Hopes, dreams and reality: crowdsourcing and the democratisation of knowledge...
Hopes, dreams and reality: crowdsourcing and the democratisation of knowledge...Hopes, dreams and reality: crowdsourcing and the democratisation of knowledge...
Hopes, dreams and reality: crowdsourcing and the democratisation of knowledge...
 
Enabling digital scholarship through staff training: the British Library's ex...
Enabling digital scholarship through staff training: the British Library's ex...Enabling digital scholarship through staff training: the British Library's ex...
Enabling digital scholarship through staff training: the British Library's ex...
 
A modest proposal: crowdsourcing in cultural heritage benefits us all.
A modest proposal: crowdsourcing in cultural heritage benefits us all.A modest proposal: crowdsourcing in cultural heritage benefits us all.
A modest proposal: crowdsourcing in cultural heritage benefits us all.
 
Crowdsourcing at the British Library: lessons learnt and future directions
Crowdsourcing at the British Library: lessons learnt and future directionsCrowdsourcing at the British Library: lessons learnt and future directions
Crowdsourcing at the British Library: lessons learnt and future directions
 
Crowdsourcing 'In the Spotlight' at the British Library
Crowdsourcing 'In the Spotlight' at the British LibraryCrowdsourcing 'In the Spotlight' at the British Library
Crowdsourcing 'In the Spotlight' at the British Library
 
Crowdsourcing: the British Library experience
Crowdsourcing: the British Library experienceCrowdsourcing: the British Library experience
Crowdsourcing: the British Library experience
 
Chair's welcome, MCG's Museums+Tech 2017
Chair's welcome, MCG's Museums+Tech 2017Chair's welcome, MCG's Museums+Tech 2017
Chair's welcome, MCG's Museums+Tech 2017
 
Historical thinking in crowdsourcing and citizen history projects
Historical thinking in crowdsourcing and citizen history projectsHistorical thinking in crowdsourcing and citizen history projects
Historical thinking in crowdsourcing and citizen history projects
 
Cross-sector collaboration for digital museum and library projects
Cross-sector collaboration for digital museum and library projectsCross-sector collaboration for digital museum and library projects
Cross-sector collaboration for digital museum and library projects
 
Wish upon a star: making crowdsourcing in cultural heritage a reality
Wish upon a star: making crowdsourcing in cultural heritage a realityWish upon a star: making crowdsourcing in cultural heritage a reality
Wish upon a star: making crowdsourcing in cultural heritage a reality
 
Doing Digital Research @ British Library
Doing Digital Research @ British LibraryDoing Digital Research @ British Library
Doing Digital Research @ British Library
 
Beyond the Black Box: Data Visualisation
Beyond the Black Box: Data VisualisationBeyond the Black Box: Data Visualisation
Beyond the Black Box: Data Visualisation
 
Introduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDsIntroduction to information visualisation for humanities PhDs
Introduction to information visualisation for humanities PhDs
 
Planning for big data (lessons from cultural heritage)
Planning for big data (lessons from cultural heritage)Planning for big data (lessons from cultural heritage)
Planning for big data (lessons from cultural heritage)
 
Digitised Manuscripts and the British Library's new IIIF viewer
Digitised Manuscripts and the British Library's new IIIF viewer Digitised Manuscripts and the British Library's new IIIF viewer
Digitised Manuscripts and the British Library's new IIIF viewer
 
Why do we digitise? 20 reasons in 20 pictures
Why do we digitise? 20 reasons in 20 picturesWhy do we digitise? 20 reasons in 20 pictures
Why do we digitise? 20 reasons in 20 pictures
 
Reaching out: museums, crowdsourcing and participatory heritage
Reaching out: museums, crowdsourcing and participatory heritageReaching out: museums, crowdsourcing and participatory heritage
Reaching out: museums, crowdsourcing and participatory heritage
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

In search of the sweet spot: infrastructure at the intersection of cultural heritage and data science?

  • 1. In search of the sweet spot: infrastructure at the intersection of cultural heritage and data science? Dr Mia Ridge, Digital Curator, British Library @mia_out @BL_DigiSchol @LivingWMachines Data Science & Digital Humanities: new collaborations, new opportunities and new complexities panel, DH2019
  • 2. www.bl.uk Why should GLAMs and data scientists collaborate? GLAMs (galleries, libraries, archives and museums) have vast collections • The British Library has up to 200 million items, including: 14 million books; 8 million stamps; 310,000 manuscript volumes; 4 million maps; Pamphlets, magazines, newspapers, sheet music; Television and radio recordings; Websites, e-books, e-journals. Over 3 million new items are added every year 2
  • 3. www.bl.uk Why should GLAMs and data scientists collaborate? GLAMs (galleries, libraries, archives and museums) have vast collections • The British Library has up to 200 million items, including: 14 million books; 8 million stamps; 310,000 manuscript volumes; 4 million maps; Pamphlets, magazines, newspapers, sheet music; Television and radio recordings; Websites, e-books, e-journals. Over 3 million new items are added every year But – that scale means cataloguing is often minimal / focused on particular uses, so they’re not easily findable 3
  • 4. www.bl.uk WOMAT-AFR-BEA-227-2-2 Data science methods could invent and provide routes into collections
  • 5. www.bl.uk 5Cook's Handbook for London, 1897 Data scientists need (challenging) sources
  • 6. www.bl.uk 6Cook's Handbook for London, 1897 Data scientists need (challenging) sources (Are our collections too challenging?)
  • 7. What stands in the way of collaboration? ‘to get the most out of machine learning at your organization, you need the right team and the right mindset. The latter requires a cultural shift that prioritizes and rewards experimentation, measurement, and testing throughout your organization’ - Google, ‘Everything a marketer needs to know about machine learning’ 7 Image from page 440 of "Bell telephone magazine" (1922)
  • 8. ‘you need the right team and the right mindset. The latter requires a cultural shift that prioritizes and rewards experimentation, measurement, and testing throughout your organization’ - Google, ‘Everything a marketer needs to know about machine learning’ ‘And a lot of spare capacity across teams. Whose job changes when you bring in data science?’ - me 8 What stands in the way of collaboration?
  • 9. • GLAM data can add up to terabytes of data – transfer, storage and processing become expensive • Copyright / licensing and data protection issues • Are GLAM and academic data science outcomes aligned? Novelty vs application, long-term, at scale? • How do we integrate AI-generated metadata at scale without flooding the catalogue with ‘mentions’? 9 What else stands in the way of collaboration?
  • 10. www.bl.uk Opportunities to shift GLAM infrastructure from ‘catalogue’ to ‘lake’ and provide platforms for collaborative work? 10 https://www.flickr.com/photos/missouristatearchives/11653956994
  • 11. www.bl.uk 11 Thank you! Questions? Dr Mia Ridge, Digital Curator, British Library @mia_out @BL_DigiSchol @LivingWMachines Data Science & Digital Humanities: new collaborations, new opportunities and new complexities panel, DH2019

Editor's Notes

  1. Been working in the field of open cultural data for a long time, which has led to me asking, How can GLAMs and data scientists collaborate to produce outcomes that are useful for both groups? Proposing that working on data mining with cultural heritage collections is the sweet spot.
  2. I’ve highlighted some strings that could be linked to spatial identifiers in Northern Uganda, but there are lots of other entities that could be noted. Catalogues are only one way into collections – examining the actual text, images, etc, provides lots of other ways in. We need data science methods to link to identifiers and to create structure from unstructured data From Report on Northern Uganda (WOMAT-AFR-BEA-227-2-2) Title '29.9.13. Extract from Report on Northern Uganda, compiled by Lieutenant G.P. Cosens, 1st Royal Dragoons.' Description Provides a detailed description of the country, its climate, fauna and flora. Author Cosens, Gordon Philip Lewes, d 1928, army officer, Author British Library Shelfmark WOMAT/AFR/BEA/227/2/2 Locations Depicted Nakwai Hills, Uganda ; Ngabotok, East Africa Protectorate ; Turkwel River, East Africa Protectorate
  3. Since the Turing moved into the BL’s building we’ve been working to convince them that our sources are interestingly messy, complicated.
  4. (Is this really true, or are our sources too challengingly messy and vast?)
  5. We’d like lots of lightbulb moments but are we set up for them?
  6. We’d like lots of lightbulb moments but are we set up for them?
  7. Is work with GLAM collections ‘significant’ or ‘novel’ enough for academic research? Creating metadata at scale isn’t sexy but it is necessary ‘Crowdsourcing / machine learning as additional info’ stuff. New data storage paradigms, new Uis/UX ideas
  8. Had been thinking about data lakes as a way of dealing with the complexities of data structures for different uses, but the need for shared infrastructure is actually more profound. Integrating results of DS into discovery systems means those systems need to change: more ‘data lake’ than MARC? ‘Crowdsourcing / machine learning as additional info’ stuff. New data storage paradigms, new Uis/UX ideas