SlideShare une entreprise Scribd logo
1  sur  130
1
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
http://www.bl.uk/projects/british-library-labs
Funded by the Andrew W. Mellon Foundation
Running since March 2013
A hands-on data exploration & challenge to become a derived data-set author
on the British Library’s open data-set platform (https://data.bl.uk)
Mahendra Mahey, Manager of BL Labs, British Library, London, UK.
1400 – 1530, Tuesday 25 September 2018
Workshop part of ‘Making Connections’, Digital Humanities Australasia, 2018
(#DHA2018), University of South Australia, City West campus, Adelaide, SA, Australia
2
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Who do we work with?
Researchers
https://goo.gl/WutNyi Artists
http://goo.gl/nNKhQ2
Librarians
Curators
https://goo.gl/9NWZUW
Software Developers
https://goo.gl/7QQ5Tf
Archivists
https://goo.gl/x7b4tg
Educators
https://goo.gl/qh01Mi
Working and Communicating
Entrepreneurs
https://goo.gl/Fx8RG7
3
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Competition
Awards
Projects
Tell us your ideas of what to do with our digital content (2013-16)
Show us what you have already done with our digital content in research,
artistic, commercial, learning and teaching, staff categories
Talk to us about working on collaborative projects
Tell us your ideas of what to do with our digital content
Engagement
• Roadshows
• Events
• Meetings
• Conversations
New! Digital Research Support
How?
4
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Collections – not just books!
> 180*million items
> 0.8* m serial titles
> 8* m stamps
> 14* m books
> 6* m sound recordings
> 4* m maps
> 1.6* m musical scores
> 0.3* m manuscripts
> 60* m patents
King’s Library *Estimates
5
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Have you got X?
https://upload.wikimedia.org/wikipedia/commons/5/50/Real_wuerzburg.jpg
Looking for Physical Content in the British Library
6
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
#bldigital
3 %* digitised
* estimate
Digital
Partnerships
Commercial & Other
Organisations
Bias in digitisation
http://goo.gl/bR9UJL
Sample Generator
15 %* Openly Licensed – most online
85 %* Available onsite only at the moment
Digitisation / Curating Born Digital
costs money, time, resources
http://www.turing.ac.uk
Digital increasing
rapidly
Born Digital
http://www.webarchive.org.uk/ukwa/
7
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Have you got X digitised / in digital form?
http://www.yorkmix.com/wp-content/uploads/2014/04/mr-simms-sweet-shoppe-york.jpg
Looking for Digitised / Digital Content in the BL
8
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Our Audience and Collections
Audience
research &
Digital
interests
Digital
collections we
have
This is where Labs works
It starts with a making connections!
The theme to DHA2018
9
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Finding Open Cultural Heritage Datasets
Collection Guides (219 as of 25/09/2018)
https://www.bl.uk/collection-guides/
Datasets about our collections
Bibliographic datasets relating to our published and archival holdings
Datasets for content mining
Content suitable for use in text and data mining research
Datasets for image analysis
Image collections suitable for large-scale image-analysis-based research
Datasets from UK Web Archive
Data and API services available for accessing UK Web Archive
Digital mapping
Geospatial data, cartographic applications, digital aerial photography and
scanned historic map materials
https://data.bl.uk
Download collections as zips, no API
Each dataset has a Digital Object Identifier (DOI)
can be referenced for research
Not all discoverable via
search engines!
10
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Explore Our Data at http://data.bl.uk!
• CSV of Metadata
https://data.bl.uk/digbks/dig19cbooks-mdata-csv.csv
• 19th Century Books - Book Metadata - 01/09/2013.
https://data.bl.uk/digbks/db21.html
• Digitised Books - Flickr Tag History - Dec 2013 to March 2016. TSV
https://data.bl.uk/digbks/db15.html
• Digitised Hebrew Manuscripts - Metadata
https://data.bl.uk/hebrewmanuscripts/heb1.html
• Digitised Hebrew Manuscripts: Or 2210 - Or 2364
https://data.bl.uk/hebrewmanuscripts/heb8.html
• Theatrical playbills from Britain and Ireland (OCR text only)
https://data.bl.uk/playbills/pb2.html
• Portraits of actors, views of theatres and playbills (covering 1750 - 1821 in a single volume)
https://data.bl.uk/singlesheet/por1.html
• Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on
amusements.1660-1840. https://data.bl.uk/singlesheet/ad1.html
11
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
The Story of the Digital Collection…
Digital
Collection
Curator
Who paid for the digitisation?
Who did the digitisation?
Technology used
Born digital?
Published
Unpublished
Where is it?
Access / API?
Can it still be accessed?
Generates income
Reputational risk in using?
Legalities /
Ethics / Morality
Politics when digitised
Personalities involved
Surprises (e.g. gaps)
Descriptive information
Old format not supported
What media was the
digitisation done from?
Is there any background documentation?
No Descriptive information
Inconsistent descriptive information
Still there?
Good to know the background ‘story’ of a Digital Collection
if you want to use it for projects …
12
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
https://goo.gl/qpCLlk
https://goo.gl/wMTS3Z
• Dialogue typically:
– you are ‘lucky’ & we have the digital content
/ data relevant to your research
– we don’t have exactly what your looking for,
but is there anything of interest? Let’s talk…
– engagement is hard work and it’s constantly required to
maintain interest in our digital collections!
• Artists find this dialogue easier…
• We also tend to attract researchers with ‘fuzzier’ research
boundaries and possibly open to more
interdisciplinary / collaborative research
What engagement does the BL have with researchers
wanting use our digital content?
13
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Open Content vs Onsite Only Access
• Access easier for openly licensed content
• More challenging for on-site, in-copyright, non-print legal
deposit, data protected, old content media & contemporary material (post 1877)
https://goo.gl/Y5zCXg
©
14
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
How do we give access to
onsite-only
Digital Collections
(85% of our Digital Collections)?
15
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
READING
ROOM
ON
SITE
NOT
ONLINE
OPEN
British Library
£
Labs Residency Model
Challenges of access to Digital Collections at the BL
16
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Accessing digital collections onsite
OPEN
£
• Have to be ‘onsite’ (interpretations vary)
• Need to be ‘security cleared’ ‘trusted’ for some collections
– Hence ‘Researcher in Residence Model’
• Permission required (depending on ‘story’ of collection)
• Content could be on various media formats
(not always online)
• 5 - 20 % re-use of material for non commercial research for some collections,
depends on agreements in place
• We are learning ‘pathways’ so that this becomes ‘everyday’ to provide onsite access
to some digital collections in the future
17
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Phases of interaction at BL Labs
Submit idea for
support
Ideas always change
Once people experience the data
and culture of the organisation
18
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
eResearch SA Open Data Directory
http://www.data.sa.edu.au/
19
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
URLs to download sample files not on data.bl.uk
• https://www.data.sa.edu.au/dataset/newspapers-from-british-library/
• https://www.data.sa.edu.au/dataset/
• https://www.data.sa.edu.au/dataset/
20
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Working with British Library Digitised Newspapers
• Digitised through public / private means
• Can use commercial products to look manually for content, with search
interfaces but no APIs, useful starting point though, manual methods can
translate into computational ones
• OCR quality is not great, metadata is OK, but plenty of hidden material,
approaches require to consider this, e.g. ‘Good, Bad and Ugly’ OCR
• You can purchase drives from GALE Cengage with content (dependent on
subscription)
21
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Good, Bad, Ugly Image Quality / OCR
• Original image capture of newspaper images can effect the quality of the OCR
• A poor image, very difficult to re-OCR
• Good image quality much better chance for re-OCR
• Bi-tonal, Grey Scale, Colour can effect the quality of the OCR
• Methodology of working with collection at scale needs to acknowledge OCR and
image quality
22
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Breaking Black Boxes – Melodee Beals
http://doi.org/cm3m
23
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Burney Collection
• Gathered by the Reverend Charles Burney (1757- 1817)
• 700 volumes, newspapers and news pamphlets, published in London, English
provincial, Irish and Scottish papers, and a few examples from the American
colonies.
• 1271 titles
• Around 1 million digitised page images – from around 2006 from Microfilm
• OCR quality mixed, used custom XML format
• Bi-tonal
24
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Web Interface – Burney Collection
25
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
OCR quality can be very poor!
26
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
1268 Folders
27
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
burney_summary.xls
28
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Breakdown of titles
Title No. of Pages
PUBLIC ADVERTISER 60680
LONDON GAZETTE 44463
LONDON EVENING POST 38920
LONDON CHRONICLE 32030
GAZETTEER AND NEW DAILY ADVERTISER 31250
LLOYD'S EVENING POST 28941
ST. JAMES'S CHRONICLE OR THE BRITISH EVENING POST 28130
MORNING CHRONICLE AND LONDON ADVERTISER 27658
DAILY COURANT 25334
GENERAL EVENING POST 23500
12 TITLES WITH 10,000+ PAGES 188266
87 TITLES WITH 1,000+ PAGES 289745
216 TITLES WITH 100+ PAGES 79374
945 TITLES WITH 1 TO 100 PAGES 16816
29
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Example Folders
B0001ORIWEEJO - APPLEBEE''S ORIGINAL WEEKLY JOURNAL - 1715 – 1720
B0018CONTPROC - PROCEEDINGS OF THE ARMY UNDER THE COMMAND OF SIR
THOMAS FAIRFAX – 1645
B0054REPINFCH - REPORT OF THE STATE OF THE GENERAL INFIRMARY AT
CHESTOR - 1754?-1779
B0101PROCPARL - EXACT RELATION OF THE PROCEEDINGS AND TRANSACTIONS
OF THE LATE PARLIAMENT – 1654
B0277INSTRUCT - INSTRUCTOR – 1724
B1381SCOU1717 - SCOURGE (1717, REPRINT) - 1717?
30
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Example files
‘service’ folder contains page level images and corresponding OCR XML
BurneyB0001ORIWEEJO17151119service
31
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
APPLEBEE''S ORIGINAL WEEKLY JOURNAL
FROM SATURDAY NOVEMBER 19 TO SATURDAY NOVEMBER 26 1715
WO2_B0001ORIWEEJO_1715_11_19-0001.tiff
32
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
JISC 1 and JISC 2
Newspapers
33
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Accessing digitised newspapers
through Gale Interface (subscription)
34
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Private BL NAS
Accessible onsite or remotely if security cleared via CITRIX
35
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Accessing digitised newspapers
onsite at the BL (JISC 1)
12 Volumes, 80TB of data
36
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Accessing digitised newspapers
onsite at the BL
Accessing ‘service’ Copy (post processed)
and results of OCR available as XML
37
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Accessing digitised newspapers
onsite at the BL
Accessing ‘service’
Copy (post processed)
38
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Accessing digitised newspapers
onsite at the BL
Accessing OCR as XML
39
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
jisc_1.xls
79 Titles, 2 million pages
40
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Metadata from BL (JISC 1 and 2)
• Title Metadata
– Title, as written
– Normalised title across all
variants
– Standardised title
abbreviation
– Variant titles, with associated
dates
– Place of publication
– Dates of publication
– Genre, such as newspaper
– Sub-collection, such as
Regional Daily
Issue Metadata
Volume Number
Issue Number
Date as printed
Normalised date (YYYY.MM.DD)
Number of pages
The microfilm reel number
The OCR quality
Page image data
The number of the image within that
issue
The filename
The spatial coordinates for the page
within the image
The degree of page skew
41
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Metadata from Gale (JISC 1 and 2)
• Standardised identifier
• Newspaper title
• Standardised title abbreviation
• Project codes
• Digitized collection name
• Issue number
• Date as printed
• Standardised date (Month, DD,
YYYY)
• Standardised date
(YYYYMMDD)
• Day of the week
• Number of Pages
• Copyright holder
Language
Unique ID for publication
Holding Library
Citation of the physical item
Title metadata
Title as recorded in the MARC
Library Catalogue
Dates of publication
Genre, such as newspaper
Conversion credit, usually a vendor
Article
Unique ID
OCR quality
SC, or standardized category of article
Unique ID(s) of page(s)
Unique ID(s) of individual column(s)
Column number
Headline
Article type
42
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Samples for JISC 1
‘master’ contains high res tiff
‘service’ contains post processed tiff and OCR XML
BNWL - The Belfast News-Letter - 1871 - November 14
BNWL - The Belfast News-Letter - 1885 - September 12
DNLN - Daily News - 21 Jan 1846 - 31 Dec 1900
43
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
JISC 2 Collection
• 22 Titles
• Regional titles
• 1020550 pages
44
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
jisc_2.xls
45
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
JISC 2
• 40 TB
• Stored differently locally
192,353 folders
46
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Samples for JISC 2
• Organised differently
47
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Samples for JISC 2
Lancaster Gazetter, And General Advertiser For Lancashire West
Southampton Herald
Berrows Worcester Journal
A - Contains post processed files
M - Contains JP2
O - Contains ALTO XML
48
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Previous ideas of using collection
• Bob Nicholson – Finding jokes
• Katrina Navickas – Political meetings
• Hannah Murray – Black abolitionist performances
• Jennifer Batt – Finding poetry
• Surendra Singh – Finding suicide articles
• Melodee Beals – Evidence of copy and paste
• Ryan Cordel – Viral Texts
• Paul Fyfe - Snipping out images
49
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Useful resources
• http://oceanicexchanges.org/
• http://scissorsandpaste.net/
• http://viraltexts.org/
• https://repository.lib.ncsu.edu/bitstream/handle/1840.20/33457/fyfe.newspaper.ar
chaeology.VPR.pdf?sequence=1
50
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Use of Overproof
OCR Correction?
Re-OCR with
ABBY FineReader?
https://www.abbyy.com/en-gb/
http://overproof.projectcomputing.com/
RE-OCR
51
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Virtual Infrastructure for OCR text
OCR text ‘scraped’ from
digitised newspapers
and put in cloud
Jupyter notebook
Write python code and results
in web browser
http://jupyter.org
Access available for researchers ‘in residence’
https://www.docker.com/
52
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
65,000 digitised 19th Century books
Image: Artwork by Alicia Martin 2007 / 2008
Paid for by:
For a full list:
https://goo.gl/HqPQMS
Subjects include:
Philosophy
Poetry
History
Literature
1789 - 1876
53
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Working with the MS Books Collection
• Metadata
• Page level images
• OCR Text
• Flickr Commons - images snipped out and user generated tags for images
• 19th Century Books Collection data
54
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
30 August 2012
55
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Metadata
MicrosoftBooks.xls - Over 65,000 titles
56
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
MS Books – Finish Titles
57
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Fiction / Non Fiction
58
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Latin American Studies
59
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
ALTO XML – Sample Files – 1800 - 1809
1502 Zip Files
60
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
OCR Text – JSON File
61
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
002819694
62
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
63
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
64
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Optically Character Recognised (OCR)
generated Text
Scanned Page
Image on Flickr
Commons
https://goo.gl/AC43vs
65
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Worked better for female faces than men’s
Press
http://mechanicalcurator.tumblr.com
Posts image every 30 minutes
http://www.flickr.com/photos/britishlibrary/
1,020,418 images
need tagging!
Creative uses of images
Face recognition
Algorithms based on photos
Mechanical Curator
with an algorithmic brain
(Circles, Squares and Slanty etc)
http://goo.gl/qPPgxX
Wikimedia
Flickr Commons
Individual URL & API
Snipping out images
from 65,000 Digitised Books*
>1000,000,000* views
>17,000,000* tags
https://goo.gl/FgZ4HM
Work @ BL by Ben O’Steen, Labs
and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx
Since Dec 2013
Tumblr
*Estimates
>More demand to see
physical items
66
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
British Library Flickr Commons
https://www.flickr.com/photos/britishlibrary/
Flickr Commons has items from
Galleries, Libraries, Archives and Museums (GLAM)
(Mostly Public Domain)
67
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Flickr Commons (100 + GLAMs as of 25/09/18)
68
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Getting an account on Flickr
•Get a Flickr / Yahoo account
(https://login.yahoo.com/account/create)
•You can then tag, organise favourites, make your own
albums and galleries from Flickr images online or uploaded
•You get 1TB for free!
69
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
British Library Flickr Commons
Why Flickr Commons?
• Free!
• Each image has it’s own unique web address, easy to share
• Can Tag images
• Has Application Programming Interface (API)
Late August 2013
70
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Using British Library Flickr Commons
•How do we find things in this collection?
•Remember snipped out images from books with no
description?
•Not straightforward…
71
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
How is Flickr Commons Organised?
• Photostream
• Albums
• Faves
• Galleries
• Tags
72
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Flickr Photostream
https://www.flickr.com/photos/britishlibrary/
Kind of the home page for the collection!
Usually displays images with most recent activity!
73
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Flickr Albums
Curated by the British Library – specifically Nora McGregor
She works with the public to add images or create new ones!
Over 450 Albums as of 25/09/18 – Mostly Maps!
74
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Flickr Faves
Most favorited image first in descending order
To favourite an image requires an account
75
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Flickr Galleries
More useful if you have an account
You can create a Gallery of Flickr images to share with everyone
Gallery is tied to your account
76
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Flickr Groups
Community based – for sharing and discussing images
We might create a group for the competition – watch this space!
77
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Adding Tags in Flickr
Be the next ‘Chico45’!
78
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Get Tags!
79
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Searching within the collection!
80
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
The Anatomy of a BL Flickr Record
Download
high res
300dpi image
81
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
82
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
When you log in to Flickr Commons
83
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
84
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Opportunities
– increasing traffic to Library services
You can purchase
a ‘High Res’ Copy
View in the
Library Item Viewer
Download .pdf
All illustrations
in book
Other illustrations in books
Published in same year
View the item in
the Library Catalogue Tags auto generated
User generated
Tag
Grouping for image
85
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Refers to the
Physical Copy of
the Item
86
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
87
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
88
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Physical and Digital Copy
Number relates to Physical Copy
89
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
90
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
91
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
92
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
93
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
94
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
95
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
96
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
You can’t beat the Physical Copy!
97
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Now for the Digital Copy!
98
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
99
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
100
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
101
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Warning – can be large file!
It’s aPDF
You can do Ctrl F in it to find text
But health warning about OCR!
102
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
103
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Page numbers don’t always correspond!
Page numbers
Don’t always correspond
Page 132 on Flickr?
Is Page Number in PDF
In PDF of
book
Page number
in book
104
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
105
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Plain Text from Books?
Not working
But can be obtained from https://data.bl.uk/digbks/db14.html
106
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
All illustrations in book / books in same year!
All the illustrations in this book Other illustrations books published
in the same year
107
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Views and Favourites
108
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Galleries
•Personal Galleries which you can share.
109
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Exchangeable Image File Information!
For Geeks only!
110
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Tags!
111
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Tagging a million images
Iterative Crowdsourcing
http://goo.gl/j6fxac
Cardiff University’s
Lost Visions Project
http://www.metadatagames.org/
Metadata Games
James Heald
Mario Klingemann
Chico 45
Use computational methods
Human Tagger
Top British Library Flickr Commons Taggers
18 hard core taggers
How to reward and keep motivated this ‘small group?
Average for ‘crowd’ is 1 tag per person
What kind of ‘task’ can this ‘crowd’ do?
Mobile games for ‘Ships’, ‘Covers’ and ‘Portraits’ Interface for tagging
112
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Adding Tags!
•You have to have an account to add tags!
•Could you be the next Chico 45?
113
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Generated from book
Description
Generated from user
114
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Generated by Flickr
115
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Flickr Commons API
https://www.flickr.com/services/api/
116
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Generated by SherlockNet!
bit.ly/sherlocknet
117
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Sherlocknet has a search interface!
118
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
SherlockNet Search for ‘people’
119
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Advanced Search in SherlockNet!
Tags Available for Download
120
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
19th Century Books Metadata
• 1,9 Million records of 19th Century Books
• Used for Sample generator project
121
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Using the Wikimedia Synoptic Index
• Created to help find all the maps in the books
• Great resource if you want to find things by place!
https://goo.gl/zuxRnG
122
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Google Fusion Table
• https://fusiontables.google.com/DataSource?docid=1BMm0FeSsEBa40zgs3C3v
ySKC0gnPk-pSvrDqqnA7&pli=1#rows:id=1
123
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Geodata
flickr_geodata.csv
124
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Alston Index
Internal Document
55-602 - Topical Index
603 - 925 - Pressmark Sequence925 page document of BL /
British Museum Pressmarks
125
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Alston Index
• Internal document (not to be externally shared)
• Published in 1987 – dot matrix printed
• Refers to British Museum and British Library Pressmarks / Shelfmarks
• Shelfmarks are used internally to identify
126
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Topical Index
OCR problems – Re-do? Manually correct?
127
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Augment Library Catalogue?
128
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Libcrowds – In the Spotlight
https://www.libcrowds.com/collection/playbills/projects
129
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Libcrowds – Spotlight - Data
https://www.libcrowds.com/collection/playbills/data
130
@BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk
Data Journey
• Choose one or two datasets maximum
• Explore the collection and make notes about any challenges and issues
• See if you can curate a smaller collection from the larger collection
• Tell us what you have done
• We will consider to publish on http://data.bl.uk

Contenu connexe

Tendances

BL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of WolverhamptonBL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of Wolverhamptonlabsbl
 
Experiences and lessons learned through British Library Labs How have we eng...
Experiences and lessons learned through British Library Labs  How have we eng...Experiences and lessons learned through British Library Labs  How have we eng...
Experiences and lessons learned through British Library Labs How have we eng...labsbl
 
British Library Labs - Bodleian - University of Oxford
British Library Labs - Bodleian - University of OxfordBritish Library Labs - Bodleian - University of Oxford
British Library Labs - Bodleian - University of Oxfordlabsbl
 
British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018labsbl
 
A career in academic publishing
A career in academic publishingA career in academic publishing
A career in academic publishingMax Haring
 
Crowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopCrowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopMia
 
María Luisa Alvite Díez: Digital Collections: Bibliographic heritage in Spain
María Luisa Alvite Díez: Digital Collections: Bibliographic heritage in SpainMaría Luisa Alvite Díez: Digital Collections: Bibliographic heritage in Spain
María Luisa Alvite Díez: Digital Collections: Bibliographic heritage in SpainÚISK FF UK
 
BL Labs at Arts and Humanities event
BL Labs at Arts and Humanities eventBL Labs at Arts and Humanities event
BL Labs at Arts and Humanities eventlabsbl
 
VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...
VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...
VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...Artium Vitoria
 
Planning for big data (lessons from cultural heritage)
Planning for big data (lessons from cultural heritage)Planning for big data (lessons from cultural heritage)
Planning for big data (lessons from cultural heritage)Mia
 

Tendances (14)

BL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of WolverhamptonBL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of Wolverhampton
 
Experiences and lessons learned through British Library Labs How have we eng...
Experiences and lessons learned through British Library Labs  How have we eng...Experiences and lessons learned through British Library Labs  How have we eng...
Experiences and lessons learned through British Library Labs How have we eng...
 
British Library Labs - Bodleian - University of Oxford
British Library Labs - Bodleian - University of OxfordBritish Library Labs - Bodleian - University of Oxford
British Library Labs - Bodleian - University of Oxford
 
Nilges Making The Metadata Work NISO Virtual Conference Ebooks
Nilges Making The Metadata Work NISO Virtual Conference EbooksNilges Making The Metadata Work NISO Virtual Conference Ebooks
Nilges Making The Metadata Work NISO Virtual Conference Ebooks
 
Dh2016 dstp
Dh2016 dstpDh2016 dstp
Dh2016 dstp
 
British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018
 
A career in academic publishing
A career in academic publishingA career in academic publishing
A career in academic publishing
 
Crowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopCrowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshop
 
María Luisa Alvite Díez: Digital Collections: Bibliographic heritage in Spain
María Luisa Alvite Díez: Digital Collections: Bibliographic heritage in SpainMaría Luisa Alvite Díez: Digital Collections: Bibliographic heritage in Spain
María Luisa Alvite Díez: Digital Collections: Bibliographic heritage in Spain
 
BL Labs at Arts and Humanities event
BL Labs at Arts and Humanities eventBL Labs at Arts and Humanities event
BL Labs at Arts and Humanities event
 
Archiving Interactive Narratives at the British Library by Lynda Clark, Giuli...
Archiving Interactive Narratives at the British Library by Lynda Clark, Giuli...Archiving Interactive Narratives at the British Library by Lynda Clark, Giuli...
Archiving Interactive Narratives at the British Library by Lynda Clark, Giuli...
 
VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...
VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...
VIII Encuentros de Centros de Documentación de Arte Contemporáneo en Artium -...
 
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
Collecting 80 days at The British Library, by Stella Wisdom and Giulia Carla ...
 
Planning for big data (lessons from cultural heritage)
Planning for big data (lessons from cultural heritage)Planning for big data (lessons from cultural heritage)
Planning for big data (lessons from cultural heritage)
 

Similaire à A hands-on data exploration & challenge to become a derived data-set author on the British Library’s open data-set platform (https://data.bl.uk)

BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural DataBL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural Datalabsbl
 
Presentation to the London Psychology Group
Presentation to the London Psychology GroupPresentation to the London Psychology Group
Presentation to the London Psychology Grouplabsbl
 
British Library Labs - CityLIS
British Library Labs  - CityLISBritish Library Labs  - CityLIS
British Library Labs - CityLISlabsbl
 
British Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live LabBritish Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live Lablabsbl
 
BL Labs CityLIS Talk
BL Labs CityLIS TalkBL Labs CityLIS Talk
BL Labs CityLIS Talklabsbl
 
British Library Labs Presentation at British Fashion Council Teatum Jones Event
British Library Labs Presentation at British Fashion Council Teatum Jones EventBritish Library Labs Presentation at British Fashion Council Teatum Jones Event
British Library Labs Presentation at British Fashion Council Teatum Jones Eventlabsbl
 
Building Better GLAM Labs - Keynote Presentation at Simon Fraser University
Building Better GLAM Labs - Keynote Presentation at Simon Fraser UniversityBuilding Better GLAM Labs - Keynote Presentation at Simon Fraser University
Building Better GLAM Labs - Keynote Presentation at Simon Fraser Universitylabsbl
 
Presentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of SciencesPresentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of Scienceslabsbl
 
British Library Labs Roadshow 2017 at the University of Birmingham
British Library Labs Roadshow 2017 at the University of BirminghamBritish Library Labs Roadshow 2017 at the University of Birmingham
British Library Labs Roadshow 2017 at the University of Birminghamlabsbl
 
BL Labs Roadshow at the University of Kent
BL Labs Roadshow at the University of KentBL Labs Roadshow at the University of Kent
BL Labs Roadshow at the University of Kentlabsbl
 
BL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State StudentsBL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State Studentslabsbl
 
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...labsbl
 
Operationalising AI at a national library
Operationalising AI at a national libraryOperationalising AI at a national library
Operationalising AI at a national libraryMia
 
BL Labs Bitsize Talk at the British Library
BL Labs Bitsize Talk at the British LibraryBL Labs Bitsize Talk at the British Library
BL Labs Bitsize Talk at the British Librarylabsbl
 
British Library Labs Roadshow - Sussex Humanities Lab
British Library Labs Roadshow - Sussex Humanities LabBritish Library Labs Roadshow - Sussex Humanities Lab
British Library Labs Roadshow - Sussex Humanities Lablabsbl
 
Digital Magical Mystery Tour - British Library
Digital Magical Mystery Tour - British LibraryDigital Magical Mystery Tour - British Library
Digital Magical Mystery Tour - British Librarylabsbl
 
Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...Mia
 
BL Labs Presentation at Språkbanken, University of Gothenberg
BL Labs Presentation at  Språkbanken, University of GothenbergBL Labs Presentation at  Språkbanken, University of Gothenberg
BL Labs Presentation at Språkbanken, University of Gothenberglabsbl
 
DH Project Management
DH Project ManagementDH Project Management
DH Project Managementlabsbl
 

Similaire à A hands-on data exploration & challenge to become a derived data-set author on the British Library’s open data-set platform (https://data.bl.uk) (20)

BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural DataBL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
 
Presentation to the London Psychology Group
Presentation to the London Psychology GroupPresentation to the London Psychology Group
Presentation to the London Psychology Group
 
British Library Labs - CityLIS
British Library Labs  - CityLISBritish Library Labs  - CityLIS
British Library Labs - CityLIS
 
British Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live LabBritish Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live Lab
 
BL Labs CityLIS Talk
BL Labs CityLIS TalkBL Labs CityLIS Talk
BL Labs CityLIS Talk
 
British Library Labs Presentation at British Fashion Council Teatum Jones Event
British Library Labs Presentation at British Fashion Council Teatum Jones EventBritish Library Labs Presentation at British Fashion Council Teatum Jones Event
British Library Labs Presentation at British Fashion Council Teatum Jones Event
 
Building Better GLAM Labs - Keynote Presentation at Simon Fraser University
Building Better GLAM Labs - Keynote Presentation at Simon Fraser UniversityBuilding Better GLAM Labs - Keynote Presentation at Simon Fraser University
Building Better GLAM Labs - Keynote Presentation at Simon Fraser University
 
Presentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of SciencesPresentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of Sciences
 
British Library Labs Roadshow 2017 at the University of Birmingham
British Library Labs Roadshow 2017 at the University of BirminghamBritish Library Labs Roadshow 2017 at the University of Birmingham
British Library Labs Roadshow 2017 at the University of Birmingham
 
BL Labs Roadshow at the University of Kent
BL Labs Roadshow at the University of KentBL Labs Roadshow at the University of Kent
BL Labs Roadshow at the University of Kent
 
BL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State StudentsBL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State Students
 
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
 
Digital Scholarship at the British Library
Digital Scholarship at the British LibraryDigital Scholarship at the British Library
Digital Scholarship at the British Library
 
Operationalising AI at a national library
Operationalising AI at a national libraryOperationalising AI at a national library
Operationalising AI at a national library
 
BL Labs Bitsize Talk at the British Library
BL Labs Bitsize Talk at the British LibraryBL Labs Bitsize Talk at the British Library
BL Labs Bitsize Talk at the British Library
 
British Library Labs Roadshow - Sussex Humanities Lab
British Library Labs Roadshow - Sussex Humanities LabBritish Library Labs Roadshow - Sussex Humanities Lab
British Library Labs Roadshow - Sussex Humanities Lab
 
Digital Magical Mystery Tour - British Library
Digital Magical Mystery Tour - British LibraryDigital Magical Mystery Tour - British Library
Digital Magical Mystery Tour - British Library
 
Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...Developments in digital scholarship: at the British Library and at kitchen ta...
Developments in digital scholarship: at the British Library and at kitchen ta...
 
BL Labs Presentation at Språkbanken, University of Gothenberg
BL Labs Presentation at  Språkbanken, University of GothenbergBL Labs Presentation at  Språkbanken, University of Gothenberg
BL Labs Presentation at Språkbanken, University of Gothenberg
 
DH Project Management
DH Project ManagementDH Project Management
DH Project Management
 

Plus de labsbl

7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing comments7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing commentslabsbl
 
7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects update7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects updatelabsbl
 
7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Award7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Awardlabsbl
 
7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Award7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Awardlabsbl
 
7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendation7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendationlabsbl
 
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ projectlabsbl
 
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...labsbl
 
7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Award7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Awardlabsbl
 
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...labsbl
 
7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs update7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs updatelabsbl
 
7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introduction7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introductionlabsbl
 
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Awardlabsbl
 
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...labsbl
 

Plus de labsbl (13)

7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing comments7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing comments
 
7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects update7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects update
 
7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Award7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Award
 
7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Award7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Award
 
7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendation7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendation
 
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
 
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
 
7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Award7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Award
 
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
 
7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs update7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs update
 
7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introduction7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introduction
 
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
 
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...
 

Dernier

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 

Dernier (20)

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 

A hands-on data exploration & challenge to become a derived data-set author on the British Library’s open data-set platform (https://data.bl.uk)

  • 1. 1 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk http://www.bl.uk/projects/british-library-labs Funded by the Andrew W. Mellon Foundation Running since March 2013 A hands-on data exploration & challenge to become a derived data-set author on the British Library’s open data-set platform (https://data.bl.uk) Mahendra Mahey, Manager of BL Labs, British Library, London, UK. 1400 – 1530, Tuesday 25 September 2018 Workshop part of ‘Making Connections’, Digital Humanities Australasia, 2018 (#DHA2018), University of South Australia, City West campus, Adelaide, SA, Australia
  • 2. 2 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Who do we work with? Researchers https://goo.gl/WutNyi Artists http://goo.gl/nNKhQ2 Librarians Curators https://goo.gl/9NWZUW Software Developers https://goo.gl/7QQ5Tf Archivists https://goo.gl/x7b4tg Educators https://goo.gl/qh01Mi Working and Communicating Entrepreneurs https://goo.gl/Fx8RG7
  • 3. 3 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Competition Awards Projects Tell us your ideas of what to do with our digital content (2013-16) Show us what you have already done with our digital content in research, artistic, commercial, learning and teaching, staff categories Talk to us about working on collaborative projects Tell us your ideas of what to do with our digital content Engagement • Roadshows • Events • Meetings • Conversations New! Digital Research Support How?
  • 4. 4 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Collections – not just books! > 180*million items > 0.8* m serial titles > 8* m stamps > 14* m books > 6* m sound recordings > 4* m maps > 1.6* m musical scores > 0.3* m manuscripts > 60* m patents King’s Library *Estimates
  • 5. 5 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Have you got X? https://upload.wikimedia.org/wikipedia/commons/5/50/Real_wuerzburg.jpg Looking for Physical Content in the British Library
  • 6. 6 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk #bldigital 3 %* digitised * estimate Digital Partnerships Commercial & Other Organisations Bias in digitisation http://goo.gl/bR9UJL Sample Generator 15 %* Openly Licensed – most online 85 %* Available onsite only at the moment Digitisation / Curating Born Digital costs money, time, resources http://www.turing.ac.uk Digital increasing rapidly Born Digital http://www.webarchive.org.uk/ukwa/
  • 7. 7 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Have you got X digitised / in digital form? http://www.yorkmix.com/wp-content/uploads/2014/04/mr-simms-sweet-shoppe-york.jpg Looking for Digitised / Digital Content in the BL
  • 8. 8 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Our Audience and Collections Audience research & Digital interests Digital collections we have This is where Labs works It starts with a making connections! The theme to DHA2018
  • 9. 9 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Finding Open Cultural Heritage Datasets Collection Guides (219 as of 25/09/2018) https://www.bl.uk/collection-guides/ Datasets about our collections Bibliographic datasets relating to our published and archival holdings Datasets for content mining Content suitable for use in text and data mining research Datasets for image analysis Image collections suitable for large-scale image-analysis-based research Datasets from UK Web Archive Data and API services available for accessing UK Web Archive Digital mapping Geospatial data, cartographic applications, digital aerial photography and scanned historic map materials https://data.bl.uk Download collections as zips, no API Each dataset has a Digital Object Identifier (DOI) can be referenced for research Not all discoverable via search engines!
  • 10. 10 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Explore Our Data at http://data.bl.uk! • CSV of Metadata https://data.bl.uk/digbks/dig19cbooks-mdata-csv.csv • 19th Century Books - Book Metadata - 01/09/2013. https://data.bl.uk/digbks/db21.html • Digitised Books - Flickr Tag History - Dec 2013 to March 2016. TSV https://data.bl.uk/digbks/db15.html • Digitised Hebrew Manuscripts - Metadata https://data.bl.uk/hebrewmanuscripts/heb1.html • Digitised Hebrew Manuscripts: Or 2210 - Or 2364 https://data.bl.uk/hebrewmanuscripts/heb8.html • Theatrical playbills from Britain and Ireland (OCR text only) https://data.bl.uk/playbills/pb2.html • Portraits of actors, views of theatres and playbills (covering 1750 - 1821 in a single volume) https://data.bl.uk/singlesheet/por1.html • Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on amusements.1660-1840. https://data.bl.uk/singlesheet/ad1.html
  • 11. 11 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk The Story of the Digital Collection… Digital Collection Curator Who paid for the digitisation? Who did the digitisation? Technology used Born digital? Published Unpublished Where is it? Access / API? Can it still be accessed? Generates income Reputational risk in using? Legalities / Ethics / Morality Politics when digitised Personalities involved Surprises (e.g. gaps) Descriptive information Old format not supported What media was the digitisation done from? Is there any background documentation? No Descriptive information Inconsistent descriptive information Still there? Good to know the background ‘story’ of a Digital Collection if you want to use it for projects …
  • 12. 12 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk https://goo.gl/qpCLlk https://goo.gl/wMTS3Z • Dialogue typically: – you are ‘lucky’ & we have the digital content / data relevant to your research – we don’t have exactly what your looking for, but is there anything of interest? Let’s talk… – engagement is hard work and it’s constantly required to maintain interest in our digital collections! • Artists find this dialogue easier… • We also tend to attract researchers with ‘fuzzier’ research boundaries and possibly open to more interdisciplinary / collaborative research What engagement does the BL have with researchers wanting use our digital content?
  • 13. 13 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Open Content vs Onsite Only Access • Access easier for openly licensed content • More challenging for on-site, in-copyright, non-print legal deposit, data protected, old content media & contemporary material (post 1877) https://goo.gl/Y5zCXg ©
  • 14. 14 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk How do we give access to onsite-only Digital Collections (85% of our Digital Collections)?
  • 15. 15 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk READING ROOM ON SITE NOT ONLINE OPEN British Library £ Labs Residency Model Challenges of access to Digital Collections at the BL
  • 16. 16 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Accessing digital collections onsite OPEN £ • Have to be ‘onsite’ (interpretations vary) • Need to be ‘security cleared’ ‘trusted’ for some collections – Hence ‘Researcher in Residence Model’ • Permission required (depending on ‘story’ of collection) • Content could be on various media formats (not always online) • 5 - 20 % re-use of material for non commercial research for some collections, depends on agreements in place • We are learning ‘pathways’ so that this becomes ‘everyday’ to provide onsite access to some digital collections in the future
  • 17. 17 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Phases of interaction at BL Labs Submit idea for support Ideas always change Once people experience the data and culture of the organisation
  • 18. 18 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk eResearch SA Open Data Directory http://www.data.sa.edu.au/
  • 19. 19 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk URLs to download sample files not on data.bl.uk • https://www.data.sa.edu.au/dataset/newspapers-from-british-library/ • https://www.data.sa.edu.au/dataset/ • https://www.data.sa.edu.au/dataset/
  • 20. 20 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Working with British Library Digitised Newspapers • Digitised through public / private means • Can use commercial products to look manually for content, with search interfaces but no APIs, useful starting point though, manual methods can translate into computational ones • OCR quality is not great, metadata is OK, but plenty of hidden material, approaches require to consider this, e.g. ‘Good, Bad and Ugly’ OCR • You can purchase drives from GALE Cengage with content (dependent on subscription)
  • 21. 21 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Good, Bad, Ugly Image Quality / OCR • Original image capture of newspaper images can effect the quality of the OCR • A poor image, very difficult to re-OCR • Good image quality much better chance for re-OCR • Bi-tonal, Grey Scale, Colour can effect the quality of the OCR • Methodology of working with collection at scale needs to acknowledge OCR and image quality
  • 22. 22 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Breaking Black Boxes – Melodee Beals http://doi.org/cm3m
  • 23. 23 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Burney Collection • Gathered by the Reverend Charles Burney (1757- 1817) • 700 volumes, newspapers and news pamphlets, published in London, English provincial, Irish and Scottish papers, and a few examples from the American colonies. • 1271 titles • Around 1 million digitised page images – from around 2006 from Microfilm • OCR quality mixed, used custom XML format • Bi-tonal
  • 24. 24 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Web Interface – Burney Collection
  • 25. 25 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk OCR quality can be very poor!
  • 26. 26 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk 1268 Folders
  • 27. 27 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk burney_summary.xls
  • 28. 28 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Breakdown of titles Title No. of Pages PUBLIC ADVERTISER 60680 LONDON GAZETTE 44463 LONDON EVENING POST 38920 LONDON CHRONICLE 32030 GAZETTEER AND NEW DAILY ADVERTISER 31250 LLOYD'S EVENING POST 28941 ST. JAMES'S CHRONICLE OR THE BRITISH EVENING POST 28130 MORNING CHRONICLE AND LONDON ADVERTISER 27658 DAILY COURANT 25334 GENERAL EVENING POST 23500 12 TITLES WITH 10,000+ PAGES 188266 87 TITLES WITH 1,000+ PAGES 289745 216 TITLES WITH 100+ PAGES 79374 945 TITLES WITH 1 TO 100 PAGES 16816
  • 29. 29 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Example Folders B0001ORIWEEJO - APPLEBEE''S ORIGINAL WEEKLY JOURNAL - 1715 – 1720 B0018CONTPROC - PROCEEDINGS OF THE ARMY UNDER THE COMMAND OF SIR THOMAS FAIRFAX – 1645 B0054REPINFCH - REPORT OF THE STATE OF THE GENERAL INFIRMARY AT CHESTOR - 1754?-1779 B0101PROCPARL - EXACT RELATION OF THE PROCEEDINGS AND TRANSACTIONS OF THE LATE PARLIAMENT – 1654 B0277INSTRUCT - INSTRUCTOR – 1724 B1381SCOU1717 - SCOURGE (1717, REPRINT) - 1717?
  • 30. 30 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Example files ‘service’ folder contains page level images and corresponding OCR XML BurneyB0001ORIWEEJO17151119service
  • 31. 31 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk APPLEBEE''S ORIGINAL WEEKLY JOURNAL FROM SATURDAY NOVEMBER 19 TO SATURDAY NOVEMBER 26 1715 WO2_B0001ORIWEEJO_1715_11_19-0001.tiff
  • 32. 32 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk JISC 1 and JISC 2 Newspapers
  • 33. 33 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Accessing digitised newspapers through Gale Interface (subscription)
  • 34. 34 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Private BL NAS Accessible onsite or remotely if security cleared via CITRIX
  • 35. 35 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Accessing digitised newspapers onsite at the BL (JISC 1) 12 Volumes, 80TB of data
  • 36. 36 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Accessing digitised newspapers onsite at the BL Accessing ‘service’ Copy (post processed) and results of OCR available as XML
  • 37. 37 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Accessing digitised newspapers onsite at the BL Accessing ‘service’ Copy (post processed)
  • 38. 38 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Accessing digitised newspapers onsite at the BL Accessing OCR as XML
  • 39. 39 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk jisc_1.xls 79 Titles, 2 million pages
  • 40. 40 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Metadata from BL (JISC 1 and 2) • Title Metadata – Title, as written – Normalised title across all variants – Standardised title abbreviation – Variant titles, with associated dates – Place of publication – Dates of publication – Genre, such as newspaper – Sub-collection, such as Regional Daily Issue Metadata Volume Number Issue Number Date as printed Normalised date (YYYY.MM.DD) Number of pages The microfilm reel number The OCR quality Page image data The number of the image within that issue The filename The spatial coordinates for the page within the image The degree of page skew
  • 41. 41 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Metadata from Gale (JISC 1 and 2) • Standardised identifier • Newspaper title • Standardised title abbreviation • Project codes • Digitized collection name • Issue number • Date as printed • Standardised date (Month, DD, YYYY) • Standardised date (YYYYMMDD) • Day of the week • Number of Pages • Copyright holder Language Unique ID for publication Holding Library Citation of the physical item Title metadata Title as recorded in the MARC Library Catalogue Dates of publication Genre, such as newspaper Conversion credit, usually a vendor Article Unique ID OCR quality SC, or standardized category of article Unique ID(s) of page(s) Unique ID(s) of individual column(s) Column number Headline Article type
  • 42. 42 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Samples for JISC 1 ‘master’ contains high res tiff ‘service’ contains post processed tiff and OCR XML BNWL - The Belfast News-Letter - 1871 - November 14 BNWL - The Belfast News-Letter - 1885 - September 12 DNLN - Daily News - 21 Jan 1846 - 31 Dec 1900
  • 43. 43 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk JISC 2 Collection • 22 Titles • Regional titles • 1020550 pages
  • 44. 44 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk jisc_2.xls
  • 45. 45 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk JISC 2 • 40 TB • Stored differently locally 192,353 folders
  • 46. 46 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Samples for JISC 2 • Organised differently
  • 47. 47 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Samples for JISC 2 Lancaster Gazetter, And General Advertiser For Lancashire West Southampton Herald Berrows Worcester Journal A - Contains post processed files M - Contains JP2 O - Contains ALTO XML
  • 48. 48 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Previous ideas of using collection • Bob Nicholson – Finding jokes • Katrina Navickas – Political meetings • Hannah Murray – Black abolitionist performances • Jennifer Batt – Finding poetry • Surendra Singh – Finding suicide articles • Melodee Beals – Evidence of copy and paste • Ryan Cordel – Viral Texts • Paul Fyfe - Snipping out images
  • 49. 49 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Useful resources • http://oceanicexchanges.org/ • http://scissorsandpaste.net/ • http://viraltexts.org/ • https://repository.lib.ncsu.edu/bitstream/handle/1840.20/33457/fyfe.newspaper.ar chaeology.VPR.pdf?sequence=1
  • 50. 50 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Use of Overproof OCR Correction? Re-OCR with ABBY FineReader? https://www.abbyy.com/en-gb/ http://overproof.projectcomputing.com/ RE-OCR
  • 51. 51 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Virtual Infrastructure for OCR text OCR text ‘scraped’ from digitised newspapers and put in cloud Jupyter notebook Write python code and results in web browser http://jupyter.org Access available for researchers ‘in residence’ https://www.docker.com/
  • 52. 52 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk 65,000 digitised 19th Century books Image: Artwork by Alicia Martin 2007 / 2008 Paid for by: For a full list: https://goo.gl/HqPQMS Subjects include: Philosophy Poetry History Literature 1789 - 1876
  • 53. 53 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Working with the MS Books Collection • Metadata • Page level images • OCR Text • Flickr Commons - images snipped out and user generated tags for images • 19th Century Books Collection data
  • 54. 54 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk 30 August 2012
  • 55. 55 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Metadata MicrosoftBooks.xls - Over 65,000 titles
  • 56. 56 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk MS Books – Finish Titles
  • 57. 57 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Fiction / Non Fiction
  • 58. 58 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Latin American Studies
  • 59. 59 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk ALTO XML – Sample Files – 1800 - 1809 1502 Zip Files
  • 60. 60 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk OCR Text – JSON File
  • 61. 61 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk 002819694
  • 64. 64 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Optically Character Recognised (OCR) generated Text Scanned Page Image on Flickr Commons https://goo.gl/AC43vs
  • 65. 65 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Worked better for female faces than men’s Press http://mechanicalcurator.tumblr.com Posts image every 30 minutes http://www.flickr.com/photos/britishlibrary/ 1,020,418 images need tagging! Creative uses of images Face recognition Algorithms based on photos Mechanical Curator with an algorithmic brain (Circles, Squares and Slanty etc) http://goo.gl/qPPgxX Wikimedia Flickr Commons Individual URL & API Snipping out images from 65,000 Digitised Books* >1000,000,000* views >17,000,000* tags https://goo.gl/FgZ4HM Work @ BL by Ben O’Steen, Labs and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx Since Dec 2013 Tumblr *Estimates >More demand to see physical items
  • 66. 66 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk British Library Flickr Commons https://www.flickr.com/photos/britishlibrary/ Flickr Commons has items from Galleries, Libraries, Archives and Museums (GLAM) (Mostly Public Domain)
  • 67. 67 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Flickr Commons (100 + GLAMs as of 25/09/18)
  • 68. 68 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Getting an account on Flickr •Get a Flickr / Yahoo account (https://login.yahoo.com/account/create) •You can then tag, organise favourites, make your own albums and galleries from Flickr images online or uploaded •You get 1TB for free!
  • 69. 69 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk British Library Flickr Commons Why Flickr Commons? • Free! • Each image has it’s own unique web address, easy to share • Can Tag images • Has Application Programming Interface (API) Late August 2013
  • 70. 70 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Using British Library Flickr Commons •How do we find things in this collection? •Remember snipped out images from books with no description? •Not straightforward…
  • 71. 71 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk How is Flickr Commons Organised? • Photostream • Albums • Faves • Galleries • Tags
  • 72. 72 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Flickr Photostream https://www.flickr.com/photos/britishlibrary/ Kind of the home page for the collection! Usually displays images with most recent activity!
  • 73. 73 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Flickr Albums Curated by the British Library – specifically Nora McGregor She works with the public to add images or create new ones! Over 450 Albums as of 25/09/18 – Mostly Maps!
  • 74. 74 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Flickr Faves Most favorited image first in descending order To favourite an image requires an account
  • 75. 75 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Flickr Galleries More useful if you have an account You can create a Gallery of Flickr images to share with everyone Gallery is tied to your account
  • 76. 76 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Flickr Groups Community based – for sharing and discussing images We might create a group for the competition – watch this space!
  • 77. 77 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Adding Tags in Flickr Be the next ‘Chico45’!
  • 78. 78 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Get Tags!
  • 79. 79 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Searching within the collection!
  • 80. 80 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk The Anatomy of a BL Flickr Record Download high res 300dpi image
  • 82. 82 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk When you log in to Flickr Commons
  • 84. 84 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Opportunities – increasing traffic to Library services You can purchase a ‘High Res’ Copy View in the Library Item Viewer Download .pdf All illustrations in book Other illustrations in books Published in same year View the item in the Library Catalogue Tags auto generated User generated Tag Grouping for image
  • 85. 85 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Refers to the Physical Copy of the Item
  • 88. 88 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Physical and Digital Copy Number relates to Physical Copy
  • 96. 96 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk You can’t beat the Physical Copy!
  • 97. 97 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Now for the Digital Copy!
  • 101. 101 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Warning – can be large file! It’s aPDF You can do Ctrl F in it to find text But health warning about OCR!
  • 103. 103 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Page numbers don’t always correspond! Page numbers Don’t always correspond Page 132 on Flickr? Is Page Number in PDF In PDF of book Page number in book
  • 105. 105 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Plain Text from Books? Not working But can be obtained from https://data.bl.uk/digbks/db14.html
  • 106. 106 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk All illustrations in book / books in same year! All the illustrations in this book Other illustrations books published in the same year
  • 107. 107 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Views and Favourites
  • 108. 108 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Galleries •Personal Galleries which you can share.
  • 109. 109 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Exchangeable Image File Information! For Geeks only!
  • 111. 111 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Tagging a million images Iterative Crowdsourcing http://goo.gl/j6fxac Cardiff University’s Lost Visions Project http://www.metadatagames.org/ Metadata Games James Heald Mario Klingemann Chico 45 Use computational methods Human Tagger Top British Library Flickr Commons Taggers 18 hard core taggers How to reward and keep motivated this ‘small group? Average for ‘crowd’ is 1 tag per person What kind of ‘task’ can this ‘crowd’ do? Mobile games for ‘Ships’, ‘Covers’ and ‘Portraits’ Interface for tagging
  • 112. 112 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Adding Tags! •You have to have an account to add tags! •Could you be the next Chico 45?
  • 113. 113 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Generated from book Description Generated from user
  • 114. 114 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Generated by Flickr
  • 115. 115 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Flickr Commons API https://www.flickr.com/services/api/
  • 116. 116 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Generated by SherlockNet! bit.ly/sherlocknet
  • 117. 117 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Sherlocknet has a search interface!
  • 118. 118 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk SherlockNet Search for ‘people’
  • 119. 119 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Advanced Search in SherlockNet! Tags Available for Download
  • 120. 120 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk 19th Century Books Metadata • 1,9 Million records of 19th Century Books • Used for Sample generator project
  • 121. 121 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Using the Wikimedia Synoptic Index • Created to help find all the maps in the books • Great resource if you want to find things by place! https://goo.gl/zuxRnG
  • 122. 122 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Google Fusion Table • https://fusiontables.google.com/DataSource?docid=1BMm0FeSsEBa40zgs3C3v ySKC0gnPk-pSvrDqqnA7&pli=1#rows:id=1
  • 123. 123 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Geodata flickr_geodata.csv
  • 124. 124 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Alston Index Internal Document 55-602 - Topical Index 603 - 925 - Pressmark Sequence925 page document of BL / British Museum Pressmarks
  • 125. 125 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Alston Index • Internal document (not to be externally shared) • Published in 1987 – dot matrix printed • Refers to British Museum and British Library Pressmarks / Shelfmarks • Shelfmarks are used internally to identify
  • 126. 126 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Topical Index OCR problems – Re-do? Manually correct?
  • 127. 127 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Augment Library Catalogue?
  • 128. 128 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Libcrowds – In the Spotlight https://www.libcrowds.com/collection/playbills/projects
  • 129. 129 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Libcrowds – Spotlight - Data https://www.libcrowds.com/collection/playbills/data
  • 130. 130 @BL_Labs #DHA2018 @BL_DigiSchol labs@bl.uk Data Journey • Choose one or two datasets maximum • Explore the collection and make notes about any challenges and issues • See if you can curate a smaller collection from the larger collection • Tell us what you have done • We will consider to publish on http://data.bl.uk

Notes de l'éditeur

  1. 23 seconds (71 words) Though the project focusses on working and communicating with Digital Humanities and Digital Scholarship researchers, we have also engaged with amazing Artists, Librarians, Curators, Educators, Entrepreneurs, Archivists, Software Developers and other innovators. Hopefully, I will show you<CLICK> some inspirational examples of work they have done which have used our digital collections.<CLICK> I will also reflect on our experiences, challenges and lessons we have learned working with some amazing and pioneering people.
  2. 85 seconds The picture you can see is inside the main building in London, it’s the King’s Library – King George the Third’s personal library! Sometimes known as the ‘stack’, I walk past this everyday and I sometimes forget that the collections the British Library have are truly staggering! We currently estimate them to exceed <click>150 million items, representing every age of written civilisation and every known language. Our archives now contain the earliest surviving printed book in the world, the Diamond Sutra, written in Chinese and dating from 868 AD…. So some big numbers… Over …<click>14 million books <click>60 million patents <click>8 million stamps <click>4 million maps <click>3 million sound recordings <click>1.6 million music scores <click>over .3 million manuscripts <click>0.8 million serials titles (which are of course made up of many many volumes/editions), this is where a lot of our content is, just in case you thought the numbers didn’t add up!
  3. 28 seconds (85 words) This what I imagine it feels like for a researcher looking for our physical collections. <CLICK> Everything is on an industrial scale and it can feel overwhelming. Sometimes it isn’t always straightforward to find our items, as there are many that are not on our digital library catalogue, e.g. still on card catalogues and some items are in the secret and very secure parts of the Library where you would need very special permission because the items are extremely valuable and fragile for example.
  4. 24 seconds (72 words) The BL are world renowned experts in digitising materials from our physical holdings. One common misconception that many people have is that much if not all of our collections are digitised. So, the actual proportion of our collections that are digitised surprises many<CLICK> The figure is around 3% of our physical collections.<CLICK> Much of our digitisation activity happens through partnerships with commercial, philanthropic, charitable and foundation partners<CLICK> What is for certain, is the amount we are digitising is increasing rapidly. Our new programme called Heritage Made Digital for example prioritises those collections for digitisation where there is a clear researcher demand.<CLICK> One important thing we have learned is that researchers need to take heed when doing research based on our digital collections, as they are rarely complete, having gaps and not necessarily being representative of our physical collections.
  5. 36 seconds (109 words) Our digital offering is perhaps like this.<CLICK> Imagine entering a boutique sweet shop. We have some lovely things to tempt you, but it’s much smaller than the hypermarket you just visited. The shop keeper tells you there are some things behind the back door in a giant warehouse. However, you will need special access to enter that space. She also states that there are rooms in that warehouse, even she isn’t allowed to look. She isn’t even allowed to share the full list of stock because there are items on there she may never be able to be see because they were meant to be secret.
  6. 12 seconds (37 words). In another way, we are trying to match our audiences research needs and digital interests <CLICK> With the digital collections we have<CLICK> It is at this intersection where Labs works best and it usually starts with a conversation.
  7. 41 words (125 seconds) Our work in Labs has taught us that it always pays for researchers to know the back ‘story’ of a digital collection especially if they want to use it for research and analysis.<CLICK> There are too many things to consider right now, but a few highlights are such as, ‘are there gaps in the collection?’, ‘can they still be accessed?’, but perhaps most important of all is whether the curator or a human being who knows about the collection is still around who could be asked about it. Our experience has told us that so much will probably be in their head that isn’t written down, information that could be vital, important and useful for knowing about before carrying out research or re-use.
  8. <click>The British Library faces many challenges of access to our Digital collections! <click> Sometimes digital content is only available onsite due to license restrictions, <click>or even only on a specific computer in a reading room! Technically there are very few reasons why digital content can’t be online <click> though it might be too big or hasn’t been transferred from other digital storage media. <click>Sometimes access is through a paywall. Finally, <click>some content is in the happy sunny place, online, open and freely available. The real reasons why there are challenges to accessing digital content are of course human. They require different approaches from the Library and may often involve an honest, open dialogue and negotiation with the publishers. The Labs project has tried to address this problem my creating a ‘residency model’ for researchers to work intensively with a digital collection on-site, so as to not infringe access conditions, I will say more about this later.
  9. 24 seconds (72 words) Let’s look a little further at the types of interactions we have with our researchers. We have summarised these phases as ‘Exploration’ where people often ‘rethink’ their ideas of what they want to do with the data, ‘Query-Focused’ where they often have to iterate to come up with a realistic proposal of what they want to do and a ‘Wrap-up’ phase to end their project with us, if it is relevant.
  10. 970 files from a selection of 19th century newspaper titles from the BL corpus for us to correct using the overProof post-OCR correction software The best way to measure the improvement made by the correction process is to compare the OCR'ed text and the automatically corrected text with a perfect correction made by a human (known as the "ground truth"). Hannah-Rose's 5 small human-corrected samples are show as green dots. These are not only smaller than the other files, but their raw error rate is much lower at 13.3%. OverProof was measured as reducing this to 5.4%, a removal of almost 60% of errors. The red dotted-line indicates the correction "break-even" point: the further under the line, the better the quality of the document after correction. In the graph below, the grey line shows distribution of files across error rates before correction and the green line after correction.
  11. Posts small illustrations taken almost at random from the digitised book corpus to a Tumblr blog. This experiment with undirected engagement was a by-product of work to uncover the hidden wealth of illustrations within the digitised pages.
  12. 50 seconds Here is the anatomy of a Flickr record, importantly we have created links to many of the Library’s services <click>some of this lovely traffic is going back to the Library and hopefully generating more interest in our services, from downloading a pdf of the book to purchasing a high res scan of the image. <click>Tags are added from the original book record, including the approximate page number the image came from<click>users of Flickr can add their own tags, and I have mentioned they have already started doing it.