1. Bridging research and
collections
Vyacheslav Tykhonov - Software Developer
vty@iisg.nl
http://www.linkedin.com/in/vyacheslavtikhonov
Jerry de Vries - Information Analyst
jvr@iisg.nl
http://nl.linkedin.com/pub/jerry-de-vries/13/751/537
2. Bridging research and collections
25/03/2013
2
This presentation
• Mission statement of IISH
• Adjusting ICT-strategy
• Requirements for software development
• Solutions
– Demo / Proof Of Concepts (POC) of projects & tools
• Questions
3. Bridging research and collections
25/03/2013
3
Mission statement
The IISH conducts historical research on labour relations at a
global scale and to this end collects data, which are made
available to other researchers as well
• Do research
• Search, use, visualize and update data
• Collecting and preserving the data
• Make data available for research and public
4. Bridging research and collections
25/03/2013
4
What is our data?
• Metadata (describing data and collections)
• Scans / Full-text
• Image, sound, movie, books & serials
• Datasets
• Aggregation (Metadata, full-text papers and datasets)
• Analogue, digitized and digital
5. Bridging research and collections
25/03/2013
5
What are our target groups?
We are listening to our target groups:
• Researchers
• Collectors
• Public Audience
We are collecting all ideas and requirements from you!
6. Bridging research and collections
25/03/2013
6
Historical Research Methodology
What is historical research in IISH?
1. Formulation of the research question
2. Data collection and/or literature review
3. Evaluation of materials
4. Data analysis
5. Write and publish articles
6. Sharing datasets
7. Bridging research and collections
25/03/2013
7
Data Collecting Methodology
What is collecting in IISH?
• Collecting data
• Storing data
• Preservation
• Digitization/scanning
• OCR/full text
• Metadata/MARC21/Indexing
• Make data public available in digital infrastructure
8. Bridging research and collections
25/03/2013
8
Customer Development Methodology
What is software development in IISH?
Our target groups are sharing with us:
• Requirements
• Experiments
• Insights and ideas
1. Create prototype based on requirements and ideas
2. Present prototypes of software tools to our target group
3. Improve software tools after feedback from our target
group
9. Bridging research and collections
25/03/2013
9
Where? Who?
IISH
CODI Research
WE
General public
10. Bridging research and collections
25/03/2013
10
Mission statement
The IISH conducts historical research on labour relations at a
global scale and to this end collects data, which are made
available to other researchers as well
Let’s now look into collecting first!
11. Bridging research and collections
25/03/2013
11
Typical collectors requirements
• Describe, index and store Metadata in digital library
system
• Improve Metadata
• Based on computer based analysis and Natural Language Processing tools
• Link Metadata from IISH to other Metadata systems
• Search and discover digitized and digital born materials
• Transform Metadata into research data (datasets)
12. Bridging research and collections
25/03/2013
12
Indexing
Extract entities and store it as terms in Metadata
Collections Scans
Metadata
Manual Automatic
CODI HiTIME
13. Bridging research and collections
25/03/2013
13
Automatic indexing example
Input from scan:
Founded at the initiative of Vladimir I. Lenin in 1901 in Switzerland after
the Second Congress of the RSDRP in 1903 the League became the
main bulwark of Menshevism abroad until it disbanded in 1905.
Metadata linked with Evergreen Authorities:
Vladimir;;VladiMir;;566353;;Personal Name
Congress;;Video Congress;;316063;;Meeting Name
Lenin;;Lenin;;570134;;Uniform Title
Switzerland;;Switzerland;;350823;;Geographic Name
Second Congress of the RSDRP;;411162;;Meeting Name
14. Bridging research and collections
25/03/2013
14
Solutions for collectors
• Evergreen Library System
Product
• Metadata management
Product
• Metadata reports
Product
• Evergreen OAI protocol
Product
• Text analyzing tools (collectors & researchers)
Prototype API
15. Bridging research and collections
25/03/2013
15
Project overview: Evergreen
Collectors Metadata Storage System
• Perfect library solution to store Metadata in MARC21
standard
• Open-Source License (free of charge for usage)
• Flexible and Powerful solution, works with millions of
MARC records
• Export of all data in OAI-PMH protocol to link data with
other systems
• Visualization tools to present data online
18. Bridging research and collections
25/03/2013
18
Mission statement
Remember:
The IISH conducts historical research on labour relations at a
global scale and to this end collects data, which are made
available to other researchers as well
Let’s do some research!
19. Bridging research and collections
25/03/2013
19
What is historical research?
The process of systematically examining past events to give
an account of what has happened in the past
Why do we conduct historical research?
• To uncover the unknown
• To answer questions
• To identify the relationship that the past has to the present
• To record and evaluate the accomplishments of
individuals, agencies, or institutions
• To assist in understanding the culture in which we live
And much, much, much more…
20. Bridging research and collections
25/03/2013
20
Typical research requirements
Access to information
• Find digital materials relevant for research
• Search information stored in Metadata
• Poor quality of Metadata = Poor quality of research
• Searching, filtering, navigating, summarization of data
• Analyze papers for research online
• Link materials relevant to research from other sources
• Collection descriptions are relevant to the topic of
research, but papers aren't
21. Bridging research and collections
25/03/2013
21
Typical research requirements
Datasets
Store datasets in a digital infrastructure to answer research
questions
• Use best practice for visualization of datasets
• Generate custom datasets for new research
• Combine/compare datasets in time and/or place
• Share datasets with other researchers (collaboration and
crowdsourcing)
22. Bridging research and collections
25/03/2013
22
General goal of research
All Data
Possibly relevant
Data
Definately relevant
Data
Structured
Knowledge
23. Bridging research and collections
25/03/2013
23
Sharing your research
• Publish scientific articles on websites relevant to the topic
of research
• Share research datasets with other researchers
• Generate charts and maps in real-time in digital
infrastructure based on live data
• Publish in articles and share on Wikipedia and other popular websites
• Make biographies of famous people more attractive with
timelines of visual materials
24. Bridging research and collections
25/03/2013
24
Indexing (keywords)
For researchers
• Researchers publishing keywords in the beginning of
every research paper
• Keyword in research paper = Index term in Metadata
• Keywords from papers stored as Metadata in library
system
• Keywords used in text analyzing systems to create links
with other papers on the same topic
25. Bridging research and collections
25/03/2013
25
Solutions for researchers
Data: Datasets:
• Search engines • Maps
Prototype Product
• Linked data • Charts
Product
Prototype
• Timelines • Visual Library System
Prototype
Prototype
29. Bridging research and collections
25/03/2013
29
Linked data for collectors
• 500000+ authority records in IISH collection
• Bibliographic records linked to authorities by collectors
• Link bibliographic records to authorities automatically in
real time with Authority Linking Module
• Import Metadata from other sources (Google Books,
WorldCat, etc) and link with our authorities
30. Bridging research and collections
25/03/2013
30
Linked data for researchers
Metadata from IISH is available for harvesting:
• Search (search.socialhistory.org)
• OCLC's WorldCat
• Europeana
• Nederlab and other projects
Link authorities from Evergreen automatically to all other
systems to get more data for doing research
31. Bridging research and collections
25/03/2013
31
Project overview: Clio Infra
• Datasets Storage System
• Online Visualization of Datasets:
• maps, charts, timeline
• Tools to compare data for different countries in time
• Export of custom datasets
32. Bridging research and collections
25/03/2013
32
Project overview: HiTIME
Text Analyzing System
• Matching/linking of authority records from other systems:
• Locations
• Persons Named
• Organizations
• Dates
Entities
• NLP tools to recognize unknown entities
• Export to library as Metadata
• Visualization of Metadata on timelines, maps, charts
33. Bridging research and collections
25/03/2013
33
Workflow
Presentation Research
Metadata system
Storage
34. Bridging research and collections
25/03/2013
34
What have we seen today?
We are here for you and together we work on:
• Search & Discovery
Metadata searching and filtering, Full-Text Search engines,
Linked Data tools, Research Indexes (Controlled
Vocabularies)
• Visualization
Charts, graphs, timelines, network connections tools
• Analysis
Data Mining, Summarization, Topic Modeling, Tools for
Datasets
35. Bridging research and collections
25/03/2013
35
Questions?
• Feel free to ask now
• Ideas and questions can be sent by email to us
Vyacheslav Tykhonov - Software Developer
vty@iisg.nl
http://www.linkedin.com/in/vyacheslavtikhonov
Jerry de Vries - Information Analyst
jvr@iisg.nl
http://nl.linkedin.com/pub/jerry-de-vries/13/751/537