SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
A Library Data Management Platform 
Based on Linked Open Data 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
25 November, 2014 
Jens Mittelbach | Robert Glaß
A Library Data Management Platform Based on Linked Open Data 
 Back in Those Days 
 The Age of Discovery 
 Library Data Management 
 Qualify, Link and Free Your Data: D:SWARM 
 Live Demo 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
D:SWARM 
25 November 2014 | Page 2 
Dr. Jens Mittelbach
Data Heterogeneity 
 Multiple individual data silos 
• ILS, document repositories, databases, … 
 Data saved in heterogeneous formats 
• MAB, MARC21, … 
 Each data silo gets processed individually 
• Multiple admin interfaces 
• Multiple search interfaces 
• Data unrelated to one another 
 Comprehensive view of resources almost 
impossible (for users and librarians) 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
Back in Those Days … 
09 December 2014 | Page 3 
Dr. Jens Mittelbach
Data Normalization 
 More comprehensive view of 
resources for users, but no real 
discovery/exploration 
 Data gets normalized into one 
storage but not integrated 
 Data available in record-oriented 
structures 
• External data (e.g. GND) has to 
be squeezed in the record 
• Metadata records are 
independent of each other 
• No explicit semantic quality of 
data 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
The Age of “Discovery” 
09 December 2014 | Page 4 
Dr. Jens Mittelbach
Library Data Management 
What Libraries Actually Need 
 Get rid of data silos 
• Open formats for exchange 
 Lossless data integration instead of reductive 
normalization 
 Data integration with entity level granularity 
• Get rid of pre-compiled data records 
 Focus on linking entities/objects: 
• Graph structures creating the knowledge 
graph 
 Stick to quality policy of libraries 
• Versioning and provenance of data 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 5 
Dr. Jens Mittelbach 
Library Data
Library Data Management 
What Should Library Data Actually Look Like? 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 6 
Dr. Jens Mittelbach
Library Data Management 
Whose Job Is Library Data Integration? 
 Data integration should be done by domain experts 
• Librarians, not IT staff (IT always understaffed) 
• Programming skills should not be a requirement 
• Good user experience is a prerequisite for adoption 
 Example driven modelling approach 
 Value created in the community should be reusable 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 7 
Dr. Jens Mittelbach
Library Data Management 
What Tools Do We Need? 
Our Approach: An Open Source Data Management Platform 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 8 
Dr. Jens Mittelbach
Library Data Management 
How Can Data Integration Be Done? 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 9 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Who’s behind this Project? 
 Collaborative development team of SLUB Dresden and Avantgarde Labs GmbH 
 Started work in June 2013 
 Funded from the European Regional Development Fund (ERDF) 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 10 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Our Challenge: Existing Data Formats: MAB, MARC 
• „selection of keywords“ 
• Relevant MAB fields are 902x, 907x, 912x, 917x, 922x. 
• These fields have subfields a, b, c, … coded with 
further information (type of keyword, person, time, 
place, concept...) 
• From field 902x to field 922x we have to check 
• If in subfield "a" there is one of these strings 
(800|801|820|830|845|850|860|870|880)? 
• If so, is there one of these strings (c|g|k|p|s|t|z) in 
subfield "b“? 
• If so, the value in subfield "c“ qualifies as a keyword 
• Keyword needs to be trimmed (which is the easiest 
part) 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 11 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Our Challenge: Existing Tools: Talend 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 12 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Our Challenge: Existing Tools: Open Refine 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 13 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
What Is D:SWARM? 
 Graphical web based ETL modelling tool that serves to: 
• import data from heterogeneous sources with different formats 
• map input to output schemata and design transformation workflows 
• load transformed data into property graph database 
 With additional functionalities: 
• Exporting of data models as RDF 
• Sharing mappings and transformation workflows 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 14 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
How Does D:SWARM Work? 
 Modelling GUI and job repository 
 Execution environment 
• Operational data from heterogeneous data sources (ILS, OAI-PMH, CSV …) 
get processed according to the transformation logics defined in modelling GUI 
 Admin centre 
• Scheduling & execution planning 
• Monitoring of system (data ingest, processing, errors) 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 15 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Why a Property Graph? 
 Node (S) – Edge (P) – Node (O) 
 Extension of RDF data model - each element can be 
endowed with additional information (key : value) 
• Version number 
• Provenance information 
• Type information 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 16 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Intermediate Results as of November 2014 
 Modelling GUI in 2nd version 
• Available file importer: XML, CSV, MABXML 
• Simple schema editor & graphic schema mapper 
• Transformation workflow designer & filter (Metafacture) 
 Execution of mappings and transformations in modelling GUI 
 Persistence in graph database (Neo4J) 
 Exporter: Turtle, N-Quads, N3, … 
 Publication under Open Source licence (Apache 2): https://github.com/dswarm 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
9 December 2014 | Page 17 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Live Demo 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 18 
Dr. Jens Mittelbach 
http://demo.dswarm.org
Qualify, Link and Free Your Data: D:SWARM 
Our Next Steps 
 Provision of URI templates for resource matching and linking 
 Scalable execution engine for production mode 
 Extension of transformation function set 
 Extension of importers 
 Implementation of an administration centre 
 Deduplication and FRBRization 
 Integration of SLUBsemantics Enrichtment Service 
 Implementation of sharing features 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 19 
Dr. Jens Mittelbach
Qualify, Link and Free Your Data: D:SWARM 
Your Next Steps 
 Follow us on twitter.com/dswarm or www.dswarm.org or github.com/dswarm 
 Try it out and get in contact with us 
• http://demo.dswarm.org 
• https://github.com/dswarm/dswarm-documentation/wiki 
• team@dswarm.org 
 Help us prioritize our backlog 
• https://jira.slub-dresden.de/ 
 Fork us on github.com/dswarm 
SLUB Dresden slub-dresden.de 
CC BY-SA 4.0 
Avantgarde Labs 
Robert Glaß 
09 December 2014 | Page 20 
Dr. Jens Mittelbach

Contenu connexe

Tendances

Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Peter Löwe
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Stefan Dietze
 
Web at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open DataWeb at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open DataAI4BD GmbH
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experimentWARCnet
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataPascal-Nicolas Becker
 
Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...
Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...
Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...Michèle Furer-Benedetti
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...semanticsconference
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...WARCnet
 
Elasticsearch: a key element of Invenio 3
Elasticsearch: a key element of Invenio 3Elasticsearch: a key element of Invenio 3
Elasticsearch: a key element of Invenio 3Johnny Mariethoz
 
Open Data Masterclass - Europeana and LOD
Open Data Masterclass - Europeana and LODOpen Data Masterclass - Europeana and LOD
Open Data Masterclass - Europeana and LODAntoine Isaac
 
Making social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataMaking social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataAlbert Meroño-Peñuela
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebPascal-Nicolas Becker
 
Introduction to Annotation, Content Search, and IIIF Authentication from the ...
Introduction to Annotation, Content Search, and IIIF Authentication from the ...Introduction to Annotation, Content Search, and IIIF Authentication from the ...
Introduction to Annotation, Content Search, and IIIF Authentication from the ...Glen Robson
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Mind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingMind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingSimeon Warner
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEnno Meijers
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichmentsemanticsconference
 

Tendances (20)

Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)
 
Web at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open DataWeb at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open Data
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...
Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...
Réseaux de bibliothèques à l'ère du cloud : que partager ? comment travailler...
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
 
Elasticsearch: a key element of Invenio 3
Elasticsearch: a key element of Invenio 3Elasticsearch: a key element of Invenio 3
Elasticsearch: a key element of Invenio 3
 
Open Data Masterclass - Europeana and LOD
Open Data Masterclass - Europeana and LODOpen Data Masterclass - Europeana and LOD
Open Data Masterclass - Europeana and LOD
 
Making social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataMaking social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked data
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic Web
 
Linked Data
Linked DataLinked Data
Linked Data
 
Introduction to Annotation, Content Search, and IIIF Authentication from the ...
Introduction to Annotation, Content Search, and IIIF Authentication from the ...Introduction to Annotation, Content Search, and IIIF Authentication from the ...
Introduction to Annotation, Content Search, and IIIF Authentication from the ...
 
Wikidata
WikidataWikidata
Wikidata
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Mind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingMind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvesting
 
Finding Data Sets
Finding Data SetsFinding Data Sets
Finding Data Sets
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 

En vedette

Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...
Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...
Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...Felix Lohmeier
 
Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)
Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)
Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)Felix Lohmeier
 
Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...
Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...
Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...Kissmetrics on SlideShare
 
Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...
Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...
Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...Antoine Gay
 
MarketView Marketing Database Platform | Data Services, Inc.
MarketView Marketing Database Platform | Data Services, Inc.MarketView Marketing Database Platform | Data Services, Inc.
MarketView Marketing Database Platform | Data Services, Inc.Data Services, Inc.
 
Falcon - Data Management Platform on Hadoop (Beyond ETL)
Falcon - Data Management Platform on Hadoop (Beyond ETL)Falcon - Data Management Platform on Hadoop (Beyond ETL)
Falcon - Data Management Platform on Hadoop (Beyond ETL)DataWorks Summit
 
Dmp essential
Dmp essentialDmp essential
Dmp essentialBe2See.
 
What Is a Data Management Platform and Why You Should Care?
What Is a Data Management Platform and Why You Should Care?What Is a Data Management Platform and Why You Should Care?
What Is a Data Management Platform and Why You Should Care?IgnitionOne
 
Bluekai: Data Management Platforms (dmp) for Publishers
Bluekai: Data Management Platforms (dmp) for PublishersBluekai: Data Management Platforms (dmp) for Publishers
Bluekai: Data Management Platforms (dmp) for PublishersBrian Crotty
 
DMP Data Management Platform
DMP Data Management PlatformDMP Data Management Platform
DMP Data Management PlatformAvinash Tiwary
 
The DMP 101 - Data Management Platforms Explained
The DMP 101 - Data Management Platforms ExplainedThe DMP 101 - Data Management Platforms Explained
The DMP 101 - Data Management Platforms ExplainedEddy Widerker
 

En vedette (11)

Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...
Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...
Schlanke Discovery-Lösung auf Basis von TYPO3. Der neue Bibliothekskatalog de...
 
Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)
Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)
Open Source Bibliotheksmanagement (mit D:SWARM + AMSL)
 
Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...
Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...
Setting Yourself up for Success: Building an Analytics Schema and Data Dictio...
 
Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...
Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...
Atelier "Comment Epater votre direction avec votre projet DMP" avec TagComman...
 
MarketView Marketing Database Platform | Data Services, Inc.
MarketView Marketing Database Platform | Data Services, Inc.MarketView Marketing Database Platform | Data Services, Inc.
MarketView Marketing Database Platform | Data Services, Inc.
 
Falcon - Data Management Platform on Hadoop (Beyond ETL)
Falcon - Data Management Platform on Hadoop (Beyond ETL)Falcon - Data Management Platform on Hadoop (Beyond ETL)
Falcon - Data Management Platform on Hadoop (Beyond ETL)
 
Dmp essential
Dmp essentialDmp essential
Dmp essential
 
What Is a Data Management Platform and Why You Should Care?
What Is a Data Management Platform and Why You Should Care?What Is a Data Management Platform and Why You Should Care?
What Is a Data Management Platform and Why You Should Care?
 
Bluekai: Data Management Platforms (dmp) for Publishers
Bluekai: Data Management Platforms (dmp) for PublishersBluekai: Data Management Platforms (dmp) for Publishers
Bluekai: Data Management Platforms (dmp) for Publishers
 
DMP Data Management Platform
DMP Data Management PlatformDMP Data Management Platform
DMP Data Management Platform
 
The DMP 101 - Data Management Platforms Explained
The DMP 101 - Data Management Platforms ExplainedThe DMP 101 - Data Management Platforms Explained
The DMP 101 - Data Management Platforms Explained
 

Similaire à d:swarm - A Library Data Management Platform Based on a Linked Open Data Approach

FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014:  Social Network Benchmark (SNB) Graph GeneratorFOSDEM 2014:  Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014: Social Network Benchmark (SNB) Graph GeneratorLDBC council
 
Neo4j GraphTalk Basel - Building intelligent Software with Graphs
Neo4j GraphTalk Basel - Building intelligent Software with GraphsNeo4j GraphTalk Basel - Building intelligent Software with Graphs
Neo4j GraphTalk Basel - Building intelligent Software with GraphsNeo4j
 
Data-as-a-Service: DataGraft
Data-as-a-Service: DataGraftData-as-a-Service: DataGraft
Data-as-a-Service: DataGraftdapaasproject
 
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePointSemantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePointDIQA Projektmanagement GmbH
 
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with GraphsNeo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with GraphsNeo4j
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainNeo4j
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAlberto Diaz Martin
 
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise ArchitectsNeo4j
 
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellenGraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellenNeo4j
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_trainingssouthwick
 
IRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New ContentIRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New ContentMartin Sykora
 
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02BIWUG
 
How to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePointHow to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePointJoris Poelmans
 
Chocolate Flavoured Data Science
Chocolate Flavoured Data ScienceChocolate Flavoured Data Science
Chocolate Flavoured Data ScienceThilo Stadelmann
 
Data Visualisation Workshop (GovHack Brisbane 2014)
Data Visualisation Workshop (GovHack Brisbane 2014)Data Visualisation Workshop (GovHack Brisbane 2014)
Data Visualisation Workshop (GovHack Brisbane 2014)Anna Gerber
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 
Using graphs for recommendations
Using graphs for recommendationsUsing graphs for recommendations
Using graphs for recommendationsRik Van Bruggen
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?Neo4j
 

Similaire à d:swarm - A Library Data Management Platform Based on a Linked Open Data Approach (20)

FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014:  Social Network Benchmark (SNB) Graph GeneratorFOSDEM 2014:  Social Network Benchmark (SNB) Graph Generator
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
 
Neo4j GraphTalk Basel - Building intelligent Software with Graphs
Neo4j GraphTalk Basel - Building intelligent Software with GraphsNeo4j GraphTalk Basel - Building intelligent Software with Graphs
Neo4j GraphTalk Basel - Building intelligent Software with Graphs
 
Data-as-a-Service: DataGraft
Data-as-a-Service: DataGraftData-as-a-Service: DataGraft
Data-as-a-Service: DataGraft
 
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePointSemantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
 
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with GraphsNeo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
 
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to Graphs
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise Architects
 
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellenGraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_training
 
IRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New ContentIRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New Content
 
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
 
How to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePointHow to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePoint
 
Chocolate Flavoured Data Science
Chocolate Flavoured Data ScienceChocolate Flavoured Data Science
Chocolate Flavoured Data Science
 
Data Visualisation Workshop (GovHack Brisbane 2014)
Data Visualisation Workshop (GovHack Brisbane 2014)Data Visualisation Workshop (GovHack Brisbane 2014)
Data Visualisation Workshop (GovHack Brisbane 2014)
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
Using graphs for recommendations
Using graphs for recommendationsUsing graphs for recommendations
Using graphs for recommendations
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?
 

Plus de Jens Mittelbach

New work culture and the social intranet
New work culture and the social intranetNew work culture and the social intranet
New work culture and the social intranetJens Mittelbach
 
Prozessoptimierung in Bibliotheken — Transparenz durch Visualisierung
Prozessoptimierung in Bibliotheken — Transparenz durch VisualisierungProzessoptimierung in Bibliotheken — Transparenz durch Visualisierung
Prozessoptimierung in Bibliotheken — Transparenz durch VisualisierungJens Mittelbach
 
Wissenschaft und Wissen schaffen - Informationskompetenz als Metakompetenz
Wissenschaft und Wissen schaffen - Informationskompetenz als MetakompetenzWissenschaft und Wissen schaffen - Informationskompetenz als Metakompetenz
Wissenschaft und Wissen schaffen - Informationskompetenz als MetakompetenzJens Mittelbach
 
Modernes Datenmanagement: Linked Open Data und die offene Bibliothek
Modernes Datenmanagement: Linked Open Data und die offene BibliothekModernes Datenmanagement: Linked Open Data und die offene Bibliothek
Modernes Datenmanagement: Linked Open Data und die offene BibliothekJens Mittelbach
 
Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...
Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...
Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...Jens Mittelbach
 
Resource Discovery neu definiert
Resource Discovery neu definiertResource Discovery neu definiert
Resource Discovery neu definiertJens Mittelbach
 
Eine Wissensbar für die SLUB Dresden
Eine Wissensbar für die SLUB DresdenEine Wissensbar für die SLUB Dresden
Eine Wissensbar für die SLUB DresdenJens Mittelbach
 
Projekt Integration DACHELA 2011 St. Gallen
Projekt Integration  DACHELA 2011 St. GallenProjekt Integration  DACHELA 2011 St. Gallen
Projekt Integration DACHELA 2011 St. GallenJens Mittelbach
 

Plus de Jens Mittelbach (9)

New work culture and the social intranet
New work culture and the social intranetNew work culture and the social intranet
New work culture and the social intranet
 
Prozessoptimierung in Bibliotheken — Transparenz durch Visualisierung
Prozessoptimierung in Bibliotheken — Transparenz durch VisualisierungProzessoptimierung in Bibliotheken — Transparenz durch Visualisierung
Prozessoptimierung in Bibliotheken — Transparenz durch Visualisierung
 
Wissenschaft und Wissen schaffen - Informationskompetenz als Metakompetenz
Wissenschaft und Wissen schaffen - Informationskompetenz als MetakompetenzWissenschaft und Wissen schaffen - Informationskompetenz als Metakompetenz
Wissenschaft und Wissen schaffen - Informationskompetenz als Metakompetenz
 
Modernes Datenmanagement: Linked Open Data und die offene Bibliothek
Modernes Datenmanagement: Linked Open Data und die offene BibliothekModernes Datenmanagement: Linked Open Data und die offene Bibliothek
Modernes Datenmanagement: Linked Open Data und die offene Bibliothek
 
Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...
Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...
Ein rundes Service-Konzept: Vermittlung von Informationskompetenz und der For...
 
Resource Discovery neu definiert
Resource Discovery neu definiertResource Discovery neu definiert
Resource Discovery neu definiert
 
Eine Wissensbar für die SLUB Dresden
Eine Wissensbar für die SLUB DresdenEine Wissensbar für die SLUB Dresden
Eine Wissensbar für die SLUB Dresden
 
A Map to Go
A Map to GoA Map to Go
A Map to Go
 
Projekt Integration DACHELA 2011 St. Gallen
Projekt Integration  DACHELA 2011 St. GallenProjekt Integration  DACHELA 2011 St. Gallen
Projekt Integration DACHELA 2011 St. Gallen
 

Dernier

Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfInfopole1
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTxtailishbaloch
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and businessFrancesco Corti
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Libraryshyamraj55
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationKnoldus Inc.
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfTejal81
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTopCSSGallery
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4DianaGray10
 

Dernier (20)

Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdf
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and business
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Library
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its application
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development Companies
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4
 

d:swarm - A Library Data Management Platform Based on a Linked Open Data Approach

  • 1. A Library Data Management Platform Based on Linked Open Data SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 25 November, 2014 Jens Mittelbach | Robert Glaß
  • 2. A Library Data Management Platform Based on Linked Open Data  Back in Those Days  The Age of Discovery  Library Data Management  Qualify, Link and Free Your Data: D:SWARM  Live Demo SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß D:SWARM 25 November 2014 | Page 2 Dr. Jens Mittelbach
  • 3. Data Heterogeneity  Multiple individual data silos • ILS, document repositories, databases, …  Data saved in heterogeneous formats • MAB, MARC21, …  Each data silo gets processed individually • Multiple admin interfaces • Multiple search interfaces • Data unrelated to one another  Comprehensive view of resources almost impossible (for users and librarians) SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß Back in Those Days … 09 December 2014 | Page 3 Dr. Jens Mittelbach
  • 4. Data Normalization  More comprehensive view of resources for users, but no real discovery/exploration  Data gets normalized into one storage but not integrated  Data available in record-oriented structures • External data (e.g. GND) has to be squeezed in the record • Metadata records are independent of each other • No explicit semantic quality of data SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß The Age of “Discovery” 09 December 2014 | Page 4 Dr. Jens Mittelbach
  • 5. Library Data Management What Libraries Actually Need  Get rid of data silos • Open formats for exchange  Lossless data integration instead of reductive normalization  Data integration with entity level granularity • Get rid of pre-compiled data records  Focus on linking entities/objects: • Graph structures creating the knowledge graph  Stick to quality policy of libraries • Versioning and provenance of data SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 5 Dr. Jens Mittelbach Library Data
  • 6. Library Data Management What Should Library Data Actually Look Like? SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 6 Dr. Jens Mittelbach
  • 7. Library Data Management Whose Job Is Library Data Integration?  Data integration should be done by domain experts • Librarians, not IT staff (IT always understaffed) • Programming skills should not be a requirement • Good user experience is a prerequisite for adoption  Example driven modelling approach  Value created in the community should be reusable SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 7 Dr. Jens Mittelbach
  • 8. Library Data Management What Tools Do We Need? Our Approach: An Open Source Data Management Platform SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 8 Dr. Jens Mittelbach
  • 9. Library Data Management How Can Data Integration Be Done? SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 9 Dr. Jens Mittelbach
  • 10. Qualify, Link and Free Your Data: D:SWARM Who’s behind this Project?  Collaborative development team of SLUB Dresden and Avantgarde Labs GmbH  Started work in June 2013  Funded from the European Regional Development Fund (ERDF) SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 10 Dr. Jens Mittelbach
  • 11. Qualify, Link and Free Your Data: D:SWARM Our Challenge: Existing Data Formats: MAB, MARC • „selection of keywords“ • Relevant MAB fields are 902x, 907x, 912x, 917x, 922x. • These fields have subfields a, b, c, … coded with further information (type of keyword, person, time, place, concept...) • From field 902x to field 922x we have to check • If in subfield "a" there is one of these strings (800|801|820|830|845|850|860|870|880)? • If so, is there one of these strings (c|g|k|p|s|t|z) in subfield "b“? • If so, the value in subfield "c“ qualifies as a keyword • Keyword needs to be trimmed (which is the easiest part) SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 11 Dr. Jens Mittelbach
  • 12. Qualify, Link and Free Your Data: D:SWARM Our Challenge: Existing Tools: Talend SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 12 Dr. Jens Mittelbach
  • 13. Qualify, Link and Free Your Data: D:SWARM Our Challenge: Existing Tools: Open Refine SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 13 Dr. Jens Mittelbach
  • 14. Qualify, Link and Free Your Data: D:SWARM What Is D:SWARM?  Graphical web based ETL modelling tool that serves to: • import data from heterogeneous sources with different formats • map input to output schemata and design transformation workflows • load transformed data into property graph database  With additional functionalities: • Exporting of data models as RDF • Sharing mappings and transformation workflows SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 14 Dr. Jens Mittelbach
  • 15. Qualify, Link and Free Your Data: D:SWARM How Does D:SWARM Work?  Modelling GUI and job repository  Execution environment • Operational data from heterogeneous data sources (ILS, OAI-PMH, CSV …) get processed according to the transformation logics defined in modelling GUI  Admin centre • Scheduling & execution planning • Monitoring of system (data ingest, processing, errors) SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 15 Dr. Jens Mittelbach
  • 16. Qualify, Link and Free Your Data: D:SWARM Why a Property Graph?  Node (S) – Edge (P) – Node (O)  Extension of RDF data model - each element can be endowed with additional information (key : value) • Version number • Provenance information • Type information SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 16 Dr. Jens Mittelbach
  • 17. Qualify, Link and Free Your Data: D:SWARM Intermediate Results as of November 2014  Modelling GUI in 2nd version • Available file importer: XML, CSV, MABXML • Simple schema editor & graphic schema mapper • Transformation workflow designer & filter (Metafacture)  Execution of mappings and transformations in modelling GUI  Persistence in graph database (Neo4J)  Exporter: Turtle, N-Quads, N3, …  Publication under Open Source licence (Apache 2): https://github.com/dswarm SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 9 December 2014 | Page 17 Dr. Jens Mittelbach
  • 18. Qualify, Link and Free Your Data: D:SWARM Live Demo SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 18 Dr. Jens Mittelbach http://demo.dswarm.org
  • 19. Qualify, Link and Free Your Data: D:SWARM Our Next Steps  Provision of URI templates for resource matching and linking  Scalable execution engine for production mode  Extension of transformation function set  Extension of importers  Implementation of an administration centre  Deduplication and FRBRization  Integration of SLUBsemantics Enrichtment Service  Implementation of sharing features SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 19 Dr. Jens Mittelbach
  • 20. Qualify, Link and Free Your Data: D:SWARM Your Next Steps  Follow us on twitter.com/dswarm or www.dswarm.org or github.com/dswarm  Try it out and get in contact with us • http://demo.dswarm.org • https://github.com/dswarm/dswarm-documentation/wiki • team@dswarm.org  Help us prioritize our backlog • https://jira.slub-dresden.de/  Fork us on github.com/dswarm SLUB Dresden slub-dresden.de CC BY-SA 4.0 Avantgarde Labs Robert Glaß 09 December 2014 | Page 20 Dr. Jens Mittelbach

Notes de l'éditeur

  1. Where we come from, what we have, what we want to have How we can achieve it with D:SWARM Live presentation of D:SWARM
  2. Data normalization poses quite a number of problems to librarians, admins and users normalisation: at the cost of information richness Deduplication questionable Reliable enrichment only for parts of data Linkage (especially with external resources) is technically not possible (at this stage) Metadata records are independent of each other (sit side by side in xml silos)
  3. Data integration instead of mere normalization: Harvesting (external) data from a variety of sources and integrate all available information into a knowledge structures Versioning and provenance: this is related to what Markus Krötzsch said in his talk about statements according to the Wikidata data model
  4. Example driven modelling approach: users can directly observe the concrete results of their abstract modelling work
  5. Robert
  6. http://hub.culturegraph.org/resources/static/mab2.pdf