SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
A Decentralized Approach to Dissemination,
Retrieval, and Archiving of Data
Tobias Kuhn
http://www.tkuhn.org
@txkuhn
Department of Computer Science, VU University Amsterdam
Open Science for an Open Society Workshop
2016 Conference on Complex Systems
Amsterdam, Netherlands
20 September 2016
Increasing Importance of Scientific Data
https://www.google.com/trends/explore#q=%22data%20science%22
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 2 / 15
Scientific Data as Supplemental Material
...
http://www.nature.com/ni/journal/v16/n10/full/ni.3267.html#supplementary-information
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 3 / 15
Scientific Data in Open Repositories
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 4 / 15
We Need Better Data Publishing!
Published data should be:
• Verifiable (Is this really the data I am looking for?)
• Immutable (Can I be sure that it hasn’t been modified?)
• Permanent (Will it be available in 1, 5, 20 years from now?)
• Reliable (Can it be efficiently retrieved whenever needed?)
• Granular (Can I refer to individual data entries?)
• Semantic (Can it be automatically interpreted?)
• Linked (Does it use established identifiers and ontologies?)
• Trustworthy (Can I trust the source?)
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 5 / 15
Requirement: Automated Low-Level Operations
We need automated low-level operations to publish and retrieve data
entries and datasets:
publish <dataset-identifier>
get <dataset-identifier>
(like HTTP POST/GET but verifiable, immutable, permanent, reliable, ...)
Approach: Linked Data + Cryptography + Decentralization
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 6 / 15
Nanopublications: Linked Data Containers for
Provenance-Aware Semantic Publishing
assertion
provenance
publication info
nanopublication
http://nanopub.org / @nanopub org
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 7 / 15
Trusty URIs: Cryptographic Hash Values for
Verifiable and Immutable Web Identifiers
Nanopublications with Trusty URIs are ...
Verifiable
+
Immutable
+
Permanent
.trighttp://example.org/r1. RA 5AbXdpz5DcaYXCh9l3eI9ruBosiL5XDU3rxBbBaUO70
http://trustyuri.net/
Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014.
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 8 / 15
Decentralized and Reliable Publishing with a
Nanopublication Server Network
Nanopublications
with Trusty URIs
Publication
Retrieval
Propagation /
Archiving
http://npmonitor.inn.ac
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 9 / 15
Defining Datasets with Nanopublication Indexes
(which are themselves Nanopublications)
appends
has sub-index
has
element
(a) (b)
(c) (f)
(d) (e)
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 10 / 15
Nanopublication Server Network is
Efficient and Scalable
Our servers can deliver nanopublications about 100 times faster than
when a triple store is used (and need much less resources):
time from start of test in seconds
responsetimeinseconds
0 50 100 150 200 250 3000 50 100 150 200 250 300
0.1
1
10
100
0 20 40 60 80 100
number of clients accessing the service in parallel
Virtuoso triple store with SPARQL endpoint
nanopublication server
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 11 / 15
Nanopublication Datasets
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 12 / 15
Reliable Low-Level Publish/Retrieve Operations!
Operation to publish data:
$ np publish nanopubs.trig
156026 nanopubs published at http://np.inn.ac/
which can also be used to publish dataset definitions (indexes):
$ np publish index.trig
157 nanopubs published at http://np.inn.ac/
Operation to retrieve data entries:
$ np get http://np.inn.ac/RA7Kmmugi8OuCirfe5WKchnJhC3FuhQD
and to retrieve entire datasets:
$ np get -c http://np.inn.ac/RAY lQruuagCYtAcKAPptkY7EpITw
https://github.com/Nanopublication/nanopub-java
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 13 / 15
Future Work
• Improve server protocol
• Develop services on top of the server network
• Establish best practices for versioning, retractions, reviews, etc.
• Connect it all to the scientific publishing workflow
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 14 / 15
Thank you for your attention!
Questions?
Further information:
• Paper on the approach:
https://peerj.com/articles/cs-78/
• Nanopublications: http://nanopub.org
• Trusty URIs: http://trustyuri.net
• Nanopublication Server Network: http://npmonitor.inn.ac
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 15 / 15
Multi-Layer Architecture
applications (analyze/use data)
advanced services (query/analyze data)
core services (find data)
decentralized server network (provide data)
1
Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 16 / 15

Contenu connexe

Tendances

Charper.lawdi.20130531
Charper.lawdi.20130531Charper.lawdi.20130531
Charper.lawdi.20130531
charper
 
Strategic overview, Alastair Dunning, Programme Manager at The European Library
Strategic overview, Alastair Dunning, Programme Manager at The European LibraryStrategic overview, Alastair Dunning, Programme Manager at The European Library
Strategic overview, Alastair Dunning, Programme Manager at The European Library
The European Library
 

Tendances (20)

Making social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked dataMaking social science more reproducible by encapsulating access to linked data
Making social science more reproducible by encapsulating access to linked data
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research funding
 
SSHA 2019: Reconstructring a country
SSHA 2019: Reconstructring a countrySSHA 2019: Reconstructring a country
SSHA 2019: Reconstructring a country
 
Esshc presentation ashkan
Esshc presentation ashkanEsshc presentation ashkan
Esshc presentation ashkan
 
Charper.lawdi.20130531
Charper.lawdi.20130531Charper.lawdi.20130531
Charper.lawdi.20130531
 
The Structured Data Hub in 2019
The Structured Data Hub in 2019The Structured Data Hub in 2019
The Structured Data Hub in 2019
 
About a Mapping Proposal from FRBRoo to SharedCanvas
About a Mapping Proposal from FRBRoo to SharedCanvasAbout a Mapping Proposal from FRBRoo to SharedCanvas
About a Mapping Proposal from FRBRoo to SharedCanvas
 
Biblissima et IIIF (MAE)
Biblissima et IIIF (MAE)Biblissima et IIIF (MAE)
Biblissima et IIIF (MAE)
 
Linked Data past, present and futures
Linked Datapast, present and futuresLinked Datapast, present and futures
Linked Data past, present and futures
 
ESDG seminar 2019: reconstructing a country
ESDG seminar 2019: reconstructing a countryESDG seminar 2019: reconstructing a country
ESDG seminar 2019: reconstructing a country
 
A Comparative Kalendar - DH2013 Presentation
A Comparative Kalendar - DH2013 PresentationA Comparative Kalendar - DH2013 Presentation
A Comparative Kalendar - DH2013 Presentation
 
data - driven journalism 1
 data - driven journalism 1 data - driven journalism 1
data - driven journalism 1
 
OA - Shared Canvas - TEI - Biblissima project
OA - Shared Canvas - TEI - Biblissima projectOA - Shared Canvas - TEI - Biblissima project
OA - Shared Canvas - TEI - Biblissima project
 
creating a trading zone around twitter srchives. case study: paris attacks
creating a trading zone around twitter srchives. case study: paris attackscreating a trading zone around twitter srchives. case study: paris attacks
creating a trading zone around twitter srchives. case study: paris attacks
 
Introduction to Linked Data
Introduction to Linked DataIntroduction to Linked Data
Introduction to Linked Data
 
Evolutionary & Swarm Computing for the Semantic Web
Evolutionary & Swarm Computing for the Semantic WebEvolutionary & Swarm Computing for the Semantic Web
Evolutionary & Swarm Computing for the Semantic Web
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data Arrives
 
Towards a Linked Data Publishing Methodology
Towards a Linked Data Publishing MethodologyTowards a Linked Data Publishing Methodology
Towards a Linked Data Publishing Methodology
 
Strategic overview, Alastair Dunning, Programme Manager at The European Library
Strategic overview, Alastair Dunning, Programme Manager at The European LibraryStrategic overview, Alastair Dunning, Programme Manager at The European Library
Strategic overview, Alastair Dunning, Programme Manager at The European Library
 
Tuesday 5 May 2020: Contextualizing and engaging with Web domains, Valérie Sc...
Tuesday 5 May 2020: Contextualizing and engaging with Web domains, Valérie Sc...Tuesday 5 May 2020: Contextualizing and engaging with Web domains, Valérie Sc...
Tuesday 5 May 2020: Contextualizing and engaging with Web domains, Valérie Sc...
 

Similaire à A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
Tobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
Tobias Kuhn
 

Similaire à A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data (20)

Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Benefits and practice of open science
Benefits and practice of open scienceBenefits and practice of open science
Benefits and practice of open science
 
Research as Hypermedia Narrative
Research as Hypermedia NarrativeResearch as Hypermedia Narrative
Research as Hypermedia Narrative
 
Modern Tools & Rationales for 21st Century Research
Modern Tools & Rationales  for 21st Century ResearchModern Tools & Rationales  for 21st Century Research
Modern Tools & Rationales for 21st Century Research
 
Frankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee ProjeectFrankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee Projeect
 
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
 
Collections as Data National Forum (Elings)
Collections as Data National Forum (Elings)Collections as Data National Forum (Elings)
Collections as Data National Forum (Elings)
 
Imac 090924
Imac 090924Imac 090924
Imac 090924
 
OpenAIRE @ OECD Blue Sky III
OpenAIRE @ OECD Blue Sky IIIOpenAIRE @ OECD Blue Sky III
OpenAIRE @ OECD Blue Sky III
 
ld4dh demo lecture
ld4dh demo lectureld4dh demo lecture
ld4dh demo lecture
 
20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph 20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure Commons
 

Plus de Tobias Kuhn

Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
Tobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
Tobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
Tobias Kuhn
 

Plus de Tobias Kuhn (19)

Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Meme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation NetworksMeme Extraction from Corpora of Scientific Literature using Citation Networks
Meme Extraction from Corpora of Scientific Literature using Citation Networks
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 
AceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic WikiAceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic Wiki
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic Wiki
 
How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic Wikis
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural Languages
 
Wissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischWissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem Englisch
 
An Introduction to AceWiki
An Introduction to AceWikiAn Introduction to AceWiki
An Introduction to AceWiki
 
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsCodeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
 

Dernier

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 

Dernier (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 

A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

  • 1. A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data Tobias Kuhn http://www.tkuhn.org @txkuhn Department of Computer Science, VU University Amsterdam Open Science for an Open Society Workshop 2016 Conference on Complex Systems Amsterdam, Netherlands 20 September 2016
  • 2. Increasing Importance of Scientific Data https://www.google.com/trends/explore#q=%22data%20science%22 Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 2 / 15
  • 3. Scientific Data as Supplemental Material ... http://www.nature.com/ni/journal/v16/n10/full/ni.3267.html#supplementary-information Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 3 / 15
  • 4. Scientific Data in Open Repositories Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 4 / 15
  • 5. We Need Better Data Publishing! Published data should be: • Verifiable (Is this really the data I am looking for?) • Immutable (Can I be sure that it hasn’t been modified?) • Permanent (Will it be available in 1, 5, 20 years from now?) • Reliable (Can it be efficiently retrieved whenever needed?) • Granular (Can I refer to individual data entries?) • Semantic (Can it be automatically interpreted?) • Linked (Does it use established identifiers and ontologies?) • Trustworthy (Can I trust the source?) Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 5 / 15
  • 6. Requirement: Automated Low-Level Operations We need automated low-level operations to publish and retrieve data entries and datasets: publish <dataset-identifier> get <dataset-identifier> (like HTTP POST/GET but verifiable, immutable, permanent, reliable, ...) Approach: Linked Data + Cryptography + Decentralization Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 6 / 15
  • 7. Nanopublications: Linked Data Containers for Provenance-Aware Semantic Publishing assertion provenance publication info nanopublication http://nanopub.org / @nanopub org Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 7 / 15
  • 8. Trusty URIs: Cryptographic Hash Values for Verifiable and Immutable Web Identifiers Nanopublications with Trusty URIs are ... Verifiable + Immutable + Permanent .trighttp://example.org/r1. RA 5AbXdpz5DcaYXCh9l3eI9ruBosiL5XDU3rxBbBaUO70 http://trustyuri.net/ Kuhn, Dumontier. Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data. ESWC 2014. Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 8 / 15
  • 9. Decentralized and Reliable Publishing with a Nanopublication Server Network Nanopublications with Trusty URIs Publication Retrieval Propagation / Archiving http://npmonitor.inn.ac Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 9 / 15
  • 10. Defining Datasets with Nanopublication Indexes (which are themselves Nanopublications) appends has sub-index has element (a) (b) (c) (f) (d) (e) Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 10 / 15
  • 11. Nanopublication Server Network is Efficient and Scalable Our servers can deliver nanopublications about 100 times faster than when a triple store is used (and need much less resources): time from start of test in seconds responsetimeinseconds 0 50 100 150 200 250 3000 50 100 150 200 250 300 0.1 1 10 100 0 20 40 60 80 100 number of clients accessing the service in parallel Virtuoso triple store with SPARQL endpoint nanopublication server Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 11 / 15
  • 12. Nanopublication Datasets Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 12 / 15
  • 13. Reliable Low-Level Publish/Retrieve Operations! Operation to publish data: $ np publish nanopubs.trig 156026 nanopubs published at http://np.inn.ac/ which can also be used to publish dataset definitions (indexes): $ np publish index.trig 157 nanopubs published at http://np.inn.ac/ Operation to retrieve data entries: $ np get http://np.inn.ac/RA7Kmmugi8OuCirfe5WKchnJhC3FuhQD and to retrieve entire datasets: $ np get -c http://np.inn.ac/RAY lQruuagCYtAcKAPptkY7EpITw https://github.com/Nanopublication/nanopub-java Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 13 / 15
  • 14. Future Work • Improve server protocol • Develop services on top of the server network • Establish best practices for versioning, retractions, reviews, etc. • Connect it all to the scientific publishing workflow Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 14 / 15
  • 15. Thank you for your attention! Questions? Further information: • Paper on the approach: https://peerj.com/articles/cs-78/ • Nanopublications: http://nanopub.org • Trusty URIs: http://trustyuri.net • Nanopublication Server Network: http://npmonitor.inn.ac Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 15 / 15
  • 16. Multi-Layer Architecture applications (analyze/use data) advanced services (query/analyze data) core services (find data) decentralized server network (provide data) 1 Tobias Kuhn, Department of Computer Science, VU University Amsterdam Decentralized Data Publishing 16 / 15