A Secure and Reliable Document Management System is Essential.docx
QB'er demonstration
1. Tool for converting and linking statistical datasets
to a cloud of interconnected historical datasets.
QB’er - Demonstration
Ashkan Ashkpour, IISH – CLARIAH WP4
07-10-2016
2. GOAL OF THIS
PRESENTATION
From CSV files and structured statistical data to (harmonized)
Interlinked data on the Web
Data Tooling Interlinked Datasets on the web
3. • Gather and enter own data
• Find data on multiple repositories
• Download
• Clean and reshape
• Merge
• Clean and reshape…
• Analyse
PROBLEM - Today’s Workflow
4. PROBLEM
Disconnected data and efforts
We keep repeating ourselves and do this repeatedly for the same
datasets
Comparability across time and datasets
6. LOSS OFF..
Provenance
Cleaning efforts (sometimes up to 60% of the work)
Valuable mappings (discarding time consuming prior work)
Expert decisions
Discoverability
8. HARMONIZATION AND RDF
What we want is harmonization by way of;
Standardization and Classification
Flexible approach while providing accountability
9.
10.
11.
12. QB’ER
Empower individual researchers to:
Code and harmonize individual datasets according to best practices of the
community (e.g. HISCO, SDMX, Worldbank, etc.) or against their colleagues
Share their own code lists with fellow researchers
Align code lists across datasets
Publish their standards-compliant datasets on a Structured Data Hub
Collaborative growing of a graph of interconnected datasets
22. TO CONCLUDE…
• Generic, domain-independent tool
• Uploading of a dataset and extraction of variables and value
Frequencies
• Mapping of variable values to codes (while preserving the originals!)
• Publishing of dataset structure as Linked Data
• Align codes and identifiers across datasets
• Provenance of all assertions to the SDH traceable to time and person
• Crowd-based production of code lists and mappings
• Sharing / Reuse other people’s work (or stand on the shoulders of giants)
• No disposable research