SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
The Structured Data
Hub
Today’s fiction, 2019’s reality
Status quo
Many datasets currently live in isolation. They are stored on people’s computers and are not findable. Moreover little effort is given to link such datasets. When data is
being linked, it requires cleaning and harmonising the datasets, which is very time intensive. More importantly, such linkage efforts are seldom shared, literally providing
‘disposable research’
What we envisage
Is to select core micro, meso and macro datasets from the field of economic and social history and create a structured data hub from those.
What we envisage
Structured Data
Hub
Your data
Tooling
WWW
Next to allow you to connect your data and allow you to build such connections yourself, while we will ensure your data is findable and linkable to other datasets on the
(semantic) world wide web.
The Structured Data Hub
A place to
store data
augment data
link data
find data
ask questions! (for data analysis and visualization)
So, the structure data hub is a place to …. Now let’s go into more detail for some of these aspects.
Data augmentation
A first feature of the Structured Data Hub, is augmentation. With augmentation we refer to the process of enhancing your data with core variables from social,
demographic and economic sciences.
For example, think of this datasets containing individual characteristics, including occupation and HISCO code. If we wanted to know whether these person were
incumbents of high or low occupations we would needed to add a stratification measure.
Here, we add the universal HISCAM scale, but any other HISCO based stratification scale or class measure can be added.
We might also be interested in the area where people are working, here indicated by the place variable. If we wanted to map such values, or calculate distances between
these places, we would need information on the latitude and longitude.
Another type of data augmentation concerns the application of basic calculus to derive new variables. Income for example, is seldom analysed in its raw form, and is
often rescaled using a log transformation.
The Structured Data Hub facilitates in the creation and documentation of such newly derived variables.
Provenance tracking
A second feature of the Data Hub is traceable provenance. Currently bigger datasets such as Clio-Infra consists of a core part derived from a bigger statistical agency,
combined with many smaller datasets as well as ‘corrections’ of the data by the researcher. After an iteration it is hard to track who contributed what, or which number
was changed by whom for what reason. We therefore present provenance tracking.
version 2version 1
activity =+
The basic formula for provenance we use is that one version leads to the next as the result of an activity.
activity
who
when
what
how
For proper provenance it is crucial to describe this activity, at least in the terms of what the activity entailed, how the activity was performed, by whom and in which time
period.
surname occupa+on
Fumes cigar	maker
Bridges civil	engineer
Moves dancer
Bones undertaker
New PID!PID: ab.123 PID: bc.789
- added occupation Bones
- from Gravediggers Vol II
- 2015-12-09A09:30:17
- dai:richard.zijdeman
surname occupa+on
Fumes cigar	maker
Bridges civil	engineer
Moves dancer
Bones
In this example, the occupation for ‘Bones’ is added, which leads to a new version of the data, hence a new PID. Moreover, the action of adding the value for occupation,
is provided with as provenance.
Quality flags
An important aspect to consider when combining data is that datasets will come in various forms of quality.
Quality flags
Allow for quality flags of content
e.g. created by scientists
e.g. peer reviewed (by scientist)
created by public and peer reviewed
We will design a system in which datasets will be accompanied by a ‘quality flag’, an indicator of the trustworthiness of the dataset. This might involve simple reputation
effects, but could also provide more enhanced features, like whether other data confirms the values in this datasets. Work together with sestet on this
Basic visualisation
Focus on visual exploration of data and results
‘Ask’ question and get visual output:
e.g. bar, line graph etc.
get output on map or even as ‘movie’
A final feature that we want to highlight here is to ask questions and receive a ‘visual’ answer. Data visualisations are increasingly present in all sorts of media and our hub
will allow for such visualisations to answer basic questions on historical patterns.
To society and back
From Science to Society
and back
Provide data to public: ‘enthusiasts’, journalists
Have enthusiasts add data to the hub (creating linked
data): e.g. stucadoors dataset, harbour datasets,
railway datasets, etc.
And back: link scientific data to crowd-projects like
dpbedia: enhance occupations with descriptions
The last point we want to make about the structured data hub, is that it is not just for academics, but we provide our tools for a broader audience too. This means that we
assume a lowish level of knowledge of history and technical skills. However, we also believe, that ‘the public’ is making quite interesting datasets from which we may
borrow, as well as may give back to, by enriching those with scientific knowledge.

Contenu connexe

Tendances

Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)OpenAIRE
 
SSHA 2019: Reconstructring a country
SSHA 2019: Reconstructring a countrySSHA 2019: Reconstructring a country
SSHA 2019: Reconstructring a countryRick Mourits
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
VRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffVRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffHeather Seneff
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingAndrea Scharnhorst
 
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'ScienceWorks
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...semanticsconference
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichmentsemanticsconference
 
CESSDA Persistent Identifiers
CESSDA Persistent Identifiers CESSDA Persistent Identifiers
CESSDA Persistent Identifiers vty
 
Web Mining & Text Mining
Web Mining & Text MiningWeb Mining & Text Mining
Web Mining & Text MiningHemant Sharma
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Stefan Dietze
 
Connecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked DataConnecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked DataVictor de Boer
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Robert H. McDonald
 
OpenMinTeD - Repositories in the centre of new scientific knowledge
OpenMinTeD - Repositories in the centre of new scientific knowledgeOpenMinTeD - Repositories in the centre of new scientific knowledge
OpenMinTeD - Repositories in the centre of new scientific knowledgeopenminted_eu
 
Jisc Text Mining Capabilities
Jisc Text Mining CapabilitiesJisc Text Mining Capabilities
Jisc Text Mining Capabilitiesopenminted_eu
 
A landscape survey of Active DMPs
A landscape survey of Active DMPsA landscape survey of Active DMPs
A landscape survey of Active DMPsSarah Jones
 

Tendances (20)

Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
 
SSHA 2019: Reconstructring a country
SSHA 2019: Reconstructring a countrySSHA 2019: Reconstructring a country
SSHA 2019: Reconstructring a country
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
VRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffVRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_Seneff
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research funding
 
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
 
Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...
 
Open University Data
Open University DataOpen University Data
Open University Data
 
LKG Editor Dev
LKG Editor DevLKG Editor Dev
LKG Editor Dev
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
CESSDA Persistent Identifiers
CESSDA Persistent Identifiers CESSDA Persistent Identifiers
CESSDA Persistent Identifiers
 
Web Mining & Text Mining
Web Mining & Text MiningWeb Mining & Text Mining
Web Mining & Text Mining
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)
 
Connecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked DataConnecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked Data
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
 
OpenMinTeD - Repositories in the centre of new scientific knowledge
OpenMinTeD - Repositories in the centre of new scientific knowledgeOpenMinTeD - Repositories in the centre of new scientific knowledge
OpenMinTeD - Repositories in the centre of new scientific knowledge
 
Jisc Text Mining Capabilities
Jisc Text Mining CapabilitiesJisc Text Mining Capabilities
Jisc Text Mining Capabilities
 
A landscape survey of Active DMPs
A landscape survey of Active DMPsA landscape survey of Active DMPs
A landscape survey of Active DMPs
 
Wikidata
WikidataWikidata
Wikidata
 

En vedette

MarinoTech_Corporate_Deck_-Kanchana (1)
MarinoTech_Corporate_Deck_-Kanchana (1)MarinoTech_Corporate_Deck_-Kanchana (1)
MarinoTech_Corporate_Deck_-Kanchana (1)Kanchana K
 
Advancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataAdvancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataRichard Zijdeman
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Rinke Hoekstra
 
Historical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesHistorical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesRichard Zijdeman
 
Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)Richard Zijdeman
 
Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Richard Zijdeman
 
An Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities DataAn Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities DataRinke Hoekstra
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseRinke Hoekstra
 
QBer - Connect your data to the cloud
QBer - Connect your data to the cloudQBer - Connect your data to the cloud
QBer - Connect your data to the cloudRinke Hoekstra
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationRinke Hoekstra
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the WebRinke Hoekstra
 

En vedette (11)

MarinoTech_Corporate_Deck_-Kanchana (1)
MarinoTech_Corporate_Deck_-Kanchana (1)MarinoTech_Corporate_Deck_-Kanchana (1)
MarinoTech_Corporate_Deck_-Kanchana (1)
 
Advancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open DataAdvancing the comparability of occupational data through Linked Open Data
Advancing the comparability of occupational data through Linked Open Data
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
 
Historical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemesHistorical occupational classification and occupational stratification schemes
Historical occupational classification and occupational stratification schemes
 
Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)Introduction into R for historians (part 4: data manipulation)
Introduction into R for historians (part 4: data manipulation)
 
Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010Labour force participation of married women, US 1860-2010
Labour force participation of married women, US 1860-2010
 
An Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities DataAn Ecosystem for Linked Humanities Data
An Ecosystem for Linked Humanities Data
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS case
 
QBer - Connect your data to the cloud
QBer - Connect your data to the cloudQBer - Connect your data to the cloud
QBer - Connect your data to the cloud
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 

Similaire à The Structured Data Hub in 2019

Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPeter Wang
 
Delivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the CloudDelivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the CloudBooz Allen Hamilton
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceMahir Haque
 
Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...
Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...
Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...Dana Gardner
 
BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7Rohit Mittal
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxTake1As
 
sybca-bigdata-ppt.pptx
sybca-bigdata-ppt.pptxsybca-bigdata-ppt.pptx
sybca-bigdata-ppt.pptxcalf_ville86
 
Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery Attivio
 
2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open data2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open dataPeterWinstanley1
 
Travel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingTravel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingIRJET Journal
 

Similaire à The Structured Data Hub in 2019 (20)

1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data Analysis
 
Delivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the CloudDelivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the Cloud
 
Module 1
Module  1Module  1
Module 1
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Essay Database
Essay DatabaseEssay Database
Essay Database
 
Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...
Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...
Complex Carrier Network Performance Data on Vertica Yields Performance and Cu...
 
Datamining
DataminingDatamining
Datamining
 
BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
Database Essay
Database EssayDatabase Essay
Database Essay
 
ETL QA
ETL QAETL QA
ETL QA
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 
sybca-bigdata-ppt.pptx
sybca-bigdata-ppt.pptxsybca-bigdata-ppt.pptx
sybca-bigdata-ppt.pptx
 
BIG DATA AND HADOOP.pdf
BIG DATA AND HADOOP.pdfBIG DATA AND HADOOP.pdf
BIG DATA AND HADOOP.pdf
 
Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery
 
2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open data2014 11-17 crichton institute talk on open data
2014 11-17 crichton institute talk on open data
 
Travel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingTravel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social Networking
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 

Plus de Richard Zijdeman

Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Richard Zijdeman
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Richard Zijdeman
 
grlc. store, share and run sparql queries
grlc. store, share and run sparql queriesgrlc. store, share and run sparql queries
grlc. store, share and run sparql queriesRichard Zijdeman
 
Rijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRichard Zijdeman
 
Data legend dh_benelux_2017.key
Data legend dh_benelux_2017.keyData legend dh_benelux_2017.key
Data legend dh_benelux_2017.keyRichard Zijdeman
 
work in a globalized world
work in a globalized worldwork in a globalized world
work in a globalized worldRichard Zijdeman
 
Examples of digital history at the IISH
Examples of digital history at the IISHExamples of digital history at the IISH
Examples of digital history at the IISHRichard Zijdeman
 
Introduction into R for historians (part 3: examine and import data)
Introduction into R for historians (part 3: examine and import data)Introduction into R for historians (part 3: examine and import data)
Introduction into R for historians (part 3: examine and import data)Richard Zijdeman
 
Introduction into R for historians (part 1: introduction)
Introduction into R for historians (part 1: introduction)Introduction into R for historians (part 1: introduction)
Introduction into R for historians (part 1: introduction)Richard Zijdeman
 
Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Richard Zijdeman
 
Using HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupationsUsing HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupationsRichard Zijdeman
 

Plus de Richard Zijdeman (13)

Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
 
grlc. store, share and run sparql queries
grlc. store, share and run sparql queriesgrlc. store, share and run sparql queries
grlc. store, share and run sparql queries
 
Rijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshopRijpma's Catasto meets SPARQL dhb2017_workshop
Rijpma's Catasto meets SPARQL dhb2017_workshop
 
Data legend dh_benelux_2017.key
Data legend dh_benelux_2017.keyData legend dh_benelux_2017.key
Data legend dh_benelux_2017.key
 
Toogdag 2017
Toogdag 2017Toogdag 2017
Toogdag 2017
 
Basic introduction into R
Basic introduction into RBasic introduction into R
Basic introduction into R
 
work in a globalized world
work in a globalized worldwork in a globalized world
work in a globalized world
 
Examples of digital history at the IISH
Examples of digital history at the IISHExamples of digital history at the IISH
Examples of digital history at the IISH
 
Introduction into R for historians (part 3: examine and import data)
Introduction into R for historians (part 3: examine and import data)Introduction into R for historians (part 3: examine and import data)
Introduction into R for historians (part 3: examine and import data)
 
Introduction into R for historians (part 1: introduction)
Introduction into R for historians (part 1: introduction)Introduction into R for historians (part 1: introduction)
Introduction into R for historians (part 1: introduction)
 
Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)
 
Using HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupationsUsing HISCO and HISCAM to code and analyze occupations
Using HISCO and HISCAM to code and analyze occupations
 

Dernier

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 

Dernier (20)

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

The Structured Data Hub in 2019

  • 1. The Structured Data Hub Today’s fiction, 2019’s reality
  • 2. Status quo Many datasets currently live in isolation. They are stored on people’s computers and are not findable. Moreover little effort is given to link such datasets. When data is being linked, it requires cleaning and harmonising the datasets, which is very time intensive. More importantly, such linkage efforts are seldom shared, literally providing ‘disposable research’
  • 3. What we envisage Is to select core micro, meso and macro datasets from the field of economic and social history and create a structured data hub from those.
  • 4. What we envisage Structured Data Hub Your data Tooling WWW Next to allow you to connect your data and allow you to build such connections yourself, while we will ensure your data is findable and linkable to other datasets on the (semantic) world wide web.
  • 5. The Structured Data Hub A place to store data augment data link data find data ask questions! (for data analysis and visualization) So, the structure data hub is a place to …. Now let’s go into more detail for some of these aspects.
  • 6. Data augmentation A first feature of the Structured Data Hub, is augmentation. With augmentation we refer to the process of enhancing your data with core variables from social, demographic and economic sciences.
  • 7. For example, think of this datasets containing individual characteristics, including occupation and HISCO code. If we wanted to know whether these person were incumbents of high or low occupations we would needed to add a stratification measure.
  • 8. Here, we add the universal HISCAM scale, but any other HISCO based stratification scale or class measure can be added.
  • 9. We might also be interested in the area where people are working, here indicated by the place variable. If we wanted to map such values, or calculate distances between these places, we would need information on the latitude and longitude.
  • 10. Another type of data augmentation concerns the application of basic calculus to derive new variables. Income for example, is seldom analysed in its raw form, and is often rescaled using a log transformation.
  • 11. The Structured Data Hub facilitates in the creation and documentation of such newly derived variables.
  • 12. Provenance tracking A second feature of the Data Hub is traceable provenance. Currently bigger datasets such as Clio-Infra consists of a core part derived from a bigger statistical agency, combined with many smaller datasets as well as ‘corrections’ of the data by the researcher. After an iteration it is hard to track who contributed what, or which number was changed by whom for what reason. We therefore present provenance tracking.
  • 13. version 2version 1 activity =+ The basic formula for provenance we use is that one version leads to the next as the result of an activity.
  • 14. activity who when what how For proper provenance it is crucial to describe this activity, at least in the terms of what the activity entailed, how the activity was performed, by whom and in which time period.
  • 15. surname occupa+on Fumes cigar maker Bridges civil engineer Moves dancer Bones undertaker New PID!PID: ab.123 PID: bc.789 - added occupation Bones - from Gravediggers Vol II - 2015-12-09A09:30:17 - dai:richard.zijdeman surname occupa+on Fumes cigar maker Bridges civil engineer Moves dancer Bones In this example, the occupation for ‘Bones’ is added, which leads to a new version of the data, hence a new PID. Moreover, the action of adding the value for occupation, is provided with as provenance.
  • 16. Quality flags An important aspect to consider when combining data is that datasets will come in various forms of quality.
  • 17. Quality flags Allow for quality flags of content e.g. created by scientists e.g. peer reviewed (by scientist) created by public and peer reviewed We will design a system in which datasets will be accompanied by a ‘quality flag’, an indicator of the trustworthiness of the dataset. This might involve simple reputation effects, but could also provide more enhanced features, like whether other data confirms the values in this datasets. Work together with sestet on this
  • 18. Basic visualisation Focus on visual exploration of data and results ‘Ask’ question and get visual output: e.g. bar, line graph etc. get output on map or even as ‘movie’ A final feature that we want to highlight here is to ask questions and receive a ‘visual’ answer. Data visualisations are increasingly present in all sorts of media and our hub will allow for such visualisations to answer basic questions on historical patterns.
  • 20. From Science to Society and back Provide data to public: ‘enthusiasts’, journalists Have enthusiasts add data to the hub (creating linked data): e.g. stucadoors dataset, harbour datasets, railway datasets, etc. And back: link scientific data to crowd-projects like dpbedia: enhance occupations with descriptions The last point we want to make about the structured data hub, is that it is not just for academics, but we provide our tools for a broader audience too. This means that we assume a lowish level of knowledge of history and technical skills. However, we also believe, that ‘the public’ is making quite interesting datasets from which we may borrow, as well as may give back to, by enriching those with scientific knowledge.