SlideShare a Scribd company logo
1 of 31
The Expanding
Dataverse
Mercè Crosas, Director of Data Science, IQSS
@mercecrosas
January 21, 2015, Lamont Library, Harvard University
Data Publishing: A form of
Scholarly Communication
350 years
of scientific
publishing,
with words
and data
1665
Data, if any, were part of the printed publication
Now
Vast quantities of digital data (and code) cannot
be part of the printed publication
Pillars of Data Publishing
To make data discoverable, accessible and
reusable, we need:
1. Data Citation, to reference and find data
2. Data Repositories, to host and access data
3. Information about the data, to understand
and reuse them
Dataverse Software:
A Data Publishing framework
… for a wide range of repositories
Public, Generic
Repositories
Institutional
Repositories
Curated Data Archives
Repositories
http://dataverse.org
Dataverse 4.0: Enables and
Enhances Data Publishing
● A data citation compliant with the Data
Citation Principles
● Rich metadata to describe and find datasets
from multiple domains
● Support for public and restricted data,
open data license and terms of use
● Rigorous workflows to publish data, with
support for new versions of the data
Data Citation
A Brief History of Citing Data
1906
Chicago Manual of Style:
author/creator, title, dates,
publisher or distributor
1979
ASBR (“Data File” type)
MARC (machine readable catalog)
Domain Repositories
(e.g., GenBank)
1959
First scientific digital repositories
(e.g. World Data Center, ICPSR)
1999 - Now
Growth of Data Repositories
(e.g., NESSTAR, Dataverse,
Dryad, Figshare, Zenodo)
DOI services for Data
(e.g., DataCite in 2009)
Altman & Crosas, 2013, “The Evolution of Data Citation: From Principles to Implementation” IASSIST Quarterly
2014
Data Citation
Principles
NISO-JATS
revised to
support data
Joint Declaration of Data
Citation Principles
1 Importance
2 Credit and Attribution
3 Evidence
4 Unique Identification
5 Access
6 Persistence
7 Specificity and Verifiability
8 Interoperability and flexibility
https://www.force11.org/datacitation
Data Citation generated by
Dataverse
Principle 2:
Credit and Attribution
Principle 4, 5, 6:
Unique Id Access
Persistence
Principle 7:
Specificity and Verifiability
Principle 8: Interoperability and flexibility:
Repository exports citation metadata in XML, JSON formats
Authors, Year, Dataset Title, DOI, Data Repository, UNF, version
Resolves to landing page with access to
metadata, docs, and data
Altman & King, 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data.
A rigorous
Metadata
Three Metadata Levels
Generic Metadata
Domain Specific
Metadata
File Metadata
Includes data
citation metadata
fields (Examples:
title, authors,
persistent id,
description)
Examples:
● Social Science
Metadata (DDI)
● Life Sciences
(ISA-Tab)
● Astronomy (VO)
Examples (automatic):
● For Tabular Files:
Column information
● For FITS Files:
Header information
Life Science Metadata
Example: Life Sciences Metadata
Example: Astronomy Metadata
Public vs
Restricted
Terms, Licenses and
Restrictions
Public Dataset Dataset with
Restricted Files
Dataset with
Terms of Use
● CC0 License
● Metadata is public
● Files are public
● CC0 License
● Metadata is public
● Files are restricted
● Access Terms are
defined in dataset
● Metadata is public
● Terms of Use are
defined in dataset
(CC0 can’t apply)
● Files might be public
or restricted
Workflows
Draft, Published and
Versions
Draft Dataset
Published
Dataset, v1
Published
Dataset, v1.1
Published
Dataset, v2
Upload
Data
Dataset in review,
can be shared with
collaborators
Once published,
dataset cannot be
unpublished (only
deaccessioned)
Minor version for
small changes to
dataset description
Major version for
new versions of
data files
Data Citation
becomes public
Data Citation
doesn’t change
Data Citation
changes
Draft Draft
Multiple Roles for
Multiple Workflows
Editor
Upload Data +
Edit Metadata
Set File Restrictions +
License and Terms
Grant Access +
Publish Dataset
Upload Data +
Edit Metadata
Upload Data +
Edit Metadata
Set File Restrictions +
License and Terms
Manager
+
+ +
Curator
+ Custom Roles
Data Processing,
Analysis, and
Visualizations
Tabular Data: Converted to Preservation format
Download in Original format or
Preservation format (does not
depend on software package)
Tabular Data: Explore and Analyze with TwoRavens
Geospatial Data: Visualize in WorldMap
Demo acknowledgement: Dwayne Liburd, Sonia Barbosa
Not only Expanding in
Features, but also in Size
874 Dataverses
55,539 Datasets
1,173,733 Downloads
What’s coming
Beyond 4.0
● Integration with other Systems:
o DASH
o ORCID
o Journal Systems (in addition to OJS)
o Archivematica
o iRODS
● Support for Sensitive Data:
o Secure Storage
o DataTags
o Analysis with Privacy Preserving Algorithms
● Data Citation with Dataset Provenance
● Expanding APIs!
A rigorous
Thank You
mcrosas@iq.harvard.edu
@mercecrosas
http://datascience.iq.harvard.edu/team

More Related Content

What's hot

CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...
CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...
CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...
Crossref
 
Barcelona 2014: CrossRef System and Support Update by Chuck Koscher
Barcelona 2014: CrossRef System and Support Update by Chuck KoscherBarcelona 2014: CrossRef System and Support Update by Chuck Koscher
Barcelona 2014: CrossRef System and Support Update by Chuck Koscher
Crossref
 

What's hot (20)

David Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published recordDavid Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published record
 
EDI Training Module 10: EDI Data Repository Overview
EDI Training Module 10:  EDI Data Repository OverviewEDI Training Module 10:  EDI Data Repository Overview
EDI Training Module 10: EDI Data Repository Overview
 
EDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable UnitsEDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable Units
 
Data Publishing and Institutional Repositories
Data Publishing and Institutional RepositoriesData Publishing and Institutional Repositories
Data Publishing and Institutional Repositories
 
Using Funding Data
Using Funding DataUsing Funding Data
Using Funding Data
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
How to make your research data open : presentation held at the VU Open Scienc...
How to make your research data open : presentation held at the VU Open Scienc...How to make your research data open : presentation held at the VU Open Scienc...
How to make your research data open : presentation held at the VU Open Scienc...
 
CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...
CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...
CrossMark: Standardizing Funding Information in Scholarly Journal Articles 20...
 
Data Repositories Impact
Data Repositories ImpactData Repositories Impact
Data Repositories Impact
 
Barcelona 2014: CrossRef System and Support Update by Chuck Koscher
Barcelona 2014: CrossRef System and Support Update by Chuck KoscherBarcelona 2014: CrossRef System and Support Update by Chuck Koscher
Barcelona 2014: CrossRef System and Support Update by Chuck Koscher
 
EDI Training Module 12: An Introduction to Metadata and Data Repositories
EDI Training Module 12:  An Introduction to Metadata and Data RepositoriesEDI Training Module 12:  An Introduction to Metadata and Data Repositories
EDI Training Module 12: An Introduction to Metadata and Data Repositories
 
DataCite at APE 2011
DataCite at APE 2011DataCite at APE 2011
DataCite at APE 2011
 
FSCI Data Discovery
FSCI Data DiscoveryFSCI Data Discovery
FSCI Data Discovery
 
Introduction to the Environmental Data Initiative (EDI)
Introduction to the Environmental Data Initiative (EDI)Introduction to the Environmental Data Initiative (EDI)
Introduction to the Environmental Data Initiative (EDI)
 
re3data.org – Registry of Research Data Repositories
re3data.org – Registry of Research Data Repositoriesre3data.org – Registry of Research Data Repositories
re3data.org – Registry of Research Data Repositories
 
EDI Training Module 12: Learn to Cite and Link Your Data
EDI Training Module 12:  Learn to Cite and Link Your DataEDI Training Module 12:  Learn to Cite and Link Your Data
EDI Training Module 12: Learn to Cite and Link Your Data
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDS
 
FundRef Webinar
FundRef WebinarFundRef Webinar
FundRef Webinar
 

Similar to The expanding dataverse

Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Natsuko Nicholls
 
Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Sarah Shreeves
 

Similar to The expanding dataverse (20)

Data Publishing Workflows with Dataverse
Data Publishing Workflows with DataverseData Publishing Workflows with Dataverse
Data Publishing Workflows with Dataverse
 
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
Gaining credit for sharing research data: Viewpoints on Data Publishing
Gaining credit for sharing research data: Viewpoints on Data PublishingGaining credit for sharing research data: Viewpoints on Data Publishing
Gaining credit for sharing research data: Viewpoints on Data Publishing
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
 
ODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific DataODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific Data
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
Introduction of Linked Data for Science
Introduction of Linked Data for ScienceIntroduction of Linked Data for Science
Introduction of Linked Data for Science
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...
 
Data Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access SymposiumData Publishing at Harvard's Research Data Access Symposium
Data Publishing at Harvard's Research Data Access Symposium
 
Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...
 
Data Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharingData Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharing
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishing
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 

More from Merce Crosas

More from Merce Crosas (20)

Practical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with DataversePractical Implementation of research data policies: Solutions with Dataverse
Practical Implementation of research data policies: Solutions with Dataverse
 
Research Data Management @Harvard
Research Data Management @HarvardResearch Data Management @Harvard
Research Data Management @Harvard
 
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack CloudCloud Dataverse: A Data repository platform for an OpenStack Cloud
Cloud Dataverse: A Data repository platform for an OpenStack Cloud
 
Can data access combat fake news?
Can data access combat fake news?Can data access combat fake news?
Can data access combat fake news?
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
FAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data SharingFAIR Data Management and FAIR Data Sharing
FAIR Data Management and FAIR Data Sharing
 
The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)The Data Lifecycle (Harvard DataFest)
The Data Lifecycle (Harvard DataFest)
 
Cloud Dataverse
Cloud DataverseCloud Dataverse
Cloud Dataverse
 
Making Data Accessible
Making Data AccessibleMaking Data Accessible
Making Data Accessible
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
 
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
The Rise of Data Publishing in the Digital World (and how Dataverse and DataT...
 
A very Brief History of Communicating Science
A very Brief History of Communicating ScienceA very Brief History of Communicating Science
A very Brief History of Communicating Science
 
Dataverse on the MOC
Dataverse on the MOCDataverse on the MOC
Dataverse on the MOC
 
The Dataverse Commons
The Dataverse CommonsThe Dataverse Commons
The Dataverse Commons
 
Dataverse hpdm symposium
Dataverse   hpdm symposiumDataverse   hpdm symposium
Dataverse hpdm symposium
 
Collaboration in science and technology it summit
Collaboration in science and technology   it summitCollaboration in science and technology   it summit
Collaboration in science and technology it summit
 
Dataverse for Journals
Dataverse for JournalsDataverse for Journals
Dataverse for Journals
 
Collaboration in science and technology
Collaboration in science and technologyCollaboration in science and technology
Collaboration in science and technology
 

Recently uploaded

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 

Recently uploaded (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 

The expanding dataverse

  • 1. The Expanding Dataverse Mercè Crosas, Director of Data Science, IQSS @mercecrosas January 21, 2015, Lamont Library, Harvard University
  • 2. Data Publishing: A form of Scholarly Communication 350 years of scientific publishing, with words and data 1665 Data, if any, were part of the printed publication Now Vast quantities of digital data (and code) cannot be part of the printed publication
  • 3. Pillars of Data Publishing To make data discoverable, accessible and reusable, we need: 1. Data Citation, to reference and find data 2. Data Repositories, to host and access data 3. Information about the data, to understand and reuse them
  • 4. Dataverse Software: A Data Publishing framework … for a wide range of repositories Public, Generic Repositories Institutional Repositories Curated Data Archives Repositories
  • 6. Dataverse 4.0: Enables and Enhances Data Publishing ● A data citation compliant with the Data Citation Principles ● Rich metadata to describe and find datasets from multiple domains ● Support for public and restricted data, open data license and terms of use ● Rigorous workflows to publish data, with support for new versions of the data
  • 8. A Brief History of Citing Data 1906 Chicago Manual of Style: author/creator, title, dates, publisher or distributor 1979 ASBR (“Data File” type) MARC (machine readable catalog) Domain Repositories (e.g., GenBank) 1959 First scientific digital repositories (e.g. World Data Center, ICPSR) 1999 - Now Growth of Data Repositories (e.g., NESSTAR, Dataverse, Dryad, Figshare, Zenodo) DOI services for Data (e.g., DataCite in 2009) Altman & Crosas, 2013, “The Evolution of Data Citation: From Principles to Implementation” IASSIST Quarterly 2014 Data Citation Principles NISO-JATS revised to support data
  • 9. Joint Declaration of Data Citation Principles 1 Importance 2 Credit and Attribution 3 Evidence 4 Unique Identification 5 Access 6 Persistence 7 Specificity and Verifiability 8 Interoperability and flexibility https://www.force11.org/datacitation
  • 10. Data Citation generated by Dataverse Principle 2: Credit and Attribution Principle 4, 5, 6: Unique Id Access Persistence Principle 7: Specificity and Verifiability Principle 8: Interoperability and flexibility: Repository exports citation metadata in XML, JSON formats Authors, Year, Dataset Title, DOI, Data Repository, UNF, version Resolves to landing page with access to metadata, docs, and data Altman & King, 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data.
  • 12.
  • 13. Three Metadata Levels Generic Metadata Domain Specific Metadata File Metadata Includes data citation metadata fields (Examples: title, authors, persistent id, description) Examples: ● Social Science Metadata (DDI) ● Life Sciences (ISA-Tab) ● Astronomy (VO) Examples (automatic): ● For Tabular Files: Column information ● For FITS Files: Header information
  • 14. Life Science Metadata Example: Life Sciences Metadata
  • 17. Terms, Licenses and Restrictions Public Dataset Dataset with Restricted Files Dataset with Terms of Use ● CC0 License ● Metadata is public ● Files are public ● CC0 License ● Metadata is public ● Files are restricted ● Access Terms are defined in dataset ● Metadata is public ● Terms of Use are defined in dataset (CC0 can’t apply) ● Files might be public or restricted
  • 19. Draft, Published and Versions Draft Dataset Published Dataset, v1 Published Dataset, v1.1 Published Dataset, v2 Upload Data Dataset in review, can be shared with collaborators Once published, dataset cannot be unpublished (only deaccessioned) Minor version for small changes to dataset description Major version for new versions of data files Data Citation becomes public Data Citation doesn’t change Data Citation changes Draft Draft
  • 20. Multiple Roles for Multiple Workflows Editor Upload Data + Edit Metadata Set File Restrictions + License and Terms Grant Access + Publish Dataset Upload Data + Edit Metadata Upload Data + Edit Metadata Set File Restrictions + License and Terms Manager + + + Curator + Custom Roles
  • 22. Tabular Data: Converted to Preservation format Download in Original format or Preservation format (does not depend on software package)
  • 23. Tabular Data: Explore and Analyze with TwoRavens
  • 25. Demo acknowledgement: Dwayne Liburd, Sonia Barbosa
  • 26. Not only Expanding in Features, but also in Size 874 Dataverses 55,539 Datasets 1,173,733 Downloads
  • 28.
  • 29. Beyond 4.0 ● Integration with other Systems: o DASH o ORCID o Journal Systems (in addition to OJS) o Archivematica o iRODS ● Support for Sensitive Data: o Secure Storage o DataTags o Analysis with Privacy Preserving Algorithms ● Data Citation with Dataset Provenance ● Expanding APIs!
  • 30.