Researchers face the challenge to choose from a diverse set of repositories to publish, preserve and share their data. Ideally, they have features available to seamlessly integrate the preservation step in their daily research routine. The project PresQT (Preservation Quality Tool) eases the use of repositories and serves as boilerplate between existing solutions while adding beneficial metadata and FAIR tests (Findability, Accessibility, Interoperability, and Reuse). PresQT and its standards-based design with RESTful web services have been informed via user-centered design and is a collaborative open-source implementation effort. The PresQT services extend the preservation tool landscape in a way that stakeholders can keep working in their chosen computational environment and receive additional features instead of having to switch to a different software. PresQT services form the connection between tools, workflows and databases to existing repositories. Current partners or implementations for open APIs include OSF, CurateND, EaaSI, GitHub, GitLab, Zenodo, FigShare, WholeTale, Jupyter and HUBzero. The diversity of partners contributes to understanding the needs of the stakeholders of PresQT services.
PresQT services are easily integratable and target systems can be added via extending JSON files and Python functions. Data is packaged as BagITs for uploads, downloads and transfers. The current services include transfers with fixity checks supporting diverse hash algorithms, keyword enhancement via SciGraph, upload, download and connection to EaaSI services. The FAIR tests are under development building on FairShake for initial tests. The next planned steps include to create indicators how FAIR the data in the target repository is stored with additional hints for improvement. To present the capabilities to interested developers of computational solutions, users of PresQT services and funding bodies, we have developed a demo user interface that allows for demoing and testing the different features of PresQT services.
Improving the Publication and Re-USe of Data via PresQT Tools and Services
1. Hesburgh Libraries
Sandra Gesing, Natalie Meyers, Richard Johnson and John Wang
sandra.gesing@nd.edu
https://presqt.readthedocs.io/en/latest/
December 9, 2020
Session
Best Practices and Realities of Research Data Repositories:
Which One Should I Choose to Publish My Data?
Improving the Publication and
Re-Use of Data via
PresQT Tools and Services
4. PresQT
An implementation grant and previous planning
grant funded effort to address needs for
preserving data and software. The goal is to
collaboratively design, develop, and connect
interoperable and repository agnostic Data and
Software Preservation Quality Tools.
Hesburgh Libraries
- Implementation grant: https://www.imls.gov/grants/awarded/LG-70-18-0082-18
- Planning grant: https://www.imls.gov/grants/awarded/lg-72-16-0122-16
5. Hesburgh Libraries
Collaborative Effort
Concept
• not standalone solutions
• partner systems and services easily integrable via
RESTful APIs and services
• user-centered open design and collaborative
development
16. Current Service Integrations
Hesburgh Libraries
https://presqt.readthedocs.io/en/latest/
Service Functionality
EaaSI
Send resources from a PresQT server to EaaSI
to generate an emulation proposal
Keyword Enhancement Suggest/add keywords to existing keywords
FAIR Evaluator Evaluate the FAIR-ness of a resource
17. PresQT - Characteristics
• Configure additional partner systems via JSON and Python
functions
• Check fixity via hash algorithms
• Add metadata via JSON
• Transfer in BagIt format
• Enhance keywords
• Check FAIRness of research objects
• Check available tools via EaaSI
Hesburgh Libraries
https://presqt.readthedocs.io/en/latest/