The panel will focus on a pilot project to ensure that all stakeholders understand the services and infrastructures to be included in the DMPs by the granting councils and CFI.
3. THE TREND TOWARDS SHARED SERVICES
• Other countries are developing shared services and
infrastructure to support research data
management services.
• Why? To address cost redundancies, pools
knowledge, breaks down silos across disciplines.
• Common shared services are: discovery, data
registries, support and expertise, training, shared
repositories and preservation.
• General trend towards domain-based services to
generic RDM infrastructure. There are common
infrastructure and service requirements across
domains!
4. PROJECT ARC
• Initiated and supported by CARL
• December 2013 initial stakeholders meeting
• March 2014 working group launched, for 1 year
• Working group members represent
– CARL
– all four regional academic library associations:
CAUL, COPPUL, OCUL and Quebec
– CRKN
• Includes some of Canada’s top research data
management experts
5. PROJECT ARC
Builds on previous efforts by CARL to improve capacity
at Canadian universities in the area of research data
management:
• 2009 Research Data Management Toolkit: Unseen
Opportunities
• 2010 Library Roles in Management Research Data
• 2011-2012 Proposal: “Canadian National
Collaborative Data Infrastructure Project”
• 2013 RDM Course: Introduction to Research Data
Management Services
6. PROJECT ARC AIM AND VISION
• A future in which Canada capitalizes on the
trend towards data intensive research and is
a world leader in research and innovation
• This future is achievable, with comprehensive
support for research data management at a
national scale.
• Project ARC aim is to improve our national
capacity for the management, preservation,
and re-use of research data.
7. PROJECT ARC – SCOPE
• Bring together existing library-based initiatives to
better coordinate activities and build capacity across
the country
• Lay the foundation for a library-based research data
management network
• Work closely with other stakeholders (e.g. CANARIE,
Compute Canada, Research Data Canada) to ensure
integration with and support for other infrastructures
and initiatives in Canada
8. PROJECT ARC - PRINCIPLES
• Data are a public good
• Intelligent access: openness, with respect for privacy
• Collaborative approaches: cost savings and sharing expertise
• Inclusiveness: aim to serve all researchers and create a more level
playing field
• Commitment to standards and interoperability
• International relationships: liaise internationally and ensure our work is
in keeping with international practices
• Respect for differences: flexibility to meet the needs of different regions,
institutions, and disciplines
• Open source: Tools will be contributed back to the community
• Stewardship: a sense of responsibility for managing research data over
the long term
9. OBJECTIVES OF PROJECT ARC
Liaising closely with all relevant stakeholder in this
arena,
1. Provide support for institutions to deliver data
management plans (DMPs)
2. Develop a plan for the implementation of a centre
of expertise for the curation of research data in
Canada
3. Undertake a pilot that will act as an exemplar for
a national preservation service for research data
4. Develop an organizational framework and
operational plan for a library-based research data
management network in Canada
10. PORTAGE
At Project ARC mid-point (September 2014), a
network name was proposed and concepts were
refined…
The Portage network will have two major
components:
• A distributed centre of expertise for research data
management, and
• A national preservation system for research data
that will evolve and expand over time
11. NETWORK CENTRE OF EXPERTISE
1) Comprehensive set of resources to support data management planning
– How-to guides, case studies, training materials
– Cooperation with UK Digital Curation Centre (DCC)
– Currently being collected on Project ARC website
2) National DMP automated tool to assist Canadian researchers in
developing management plans
– DMP online (originally developed by DCC) selected
3) Consulting services
– Draw on expertise of librarians and others from across the country
– Support data curation, training, DMPs, discovery, preservation,
privacy-security-ethics
– Build human capacity across the country
12. NATIONAL PRESERVATION SYSTEM
(more from Chuck and Walter)
Advice and support for researchers depends on viable technical
solutions!
• Continue a pilot in close collaboration with Compute Canada
and RDC, including some of the domain data centres
• Domain data centres currently involved are Canadian
Astronomy Data Centre and C-Brain, whose creation was
supported by the CANARIE research software program
• Goal is to enable all interested academic libraries to
participate, whether or not they have their own local
infrastructure
• Complements high performance computing infrastructure and
domain repositories and contributes integration layer
15. A COLLABORATIVE EFFORT AMONG RDC, CARL
Project Arc, Compute Canada, CANARIE,
Scholar’s Portal, SFU Libraries, CANFAR, C-
Brain, CPDN
Born at the DI Summit2014
16. THE CONTEXT:
• Data are both a product and a resource for 21st
century discovery.
• The TC3+ are preparing to require Data
Management Plans as part of the funding
application process.
• The federal government has extended its
commitment to open government and open data
to cover federally funded research:
17. THE CONTEXT:
The Government of Canada will maximize access to
federally funded scientific research to encourage greater
collaboration and engagement with the scientific
community, the private sector, and the public.
Among the commitments for 2014 to 2016:
Launch of open access to publications and data resulting
from federally funded scientific activities
Canada's Action Plan on Open Government 2014-16
http://open.canada.ca/
18. THE PROBLEM:
• Data that cannot be discovered cannot be open!
• Data that are only on someone’s hard drive or
memory stick cannot be open!
• Data that are not curated cannot be open for
long!
19. THE PROBLEM:
• Currently in Canada, most researchers lack
access to the services and the infrastructure
that would permit them to be good stewards of
their research data and to make it accessible.
• Outside of some data intensive disciplines, little
is in place to provide for the long-term curation
and preservation of data
20. THE OPPORTUNITY:
• Many of the elements for a national system of
data stewardship are in place – the networks
that are required with CANARIE and the ORANS;
storage systems at Compute Canada; data
expertise in research libraries and the ARC
Project of CARL; significant experience in
developing repositories with CANFAR, C-Brain,
and CPDN among others.
21. THE CHALLENGE:
• Can we integrate those elements at a pilot level?
• Can we work with a small set of researchers to
ingest their data into a storage and curation
environment easily and seamlessly in a manner
that provides for easy retrieval?
• Can we create this opportunity first at a local
level and then demonstrate integration among a
few local sites into a proto-typical national
network that provides appropriate replication
and the basis for long-term preservation?
22. THE ANTICIPATED RESULT:
• Anticipating meeting the challenge successfully,
we hope to be able to arrive at a set of
conclusions that will allow us to make
recommendations on what would be required to
grow such a prototypical system into a truly
national network that would serve those parts of
the research community currently unserved and
would provide further support and backup for
existing repositories, some of which have
concerns about their long-term viability.
23. PROGRESS TO DATE:
• We have a model identified about which Chuck
Humphrey will speak in a moment.
• Building on a local scale project at SFU, we have
researcher data being moved into Compute
Canada storage resources by library staff
• We have a plan for a similar activity at the
University of Toronto to get underway in 2015
24. NEXT STEPS:
• We will shortly start having researchers do their own
ingest directly into the repository and archive
environment at SFU
• We will be looking at establishing a duplicative
installation at another university
• We will be looking to test replication services
• We will look to have researchers use the system at a
distance
• We will minutely detail the processes
• We will begin to discuss what it would take to scale
26. The Pilot
Working with existing digital technology and
expertise, the Pilot is to assemble a research
data management infrastructure
demonstrating interoperability among data
repositories and the archiving of research
data.
27. LEVELS OF DATA STEWARDSHIP
• Research data management infrastructure supports data
stewardship that occurs at different levels across the
research lifecycle.
o The researcher at the project level
o The data repository level
o The interoperability level at the regional and national
level
28. EXCHANGES AMONG LEVELS
• The exchange of research data and metadata
among these three levels can encounter barriers
or gaps.
• A Data Management Plan is helpful in identifying
a pathway for data and metadata across levels,
bridging gaps and overcoming barriers.
29. KEY OBJECTIVES
Because no national RDM infrastructure
exists today, the pilot is building
pathways across levels and assembling an
operational system to demonstrate a
community response to providing RDM
infrastructure.