SlideShare une entreprise Scribd logo
1  sur  43
DATA MANAGEMENT
                    June 13, 2012
PLANS & PLANNING:
  MEETING THE NSF
     REQUIREMENT
WHO ARE WE?

Heather Coates
Digital Scholarship & Data Management Librarian
Liaison to the School of Public Health
University Library

Kristi Palmer
Digital Scholarship Team Leader
Liaison to the Department of History
and Programs of Women's and American Studies
University Library
LEARNING OBJECTIVES

After attending this workshop:

 You will understand the NSF data policies.
 You will be aware of the relevant data -related services at IUPUI.
 You will have resources to develop a data management plan
  (DMP) for your NSF proposal(s).
 You will be able to write a comprehensive DMP for your NSF
  proposal(s).
 You will send your DMP draft to the Data Services Program for
  review and assistance as needed.
OVERVIEW

 Context for the NSF data policies

 Meeting the NSF DMP requirement
   The requirement: 5 elements
   Developing a Data Management Plan
   Implementing your plan


 Workshop Evaluation ( 5 minutes)
CONTEXT: SCHOLARLY COMMUNICATIONS

 Funding, funding, funding

 Scholarly Impact
     Exposure  increased citation
     More equal access (especially for students)
     Facilitates reproducibility
     Facilitate new discoveries via secondary analysis/data re -use
     Foster productive collaborations
     Lead to new computational techniques


 Planning for the future
   If we can’t find it, it doesn’t exist
   Persistent access
   Long-term preservation of scholarly records
CONTEXT: WHY THE LIBRARY?

           preservation, curation, access
 Trusted member of the institution
 Organizational structure lends itself to collaboration with
  researchers
 Interdisciplinary by nature
 Existing infrastructure for digital information
 Existing expertise in preserving and providing access to
  information
   Program of Digital Scholarship
   Archives
CONTEXT: DATA SERVICES PROGRAM

 Part of the Program of Digital Scholarship
 Mission
   Identifying data issues and connecting you to the solutions
 Services
   Workshops
   Individual consultations
   Data repository
 Resources
   Guide to NSF Data Management Plan Requirement
   Website
CONTEXT: TERMINOLOGY

 Cyberinfrastructure: computing resources & networks, services,
  & people (see Empowering People, 2009 for more)
 Data management: technical processing and preparation of data
  for analysis
 Data curation: selection of data for preservation and adding
  value for current and future use
 Data citation: mechanisms to enable easy reuse and verification,
  track impact of data, and create structures to recognize and
  reward researchers ( DataCite)
 Data sharing: must take into account ethical and legal issues; a
  spectrum with many options
 Data stewardship:
CONTEXT: FEDERAL POLICIES

 Issues in scholarly communication
   Open access
   Open data & data citation
   Data management & curation


 Federal policies (incremental steps towards openness)
     National Research Council, 1985
     Office of Management & Budget, 1999: Circular A-110
     NIH Data Sharing Policy, 2003
     NIH Public Access Policy, 2008
     NSF DMP Requirement, 2011
     Other policies: NEH, NOAA, NASA, Howard Hughes Medical Institute
      Wellcome Trust
CONTEXT: IU STRATEGIC PLAN

IU Empowering People Strategic Plan for IT (2009), Action
33:

  “IU should provision a data utility service for research data
  that affords abundant near- and long-term storage, ease of
  use, and preservation capabilities. This data utility will need
  to offer a range of services for securing data, providing
  authorized access within and beyond IU; ensuring metadata
  description, annotation, and provenance; and providing
  backup/recovery services.”
CONTEXT: OPEN ACCESS

 What is Open Access?
   Freely available, online, and free of most copyright restrictions
 Why should you care?
   Right thing to do?
   Increase your citations
   “We analysed 119,924 conference articles in computer science and related
    disciplines. The mean number of citations to offline articles is 2.74, and the
    mean number of citations to online articles is 7.03, an increase of 157%.”
    (Lawrence, 2008)
   Publisher functions need not reside in for profit hands
   "Between 1975 and 2005 the average cost of journals in chemistry and
    physics rose from $76.84 to $1,879.56. In the same period, the cost of a
    gallon of unleaded regular gasoline rose from 55 cents to $1.82. If the gallon
    of gas had increased in price at the same rate as chemistry and physics
    journals over this period it would have reached $12.43 in 2005, and would
    be over $14.50 today.” (Lewis, 2008)
CONTEXT: OPEN ACCESS @ IUPUI AND IU

 IUPUI University Library Program of Digital Scholarship
 http://www.ulib.iupui.edu/digitalscholarship
   Open Journals
   IUPUIScholarWorks-Faculty Scholarship
     Electronic Theses and Dissertations
   Cultural Heritage Collections
   Data
   eArchives
CONTEXT: RESEARCH LIFE CYCLE




Source: DDI Structural Reform Group. “DDI Version 3.0 Conceptual Model." DDI Alliance. 2004. Accessed on 11 August 2008.
<http://www.icpsr.umich.edu/DDI/committee-info/Concept-Model-WD.pdf>.
CONTEXT: BENEFITS OF PLANNING

 Saves time
   Less reorganization down the road
 Increases efficiency
   Gathers necessary information for analysis and writing
   Prevents problems in understanding data and metadata
 Prevents data loss
   If you have a plan, you are more likely to back up your data
 Makes it easier to preserve your data
   Documentation is more easily created throughout a project
   Metadata generation can be automated or incorporated into procedures
 Requirements of some funding agencies and institutions
DMP: INTERPRETING THE POLICY

 Why?
     Increased impact of research money
     Reduce redundant data collection
     Enhance use and value of existing data
     Further scientific research
     Data gathering tool
       What kinds of data are we collecting?
       How are researchers collecting, managing, and preserving data?
       What are community norms?
 Language is broad to allow input from research communities
 Implementation costs of the DMP CAN be included in direct costs
DMP: KEEP IN MIND

 The gist of it…
   Describe what you will do with your data during and after the proposed
    project
   Ensures data is safe now and in the future
 DMP should reflect…
   Awareness of data management and curation in your discipline
   Feasible plan to utilize available cyberinfrastructure
 Try to…
   Explain the rationale for your choices
   Identify roles for data management and curation activities
DMP: ELEMENTS

   Types of data
   Standards and metadata
   Access and sharing
   Re-use, re-distribution, and the production of derivatives
   Long-term preservation
   [Budget]
DMP: TYPES OF DATA [1]

      Use standards common in your research community

 Characterize the data
   Types of data
      experimental, observational, raw or derived, models, simulations, curriculum
       materials, software, images, audio, video, etc.
   File formats (i.e., text, spreadsheet, database, etc.)
   How much data? (# of files, total size)
   Will the data be reproducible?


 Relationship to existing data? (i.e., interoperability)
   Syntactic
   Semantic
DMP: TYPES OF DATA [2]

 How will data be collected?
   How? (tools, instruments, measurements, etc.)
   When? (timeframe, series)
   Where? (sites, settings)


 How will data be processed?
   Workflows (brief overview using flow chart)
   Software packages


 How will the data be stored and managed?
   File naming conventions
   Version control
DMP: TYPES OF DATA [3]

 What QA & QC measures will be used?
   Identify steps during processing and analysis to eliminate missing data
    points, identify outliers, and provide statistical summaries (e.g., double
    data entry, histograms, scatterplots)
   Before data are collected, define and enforce standards and assign
    responsibility
   During project, document processes and any changes or deviations


 What is the backup and security plan?
   Identify particular security or confidentiality issues
   Describe location & frequency


 Roles & responsibilities
   Who will carry out data collection, processing, and backup activities?
EXAMPLE: TYPES OF DATA

Atmospheric Concentrations of CO2, Mauna Loa
Observatory, Hawaii, 2011 -2013
https://www.dataone.org /sites/all/documents/DMP_MaunaLoa_Fo
rmatted.pdf

Arthropod responses to grassland nutrient limitation
https://www.dataone.org /sites/all/documents/DMP_NutNet_Form
atted.pdf
DMP: STANDARDS & METADATA [1]

 Metadata describes the who, what, when, where, how, why of
  the data
   Include workflow: how you get from raw data to final products


 Purpose: enable finding, organization, interoperability,
  identification, archiving & preservation

 Standards are commonly agreed upon terms and definitions in a
  structured format
     Dublin Core (commonly used by libraries)
     Darwin Core (geographic occurrence of species)
     EML (ecology)
     Data Documentation Initiative (DDI; social sciences)
     IEEE LOM (learning objects metadata)
DMP: STANDARDS & METADATA [2]

 Ask yourself: will your datasets be self -explanatory or
  understandable in isolation?

 Decisions to make about metadata
   Relevant standard(s)
   Format
   Content
      What information is needed to use and interpret in 5 years, 25 years?


 How are metadata created?
   Automatically generated
   Manually created
EXAMPLE: STANDARDS & METADATA [1]

Atmospheric Concentrations of CO2, Mauna Loa Observatory,
Hawaii, 2011-2013
https://www.dataone.org /sites/all/documents/DMP_MaunaLoa_Fo
rmatted.pdf

Metadata will be comprised of two formats —Contextual
information about the data in a text based document and ISO
19115 standard metadata in an xml file. These two formats for
metadata were chosen to provide a full explanation of the data
(text format) and to ensure compatibility with international
standards (xml format). The standard XML file will be more
complete; the document file will be a human -readable summary of
the XML file.
EXAMPLE: STANDARDS & METADATA [2]

R i o G ra n d e H yd rol ogic G e o d atabase C o m p e n di um
htt ps:/ /www. dataone .org /site s /al l/ doc ume nts /D M P_ Hydrol ogic _ Form atte d.pdf
                  M i c ro s o f t A c c e s s D ata b a s e fo r ma t w i l l b e u s e d s i n c e i t i s re a d i l y a c c e s s i b l e a n d
i t i s co m p a t i b l e w i t h E S R I A rc G I S ( htt p : / / w w w. e s r i . co m/s o f t wa re /a rc g i s / i n d ex . ht m l ) , a
G e o g ra p h i c I nfo r m at i o n S y s te m s o f t w a re p a c ka g e u s e d by t h e s ta ke h o l d e rs . N a m i n g
co nv e nt i o n s w i l l b e co n s i s te nt – n o s p a c e s w i l l b e u s e d i n ta b l e n a m e s o r f i e l d n a m e s .
T h e f i l e n a m i n g co nv e nt i o n w i l l co n s i s t o f t h e d a ta s o u rc e _ d a ta t y p e fo r m a t fo r ra w d a ta
f i l e s . D a ta re p o r t i n g f u n c t i o n a l i t y w i l l b e b u i l t i nto t h e V B A p ro c e s s i n g p ro g ra m s to
p ro v i d e o u t p u t i n .t x t f i l e fo r m at fo r n u m b e r o f re co rd s p e r s o u rc e w h e n u p d a ta b l e d a ta
s o u rc e s a re ref re s h e d .
                  Ev e r y ef fo r t w i l l b e m a d e to g o b a c k to t h e a u t h o r i ta t i v e s o u rc e fo r a n
i d e nt i f i e d d a ta s et . Q u a l i t y co nt ro l o f t h e d a ta b a s e w i l l b e p e r fo r m ed u s i n g S Q L
s t a te m e nt s t h a t ca p i ta l i ze o n t h e d a ta b a s e s t r u c t u re to e n s u re re l a t i o n a l d a ta b a s e
i nte g r i t y. A p p ro p r i ate p r i m a r y key s w i l l b e a s s i g n e d to m a n a g e p o s s i b l e d a ta d u p l i ca te s .
Po t e nt i a l d u p l i ca te s i te I D s , w i l l b e h a n d l e d t h ro u g h a u to m a te d p ro c e d u re s a n d t h e
c re a t i o n o f a l te r n a te I D ta b l e s .
                  A d ata d i c t i o n a r y w i l l b e c re a te d t h a t d ef i n e s t h e ta b l e d ef i n i t i o n , ta b l e
f i e l d s , a n d ta b l e f i e l d d a ta t y p e s . A n e nt i t y - ­ ­ re l at i o n s h i p d i a g ra m w i l l b e c re a te d t h a t
d ef i n e s t h e re l a t i o n a l s t r u c t u re o f t h e d a ta b a s e .
                  A m eta d a ta re co rd w i l l b e p ro d u c e d u s i n g t h e F G D C s ta n d a rd t h a t d e s c r i b e s t h e
e nt i re g e o d a ta b a s e . T h e F G D C s ta n d a rd w a s c h o s e n d u e to re q u i re d Fe d e ra l g o v e r n m e nt
s t a n d a rd s .
DMP: ACCESS & SHARING

 What are your obligations for sharing?
   Funding agency, institution, other organization, legal, etc.
 What are the ethical or legal issues? (i.e., privacy,
  confidentiality, security, intellectual property, or other rights)

 How will the data be made available?
 What is the process for gaining access?
 When will the data be made available?
   When will the data become available?
   For how long will the data be available?
 What is the process for gaining access?
 Who will have access to the data?
DMP: RE-USE, RE-DISTRIBUTION, ETC.

 What rights will you retain before data is made available?
 Will permission restrictions be necessary?
   Limits or conditions for political, commercial, or patent reasons?
 Is there an embargo period? Why?

 Future users and uses
   Who might be interested in the data?
   How might you anticipate this data being used?
   What value might the data have for these people?
EXAMPLE: ACCESS, SHARING, RE-USE

Development of a NanoKlein Calorimeter
http://libguides.unm.edu/content.php?pid=137795&sid=1422879

  We expect to apply for a patent for this instrument. All of the
  materials submitted as part of the patent process will be a matter
  of public record. We will also make technical drawings, test data
  and calibration data available through our institutional repository.

Cave Microbiology
http://libguides.unm.edu/content.php?pid=137795&sid=1422879
DMP: LONG-TERM PRESERVATION

      Project-based funding does not lend itself to long -term
                          preservation.

 What data will be preserved?
   What transformations are necessary to prepare the data?
 How long do you think the data will be useful? How long will the
  data be preserved?
 Contextual information needed to make the data reusable
   metadata, references, reports, manuscripts, grant proposal, etc.
 Where will it be preserved?
   Links to published materials and other outcomes? Use of persistent
    citation?
   Procedures for preservation and back-up?
 Who will be the contact for the dataset?
EXAMPLE: LONG-TERM PRESERVATION [1]

Arthropod responses to grassland nutrient limitation
https://www.dataone.org /sites/all/documents/DMP_NutNet_Form
atted.pdf

We will preserve both arthropod datasets generated during this
project (abundance and stoichiometry) for the long term in the
Digital Conservancy at the U of M. We will include the .csv
files, along with the associated metadata files. We will also submit
an abstract with the datasets that describe their original context
and any potentially relevant project information. Borer will be
responsible for preparing data for long -term preservation and for
updating contact information for investigators.
EXAMPLE: LONG-TERM PRESERVATION [2]

Improving the long-term preservability of HDF-formatted data by
creating maps to file contents
https://www.dataone.org /sites/all/documents/DMP_HDFMap_For
matted.pdf

The writer software will be preserved by the HDF Group for the life
of the HDF libraries. The HDF Group uses industry­standard best
practices to ensure the integrity of their software and systems.
Once the map writer has been used to generate maps for every
HDF file in existence, the continued existence of the writer
software is not required. The reader software will be preserved at
SourceForge.org for as long as there is community interest. The
collection of HDF files will be preserved at NSIDC as long as utility
is deemed high.
IUPUIDATAWORKS

 Institutional repository that can facilitate subject repositories
 Policies are being developed, informed by faculty needs
 Pilot projects
   More support at little/no cost
   Flexibility in what we are willing to do
   New tools to demonstrate impact of research
 The future
   Standardized levels of service
   Standardized policies, responsive to faculty needs
   Cost recovery for significant intellectual/time investment
IMPLEMENTING YOUR PLAN [1]

 The DMP is a working document

 NSF expects progress to be reported (progress reports, final
  reports, new grant proposals)

 Incorporate implementation into the project startup process
   C&G, IRB, IACUC all have to be in place before data collection can begin
   Review, revise, and set up your system during startup


 Good documentation ensures…
   A shared understanding of the data throughout a project
   That future researchers will be able to understand data within the
    relevant context
   That re-users of data are able to interpret the data appropriately
IMPLEMENTING YOUR PLAN [2]

Research File System: http://pti.iu.edu/storage/rfs
Scholarly Data Archive: http://pti.iu.edu/storage/sda
Research Technologies, UITS: http://uits.iu.edu/page/avel
Core Ser vices, UITS: http://pti.iu.edu/cs
Scholarly Cyberinfrastructure, UITS: http://uits.iu.edu/page/amee
C TSI Tools: http://www.indianactsi.org /rct (Alfresco Share, REDCap )

Program of Digital Scholarship: http://ulib.iupui.edu/digitalscholarship
Center for Research & Learning: http://crl.iupui.edu/
OVCR: http://research.iupui.edu/development/
Office of Academic Affairs: http://www.academicaffairs.iupui.edu
Intellectual Property Policy: https://www.indiana.edu/~vpfaa/
 academicguide/index.php/Policy_I-11

IUWare: https://iuware.iu.edu
IUanyWare: https://iuanyware.iu.edu/vpn/index.html
StatMath: http://www.indiana.edu/~statmath/
Statistics Consulting Center: http://www.math.iupui.edu/asci/
PRACTICAL TOOLS

Lynda.com tutorials: http://ittraining.iu.edu/lynda/default.aspx
  Cleaning Up Your Excel Data (2010)
  Managing & Analyzing Data in Excel (2010)
  Data Validation in Depth (2010)
DMPTool: https://dmp.cdlib.org /
DMPOnline: https://dmponline.dcc.ac.uk/
UK Data Archive Costing Tool:
http://www.data-archive.ac.uk/media/257647/
ukda_jiscdmcosting.pdf
Creative Commons Licenses & Data:
http://wiki.creativecommons.org /Data
Licensing Research Data, Digital Curation Centre
http://www.dcc.ac.uk/resources/how -guides/license-research-data
CIC Author Addendum
http://www.cic.net/authors
RECOMMENDED READING

UK Data Archive: Managing & Sharing Data Brochure:
http://www.data-archive.ac.uk/media/2894/managingsharing.pdf
MORE RESOURCES

 National Science Board, Digital Research Data Sharing &
  Management, 2012 (pre-publication):
  http://www.nsf.gov/nsb/publications/2011/nsb1124.pdf
 Committee on Science, Engineering, and Public Policy (U.S.).
  (2009). Ensuring the integrity, accessibility, and stewardship of
  research data in the digital age. Washington, D.C.: National
  Academies Press.
 National Science Board Committee on Strategy and Budget Task
  Force on Data Policies. (2011). Digital Research Data Sharing &
  Management. Washington, D.C.: National Science Board.
 America Creating Opportunities to Meaningfully Promote
  Excellence in Technology, Education, and Science Reauthorization
  Act of 2010, Pub. L. No. 111 -358. 124 Stat. 3982 (2010).
  Retrieved from the Library of Congress Thomas database .
REFERENCES

1.   Higgins, S. ( nd). What are metadata standards. http://ww w.dcc.ac.uk/
     resources/bri efing -papers/standards -watch-papers/what -are- metadata -
     standards
2.   Digital Curation Centre. ( nd). DCC Charter and Statement of Principles.
     Retrieved from http://ww w.dcc.ac.uk/about -us/dcc- charter.
3.   Indiana Universit y. (2011). Indiana Universit y ’s Advanced
     Cyberinf rast ructure. Retri eved from
     http://pti.iu.edu/cyberinf rast ructure.pdf.
4.   Indiana Universit y. (2009). Empowering Peopl e: Indiana Universit y ’s
     Strategic Plan for Information Technology. Retrieved from
     http://ovpit.iu. edu/st rategic2/ .
5.   National Science Foundati on. (2011 ). Award and Administration Guide:
     Chapter IV C.4., Disseminati on and Sharing of Research Results. Ret ri eved
     from
     http://ww w.nsf. gov/pubs/policydocs/pappguide/nsf 1 1001/aag_6. jsp#VI D4 .
6.   Lawrence, S., Free online availability substantially increases a paper ’s
     impact, Nature, 31 May 2001. http://ww w.nat ure. com/nature/debates/e -
     access/Articles/lawrence.html (accessed November 5, 2008,)
7.   Lewis, David W. "Librar y budgets, open access, and the future of scholarl y
     communication: Transformati ons in academic publishing." C&RL News, May
     2008, Vol. 69, No. 5. [Available at:
     http://ww w.ala.org /ala/mgrps/di vs/acrl/publicati ons/crlnews/
     2008/may/ALA_print _layout _1_ 47113 9_471 139. cf m ]
COMPELLING CASES FOR OPEN DATA

SPARC, Research is more valuable when it ’s shared:
http://www.arl.org /sparc/greaterreach/index.shtml

Tim Berners-Lee: http://www.ted.com/talks/tim_berners_lee_
on_the_next_web.html

Open-source cancer research: http://www.ted.com/talks/
jay_bradner_open_source_cancer_research.html

Polymath problem blogs:
http://polymathprojects.org /about/
http://stevekochscience.blogspot.com/2011/02/open -data-success-
story.html
http://eaves.ca/2011/09/07/the -economics-of-open-data-mini-case-
transit-data-translink/
THANK YOU

Tell us what you think, take a brief survey.


Find us @
http://ulib.iupui.edu/digitalscholarship/dataservices
Heather Coates, hcoates@iupui.edu, 317-278-7125
Kristi Palmer, klpalmer@iupui.edu, 317-274-8230

IUB
Stacy Konkiel, skonkiel@indiana.edu, 812-856-5295
EXTRA: NIH DATA SHARING POLICY

 $500,000 or more in direct costs in any year of the proposed research
 Final research data, not summary statistics or tables, not underlying
  pathology reports and other clinical source documents, might include
  both raw data and derived variables
 If an application describes a data -sharing plan, NIH expects that plan
  to be enacted.
 NIH expects the timely release and sharing of data to be no later than
  the acceptance for publication of the main findings from the final
  dataset.
 It is the responsibility of the investigators, their Institutional Review
  Board (IRB), and their institution to protect the rights of subjects and
  the confidentiality of the data. Prior to sharing , data should be
  redacted to strip all identifiers, and effective strategies should be
  adopted to minimize risks of unauthorized disclosure of personal
  identifiers.
EXTRA: NIH DATA SHARING PLAN

   describe briefly the expected schedule for data sharing
   the format of the final dataset
   the documentation to be provided
   whether or not any analytic tools also will be provided
   whether or not a data -sharing agreement will be required
     if so, a brief description of such an agreement (including the criteria for
      deciding who can receive the data and whether or not any conditions
      will be placed on their use)
 mode of data sharing (e.g., under their own auspices by mailing
  a disk or posting data on their institutional or personal
  website, through a data archive or enclave)
 Applicants may request funds in their application for data
  sharing.
RESOURCES

National Institutes of Health, Data Sharing Policy
http://grants.nih.gov/grants/policy/data_sharing /data_sharing_gui
dance.htm
NIH Public Access Policy Implications
http://publicaccess.nih.gov/public_access_policy_implications_20
12.pdf

Contenu connexe

Tendances

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
SEAD
 
Supporting UC Research Data Management
Supporting UC Research Data ManagementSupporting UC Research Data Management
Supporting UC Research Data Management
slabrams
 
Data management plans (DMPs)- 16 Feb 2017
Data management plans (DMPs)- 16 Feb 2017 Data management plans (DMPs)- 16 Feb 2017
Data management plans (DMPs)- 16 Feb 2017
ARDC
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
Sherry Lake
 

Tendances (20)

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data AcceptanceRDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
 
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FuturePoster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
 
Building and providing data management services a framework for everyone!
Building and providing data management services  a framework for everyone!Building and providing data management services  a framework for everyone!
Building and providing data management services a framework for everyone!
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Libraries and Research Data Curation: Barriers and Incentives for Preservatio...
Libraries and Research Data Curation: Barriers and Incentives for Preservatio...Libraries and Research Data Curation: Barriers and Incentives for Preservatio...
Libraries and Research Data Curation: Barriers and Incentives for Preservatio...
 
Data Management Lab: Session 1 Slides
Data Management Lab: Session 1 SlidesData Management Lab: Session 1 Slides
Data Management Lab: Session 1 Slides
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Supporting UC Research Data Management
Supporting UC Research Data ManagementSupporting UC Research Data Management
Supporting UC Research Data Management
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel
 
Data Science - Poster - Kirk Borne - RDAP12
Data Science - Poster - Kirk Borne - RDAP12Data Science - Poster - Kirk Borne - RDAP12
Data Science - Poster - Kirk Borne - RDAP12
 
Challenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceChallenges in setting up an RDM Support Service
Challenges in setting up an RDM Support Service
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
SEAD slide set (October 2011)
SEAD slide set (October 2011)SEAD slide set (October 2011)
SEAD slide set (October 2011)
 
Data management plans (DMPs)- 16 Feb 2017
Data management plans (DMPs)- 16 Feb 2017 Data management plans (DMPs)- 16 Feb 2017
Data management plans (DMPs)- 16 Feb 2017
 
ESA14 Workshop on SEAD's Data Services and Tools
ESA14 Workshop on SEAD's Data Services and ToolsESA14 Workshop on SEAD's Data Services and Tools
ESA14 Workshop on SEAD's Data Services and Tools
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
 

Similaire à Meeting the NSF DMP Requirement June 13, 2012

Data management plans
Data management plansData management plans
Data management plans
Brad Houston
 
NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012
IUPUI
 

Similaire à Meeting the NSF DMP Requirement June 13, 2012 (20)

Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012
 
Research data management & planning: an introduction
Research data management & planning: an introductionResearch data management & planning: an introduction
Research data management & planning: an introduction
 
Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Data management plans
Data management plansData management plans
Data management plans
 
Data Management Plans: a gentle introduction
Data Management Plans: a gentle introductionData Management Plans: a gentle introduction
Data Management Plans: a gentle introduction
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
Research Lifecycles and RDM
Research Lifecycles and RDMResearch Lifecycles and RDM
Research Lifecycles and RDM
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of Edinburgh
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Introduction to Data Management and Sharing
Introduction to Data Management and SharingIntroduction to Data Management and Sharing
Introduction to Data Management and Sharing
 
NSF Data Management Plan Case Study: UVa’s Response.
NSF Data Management Plan Case Study:  UVa’s Response.NSF Data Management Plan Case Study:  UVa’s Response.
NSF Data Management Plan Case Study: UVa’s Response.
 
Australia's Environmental Predictive Capability
Australia's Environmental Predictive CapabilityAustralia's Environmental Predictive Capability
Australia's Environmental Predictive Capability
 
RDM for Librarians
RDM for LibrariansRDM for Librarians
RDM for Librarians
 
Digital Curation 101 - Taster
Digital Curation 101 - TasterDigital Curation 101 - Taster
Digital Curation 101 - Taster
 

Plus de IUPUI

Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
IUPUI
 

Plus de IUPUI (20)

Altmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in LibrariesAltmetrics 101 - Altmetrics in Libraries
Altmetrics 101 - Altmetrics in Libraries
 
Gather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your researchGather evidence to demonstrate the impact of your research
Gather evidence to demonstrate the impact of your research
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interity
 
Case studies for open science
Case studies for open scienceCase studies for open science
Case studies for open science
 
Midwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data PanelMidwest Medical Library Association 2015 Big Data Panel
Midwest Medical Library Association 2015 Big Data Panel
 
Gathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactGathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate Impact
 
Citation & altmetrics - a comparison
Citation & altmetrics - a comparisonCitation & altmetrics - a comparison
Citation & altmetrics - a comparison
 
Altmetrics for Team Science
Altmetrics for Team ScienceAltmetrics for Team Science
Altmetrics for Team Science
 
Ensuring data quality
Ensuring data qualityEnsuring data quality
Ensuring data quality
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Practical Data Management Plans
Practical Data Management PlansPractical Data Management Plans
Practical Data Management Plans
 
Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)Teaching data management in a lab environment (IASSIST 2014)
Teaching data management in a lab environment (IASSIST 2014)
 
Building the Future of Research Together
Building the Future of Research TogetherBuilding the Future of Research Together
Building the Future of Research Together
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 
NIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - SlidesNIH Data Sharing Plan Workshop - Slides
NIH Data Sharing Plan Workshop - Slides
 
Data Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesData Management Lab: Session 4 Slides
Data Management Lab: Session 4 Slides
 
Data Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review OutlineData Management Lab: Session 4 Review Outline
Data Management Lab: Session 4 Review Outline
 
Data Management Lab: Session 3 Slides
Data Management Lab: Session 3 SlidesData Management Lab: Session 3 Slides
Data Management Lab: Session 3 Slides
 
Data Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review ChecklistData Management Lab: Session 3 Data Review Checklist
Data Management Lab: Session 3 Data Review Checklist
 
Data Management Lab: Session 3 Data Entry Best Practices
Data Management Lab: Session 3 Data Entry Best PracticesData Management Lab: Session 3 Data Entry Best Practices
Data Management Lab: Session 3 Data Entry Best Practices
 

Dernier

Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Dernier (20)

Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 

Meeting the NSF DMP Requirement June 13, 2012

  • 1. DATA MANAGEMENT June 13, 2012 PLANS & PLANNING: MEETING THE NSF REQUIREMENT
  • 2. WHO ARE WE? Heather Coates Digital Scholarship & Data Management Librarian Liaison to the School of Public Health University Library Kristi Palmer Digital Scholarship Team Leader Liaison to the Department of History and Programs of Women's and American Studies University Library
  • 3. LEARNING OBJECTIVES After attending this workshop:  You will understand the NSF data policies.  You will be aware of the relevant data -related services at IUPUI.  You will have resources to develop a data management plan (DMP) for your NSF proposal(s).  You will be able to write a comprehensive DMP for your NSF proposal(s).  You will send your DMP draft to the Data Services Program for review and assistance as needed.
  • 4. OVERVIEW  Context for the NSF data policies  Meeting the NSF DMP requirement  The requirement: 5 elements  Developing a Data Management Plan  Implementing your plan  Workshop Evaluation ( 5 minutes)
  • 5. CONTEXT: SCHOLARLY COMMUNICATIONS  Funding, funding, funding  Scholarly Impact  Exposure  increased citation  More equal access (especially for students)  Facilitates reproducibility  Facilitate new discoveries via secondary analysis/data re -use  Foster productive collaborations  Lead to new computational techniques  Planning for the future  If we can’t find it, it doesn’t exist  Persistent access  Long-term preservation of scholarly records
  • 6. CONTEXT: WHY THE LIBRARY? preservation, curation, access  Trusted member of the institution  Organizational structure lends itself to collaboration with researchers  Interdisciplinary by nature  Existing infrastructure for digital information  Existing expertise in preserving and providing access to information  Program of Digital Scholarship  Archives
  • 7. CONTEXT: DATA SERVICES PROGRAM  Part of the Program of Digital Scholarship  Mission  Identifying data issues and connecting you to the solutions  Services  Workshops  Individual consultations  Data repository  Resources  Guide to NSF Data Management Plan Requirement  Website
  • 8. CONTEXT: TERMINOLOGY  Cyberinfrastructure: computing resources & networks, services, & people (see Empowering People, 2009 for more)  Data management: technical processing and preparation of data for analysis  Data curation: selection of data for preservation and adding value for current and future use  Data citation: mechanisms to enable easy reuse and verification, track impact of data, and create structures to recognize and reward researchers ( DataCite)  Data sharing: must take into account ethical and legal issues; a spectrum with many options  Data stewardship:
  • 9. CONTEXT: FEDERAL POLICIES  Issues in scholarly communication  Open access  Open data & data citation  Data management & curation  Federal policies (incremental steps towards openness)  National Research Council, 1985  Office of Management & Budget, 1999: Circular A-110  NIH Data Sharing Policy, 2003  NIH Public Access Policy, 2008  NSF DMP Requirement, 2011  Other policies: NEH, NOAA, NASA, Howard Hughes Medical Institute Wellcome Trust
  • 10. CONTEXT: IU STRATEGIC PLAN IU Empowering People Strategic Plan for IT (2009), Action 33: “IU should provision a data utility service for research data that affords abundant near- and long-term storage, ease of use, and preservation capabilities. This data utility will need to offer a range of services for securing data, providing authorized access within and beyond IU; ensuring metadata description, annotation, and provenance; and providing backup/recovery services.”
  • 11. CONTEXT: OPEN ACCESS  What is Open Access?  Freely available, online, and free of most copyright restrictions  Why should you care?  Right thing to do?  Increase your citations  “We analysed 119,924 conference articles in computer science and related disciplines. The mean number of citations to offline articles is 2.74, and the mean number of citations to online articles is 7.03, an increase of 157%.” (Lawrence, 2008)  Publisher functions need not reside in for profit hands  "Between 1975 and 2005 the average cost of journals in chemistry and physics rose from $76.84 to $1,879.56. In the same period, the cost of a gallon of unleaded regular gasoline rose from 55 cents to $1.82. If the gallon of gas had increased in price at the same rate as chemistry and physics journals over this period it would have reached $12.43 in 2005, and would be over $14.50 today.” (Lewis, 2008)
  • 12. CONTEXT: OPEN ACCESS @ IUPUI AND IU  IUPUI University Library Program of Digital Scholarship  http://www.ulib.iupui.edu/digitalscholarship  Open Journals  IUPUIScholarWorks-Faculty Scholarship  Electronic Theses and Dissertations  Cultural Heritage Collections  Data  eArchives
  • 13. CONTEXT: RESEARCH LIFE CYCLE Source: DDI Structural Reform Group. “DDI Version 3.0 Conceptual Model." DDI Alliance. 2004. Accessed on 11 August 2008. <http://www.icpsr.umich.edu/DDI/committee-info/Concept-Model-WD.pdf>.
  • 14. CONTEXT: BENEFITS OF PLANNING  Saves time  Less reorganization down the road  Increases efficiency  Gathers necessary information for analysis and writing  Prevents problems in understanding data and metadata  Prevents data loss  If you have a plan, you are more likely to back up your data  Makes it easier to preserve your data  Documentation is more easily created throughout a project  Metadata generation can be automated or incorporated into procedures  Requirements of some funding agencies and institutions
  • 15. DMP: INTERPRETING THE POLICY  Why?  Increased impact of research money  Reduce redundant data collection  Enhance use and value of existing data  Further scientific research  Data gathering tool  What kinds of data are we collecting?  How are researchers collecting, managing, and preserving data?  What are community norms?  Language is broad to allow input from research communities  Implementation costs of the DMP CAN be included in direct costs
  • 16. DMP: KEEP IN MIND  The gist of it…  Describe what you will do with your data during and after the proposed project  Ensures data is safe now and in the future  DMP should reflect…  Awareness of data management and curation in your discipline  Feasible plan to utilize available cyberinfrastructure  Try to…  Explain the rationale for your choices  Identify roles for data management and curation activities
  • 17. DMP: ELEMENTS  Types of data  Standards and metadata  Access and sharing  Re-use, re-distribution, and the production of derivatives  Long-term preservation  [Budget]
  • 18. DMP: TYPES OF DATA [1] Use standards common in your research community  Characterize the data  Types of data  experimental, observational, raw or derived, models, simulations, curriculum materials, software, images, audio, video, etc.  File formats (i.e., text, spreadsheet, database, etc.)  How much data? (# of files, total size)  Will the data be reproducible?  Relationship to existing data? (i.e., interoperability)  Syntactic  Semantic
  • 19. DMP: TYPES OF DATA [2]  How will data be collected?  How? (tools, instruments, measurements, etc.)  When? (timeframe, series)  Where? (sites, settings)  How will data be processed?  Workflows (brief overview using flow chart)  Software packages  How will the data be stored and managed?  File naming conventions  Version control
  • 20. DMP: TYPES OF DATA [3]  What QA & QC measures will be used?  Identify steps during processing and analysis to eliminate missing data points, identify outliers, and provide statistical summaries (e.g., double data entry, histograms, scatterplots)  Before data are collected, define and enforce standards and assign responsibility  During project, document processes and any changes or deviations  What is the backup and security plan?  Identify particular security or confidentiality issues  Describe location & frequency  Roles & responsibilities  Who will carry out data collection, processing, and backup activities?
  • 21. EXAMPLE: TYPES OF DATA Atmospheric Concentrations of CO2, Mauna Loa Observatory, Hawaii, 2011 -2013 https://www.dataone.org /sites/all/documents/DMP_MaunaLoa_Fo rmatted.pdf Arthropod responses to grassland nutrient limitation https://www.dataone.org /sites/all/documents/DMP_NutNet_Form atted.pdf
  • 22. DMP: STANDARDS & METADATA [1]  Metadata describes the who, what, when, where, how, why of the data  Include workflow: how you get from raw data to final products  Purpose: enable finding, organization, interoperability, identification, archiving & preservation  Standards are commonly agreed upon terms and definitions in a structured format  Dublin Core (commonly used by libraries)  Darwin Core (geographic occurrence of species)  EML (ecology)  Data Documentation Initiative (DDI; social sciences)  IEEE LOM (learning objects metadata)
  • 23. DMP: STANDARDS & METADATA [2]  Ask yourself: will your datasets be self -explanatory or understandable in isolation?  Decisions to make about metadata  Relevant standard(s)  Format  Content  What information is needed to use and interpret in 5 years, 25 years?  How are metadata created?  Automatically generated  Manually created
  • 24. EXAMPLE: STANDARDS & METADATA [1] Atmospheric Concentrations of CO2, Mauna Loa Observatory, Hawaii, 2011-2013 https://www.dataone.org /sites/all/documents/DMP_MaunaLoa_Fo rmatted.pdf Metadata will be comprised of two formats —Contextual information about the data in a text based document and ISO 19115 standard metadata in an xml file. These two formats for metadata were chosen to provide a full explanation of the data (text format) and to ensure compatibility with international standards (xml format). The standard XML file will be more complete; the document file will be a human -readable summary of the XML file.
  • 25. EXAMPLE: STANDARDS & METADATA [2] R i o G ra n d e H yd rol ogic G e o d atabase C o m p e n di um htt ps:/ /www. dataone .org /site s /al l/ doc ume nts /D M P_ Hydrol ogic _ Form atte d.pdf M i c ro s o f t A c c e s s D ata b a s e fo r ma t w i l l b e u s e d s i n c e i t i s re a d i l y a c c e s s i b l e a n d i t i s co m p a t i b l e w i t h E S R I A rc G I S ( htt p : / / w w w. e s r i . co m/s o f t wa re /a rc g i s / i n d ex . ht m l ) , a G e o g ra p h i c I nfo r m at i o n S y s te m s o f t w a re p a c ka g e u s e d by t h e s ta ke h o l d e rs . N a m i n g co nv e nt i o n s w i l l b e co n s i s te nt – n o s p a c e s w i l l b e u s e d i n ta b l e n a m e s o r f i e l d n a m e s . T h e f i l e n a m i n g co nv e nt i o n w i l l co n s i s t o f t h e d a ta s o u rc e _ d a ta t y p e fo r m a t fo r ra w d a ta f i l e s . D a ta re p o r t i n g f u n c t i o n a l i t y w i l l b e b u i l t i nto t h e V B A p ro c e s s i n g p ro g ra m s to p ro v i d e o u t p u t i n .t x t f i l e fo r m at fo r n u m b e r o f re co rd s p e r s o u rc e w h e n u p d a ta b l e d a ta s o u rc e s a re ref re s h e d . Ev e r y ef fo r t w i l l b e m a d e to g o b a c k to t h e a u t h o r i ta t i v e s o u rc e fo r a n i d e nt i f i e d d a ta s et . Q u a l i t y co nt ro l o f t h e d a ta b a s e w i l l b e p e r fo r m ed u s i n g S Q L s t a te m e nt s t h a t ca p i ta l i ze o n t h e d a ta b a s e s t r u c t u re to e n s u re re l a t i o n a l d a ta b a s e i nte g r i t y. A p p ro p r i ate p r i m a r y key s w i l l b e a s s i g n e d to m a n a g e p o s s i b l e d a ta d u p l i ca te s . Po t e nt i a l d u p l i ca te s i te I D s , w i l l b e h a n d l e d t h ro u g h a u to m a te d p ro c e d u re s a n d t h e c re a t i o n o f a l te r n a te I D ta b l e s . A d ata d i c t i o n a r y w i l l b e c re a te d t h a t d ef i n e s t h e ta b l e d ef i n i t i o n , ta b l e f i e l d s , a n d ta b l e f i e l d d a ta t y p e s . A n e nt i t y - ­ ­ re l at i o n s h i p d i a g ra m w i l l b e c re a te d t h a t d ef i n e s t h e re l a t i o n a l s t r u c t u re o f t h e d a ta b a s e . A m eta d a ta re co rd w i l l b e p ro d u c e d u s i n g t h e F G D C s ta n d a rd t h a t d e s c r i b e s t h e e nt i re g e o d a ta b a s e . T h e F G D C s ta n d a rd w a s c h o s e n d u e to re q u i re d Fe d e ra l g o v e r n m e nt s t a n d a rd s .
  • 26. DMP: ACCESS & SHARING  What are your obligations for sharing?  Funding agency, institution, other organization, legal, etc.  What are the ethical or legal issues? (i.e., privacy, confidentiality, security, intellectual property, or other rights)  How will the data be made available?  What is the process for gaining access?  When will the data be made available?  When will the data become available?  For how long will the data be available?  What is the process for gaining access?  Who will have access to the data?
  • 27. DMP: RE-USE, RE-DISTRIBUTION, ETC.  What rights will you retain before data is made available?  Will permission restrictions be necessary?  Limits or conditions for political, commercial, or patent reasons?  Is there an embargo period? Why?  Future users and uses  Who might be interested in the data?  How might you anticipate this data being used?  What value might the data have for these people?
  • 28. EXAMPLE: ACCESS, SHARING, RE-USE Development of a NanoKlein Calorimeter http://libguides.unm.edu/content.php?pid=137795&sid=1422879 We expect to apply for a patent for this instrument. All of the materials submitted as part of the patent process will be a matter of public record. We will also make technical drawings, test data and calibration data available through our institutional repository. Cave Microbiology http://libguides.unm.edu/content.php?pid=137795&sid=1422879
  • 29. DMP: LONG-TERM PRESERVATION Project-based funding does not lend itself to long -term preservation.  What data will be preserved?  What transformations are necessary to prepare the data?  How long do you think the data will be useful? How long will the data be preserved?  Contextual information needed to make the data reusable  metadata, references, reports, manuscripts, grant proposal, etc.  Where will it be preserved?  Links to published materials and other outcomes? Use of persistent citation?  Procedures for preservation and back-up?  Who will be the contact for the dataset?
  • 30. EXAMPLE: LONG-TERM PRESERVATION [1] Arthropod responses to grassland nutrient limitation https://www.dataone.org /sites/all/documents/DMP_NutNet_Form atted.pdf We will preserve both arthropod datasets generated during this project (abundance and stoichiometry) for the long term in the Digital Conservancy at the U of M. We will include the .csv files, along with the associated metadata files. We will also submit an abstract with the datasets that describe their original context and any potentially relevant project information. Borer will be responsible for preparing data for long -term preservation and for updating contact information for investigators.
  • 31. EXAMPLE: LONG-TERM PRESERVATION [2] Improving the long-term preservability of HDF-formatted data by creating maps to file contents https://www.dataone.org /sites/all/documents/DMP_HDFMap_For matted.pdf The writer software will be preserved by the HDF Group for the life of the HDF libraries. The HDF Group uses industry­standard best practices to ensure the integrity of their software and systems. Once the map writer has been used to generate maps for every HDF file in existence, the continued existence of the writer software is not required. The reader software will be preserved at SourceForge.org for as long as there is community interest. The collection of HDF files will be preserved at NSIDC as long as utility is deemed high.
  • 32. IUPUIDATAWORKS  Institutional repository that can facilitate subject repositories  Policies are being developed, informed by faculty needs  Pilot projects  More support at little/no cost  Flexibility in what we are willing to do  New tools to demonstrate impact of research  The future  Standardized levels of service  Standardized policies, responsive to faculty needs  Cost recovery for significant intellectual/time investment
  • 33. IMPLEMENTING YOUR PLAN [1]  The DMP is a working document  NSF expects progress to be reported (progress reports, final reports, new grant proposals)  Incorporate implementation into the project startup process  C&G, IRB, IACUC all have to be in place before data collection can begin  Review, revise, and set up your system during startup  Good documentation ensures…  A shared understanding of the data throughout a project  That future researchers will be able to understand data within the relevant context  That re-users of data are able to interpret the data appropriately
  • 34. IMPLEMENTING YOUR PLAN [2] Research File System: http://pti.iu.edu/storage/rfs Scholarly Data Archive: http://pti.iu.edu/storage/sda Research Technologies, UITS: http://uits.iu.edu/page/avel Core Ser vices, UITS: http://pti.iu.edu/cs Scholarly Cyberinfrastructure, UITS: http://uits.iu.edu/page/amee C TSI Tools: http://www.indianactsi.org /rct (Alfresco Share, REDCap ) Program of Digital Scholarship: http://ulib.iupui.edu/digitalscholarship Center for Research & Learning: http://crl.iupui.edu/ OVCR: http://research.iupui.edu/development/ Office of Academic Affairs: http://www.academicaffairs.iupui.edu Intellectual Property Policy: https://www.indiana.edu/~vpfaa/ academicguide/index.php/Policy_I-11 IUWare: https://iuware.iu.edu IUanyWare: https://iuanyware.iu.edu/vpn/index.html StatMath: http://www.indiana.edu/~statmath/ Statistics Consulting Center: http://www.math.iupui.edu/asci/
  • 35. PRACTICAL TOOLS Lynda.com tutorials: http://ittraining.iu.edu/lynda/default.aspx Cleaning Up Your Excel Data (2010) Managing & Analyzing Data in Excel (2010) Data Validation in Depth (2010) DMPTool: https://dmp.cdlib.org / DMPOnline: https://dmponline.dcc.ac.uk/ UK Data Archive Costing Tool: http://www.data-archive.ac.uk/media/257647/ ukda_jiscdmcosting.pdf Creative Commons Licenses & Data: http://wiki.creativecommons.org /Data Licensing Research Data, Digital Curation Centre http://www.dcc.ac.uk/resources/how -guides/license-research-data CIC Author Addendum http://www.cic.net/authors
  • 36. RECOMMENDED READING UK Data Archive: Managing & Sharing Data Brochure: http://www.data-archive.ac.uk/media/2894/managingsharing.pdf
  • 37. MORE RESOURCES  National Science Board, Digital Research Data Sharing & Management, 2012 (pre-publication): http://www.nsf.gov/nsb/publications/2011/nsb1124.pdf  Committee on Science, Engineering, and Public Policy (U.S.). (2009). Ensuring the integrity, accessibility, and stewardship of research data in the digital age. Washington, D.C.: National Academies Press.  National Science Board Committee on Strategy and Budget Task Force on Data Policies. (2011). Digital Research Data Sharing & Management. Washington, D.C.: National Science Board.  America Creating Opportunities to Meaningfully Promote Excellence in Technology, Education, and Science Reauthorization Act of 2010, Pub. L. No. 111 -358. 124 Stat. 3982 (2010). Retrieved from the Library of Congress Thomas database .
  • 38. REFERENCES 1. Higgins, S. ( nd). What are metadata standards. http://ww w.dcc.ac.uk/ resources/bri efing -papers/standards -watch-papers/what -are- metadata - standards 2. Digital Curation Centre. ( nd). DCC Charter and Statement of Principles. Retrieved from http://ww w.dcc.ac.uk/about -us/dcc- charter. 3. Indiana Universit y. (2011). Indiana Universit y ’s Advanced Cyberinf rast ructure. Retri eved from http://pti.iu.edu/cyberinf rast ructure.pdf. 4. Indiana Universit y. (2009). Empowering Peopl e: Indiana Universit y ’s Strategic Plan for Information Technology. Retrieved from http://ovpit.iu. edu/st rategic2/ . 5. National Science Foundati on. (2011 ). Award and Administration Guide: Chapter IV C.4., Disseminati on and Sharing of Research Results. Ret ri eved from http://ww w.nsf. gov/pubs/policydocs/pappguide/nsf 1 1001/aag_6. jsp#VI D4 . 6. Lawrence, S., Free online availability substantially increases a paper ’s impact, Nature, 31 May 2001. http://ww w.nat ure. com/nature/debates/e - access/Articles/lawrence.html (accessed November 5, 2008,) 7. Lewis, David W. "Librar y budgets, open access, and the future of scholarl y communication: Transformati ons in academic publishing." C&RL News, May 2008, Vol. 69, No. 5. [Available at: http://ww w.ala.org /ala/mgrps/di vs/acrl/publicati ons/crlnews/ 2008/may/ALA_print _layout _1_ 47113 9_471 139. cf m ]
  • 39. COMPELLING CASES FOR OPEN DATA SPARC, Research is more valuable when it ’s shared: http://www.arl.org /sparc/greaterreach/index.shtml Tim Berners-Lee: http://www.ted.com/talks/tim_berners_lee_ on_the_next_web.html Open-source cancer research: http://www.ted.com/talks/ jay_bradner_open_source_cancer_research.html Polymath problem blogs: http://polymathprojects.org /about/ http://stevekochscience.blogspot.com/2011/02/open -data-success- story.html http://eaves.ca/2011/09/07/the -economics-of-open-data-mini-case- transit-data-translink/
  • 40. THANK YOU Tell us what you think, take a brief survey. Find us @ http://ulib.iupui.edu/digitalscholarship/dataservices Heather Coates, hcoates@iupui.edu, 317-278-7125 Kristi Palmer, klpalmer@iupui.edu, 317-274-8230 IUB Stacy Konkiel, skonkiel@indiana.edu, 812-856-5295
  • 41. EXTRA: NIH DATA SHARING POLICY  $500,000 or more in direct costs in any year of the proposed research  Final research data, not summary statistics or tables, not underlying pathology reports and other clinical source documents, might include both raw data and derived variables  If an application describes a data -sharing plan, NIH expects that plan to be enacted.  NIH expects the timely release and sharing of data to be no later than the acceptance for publication of the main findings from the final dataset.  It is the responsibility of the investigators, their Institutional Review Board (IRB), and their institution to protect the rights of subjects and the confidentiality of the data. Prior to sharing , data should be redacted to strip all identifiers, and effective strategies should be adopted to minimize risks of unauthorized disclosure of personal identifiers.
  • 42. EXTRA: NIH DATA SHARING PLAN  describe briefly the expected schedule for data sharing  the format of the final dataset  the documentation to be provided  whether or not any analytic tools also will be provided  whether or not a data -sharing agreement will be required  if so, a brief description of such an agreement (including the criteria for deciding who can receive the data and whether or not any conditions will be placed on their use)  mode of data sharing (e.g., under their own auspices by mailing a disk or posting data on their institutional or personal website, through a data archive or enclave)  Applicants may request funds in their application for data sharing.
  • 43. RESOURCES National Institutes of Health, Data Sharing Policy http://grants.nih.gov/grants/policy/data_sharing /data_sharing_gui dance.htm NIH Public Access Policy Implications http://publicaccess.nih.gov/public_access_policy_implications_20 12.pdf

Notes de l'éditeur

  1. Housekeeping: hold questions until the end, make sure everyone has handoutsResources: SlidesDSP Guide to NSF DMPNSF Policy language handoutCIC Author Addendum
  2. We’re going to spend the majority of our time today walking through practical tips and examples for each section of the DMP, but there is important background information you need to know first.
  3. The NSF data sharing policy and data management plan requirement came about within the context of broader discussions about how information is disseminated in the sciences, so we’ll quickly review that discussionbefore getting into the practical steps of developing a DMP. We want to accomplish a 2 things:We want to prepare you to engage in discussions about the scholarly record and how research products are disseminated, specifically your rights and options. We want to give you the information you need to make informed decisions regarding copyright, IP, patent, and other issues when it comes to choosing where to publish and preserve all things related to your research. Data is just one piece of this picture.In addition to funding, there are many compelling reasons to plan for preserving and sharing your data. The good news is that data sharing can boost the scholarly impact of your data and research in general, which is always good for promotion and tenure. -collaborations  funders are increasingly looking for interdisciplinary and multi-institution collaborationsThe benefits of digital data come with costs. Unlike with paper-based data and records, we can’t assume that we’ll be able to access and use digital information in 5, 10, or 50 years. We need to plan for managing and preserving valuable digital data so that the scholarly record isn’t lost. If we can’t find something, it doesn’t exist. These issues of persistent access and long-term preservation are challenges that libraries have been solving for thousands of years.
  4. Some people wonder why the library is taking on this challenge of helping researchers to manage and preserve their data. There are several good reasons.-every college or university has a library-our place within IUPUI facilitates collaboration; we have existing relationships with each department; these collaborations are another way to build capacity for data management, sharing, preservation, and curation by making use of resources that are already available-libraries and librarians have been caring for information in many formats for thousands of years; while the formats change more rapidly these days, our core principles remain the sameOther campus units can help you with your research, but have a different focus, such as compliance with human subjects or animal use guidelines, contracts and grants, bioethics, etc.
  5. The Data Services Program is part of the University Library’s Program of Digital Scholarship. The Data Services Program offers workshops and consultations for developing an NSF data management plan as well as data management and curation in general. In addition, we have established a data repository for IUPUI research. The repository is one of many tools available to your for preserving and sharing, if appropriate, your research data.On our website, we’ve provided links to :Sample NSF DMP from other institutionsvarious toolsGuidance from institutions like the ICPSR and Digital Curation Centre (UK)Significant publications discussing data management and curation
  6. I want to clarify some terms so we’re all on the same page. Data management is largely seen as the purview of scientists and biostatisticians since it varies by research community and discipline. Data sharing is not an all-or-none proposition. It encompasses a wide spectrum of activities ranging from open data publishing on the internet without restriction to controlled access by pre-defined partners or collaboratorsData citation is a concept similar to citation of scholarly publications and refers to mechanisms that allowseasy reuse and verification of data (DataCite);the impact of data to be tracked (DataCite);And creates a scholarly structure that recognizes and rewards data producers (DataCite)
  7. These policies came about as a result of broader conversations about scholarly communication. In case you aren’t familiar with the term, it refers to the processes by which we produce and disseminate information relating to teaching, research, and other scholarly activities. Our goal is to provide you with the information necessary to engage in these discussions within IU and your research communities so that you are making informed decisions about how your research gets out there, who retains rights, and who can access it.The NSF data policies are not radical deviations; they are logical steps forward towards more formal guidelines for providing public access, data management best practices, minimal requirements for sharing data, and data stewardship. http://www.nsf.gov/bfa/dias/policy/dmpfaqs.jsp
  8. IU as an institution has been engaged in discussions about scholarly communications for several years and have voiced their commitment to these issues by including data management and curation in the IT Strategic Plan.
  9. The purpose of diagrams like this are to provide a common ground for discussion of a complex and diverse process.This represents a rough map of the research process that we all engage in.The open access conversation is focused on the dissemination of research products like peer-reviewed articles and books at the end of the research life cycle, whereas data management planning is most effective when it’s initiated before data collection begins and implemented throughout the research life cycle.
  10. Important to know that the language was crafted to:Allow the research community to shape the implementationRole for communities of practice to develop relevant best practicesThe budget allocations and narrative should tell a cohesive story; if you identify big challenges in data storage and preservation, but do not allocate funds to address these challenges, it will likely raise a red flag for the review committee.
  11. Ultimately, this document should demonstrate that you are aware of data management and preservation issues in general, more specifically the relevant practices within your research community or discipline, and that you have thought through how these affect your proposed project. The plans proposed should be feasible – for you and for us.
  12. If you look at the guide I’ve provided, you’ll see these topics are broken down into a variety of specific questions to address. We’ll go through each section in more detail.It may be helpful to begin your DMP with a few sentences describing the research project in general, to provide context for the detailed information in each section.As you develop each section of your DMP, it’s important to do two things: Explain your reasoning  it could just be that it’s a standard practice in your field/communityIdentify roles for data management and curation activities  think about who on your team or in another campus unit will carry out the activities described; this section should identify who will be carrying out the major elements of your plan. This may include the PI, staff, students, external contractors, institutional IT, the library, and external data repositories.
  13. In this first section, you want to describe two things: the data you will generate or use and the documentation you will create to facilitate data management and curation.Syntactic interoperability: ensures technical infrastructure (hardware, software, data formats) used to create and discover data can work together/communicate with each otherSemantic interoperability: ensures that the data can be interpreted once exchanged, through use of common data and metadata structures and content
  14. In addition to describing your plan in the DMP, these activities should be described in working documents throughout the life of the project. Creating data documentation is easiest and most efficient at the beginning of a project. Good documentation ensures 3 things: a shared understanding of the data throughout a project; that future researchers will be able to understand data within the context they were created;that re-users of data are able to interpret the data appropriately. You don’t need to spend a lot of time or space describing the planned documentation, but it is worthwhile to mention what format it will take and who will be responsible for creating and maintaining it. This can relate to the second section describing metadata and standards.
  15. Data screening tests: histograms, boxplots, Z-scores, etc.Research methods, even within a single lab, change over time. Good documentation can facilitate efficient data collection and processing and preserve data integrity.
  16. CO2:Site, frequency, raw data file descriptionFinal data productFile format and sizeArthropodprocess of data collection is well described with referral to proposal for further detailPre-defined sample/note code – naming convention &amp; unique IDMeasurements of interest are described, but not defined; common definition or practice?File naming convention and formatsProcess for transfer of files is not describedRelationship to existing data described &amp; process for integration is included
  17. Who created the data?What is the content of the data?When were the data created?Where is it geographically?How were the data developed?Why were the data developed?-can use flow charts for simple workflows
  18. Ask yourself are your data self-explanatory? Consider it from the perspective of a typical reader of a journal you publish in or a colleague who might be interested in collaborating. The solution is good documentation and metadata. More frequently, the people analyzing the data are not those who collected it. Metadata and good data documentation facilitate stronger understanding of the data, facilitate quality and appropriate re-useThere are a lot of standards out there; you can ask what others in your discipline or research community are doing or contact us to see if we are aware of emerging standards. If you know you will be depositing your data in a particular repository, you can ask them what their requirements or recommendations are.
  19. Metadata formats included; formal ISO standardContent of metadata not fully describedNo rationale
  20. Format describedSoftware toolsFile naming conventionQuality control procedures describedData dictionaryMetadata standardNo rationale
  21. Let’s take a look at the handout with the NSF policy language. Again, the language is broad and allows for practices to vary by research community. As you can see from the policy, data dissemination and sharing does not refer to publishing in scholarly journals. In this section, you should define what you will share, how, and the procedures for access. If you plan to use a specific data repository, they can help you develop this section; likely, they will have standard processes in place.Acceptable practices for data sharing vary by discipline; some have very mature data repositories while others rely on informal channels. Best practices for persistent access indicate more permanent and secure mechanisms than a faculty or department website. The solution at IUPUI is our data repository (IUPUIDataWorks).In terms of the access procedures, you want to think about what mechanism will be used for requests, whether registration and authentication are necessary, and what information you want to keep for your own records about those who request and receive your data. This can be useful information to demonstrate the value and impact of your research.Data sharing encompasses a wide spectrum of activities; you can decide what, when, how, and with whom you will share your data. Even if you are part of a community in which data sharing is not common practice, I urge you to think about what data might be shared or re-used without compromising your intellectual property or competitiveness. Sharing your data broadly can increase the impact of your research, benefit the institution, and your research community. Often, the value of data are unknown or unrecognized until they have been examined by a wide audience.
  22. This section will relate to the access and sharing section, but should focus on policies and permissions for re-use, re-distribution, and production of derivatives works as opposed to the mechanisms described in the previous section. You can protect your ability to use the data for ongoing analysis while sharing as much of it as possible with your research community and the general public. While you can’t plan for every case, it is useful to imagine who might be interested in the data, how it might be used, and set up a process for handling those cases. Depending on where you decide to deposit your data, this could be very formalized or relatively informal.If you decide to share your data through a repository, often there are mechanisms built in for applyingCreative Commons licenses. This is true for our data repository as well.
  23. CalorimeterThere are many avenues for sharing data and the results of a study; the patent process is one of them.Cave MicrobiologyThis is representative of a reference datasetDemonstrates history of sharing their dataInclude selected license for their dataPhysical samples – not required to share, but they include this infoField notes – requirement to share as public documentsSEM images – subject repositoryWebsite for educational resources
  24. Ultimately, the impetus for data management, preservation, sharing, and curation will need to come from someone other than funding agencies. Institutions and libraries are looking at sustainable ways to fulfill our commitments and make sure that we can be good data stewards. These systems are still being developed. Realistically, we can promise that your data will be available in 10 years; in 10 years, we hope to have solutions for many of these problems.Here, you should relate the strategies you’ve outlined in previous sections to your long-term preservation strategy. This is an opportunity for you to discuss with us or an external data repository in your discipline, the long-term plan for keeping your data safe. If you are completely unsure how to approach this, feel free to contact the DSP for support. We can help you develop a feasible and appropriate preservation strategy that relies on existing services and infrastructure, whether at IU or elsewhere. A key component of your plan is the description of the cyberinfrastructure available to you and how you will use it to carry out your plan as a responsible data steward. Although your lab may be equipped to store and maintain the data for a project while it’s active, you may not have the capacity to make sure the data is preserved once the project is complete and your lab resources are dedicated to new endeavors. Neither IU nor NSF want to see scientific data lost and are investing significant effort and resources in maintaining the scientific record.These are activities that the Library specifically is invested in and equipped to do; our focus is on long-term preservation, curation, and access. What this means will likely vary by dataset, project, and lab; we’re happy to think this through with you to develop a plan that will meet the needs.
  25. Identified an institutional repositorySelected files to be archived; could be more specificType and format of project information is vague; abstract may not be sufficient
  26. Example of software archival/preservationWhat are “industry-standard best practices”Continued existence of writer software is not required, but reader software will be preservedIsSourceForge an institution that is likely to persist for 10, 25, 50 years?
  27. There are a wealth of resources at IU to help you with your research. These are just a few of those relevant to data management and curation.