SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
DATA MANAGEMENT
PLANNING
FEBRUARY 21, 2013




        Lizzy Rolando, Research Data Librarian
Objectives
2



       Understand the current climate around data
        management and data sharing
       Learn about the basic elements of a data
        management plan
       Explore some of the best practices for data
        documentation, long-term preservation, and data
        sharing
       Work with the DMPTool to create a data
        management plan
What is Data Management?
3
Why Data Management?
5




            Good for You




        Photo taken by the U.S. Army Research Development and Engineering Command
Why Data Management?
6




         Good for Science




                  Image from http://xnat.org/
Van Noorden, R. (2011). Science publishing:
    The trouble with retractions, Nature 478, 26-
7   28. doi:10.1038/478026a
Why Data Management?
8




    Required by Funding Agencies
Funding Agency Requirements
9


    Funding Agency                                Requirement
        NSF*         •   Must include a 2-page DMP in proposal
                     •   Materials collected during research should be shared

         NIH         •   Papers must be submitted to PubMed
                     •   Projects with over $500,000 funding must share data and include
                         Data Sharing Plan in proposal

        USDA         •   National Institute of Food and Agriculture requires all data to be
                         submitted to public domain without restriction

        NOAA         •   Soon requiring that all grants include a data sharing plan, which
                         must also be shared
                     •   All data should be made visible, accessible and independently
                         understandable to users, within 2 years of end of grant
        NASA         •   Data should be made freely and widely available.
                     •   A data sharing plan and evidence of any past sharing activities
                         should be included as part of the technical proposal

         CDC         •   All data should be released and/or shared as soon as feasible
Exciting News!
10




        Beginning January 14, 2013, the Biographical
         Sketch(es) for an NSF grant proposal will include
         a section on “Products,” and no longer
         “Publications.” This way, applicants can include not
         just publications, but also datasets, software,
         patents and copyrights.
Basic DMP Components
11



     The NSF requires a 2-page data management
     plan with every grant proposal.
      Data Description
      Data and metadata standards

      Data access and sharing policies

      Data re-use and re-distribution

      Data preservation and archiving
     Depending on the funding source and the directorate/division/program, data
     management plan requirements may differ.
Data Description
12


        What kinds of data will you produce?
          Numerical data, simulations, text sequences, etc.
          Experimental, observational, simulation

          Raw, derived

        How will you acquire the data?
        How will you process the data?
        How much data will you collect?
        Are you using any existing data?
        What QA/QC procedures will you use?
Recommendations
13


        A short description of your project helps to give
         context to why you are collecting the data.
        Survey existing data sources.
        Can be a narrative paragraph, table, or list.
        Keep all raw data separate from analyzed data,
         and maintain versions of data during analysis.
        Implement QA/QC procedures.
          Ex. Two people independently record data
          Ex. Tools to audit spreadsheets
Example (taken from Oceanography DMP)
14



        The project will collect and analyze the following
         data:
          Conductivity  and temperature from moorings and
           shipboard CTD surveys
          Horizontal currents from Lowered ADCP and moorings.
          Horizontal currents from shipboard sonar
          Fine and micro-scale velocity from the WHOI High
           Resolution Profiler
          Fine and micro-scale temperature from fast-response
           thermistors (pods)
Data and Metadata Formats
15



        What metadata will you create/include with data?
          i.e.
              What does someone else need to know about your
           data in order to reuse them?
          Where will this be recorded? How? What format?

        Will you use a community metadata standard?
        Will you conform to community terminology?
Recommendations
16


        Use metadata standards common in your discipline.
        Include a “readme.txt” file that describes the who, what,
         where, when and why of the data, at a bare minimum.
        Make sure you have recorded the information that you would
         need if you were trying to use someone else’s data.
        Check with the data repository where you hope to store your
         data – sometimes they require a particular metadata
         standard.
        Use files names that are understandable to humans.
        Make sure you record units and have headers for rows and
         columns in your tables.
        Notes about the data should be recorded alongside the data
         by the data collectors.
        Thesauri
Example
17




From NEES (Network for Earthquake Engineering Simulation)
Example
18




     From NCAR
     (National Center
     for Atmospheric
     Research)
Example (from NASA SEAC4RS DMP)
19
Appendix A SEAC4RS data file naming convention:
dataID_locationID_YYYYMMDD_R#.extension
The only allowed characters are: a-z A-Z 0-9_.- (that is, upper case and lower case alphanumeric, underscore, period, and hyphen). Fields are
described as follows:
    dataID: an identifier of measured parameter/species, instrument, or model (e.g., O3; NxOy; and PTRMS). For DC3 and SEAC4RS data files, the PIs
     are required to use “DC3-” or “SEAC4RS-” as prefixes for their DataIDs, i.e., DC3-O3 and SEAC4RS-NxOy.
    locationID: an identifier of airborne platform or ground station, e.g., GV, DC8. Specific locationIDs for each deployment will be provided on the data
     website.
    R#: data revision number. For field data, revision number will start from letter “A”, e.g., RA, RB, … etc. Numerical values will be used for the
     preliminary and final data, e.g., R1, R2, R3 … etc.
    Extension: “ict” for ICARTT files, “h4” for HDF 4 files and “h5” for HDF 5 files.
    For example, the filename for the DC-8 Diode Laser Spectrometer H2O measurement made on June, 1, 2012 flight may be: DC3-DLH-
     H2O_DC8_20120601_RA.ict (for field data) or
    DC3-DLH-H2O_DC8_20120601_R1.ict (for final data)
Appendix B Summary of ICARTT format metadata requirements (also required for HDF 5 files):
    Platform and associated location data: Geographic location and altitude will be embedded as part of the data file or provided via a link to the
     archival location of the aircraft navigational data.
    Data Source Contact Information: phone number, mailing information, and e-mail address shall be given for themeasurement Co-I and one alternate
     contact.
    Data Information: Clear definition of measured quantities will be given in plain English, avoiding the use of undefined acronyms, along with reporting
     units and limitation of data use if applicable.
    Measurement Description: A simple description of the measurement technique with reference to readme file and relevant journal publication.
    Measurement Uncertainty: Overall uncertainty will need to be given as a minimum. Ideally, precision and accuracy will be provided explicitly. The
     confidence level associated with the reported uncertainties will also need to be specified for the reported uncertainties if it is applicable. The
     measurement uncertainty can be reported as constants for entire flights or as separate variables. Measurement uncertainty is required by the ICARTT
     data file format.
    Data Quality Flags: definition of flag codes for missing data (not reported due to instrument malfunction or calibration) and detection limits.
    Data Revision Comments: Provide sufficient discussion about the rationale for data revision. The discussions should focus on highlighting issues, solutions,
     assumptions, and impact.
Policies for Access and Sharing
20



        Are your data sensitive, so access by others needs
         to be restricted?
        What license or publishing model will you use for
         your data?
        How will you make your data accessible to others?
        What data will you make available and at what
         stage of your research?
        Do you have protocols, such as IRB, that you need to
         comply with? If so, how will you do so?
Recommendations
21


        Apply an open license to data that you will share.
        Explain why you cannot share data, if that is the
         case.
          Forexample, the data used in your research are
           proprietary.
        Anonymize any sensitive data.
          Use a repository that can mediate data sharing if data
           cannot be sufficiently anonymized
        Comply with IRB restrictions.
          That   should be obvious, but we’ll say it anyways
        Be aware of Georgia Tech Policy…
Example (from ICPSR)
22

     “ICPSR will make the research data from this project available to the broader social
     science research community.
        Public-use data files: These files, in which direct and indirect identifiers have been
         removed to minimize disclosure risk, may be accessed directly through the ICPSR
         Web site. After agreeing to Terms of Use, users with an ICPSR MyData account
         and an authorized IP address from a member institution may download the data,
         and non-members may purchase the files.
        Restricted-use data files: These files are distributed in those cases when removing
         potentially identifying information would significantly impair the analytic
         potential of the data. Users (and their institutions) must apply for these files,
         create data security plans, and agree to other access controls.
        Timeliness: The research data from this project will be supplied to ICPSR before
         the end of the project so that any issues surrounding the usability of the data can
         be resolved. Delayed dissemination may be possible. The Delayed Dissemination
         Policy allows for data to be deposited but not disseminated for an agreed-upon
         period of time (typically one year).”
Policies and Provisions for Re-use
23



        Who do you expect will want to or can reuse your
         data?
        Should there be restrictions on who or how your
         data can be reused?
        How should others indicate that they have used your
         data?
        How long will your data be available to others for
         reuse?
        Does your institution have rules about data?
Recommendations
24


        Imagine the broadest possible audience for your
         data.
        Place as few restrictions on your data as you can.
        Link your published articles to the data underlying
         those data.
        Use a repository that can make your data available
         far into the future.
          Funding Agency                Suggested Length of Time for Private Data Retention
          NIH                           No later than the acceptance for publication of main findings from final data set
          NOAA                          2 years after data collection
          NSF-Engineering Directorate   3 years after the end of the project or public release, whichever comes first

          NSF-Earth Sciences Division   2 years after data collection
          NSF-Ocean Sciences Division   2 years after data collection
Example (from USC)
25


        “USC’s policy is to encourage, wherever appropriate,
         research data to be shared with the general public
         through internet access. This public access will be
         regulated by the university in order to protect privacy
         and confidentiality concerns, as well to respect any
         proprietary or intellectual property
         rights. Administrators will consult with the university’s
         legal office to address any concerns on a case-by-case
         basis, if necessary. Terms of use will include
         requirements of attribution along with disclaimers of
         liability in connection with any use or distribution of the
         research data, which may be conditioned under some
         circumstances.”
Archiving and Preservation
26



        What formats for your data will you use? Are they
         preservation friendly?
        What repository or data archive can take your
         data when you are finished?
          How  do they preserve/share your data?
          What are their access policies?

          Is any extra work needed to prepare data for the
           repository?
        Who will be responsible for final preservation?
Recommendations
27


        Appraise your data, selecting those with long-term
         value, and document your choices.
        Use preservation friendly digital formats.
          Non-proprietary,commonly used
          You may need to transform data into new format.

        Find a repository that will take your data, and plan
         to comply with their policies early on.
        Look into using SMARTech!
        P.I.’s should ultimately be responsible for dealing
         with the final disposition of the data.
Example (from DataOne)
28

     Short Term:
        The data product will be updated monthly reflecting updates to the record, revisions due to
         recalibration of standard gases, and identification and flagging of any errors. The date of the update
         will be included in the data file and will be part of the data file name. Versions of the data product
         that have been revised due to errors/updates (other than new data) will be retained in an archive
         system. A revision history document will describe the revisions made. Daily and monthly backups of the
         data files will be retained at the Keeling Group Lab (http://scrippsco2.ucsd.edu, accessed 05/2011),
         at the Scripps Institution of Oceanography Computer Center, and at the Woods Hole Oceanographic
         Institution’s Computer Center.
     Long Term:
        Our intent is that the long term high quality final data product generated by this project will be
         available for use by the research and policy communities in perpetuity. The raw supporting data will be
         available in perpetuity as well, for use by researchers to confirm the quality of the Mauna Loa Record.
         The investigators have made arrangements for long term stewardship and curation at the Carbon
         Dioxide Information and Analysis Center (CDIAC), Oak Ridge National Laboratory (see letter of
         support). The standardized metadata record for the Mauna Loa CO2 data will be added to the
         metadata record database at CDIAC, so that interested users can discover the Mauna Loa CO2 record
         along with other related Earth science data. CDIAC has a standardized data product citation including
         DOI, that indicates the version of the Mauna Loa Data Product and how to obtain a copy of that
         product.
Never Fear!
29
DMPTool
30


        Developed by a number of academic universities in
         response to funding agency mandates
        https://dmp.cdlib.org/
Step 1: Sign In
31




        Choose Georgia Tech
Shibboleth…
32
Step 2: Create a Plan
33




     Select a Funding Agency.
                                Email is sent to
                                Georgia Tech
                                Library.
Creating and Naming your Plan
34




                           Strongly Recommend
                           Naming Plan “[Insert
                           Proposal Title Here]
                           Data Management
                           Plan”.
Step 3: One Section at a Time
35




 Sections are
 different
 depending on
 funding
 source.
                               Georgia Tech
                               and DataONE
     Enter your                have resources
     answers here.             available for
                               every section.
Some Sections Have Extra Advice
36




                              Georgia Tech
                              specific help
                              text
Almost There
37




You should
save after
every section,
but definitely      You’re so close
save at the         to the end!
very end.
Step 4: Export
38




                      Now that you have
                      the content, you can
                      export your plan.
Step 5: Share plan
39




      Send your plan to the Research Data
       Librarian (Me!) to look over your plan.
      Have your colleagues look at your plan.

      Do you know your grant officer?
Step 6: Finish and Start Research!
40




      Add plan to proposal or distribute among
       research team
      Begin your newly funded research!
Other Data Management Plan Resources
41



         Digital Curation Centre -
          http://www.dcc.ac.uk/resources/data-management-plans
         ICPSR – while made for Social Science data, it has great
          resources for anyone:
          http://www.icpsr.umich.edu/icpsrweb/content/datamanage
          ment/dmp/plan.html
         UK Data Archive - http://www.data-
          archive.ac.uk/media/2894/managingsharing.pdf
Questions?
42




       Lizzy Rolando
       Research Data Librarian
       lizzy.rolando@library.gatech.edu
       404.385.3706
       http://libguides.gatech.edu/research-data

Contenu connexe

Tendances

What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?Robert Grossman
 
Good (enough) research data management practices
Good (enough) research data management practicesGood (enough) research data management practices
Good (enough) research data management practicesLeon Osinski
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to MetadataEUDAT
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataRobert Grossman
 
A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4Leon Osinski
 
2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning WorkshopLizzy_Rolando
 
Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacurationAPLICwebmaster
 
Some Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsSome Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsRobert Grossman
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 
David Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published recordDavid Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published recordJisc
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementDaniel JACOB
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedRobert Grossman
 
DataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy IssuesDataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy IssuesDataONE
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? Robert Grossman
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Robert Grossman
 
Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?LIBER Europe
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practicesRobert Oostenveld
 
110- Freyman Knowledge flows Linking big dataset
110- Freyman Knowledge flows Linking big dataset110- Freyman Knowledge flows Linking big dataset
110- Freyman Knowledge flows Linking big datasetinnovationoecd
 

Tendances (20)

What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?
 
Good (enough) research data management practices
Good (enough) research data management practicesGood (enough) research data management practices
Good (enough) research data management practices
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to Metadata
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4
 
2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop
 
Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacuration
 
Some Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsSome Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data Platforms
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
David Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published recordDavid Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published record
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
 
The Donders Repository
The Donders RepositoryThe Donders Repository
The Donders Repository
 
DataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy IssuesDataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy Issues
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
 
Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?Where is the opportunity for libraries in the collaborative data infrastructure?
Where is the opportunity for libraries in the collaborative data infrastructure?
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
 
110- Freyman Knowledge flows Linking big dataset
110- Freyman Knowledge flows Linking big dataset110- Freyman Knowledge flows Linking big dataset
110- Freyman Knowledge flows Linking big dataset
 

En vedette

8th Annual Collateral Management Forum
8th Annual Collateral Management Forum8th Annual Collateral Management Forum
8th Annual Collateral Management ForumFleming.
 
Open Source Creativity
Open Source CreativityOpen Source Creativity
Open Source CreativitySara Cannon
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...Brian Solis
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)maditabalnco
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsBarry Feldman
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome EconomyHelge Tennø
 

En vedette (6)

8th Annual Collateral Management Forum
8th Annual Collateral Management Forum8th Annual Collateral Management Forum
8th Annual Collateral Management Forum
 
Open Source Creativity
Open Source CreativityOpen Source Creativity
Open Source Creativity
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 

Similaire à Data Management Planning - 02/21/13

DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciencesSarah Jones
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...BigData_Europe
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)EUDAT
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
 
Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Elizabeth Brown
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGGeoffrey Fox
 
Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharingJisc RDM
 
EPSRC research data expectations and research software management
EPSRC research data expectations and research software managementEPSRC research data expectations and research software management
EPSRC research data expectations and research software managementHistoric Environment Scotland
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATTony Ross-Hellauer
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATOpenAIRE
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | EUDAT
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT
 

Similaire à Data Management Planning - 02/21/13 (20)

Creating a Data Management Plan
Creating a Data Management PlanCreating a Data Management Plan
Creating a Data Management Plan
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"Praetzellis "Data Management Planning and Tools"
Praetzellis "Data Management Planning and Tools"
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
 
Data management plans
Data management plansData management plans
Data management plans
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
 
Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012
 
What is a DMP
What is a DMPWhat is a DMP
What is a DMP
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharing
 
EPSRC research data expectations and research software management
EPSRC research data expectations and research software managementEPSRC research data expectations and research software management
EPSRC research data expectations and research software management
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
 

Dernier

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 

Dernier (20)

Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 

Data Management Planning - 02/21/13

  • 1. DATA MANAGEMENT PLANNING FEBRUARY 21, 2013 Lizzy Rolando, Research Data Librarian
  • 2. Objectives 2  Understand the current climate around data management and data sharing  Learn about the basic elements of a data management plan  Explore some of the best practices for data documentation, long-term preservation, and data sharing  Work with the DMPTool to create a data management plan
  • 3. What is Data Management? 3
  • 4.
  • 5. Why Data Management? 5 Good for You Photo taken by the U.S. Army Research Development and Engineering Command
  • 6. Why Data Management? 6 Good for Science Image from http://xnat.org/
  • 7. Van Noorden, R. (2011). Science publishing: The trouble with retractions, Nature 478, 26- 7 28. doi:10.1038/478026a
  • 8. Why Data Management? 8 Required by Funding Agencies
  • 9. Funding Agency Requirements 9 Funding Agency Requirement NSF* • Must include a 2-page DMP in proposal • Materials collected during research should be shared NIH • Papers must be submitted to PubMed • Projects with over $500,000 funding must share data and include Data Sharing Plan in proposal USDA • National Institute of Food and Agriculture requires all data to be submitted to public domain without restriction NOAA • Soon requiring that all grants include a data sharing plan, which must also be shared • All data should be made visible, accessible and independently understandable to users, within 2 years of end of grant NASA • Data should be made freely and widely available. • A data sharing plan and evidence of any past sharing activities should be included as part of the technical proposal CDC • All data should be released and/or shared as soon as feasible
  • 10. Exciting News! 10  Beginning January 14, 2013, the Biographical Sketch(es) for an NSF grant proposal will include a section on “Products,” and no longer “Publications.” This way, applicants can include not just publications, but also datasets, software, patents and copyrights.
  • 11. Basic DMP Components 11 The NSF requires a 2-page data management plan with every grant proposal.  Data Description  Data and metadata standards  Data access and sharing policies  Data re-use and re-distribution  Data preservation and archiving Depending on the funding source and the directorate/division/program, data management plan requirements may differ.
  • 12. Data Description 12  What kinds of data will you produce?  Numerical data, simulations, text sequences, etc.  Experimental, observational, simulation  Raw, derived  How will you acquire the data?  How will you process the data?  How much data will you collect?  Are you using any existing data?  What QA/QC procedures will you use?
  • 13. Recommendations 13  A short description of your project helps to give context to why you are collecting the data.  Survey existing data sources.  Can be a narrative paragraph, table, or list.  Keep all raw data separate from analyzed data, and maintain versions of data during analysis.  Implement QA/QC procedures.  Ex. Two people independently record data  Ex. Tools to audit spreadsheets
  • 14. Example (taken from Oceanography DMP) 14  The project will collect and analyze the following data:  Conductivity and temperature from moorings and shipboard CTD surveys  Horizontal currents from Lowered ADCP and moorings.  Horizontal currents from shipboard sonar  Fine and micro-scale velocity from the WHOI High Resolution Profiler  Fine and micro-scale temperature from fast-response thermistors (pods)
  • 15. Data and Metadata Formats 15  What metadata will you create/include with data?  i.e. What does someone else need to know about your data in order to reuse them?  Where will this be recorded? How? What format?  Will you use a community metadata standard?  Will you conform to community terminology?
  • 16. Recommendations 16  Use metadata standards common in your discipline.  Include a “readme.txt” file that describes the who, what, where, when and why of the data, at a bare minimum.  Make sure you have recorded the information that you would need if you were trying to use someone else’s data.  Check with the data repository where you hope to store your data – sometimes they require a particular metadata standard.  Use files names that are understandable to humans.  Make sure you record units and have headers for rows and columns in your tables.  Notes about the data should be recorded alongside the data by the data collectors.  Thesauri
  • 17. Example 17 From NEES (Network for Earthquake Engineering Simulation)
  • 18. Example 18 From NCAR (National Center for Atmospheric Research)
  • 19. Example (from NASA SEAC4RS DMP) 19 Appendix A SEAC4RS data file naming convention: dataID_locationID_YYYYMMDD_R#.extension The only allowed characters are: a-z A-Z 0-9_.- (that is, upper case and lower case alphanumeric, underscore, period, and hyphen). Fields are described as follows:  dataID: an identifier of measured parameter/species, instrument, or model (e.g., O3; NxOy; and PTRMS). For DC3 and SEAC4RS data files, the PIs are required to use “DC3-” or “SEAC4RS-” as prefixes for their DataIDs, i.e., DC3-O3 and SEAC4RS-NxOy.  locationID: an identifier of airborne platform or ground station, e.g., GV, DC8. Specific locationIDs for each deployment will be provided on the data website.  R#: data revision number. For field data, revision number will start from letter “A”, e.g., RA, RB, … etc. Numerical values will be used for the preliminary and final data, e.g., R1, R2, R3 … etc.  Extension: “ict” for ICARTT files, “h4” for HDF 4 files and “h5” for HDF 5 files.  For example, the filename for the DC-8 Diode Laser Spectrometer H2O measurement made on June, 1, 2012 flight may be: DC3-DLH- H2O_DC8_20120601_RA.ict (for field data) or  DC3-DLH-H2O_DC8_20120601_R1.ict (for final data) Appendix B Summary of ICARTT format metadata requirements (also required for HDF 5 files):  Platform and associated location data: Geographic location and altitude will be embedded as part of the data file or provided via a link to the archival location of the aircraft navigational data.  Data Source Contact Information: phone number, mailing information, and e-mail address shall be given for themeasurement Co-I and one alternate contact.  Data Information: Clear definition of measured quantities will be given in plain English, avoiding the use of undefined acronyms, along with reporting units and limitation of data use if applicable.  Measurement Description: A simple description of the measurement technique with reference to readme file and relevant journal publication.  Measurement Uncertainty: Overall uncertainty will need to be given as a minimum. Ideally, precision and accuracy will be provided explicitly. The confidence level associated with the reported uncertainties will also need to be specified for the reported uncertainties if it is applicable. The measurement uncertainty can be reported as constants for entire flights or as separate variables. Measurement uncertainty is required by the ICARTT data file format.  Data Quality Flags: definition of flag codes for missing data (not reported due to instrument malfunction or calibration) and detection limits.  Data Revision Comments: Provide sufficient discussion about the rationale for data revision. The discussions should focus on highlighting issues, solutions, assumptions, and impact.
  • 20. Policies for Access and Sharing 20  Are your data sensitive, so access by others needs to be restricted?  What license or publishing model will you use for your data?  How will you make your data accessible to others?  What data will you make available and at what stage of your research?  Do you have protocols, such as IRB, that you need to comply with? If so, how will you do so?
  • 21. Recommendations 21  Apply an open license to data that you will share.  Explain why you cannot share data, if that is the case.  Forexample, the data used in your research are proprietary.  Anonymize any sensitive data.  Use a repository that can mediate data sharing if data cannot be sufficiently anonymized  Comply with IRB restrictions.  That should be obvious, but we’ll say it anyways  Be aware of Georgia Tech Policy…
  • 22. Example (from ICPSR) 22 “ICPSR will make the research data from this project available to the broader social science research community.  Public-use data files: These files, in which direct and indirect identifiers have been removed to minimize disclosure risk, may be accessed directly through the ICPSR Web site. After agreeing to Terms of Use, users with an ICPSR MyData account and an authorized IP address from a member institution may download the data, and non-members may purchase the files.  Restricted-use data files: These files are distributed in those cases when removing potentially identifying information would significantly impair the analytic potential of the data. Users (and their institutions) must apply for these files, create data security plans, and agree to other access controls.  Timeliness: The research data from this project will be supplied to ICPSR before the end of the project so that any issues surrounding the usability of the data can be resolved. Delayed dissemination may be possible. The Delayed Dissemination Policy allows for data to be deposited but not disseminated for an agreed-upon period of time (typically one year).”
  • 23. Policies and Provisions for Re-use 23  Who do you expect will want to or can reuse your data?  Should there be restrictions on who or how your data can be reused?  How should others indicate that they have used your data?  How long will your data be available to others for reuse?  Does your institution have rules about data?
  • 24. Recommendations 24  Imagine the broadest possible audience for your data.  Place as few restrictions on your data as you can.  Link your published articles to the data underlying those data.  Use a repository that can make your data available far into the future. Funding Agency Suggested Length of Time for Private Data Retention NIH No later than the acceptance for publication of main findings from final data set NOAA 2 years after data collection NSF-Engineering Directorate 3 years after the end of the project or public release, whichever comes first NSF-Earth Sciences Division 2 years after data collection NSF-Ocean Sciences Division 2 years after data collection
  • 25. Example (from USC) 25  “USC’s policy is to encourage, wherever appropriate, research data to be shared with the general public through internet access. This public access will be regulated by the university in order to protect privacy and confidentiality concerns, as well to respect any proprietary or intellectual property rights. Administrators will consult with the university’s legal office to address any concerns on a case-by-case basis, if necessary. Terms of use will include requirements of attribution along with disclaimers of liability in connection with any use or distribution of the research data, which may be conditioned under some circumstances.”
  • 26. Archiving and Preservation 26  What formats for your data will you use? Are they preservation friendly?  What repository or data archive can take your data when you are finished?  How do they preserve/share your data?  What are their access policies?  Is any extra work needed to prepare data for the repository?  Who will be responsible for final preservation?
  • 27. Recommendations 27  Appraise your data, selecting those with long-term value, and document your choices.  Use preservation friendly digital formats.  Non-proprietary,commonly used  You may need to transform data into new format.  Find a repository that will take your data, and plan to comply with their policies early on.  Look into using SMARTech!  P.I.’s should ultimately be responsible for dealing with the final disposition of the data.
  • 28. Example (from DataOne) 28 Short Term:  The data product will be updated monthly reflecting updates to the record, revisions due to recalibration of standard gases, and identification and flagging of any errors. The date of the update will be included in the data file and will be part of the data file name. Versions of the data product that have been revised due to errors/updates (other than new data) will be retained in an archive system. A revision history document will describe the revisions made. Daily and monthly backups of the data files will be retained at the Keeling Group Lab (http://scrippsco2.ucsd.edu, accessed 05/2011), at the Scripps Institution of Oceanography Computer Center, and at the Woods Hole Oceanographic Institution’s Computer Center. Long Term:  Our intent is that the long term high quality final data product generated by this project will be available for use by the research and policy communities in perpetuity. The raw supporting data will be available in perpetuity as well, for use by researchers to confirm the quality of the Mauna Loa Record. The investigators have made arrangements for long term stewardship and curation at the Carbon Dioxide Information and Analysis Center (CDIAC), Oak Ridge National Laboratory (see letter of support). The standardized metadata record for the Mauna Loa CO2 data will be added to the metadata record database at CDIAC, so that interested users can discover the Mauna Loa CO2 record along with other related Earth science data. CDIAC has a standardized data product citation including DOI, that indicates the version of the Mauna Loa Data Product and how to obtain a copy of that product.
  • 30. DMPTool 30  Developed by a number of academic universities in response to funding agency mandates  https://dmp.cdlib.org/
  • 31. Step 1: Sign In 31 Choose Georgia Tech
  • 33. Step 2: Create a Plan 33 Select a Funding Agency. Email is sent to Georgia Tech Library.
  • 34. Creating and Naming your Plan 34 Strongly Recommend Naming Plan “[Insert Proposal Title Here] Data Management Plan”.
  • 35. Step 3: One Section at a Time 35 Sections are different depending on funding source. Georgia Tech and DataONE Enter your have resources answers here. available for every section.
  • 36. Some Sections Have Extra Advice 36 Georgia Tech specific help text
  • 37. Almost There 37 You should save after every section, but definitely You’re so close save at the to the end! very end.
  • 38. Step 4: Export 38 Now that you have the content, you can export your plan.
  • 39. Step 5: Share plan 39  Send your plan to the Research Data Librarian (Me!) to look over your plan.  Have your colleagues look at your plan.  Do you know your grant officer?
  • 40. Step 6: Finish and Start Research! 40  Add plan to proposal or distribute among research team  Begin your newly funded research!
  • 41. Other Data Management Plan Resources 41  Digital Curation Centre - http://www.dcc.ac.uk/resources/data-management-plans  ICPSR – while made for Social Science data, it has great resources for anyone: http://www.icpsr.umich.edu/icpsrweb/content/datamanage ment/dmp/plan.html  UK Data Archive - http://www.data- archive.ac.uk/media/2894/managingsharing.pdf
  • 42. Questions? 42 Lizzy Rolando Research Data Librarian lizzy.rolando@library.gatech.edu 404.385.3706 http://libguides.gatech.edu/research-data