SlideShare a Scribd company logo
1 of 41
Research Data Management 
Data documentation, organization, storage and sharing 
Aaron Collie 
Digital Curation Librarian 
collie@msu.edu
Data Management. Isn’t that… trivial? 
 Not so much. Data is a primary output of research; it is very 
expensive to produce high quality data. Data may be collected 
in nanoseconds, but it takes the expert application of 
research protocol and design to generate quality data. 
CC-BY-SA-3.0 Rob Lavinsky 
CC-BY-SA-3.0 Rob
 To put that into perspective, consider data as the 
product of an industry. Data is the output of a 
process that generates higher orders of 
understanding. 
Wisdom 
Knowledge 
Information 
Data 
Understanding 
is hierarchical! 
Russell Ackoff
Data Industries 
 In the academic sector that industry is called scholarly 
communication. 
Data Research 
Article 
 In the private sector that industry is called research & 
development. 
Data New 
Product
Industry is changing 
Multiauthor Papers: Onward and Upward - ScienceWatch Newsletter. (n.d.). Retrieved October 
4, 2013, from http://archive.sciencewatch.com/newsletter/2012/201207/multiauthor_papers/ The demise of the lone author : Article : History 
of the Journal Nature. (n.d.). Retrieved October 
4, 2013, from 
http://www.nature.com/nature/history/full/nat 
ure06243.html
Science is always changing 
• Thousand years ago: 
science was empirical 
describing natural phenomena 
• Last few hundred years: 
theoretical branch 
using models, generalizations 
• Last few decades: 
a computational branch 
simulating complex phenomena 
• Today: 
data exploration (eScience) 
unify theory, experiment, and simulation 
– Data captured by instruments 
or generated by simulator 
– Processed by software 
– Information/Knowledge stored in computer 
– Scientist analyzes database / files 
using data management and statistics 
2 
2 
2 
. 
G c 
3 
4 
a 
a 
a 
  
 
 
 
 
 
 
 
 
 
 
  
Slide credit: Gray, J. & Szalay, A. (11 January 2007). eScience Talk at NRC-CSTB meeting. http://research.microsoft.com/en-us/um/people/gray/talks/NRC-CSTB_eScience.ppt
Research is now a team sport 
(cc) SpoiltCat
This has been noticed. 
“…must include a supplementary document of no more 
than two pages labeled ‘Data Management Plan’.” 
“…expects the timely release and sharing of final research 
data" 
“…should describe how the project team will manage and 
disseminate data generated by the project” 
“…requires that data…be submitted to and archived by 
designated national data centers.” 
NASA “promotes the full and open sharing of all data” 
"IMLS encourages sharing of research data."
But why are we really here? 
 Impetus: NSF has mandated that all grant applications 
submitted after January 18th, 2011 must include a 
supplemental “Data Management Plan” 
 Effect: The original NSF mandate has had a domino effect, and 
many funders now require or state guidelines for data 
management of grant funded research 
 Response: Data management has not traditionally received a 
full treatment in (many) graduate and doctoral curricula; 
intervention is necessary
Positive reinforcement…. 
 National Science Foundation Data Management 
Plan mandate (January 18, 2011) 
 Presidential Memorandum on Managing 
Government Records (August 24, 2012) 
 Managing Government Records Directive: All permanent 
electronic records in Federal agencies will be managed 
electronically to the fullest extent possible for eventual 
transfer and accessioning by NARA in an electronic format.
Positive reinforcement… (cont.) 
 White House policy memo (February 22, 2013) 
 Increasing Access to the Results of Federally Funded Scientific 
Research: Federal agencies with more than $100M in R&D 
expenditures must develop plans to make the published results of 
federally funded research freely available to the public within one year 
of publication. 
 OSTP policy memo (March 20, 2014) 
 Improving the Management of and Access to Scientific Collections: 
directs each Federal agency that owns, maintains, or otherwise 
financially supports permanent scientific collections to develop a draft 
scientific-collections management and access policy within six months.
How does this apply to you? 
 Data Management is an now an expect job skill. 
 Especially in the research fields (“RDM”). 
 Studies show that data management is not typically a 
significant part of undergraduate or graduate curriculum(s). 
 We have a causality dilemma!
What’s in it for you? 
 Better organization for your classes 
 Course Management: Angel / Desire2Learn 
 Bibliographic Management: Zotero / Endnote / Mendelay 
 File Management: Google Drive / Git / File-system 
 Direct application to your career 
 Data management is an “unnamed practice” 
 Start now so you can this skill on your Resume or CV 
 Academia is changing: big data is here
Course Management 
http://help.d2l.msu.edu/
Bibliographic Management 
http://classes.lib.msu.edu/
File Management 
http://tech.msu.edu/storage/
Storage @ MSU 
http://www.egr.msu.edu/decs/ 
• DECS provides MANY services, specifically designed for the College of 
Engineering: storage, equipment, software, hosting… 
http://afs.msu.edu 
• AFS space provides 1GB of networked storage. 
https://wiki.hpcc.msu.edu 
• HPCC provides 50GB (home) and 1TB (research group) networked storage. 
http://www.cstat.msu.edu/ 
• CSTAT offers statistical consulting and training among other services
RDM Systems 
File Storage 
File Content 
File Format 
File System 
 File Systems 
 Hierarchical 
 Database Systems 
 Hierarchical, Relational, or 
Object Oriented 
 Asset Management 
Systems 
 Combination of Database 
and File System
o Storage Options 
o Single points of failure 
o Backup Strategy 
o Project Documentation 
o Process Documentation 
o Data Documentation 
o Sharing Data 
o Publishing Data 
o Archiving Data 
Data 
Management 
Storage 
Architecture 
File 
Management 
Documentation 
Practices 
Access 
Management 
(cc) Alan Cleaver (cc) Will Scullin 
o File Organization 
o File Naming 
o File Formats
o Storage Options 
o Single points of failure 
o Backup Strategy 
Storage 
Architecture 
File Storage 
File Content 
File Format 
File System
o Storage Options  
o Single points of failure 
o Backup Strategy 
Storage 
Architecture 
Optical Storage 
• CD-ROM 
• DVD-ROM 
• Blu-ray Discs 
Solid-State Storage 
• USB Flash Drives 
• Memory Cards 
• “Internal Device Storage” 
Magnetic Storage 
• Internal Hard Drives 
• External Hard Drives 
• Tape Drives 
Networked Storage 
• Server and Web Storage 
• Managed Networked Storage 
• “Cloud Storage” 
• Tape Libraries
o Storage Options 
o Single points of failure  
o Backup Strategy 
Storage 
Architecture 
Good practices for avoiding single points of error: 
 Use managed networked storage whenever possible 
 Move data off of portable media 
 Never rely on one copy of data 
 Do not rely on CD or DVD copies to be readable 
 Be wary of software lifespans (e.g. Angel) 
Limited “Task” Term Short “Project” Term Long “Life” Term 
• Optical Media 
• CD, DVD, Blu-ray 
• Portable Flash Media 
• USB Flash Drives 
• Memory Cards 
• Internal Memory 
• Magnetic Storage 
• Internal HD 
• External HD 
• Networked Storage 
• Server/Web Space 
• Cloud Storage 
• Networked Storage 
• Managed Network 
• Magnetic Storage 
• Tape Drives
o Storage Options 
o Single points of failure 
o Backup Strategy  
Storage 
Architecture 
Good practices for creating a backup strategy: 
 Make 3 copies 
 E.g. original + external/local + external/remote 
 E.g. original + 2 formats on 2 drives in 2 locations 
 Geographically distribute and secure 
 Local vs. remote, depending on needed recovery time 
 Know what resources are available to you: personal 
computer, external hard drives, departmental, or 
university servers may be used
o Storage Options 
o Single points of failure 
o Backup Strategy 
o Project Documentation 
o Process Documentation 
o Data Documentation 
o Sharing Data 
o Publishing Data 
o Archiving Data 
Data 
Management 
Storage 
Architecture 
File 
Management 
Documentation 
Practices 
Access 
Management 
(cc) Alan Cleaver (cc) Will Scullin 
o File Organization 
o File Naming 
o File Formats
o File Organization 
o File Naming 
o File Formats 
File 
Management 
File Storage 
File Content 
File Format 
File System
Create a file plan 
 Better chance you will use a standard method when the time comes 
 Simple organization is intuitive to team members and colleagues 
 Reduces unsynchronized copies in personal drives and email 
attachments 
o File Organization  
o File Naming 
o File Formats 
File 
Management
o File Organization 
o File Naming  
o File Formats 
File 
Management 
Utilize a file naming convention 
 Create logical sequences for sorting through many files and versions 
 Identify what you’re searching for by filename by using a primary term 
 If not using a version control system, implement simple versioning 
 It’s sort of like a tweet 
 Should not exceed 255 characters for most modern operating systems 
Example file names using simple version control: Primary term: 
lakeLansing_waltM_fieldNotes_20091012_v002.doc location 
OrgChart2009_petersK_20090101_d001.svg content 
20110117_sharpeW_krillMicrograph_backscatter3_v002.tif date 
borgesJ_collocation_20080414.xml person
o File Organization 
o File Naming 
o File Formats  
File 
Management 
Make an informed decision in selecting file formats 
 It is important to choose platform and vendor-independent file 
formats to ensure the best chance for future compatibility 
 “Open” formats are often (but not always) supported broadly by a 
community rather than individually by a company or vendor 
Format Genre Great Not Bad Avoid 
TEXT .txt; .odt; .xml; .html .pdf; .rtf; .docx .doc 
AUDIO .flac; .wav .ogg; .mp3 .wma; .ra; .ram; 
compression 
VIDEO .mp2/.mp4, MKV .wmv; .mov; .avi; compression 
IMAGE .tif; .png; .svg; .jpg .gif; .psd; compression 
DATA .sql; .csv; .xml .xlsx .xls; proprietary DB formats
o Storage Options 
o Single points of failure 
o Backup Strategy 
o Project Documentation 
o Process Documentation 
o Data Documentation 
o Sharing Data 
o Publishing Data 
o Archiving Data 
Data 
Management 
Storage 
Architecture 
File 
Management 
Documentation 
Practices 
Access 
Management 
(cc) Alan Cleaver (cc) Will Scullin 
o File Organization 
o File Naming 
o File Formats
o Project Documentation 
o Process Documentation 
o Data Documentation 
Documentation 
Practices 
File Storage 
File Content 
File Format 
File System
o Project Documentation  
o Process Documentation 
o Data Documentation 
Documentation 
Practices 
Good practice for documenting project information: 
 Oftentimes a team effort 
 At minimum, store documentation in readme.txt file 
 Include name of project, people, roles & contact information 
 Include executive summary or abstract for basic context 
 Include an inventory of servers, directories, data, lab 
equipment, and other resources 
 A great start for project documentation is a project charter
o Project Documentation 
o Process Documentation  
o Data Documentation 
Documentation 
Practices 
Good practices for documenting processes: 
 Sometimes an individual effort, sometimes collaborative 
 Protocols, software or code settings, code commentary 
 Workflow descriptions (text) or diagrams (image) 
 Include example scripts, inputs, outputs if applicable 
 A great start for process documentation is a lab notebook 
Example of R code commentary 
# Cumulative normal density 
pnorm(c(-1.96,0,1.96))
o Project Documentation 
o Process Documentation 
o Data Documentation  
Good practices for documenting data: 
 Use standard methods of documentation where 
they exist 
 Metrics/Measurements 
 Code Book 
 Metadata Standard 
unit 
~1.57×107 K = Temperature of the sun (center) 
measure/metric 
metadata 
Documentation 
Practices
o Storage Options 
o Single points of failure 
o Backup Strategy 
o Project Documentation 
o Process Documentation 
o Data Documentation 
o Sharing Data 
o Publishing Data 
o Archiving Data 
Data 
Management 
Storage 
Architecture 
File 
Management 
Documentation 
Practices 
Access 
Management 
(cc) Alan Cleaver 
o File Organization 
o File Naming 
o File Formats
o Sharing Data 
o Publishing Data 
o Archiving Data 
Access 
Management 
File Storage 
File Content 
File Format 
File System
o Sharing Data  
o Publishing Data 
o Archiving Data 
Access 
Management 
Good practices for sharing or distributing data: 
 Basics 
• Synchronization, Versioning, Access Restrictions (and logs) 
• Collaborative tools can save time and effort (and help with scale) 
 Intellectual property 
• Data itself not protected by copyright law in U.S. 
• Expressions of data (forms, reports, visuals) can be copyrightable 
• Data can be licensed similarly to software 
 Ethics 
• Human subjects (e.g. IRB restrictions) 
• Private/sensitive information
o Sharing Data 
o Publishing Data  
o Archiving Data 
Access 
Management 
Good practices for publishing data: 
 Not Publishing 
 Self Publishing (Web Site) 
 Create and add data citations to personal websites 
 Journal (Supplementary Material) 
 Publish data with a journal that will provide a persistent link to your 
dataset (e.g. DOI, handle) 
 Archive/Repository 
 Institutional (see above example) 
 Disciplinary (e.g. article & data)
o Sharing Data 
o Publishing Data 
o Archiving Data  
Access 
Management 
Good practices for archiving research data: 
 LOCKSS! 
 Archive documentation with data 
 Write costs for data management and archiving into your 
research budgets (and in some cases, proposals) 
 Define access policies including restrictions or embargos 
 Understand requirements for submission of data prior to 
project completion
o Storage Options 
o Single points of failure 
o Backup Strategy 
o Project Documentation 
o Process Documentation 
o Data Documentation 
o Sharing Data 
o Publishing Data 
o Archiving Data 
Data 
Management 
Storage 
Architecture 
File 
Management 
Documentation 
Practices 
Access 
Management 
o File Organization 
o File Naming 
o File Formats
Resources at the Library 
Research Data Management Guidance 
• Face-to-face consulting on RDM strategies 
• researchdata@mail.lib.msu.edu 
Tom Volkening 
• Engineering Librarian 
• volkenin@msu.edu 
Aaron Collie 
• Digital Curation Librarian 
• collie@msu.edu
Questions? 
 Store – Three Copies on Three Disks in Three Locations 
 Organize – If you make a plan, you just might follow it. 
 Document – What would my colleagues need to know to 
understand this data? 
 Share – Data makes an impact 
 Slides are HERE: http://tiny.cc/yvdpqw 
Aaron Collie 
Digital Curation Librarian 
collie@msu.edu

More Related Content

What's hot

IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
 
Data Management Planning
Data Management PlanningData Management Planning
Data Management PlanningSarah Jones
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementSarah Jones
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycleMarieke Guy
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementCunera Buys
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)aaroncollie
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersJez Cope
 

What's hot (20)

Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
Data Management Planning
Data Management PlanningData Management Planning
Data Management Planning
 
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
 
What is-rdm
What is-rdmWhat is-rdm
What is-rdm
 
Preparing Your Research Data for the Future - 2014-05-19 - Social Sciences Di...
Preparing Your Research Data for the Future - 2014-05-19 - Social Sciences Di...Preparing Your Research Data for the Future - 2014-05-19 - Social Sciences Di...
Preparing Your Research Data for the Future - 2014-05-19 - Social Sciences Di...
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Writing a Research Data Management Plan - 2016-11-09 - University of Oxford
Writing a Research Data Management Plan - 2016-11-09 - University of OxfordWriting a Research Data Management Plan - 2016-11-09 - University of Oxford
Writing a Research Data Management Plan - 2016-11-09 - University of Oxford
 
Data management plans
Data management plansData management plans
Data management plans
 
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of OxfordData Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
 
Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2016-02-22 - Humanities Div...
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...
Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...
Preparing Your Research Data for the Future - 2014-02-17 - Social Sciences Di...
 
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 

Similar to Research Data Management Fundamentals for MSU Engineering Students

Data management for TA's
Data management for TA'sData management for TA's
Data management for TA'saaroncollie
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersRebekah Cummings
 
New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...
New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...
New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...Aaron Collie
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementJamie Bisset
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing dataSarah Jones
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementcunera
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Data Visibility and Protection at the Scale of Life Sciences
Data Visibility and Protection at the Scale of Life SciencesData Visibility and Protection at the Scale of Life Sciences
Data Visibility and Protection at the Scale of Life SciencesAdam Marko
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Managementdancrane_open
 
Planning for Research Data Managment
Planning for Research Data ManagmentPlanning for Research Data Managment
Planning for Research Data ManagmentDaniel Crane
 

Similar to Research Data Management Fundamentals for MSU Engineering Students (20)

Data management for TA's
Data management for TA'sData management for TA's
Data management for TA's
 
Introduction to RDM for trainee physicians
Introduction to RDM for trainee physiciansIntroduction to RDM for trainee physicians
Introduction to RDM for trainee physicians
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
 
Introduction to Data Management and Sharing
Introduction to Data Management and SharingIntroduction to Data Management and Sharing
Introduction to Data Management and Sharing
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un... Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...
New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...
New Grantsmanship: Digital Sustainability, Open Access, and Consortia Arrange...
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Data Management Planning for Researchers - An Introduction - 2015-02-18 - Un...
Data Management Planning for Researchers -  An Introduction - 2015-02-18 - Un...Data Management Planning for Researchers -  An Introduction - 2015-02-18 - Un...
Data Management Planning for Researchers - An Introduction - 2015-02-18 - Un...
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing data
 
Data management
Data management Data management
Data management
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Data management plans
Data management plansData management plans
Data management plans
 
Data Visibility and Protection at the Scale of Life Sciences
Data Visibility and Protection at the Scale of Life SciencesData Visibility and Protection at the Scale of Life Sciences
Data Visibility and Protection at the Scale of Life Sciences
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Management
 
Planning for Research Data Managment
Planning for Research Data ManagmentPlanning for Research Data Managment
Planning for Research Data Managment
 

Recently uploaded

Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceDelhi Call girls
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 

Recently uploaded (20)

Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 

Research Data Management Fundamentals for MSU Engineering Students

  • 1. Research Data Management Data documentation, organization, storage and sharing Aaron Collie Digital Curation Librarian collie@msu.edu
  • 2. Data Management. Isn’t that… trivial?  Not so much. Data is a primary output of research; it is very expensive to produce high quality data. Data may be collected in nanoseconds, but it takes the expert application of research protocol and design to generate quality data. CC-BY-SA-3.0 Rob Lavinsky CC-BY-SA-3.0 Rob
  • 3.  To put that into perspective, consider data as the product of an industry. Data is the output of a process that generates higher orders of understanding. Wisdom Knowledge Information Data Understanding is hierarchical! Russell Ackoff
  • 4. Data Industries  In the academic sector that industry is called scholarly communication. Data Research Article  In the private sector that industry is called research & development. Data New Product
  • 5. Industry is changing Multiauthor Papers: Onward and Upward - ScienceWatch Newsletter. (n.d.). Retrieved October 4, 2013, from http://archive.sciencewatch.com/newsletter/2012/201207/multiauthor_papers/ The demise of the lone author : Article : History of the Journal Nature. (n.d.). Retrieved October 4, 2013, from http://www.nature.com/nature/history/full/nat ure06243.html
  • 6. Science is always changing • Thousand years ago: science was empirical describing natural phenomena • Last few hundred years: theoretical branch using models, generalizations • Last few decades: a computational branch simulating complex phenomena • Today: data exploration (eScience) unify theory, experiment, and simulation – Data captured by instruments or generated by simulator – Processed by software – Information/Knowledge stored in computer – Scientist analyzes database / files using data management and statistics 2 2 2 . G c 3 4 a a a               Slide credit: Gray, J. & Szalay, A. (11 January 2007). eScience Talk at NRC-CSTB meeting. http://research.microsoft.com/en-us/um/people/gray/talks/NRC-CSTB_eScience.ppt
  • 7. Research is now a team sport (cc) SpoiltCat
  • 8. This has been noticed. “…must include a supplementary document of no more than two pages labeled ‘Data Management Plan’.” “…expects the timely release and sharing of final research data" “…should describe how the project team will manage and disseminate data generated by the project” “…requires that data…be submitted to and archived by designated national data centers.” NASA “promotes the full and open sharing of all data” "IMLS encourages sharing of research data."
  • 9. But why are we really here?  Impetus: NSF has mandated that all grant applications submitted after January 18th, 2011 must include a supplemental “Data Management Plan”  Effect: The original NSF mandate has had a domino effect, and many funders now require or state guidelines for data management of grant funded research  Response: Data management has not traditionally received a full treatment in (many) graduate and doctoral curricula; intervention is necessary
  • 10. Positive reinforcement….  National Science Foundation Data Management Plan mandate (January 18, 2011)  Presidential Memorandum on Managing Government Records (August 24, 2012)  Managing Government Records Directive: All permanent electronic records in Federal agencies will be managed electronically to the fullest extent possible for eventual transfer and accessioning by NARA in an electronic format.
  • 11. Positive reinforcement… (cont.)  White House policy memo (February 22, 2013)  Increasing Access to the Results of Federally Funded Scientific Research: Federal agencies with more than $100M in R&D expenditures must develop plans to make the published results of federally funded research freely available to the public within one year of publication.  OSTP policy memo (March 20, 2014)  Improving the Management of and Access to Scientific Collections: directs each Federal agency that owns, maintains, or otherwise financially supports permanent scientific collections to develop a draft scientific-collections management and access policy within six months.
  • 12. How does this apply to you?  Data Management is an now an expect job skill.  Especially in the research fields (“RDM”).  Studies show that data management is not typically a significant part of undergraduate or graduate curriculum(s).  We have a causality dilemma!
  • 13. What’s in it for you?  Better organization for your classes  Course Management: Angel / Desire2Learn  Bibliographic Management: Zotero / Endnote / Mendelay  File Management: Google Drive / Git / File-system  Direct application to your career  Data management is an “unnamed practice”  Start now so you can this skill on your Resume or CV  Academia is changing: big data is here
  • 17. Storage @ MSU http://www.egr.msu.edu/decs/ • DECS provides MANY services, specifically designed for the College of Engineering: storage, equipment, software, hosting… http://afs.msu.edu • AFS space provides 1GB of networked storage. https://wiki.hpcc.msu.edu • HPCC provides 50GB (home) and 1TB (research group) networked storage. http://www.cstat.msu.edu/ • CSTAT offers statistical consulting and training among other services
  • 18. RDM Systems File Storage File Content File Format File System  File Systems  Hierarchical  Database Systems  Hierarchical, Relational, or Object Oriented  Asset Management Systems  Combination of Database and File System
  • 19. o Storage Options o Single points of failure o Backup Strategy o Project Documentation o Process Documentation o Data Documentation o Sharing Data o Publishing Data o Archiving Data Data Management Storage Architecture File Management Documentation Practices Access Management (cc) Alan Cleaver (cc) Will Scullin o File Organization o File Naming o File Formats
  • 20. o Storage Options o Single points of failure o Backup Strategy Storage Architecture File Storage File Content File Format File System
  • 21. o Storage Options  o Single points of failure o Backup Strategy Storage Architecture Optical Storage • CD-ROM • DVD-ROM • Blu-ray Discs Solid-State Storage • USB Flash Drives • Memory Cards • “Internal Device Storage” Magnetic Storage • Internal Hard Drives • External Hard Drives • Tape Drives Networked Storage • Server and Web Storage • Managed Networked Storage • “Cloud Storage” • Tape Libraries
  • 22. o Storage Options o Single points of failure  o Backup Strategy Storage Architecture Good practices for avoiding single points of error:  Use managed networked storage whenever possible  Move data off of portable media  Never rely on one copy of data  Do not rely on CD or DVD copies to be readable  Be wary of software lifespans (e.g. Angel) Limited “Task” Term Short “Project” Term Long “Life” Term • Optical Media • CD, DVD, Blu-ray • Portable Flash Media • USB Flash Drives • Memory Cards • Internal Memory • Magnetic Storage • Internal HD • External HD • Networked Storage • Server/Web Space • Cloud Storage • Networked Storage • Managed Network • Magnetic Storage • Tape Drives
  • 23. o Storage Options o Single points of failure o Backup Strategy  Storage Architecture Good practices for creating a backup strategy:  Make 3 copies  E.g. original + external/local + external/remote  E.g. original + 2 formats on 2 drives in 2 locations  Geographically distribute and secure  Local vs. remote, depending on needed recovery time  Know what resources are available to you: personal computer, external hard drives, departmental, or university servers may be used
  • 24. o Storage Options o Single points of failure o Backup Strategy o Project Documentation o Process Documentation o Data Documentation o Sharing Data o Publishing Data o Archiving Data Data Management Storage Architecture File Management Documentation Practices Access Management (cc) Alan Cleaver (cc) Will Scullin o File Organization o File Naming o File Formats
  • 25. o File Organization o File Naming o File Formats File Management File Storage File Content File Format File System
  • 26. Create a file plan  Better chance you will use a standard method when the time comes  Simple organization is intuitive to team members and colleagues  Reduces unsynchronized copies in personal drives and email attachments o File Organization  o File Naming o File Formats File Management
  • 27. o File Organization o File Naming  o File Formats File Management Utilize a file naming convention  Create logical sequences for sorting through many files and versions  Identify what you’re searching for by filename by using a primary term  If not using a version control system, implement simple versioning  It’s sort of like a tweet  Should not exceed 255 characters for most modern operating systems Example file names using simple version control: Primary term: lakeLansing_waltM_fieldNotes_20091012_v002.doc location OrgChart2009_petersK_20090101_d001.svg content 20110117_sharpeW_krillMicrograph_backscatter3_v002.tif date borgesJ_collocation_20080414.xml person
  • 28. o File Organization o File Naming o File Formats  File Management Make an informed decision in selecting file formats  It is important to choose platform and vendor-independent file formats to ensure the best chance for future compatibility  “Open” formats are often (but not always) supported broadly by a community rather than individually by a company or vendor Format Genre Great Not Bad Avoid TEXT .txt; .odt; .xml; .html .pdf; .rtf; .docx .doc AUDIO .flac; .wav .ogg; .mp3 .wma; .ra; .ram; compression VIDEO .mp2/.mp4, MKV .wmv; .mov; .avi; compression IMAGE .tif; .png; .svg; .jpg .gif; .psd; compression DATA .sql; .csv; .xml .xlsx .xls; proprietary DB formats
  • 29. o Storage Options o Single points of failure o Backup Strategy o Project Documentation o Process Documentation o Data Documentation o Sharing Data o Publishing Data o Archiving Data Data Management Storage Architecture File Management Documentation Practices Access Management (cc) Alan Cleaver (cc) Will Scullin o File Organization o File Naming o File Formats
  • 30. o Project Documentation o Process Documentation o Data Documentation Documentation Practices File Storage File Content File Format File System
  • 31. o Project Documentation  o Process Documentation o Data Documentation Documentation Practices Good practice for documenting project information:  Oftentimes a team effort  At minimum, store documentation in readme.txt file  Include name of project, people, roles & contact information  Include executive summary or abstract for basic context  Include an inventory of servers, directories, data, lab equipment, and other resources  A great start for project documentation is a project charter
  • 32. o Project Documentation o Process Documentation  o Data Documentation Documentation Practices Good practices for documenting processes:  Sometimes an individual effort, sometimes collaborative  Protocols, software or code settings, code commentary  Workflow descriptions (text) or diagrams (image)  Include example scripts, inputs, outputs if applicable  A great start for process documentation is a lab notebook Example of R code commentary # Cumulative normal density pnorm(c(-1.96,0,1.96))
  • 33. o Project Documentation o Process Documentation o Data Documentation  Good practices for documenting data:  Use standard methods of documentation where they exist  Metrics/Measurements  Code Book  Metadata Standard unit ~1.57×107 K = Temperature of the sun (center) measure/metric metadata Documentation Practices
  • 34. o Storage Options o Single points of failure o Backup Strategy o Project Documentation o Process Documentation o Data Documentation o Sharing Data o Publishing Data o Archiving Data Data Management Storage Architecture File Management Documentation Practices Access Management (cc) Alan Cleaver o File Organization o File Naming o File Formats
  • 35. o Sharing Data o Publishing Data o Archiving Data Access Management File Storage File Content File Format File System
  • 36. o Sharing Data  o Publishing Data o Archiving Data Access Management Good practices for sharing or distributing data:  Basics • Synchronization, Versioning, Access Restrictions (and logs) • Collaborative tools can save time and effort (and help with scale)  Intellectual property • Data itself not protected by copyright law in U.S. • Expressions of data (forms, reports, visuals) can be copyrightable • Data can be licensed similarly to software  Ethics • Human subjects (e.g. IRB restrictions) • Private/sensitive information
  • 37. o Sharing Data o Publishing Data  o Archiving Data Access Management Good practices for publishing data:  Not Publishing  Self Publishing (Web Site)  Create and add data citations to personal websites  Journal (Supplementary Material)  Publish data with a journal that will provide a persistent link to your dataset (e.g. DOI, handle)  Archive/Repository  Institutional (see above example)  Disciplinary (e.g. article & data)
  • 38. o Sharing Data o Publishing Data o Archiving Data  Access Management Good practices for archiving research data:  LOCKSS!  Archive documentation with data  Write costs for data management and archiving into your research budgets (and in some cases, proposals)  Define access policies including restrictions or embargos  Understand requirements for submission of data prior to project completion
  • 39. o Storage Options o Single points of failure o Backup Strategy o Project Documentation o Process Documentation o Data Documentation o Sharing Data o Publishing Data o Archiving Data Data Management Storage Architecture File Management Documentation Practices Access Management o File Organization o File Naming o File Formats
  • 40. Resources at the Library Research Data Management Guidance • Face-to-face consulting on RDM strategies • researchdata@mail.lib.msu.edu Tom Volkening • Engineering Librarian • volkenin@msu.edu Aaron Collie • Digital Curation Librarian • collie@msu.edu
  • 41. Questions?  Store – Three Copies on Three Disks in Three Locations  Organize – If you make a plan, you just might follow it.  Document – What would my colleagues need to know to understand this data?  Share – Data makes an impact  Slides are HERE: http://tiny.cc/yvdpqw Aaron Collie Digital Curation Librarian collie@msu.edu

Editor's Notes

  1. National Oceanic and Atmospheric Administration (NOAA) IMLS encourages sharing of research data. Applications that develop digital products must fill out an additional form with ten questions focused on “Developing Data Management Plans for Research Projects. The federal government has the right to obtain, reproduce, publish or otherwise use the data first produced under an award and authorize others to do so for government purposes.” Ex: Digging Into Data
  2. HANDOUT: DMP (blue)
  3. Interpretation Content Carrier/computer file Network/file system Hard drive walknboston
  4. A single point of failure occurs when it would only take one event to destroy all data on a device (e.g. dropped hard drive)
  5. Simple File Plan Advanced Directory Manifest GIT, Subversion Content Management Systems (CMS) Expert Data management systems (DMS)
  6. Choose a meaningful directory hierarchy Primary subject, Secondary subject, Tertiary subject Investigator, Process, Date Instrument, Date, Sample
  7. Good Practices for file naming: Meaningful & descriptive Capital letters or underscores differentiate between words Surname first followed by initials of first name Decide on a simple “versioning” method (e.g. file_v001) Use alphanumeric characters (e.g. abc123) Meaningful but short (255 character limit) Descriptive while still making sense Capital letters or underscores differentiate between words Surname first followed by initials of first name More on handout NameOfStudy_Location_Date_FG#_transcribedby_NameOfTranscriber_v###.DOCX
  8. Good choices for file formats: Non-proprietary Open, documented standard Common usage by research community Standard representation (ASCII, Unicode) Unencrypted Uncompressed
  9. Simple README.txt Advanced Wiki’s Workflow diagrams Expert Project Management Metadata Standards Ontologies
  10. Shouldn’t I have already documented basic project information in an abstract or introduction in a paper or thesis? Yes, but this information is meant to be contextual information that can be used to better understand the data. It would accompany the data if shared. Sometimes called a project charter Wiki’s, GIT, or other version control systems can really turn this simple charter into an authoritative record of the research
  11. Why do I need to document the way I process and analyze data? Researchers will need detailed information to reuse or verify your data. Again, Methodology sections are not comprehensive
  12. Simple Email Website Collaboration Tools Advanced Networked Storage Expert Data Repository
  13. Scoop, not IRB approved, etc