SlideShare a Scribd company logo
1 of 24
Data 
Management 
LIS 653 
Starr Hoffman
Data
What is (are) Data? 
 Observations 
 Sensor data, telemetry, survey data, sample data 
 Experiments 
 Gene sequences, chromatograms 
 Simulations 
 Economic models 
 Derivations/Compilations 
 Text mining, data from public documents 
 Documents & texts themselves = data 
 Research Process 
 Observational conditions, experimental procedure, 
instrumentation, label descriptions, units, metadata
Librarian Roles & Data 
Advisory 
 Original data: 
 Consult on creating DMP 
 Consult on data organization, methodology, etc. 
 Consult on metadata practices 
 Consult on archiving 
 Help disseminate research 
 Journal publication, OA resources, blogs, etc. 
 Deposit into repository (IR, 3rd party, etc.) 
 Secondary data: 
 Consult on methodology / analysis 
 Discovery… 
Curatorial 
 Manage IR (institutional repository) 
 Create metadata for datasets 
 Purchase / catalog / discovery for secondary data
What is Data Management? 
Planning for the short-term and long-term: 
 care of and 
 access to 
…your data. 
Or: What are you going to do with that data? 
 How will you describe it? 
 How are you organizing it? 
 After you’re done, where will you put it? 
 How will you/others be able to access it? For how long?
Data Management: 
Why Does it Matter? 
 Grant requirements 
 Public access to funded research 
 Validation 
 Replication 
 Re-use, continue research 
 Teaching 
 Natural disasters 
 Computer failure/stolen 
 USB/hard drive failure/lost 
 Files corrupted
Funding 
Requirements 
 NSF: 
Proposals must include a supplementary document of no 
more than two pages labeled “Data Management Plan” 
…describe how the proposal will conform to NSF policy 
on the dissemination and sharing of research results. 
 NIH: 
The NIH expects and supports the timely release and 
sharing of final research data… for use by other 
researchers. …expected to include a plan for data 
sharing or state why data sharing is not possible. 
 NEH Office of Digital Humanities 
 NOAA 
 IMLS 
 NIJ
DMP Considerations 
 What data types, from what sources, in what formats will this project 
produce? How much of it will there be? 
 How will you describe or document your data? Are there standards you 
will be using for this? 
 Will you be sharing your data? Do you have the rights to share the data? 
What did you tell the IRB? 
 How often do you need to backup your files? How do you need to be 
able to access your files? How many backups will you have? 
 How much storage space do you need? What is your budget for your 
storage? 
 Where are you going to archive or store the data? and how will it be 
accessed? 
 What are the roles and responsibilities around all of these things? i.e., 
Who's going to be doing all this?
DMP Examples
Planning the Data Life-Cycle 
Consider… 
 Files: 
 Size, format, organization 
 Security 
 Storage/Backup system 
 Retention 
 Access/Transparency
Data Lifecycle: 
Create / Analyze / Edit 
 File Management 
 Consistency, brevity, description 
 Versioning (v01, v02, FINAL) 
 Avoid spaces 
Directory structure 
/[Project]/[Grant Number]/[Event]/[Date] 
File naming 
[description]_[instrument]_[location]_[YYYYMMDD].[ext] 
 Transparency/Sharing 
 Document data: codebook, metadata
File Structure & Naming Examples 
Directory Structure 
/[Project]/[Grant Number]/[Event]/[Date] 
 /NYCPhysicalActivity/NOT-MH-14-033/Interview/20141109 
 /Dissertation/LitReview/LibraryLeadership/ 
File Naming 
[description]_[instrument]_[location]_[YYYYMMDD].[ext] 
 PhysicalActivity_InterviewQs_PS193_20141109.doc 
 PhysicalActivity_InterviewResponses_20141022.xls 
 LibraryLeadershipHenson_Article_2011.pdf 
 Leadership_Survey_20130917.doc
Metadata & Description 
 Variables: labels, meaning, how they were 
measured, units, codes 
 Survey questions 
 Experimental procedures 
 Research methodology 
 Statistical analyses performed 
 Preferred data citation 
 Pew Hispanic Center. (2008). 2007 Hispanic 
Healthcare Survey [Data file and code book]. 
Retrieved from 
http://pewhispanic.org/datasets/
Codebook Examples
Codebook Examples
Data Lifecycle: 
Publish, Store, Access, Reuse 
 File size & format 
 Open vs. proprietary 
 Security 
 Anonymize or encrypt? 
 Levels may vary by access (org. vs. 3rd party) 
 Data Citation 
 Sharing 
 Upload data & metadata 
 Institutional repository, data center, etc. 
 Persistent identifier
Institutional Repositories
Institutional Repositories
Institutional Repositories
Institutional Repositories
Dataset Record in IR
Other places data can live… 
 Figshare 
 ICPSR 
 Github 
 DataUp 
 Dropbox 
 (or other cloud storage) 
 IF you use proper encryption 
Lists of data repositories: 
 DataCite 
 DataBib
Data Discovery 
 Data Depositories (previous slide) 
 ICPSR 
 Figshare 
 Institutional Repositories 
 OpenDOAR (directory) 
 Specific institutions 
 Data Catalogs 
 Numeric Data Catalog (Columbia) 
 GeoData (Columbia, others) 
 Gov & Public Sources (data producers) 
 NYC OpenData 
 Data.gov 
 Census Bureau 
 Bureau of Labor Statistics 
 IMLS (Institute of Museum & Library Services)
Replicated Data 
And Finally… 
Geeky puns.

More Related Content

What's hot

Using a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to LibrariansUsing a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to LibrariansSherry Lake
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampSherry Lake
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data ManagementOpenAIRE
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research DataKristin Briney
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your dataLeon Osinski
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycleSherry Lake
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantinimaxfalc
 
basis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsbasis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsSaroj Suwal
 
NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012IUPUI
 
Metadata lecture(9 17-14)
Metadata lecture(9 17-14)Metadata lecture(9 17-14)
Metadata lecture(9 17-14)mhb120
 
Research sheet - explanation
Research sheet - explanationResearch sheet - explanation
Research sheet - explanationRobert Croker
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxfPhilippe Rocca-Serra
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information RetrievalRoi Blanco
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data ManagementAmanda Whitmire
 

What's hot (20)

Using a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to LibrariansUsing a Case Study to Teach Data Management to Librarians
Using a Case Study to Teach Data Management to Librarians
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM Bootcamp
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
Data management plan template
Data management plan templateData management plan template
Data management plan template
 
Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research Data
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
 
Data Management for Librarians
Data Management for LibrariansData Management for Librarians
Data Management for Librarians
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantini
 
basis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsbasis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival tools
 
NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012NSF Data Policies webcast February 29, 2012
NSF Data Policies webcast February 29, 2012
 
Polisciguide2017
Polisciguide2017Polisciguide2017
Polisciguide2017
 
Metadata lecture(9 17-14)
Metadata lecture(9 17-14)Metadata lecture(9 17-14)
Metadata lecture(9 17-14)
 
Research sheet - explanation
Research sheet - explanationResearch sheet - explanation
Research sheet - explanation
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxf
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Aep mc nairguide
Aep mc nairguideAep mc nairguide
Aep mc nairguide
 

Viewers also liked

Survey Research Data Archive: Current Status and Challenges
Survey Research Data Archive: Current Status and ChallengesSurvey Research Data Archive: Current Status and Challenges
Survey Research Data Archive: Current Status and ChallengesBob Chao
 
NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods
NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods
NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods Nebraska Library Commission
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data managementMichael Day
 
Data Management - Basic Concepts
Data Management - Basic ConceptsData Management - Basic Concepts
Data Management - Basic ConceptsSr Edith Bogue
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data ManagementSung Kuan
 
ppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writingppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writingIVRI
 

Viewers also liked (7)

Survey Research Data Archive: Current Status and Challenges
Survey Research Data Archive: Current Status and ChallengesSurvey Research Data Archive: Current Status and Challenges
Survey Research Data Archive: Current Status and Challenges
 
NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods
NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods
NCompass Live: Conducting Surveys III: Analyzing Data and Reporting Methods
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Data Management - Basic Concepts
Data Management - Basic ConceptsData Management - Basic Concepts
Data Management - Basic Concepts
 
Data Management for Dummies
Data Management for DummiesData Management for Dummies
Data Management for Dummies
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data Management
 
ppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writingppt on data collection , processing , analysis of data & report writing
ppt on data collection , processing , analysis of data & report writing
 

Similar to LIS 653, Session 11: Data Management & Curation

Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Responsible conduct of research: Data Management
Responsible conduct of research: Data ManagementResponsible conduct of research: Data Management
Responsible conduct of research: Data ManagementC. Tobin Magle
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesCelia Emmelhainz
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
 
Research Lifecycles and RDM
Research Lifecycles and RDMResearch Lifecycles and RDM
Research Lifecycles and RDMMarieke Guy
 
Data management plan format
Data management plan formatData management plan format
Data management plan formatWouter Gerritsma
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesIUPUI
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!Renaine Julian
 
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...ARDC
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfreypvhead123
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfreypvhead123
 
Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012IUPUI
 

Similar to LIS 653, Session 11: Data Management & Curation (20)

Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management plans
Data management plansData management plans
Data management plans
 
Responsible conduct of research: Data Management
Responsible conduct of research: Data ManagementResponsible conduct of research: Data Management
Responsible conduct of research: Data Management
 
Data management plans
Data management plansData management plans
Data management plans
 
Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social Sciences
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
Research Lifecycles and RDM
Research Lifecycles and RDMResearch Lifecycles and RDM
Research Lifecycles and RDM
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Data management plan format
Data management plan formatData management plan format
Data management plan format
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of OxfordData Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
 
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un... Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
 
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfrey
 
Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfrey
 
Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012Meeting the NSF DMP Requirement June 13, 2012
Meeting the NSF DMP Requirement June 13, 2012
 
Writing a Research Data Management Plan - 2016-11-09 - University of Oxford
Writing a Research Data Management Plan - 2016-11-09 - University of OxfordWriting a Research Data Management Plan - 2016-11-09 - University of Oxford
Writing a Research Data Management Plan - 2016-11-09 - University of Oxford
 

More from Dr. Starr Hoffman

LIS 653, Session 10: Controlled Vocabulary
LIS 653, Session 10: Controlled VocabularyLIS 653, Session 10: Controlled Vocabulary
LIS 653, Session 10: Controlled VocabularyDr. Starr Hoffman
 
LIS 653, Session 8: Radical Cataloging
LIS 653, Session 8: Radical Cataloging LIS 653, Session 8: Radical Cataloging
LIS 653, Session 8: Radical Cataloging Dr. Starr Hoffman
 
LIS 653, Session 7: Classification and Categorization
LIS 653, Session 7: Classification and CategorizationLIS 653, Session 7: Classification and Categorization
LIS 653, Session 7: Classification and CategorizationDr. Starr Hoffman
 
LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships Dr. Starr Hoffman
 
LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards Dr. Starr Hoffman
 
LIS 653, Session 5: Dublin Core & Metadata Basics
LIS 653, Session 5: Dublin Core & Metadata Basics LIS 653, Session 5: Dublin Core & Metadata Basics
LIS 653, Session 5: Dublin Core & Metadata Basics Dr. Starr Hoffman
 
LIS 653, Session 9: Subject Analysis
LIS 653, Session 9: Subject Analysis LIS 653, Session 9: Subject Analysis
LIS 653, Session 9: Subject Analysis Dr. Starr Hoffman
 
LIS 653, Session 4-B: Introduction to Descriptive Metadata
LIS 653, Session 4-B: Introduction to Descriptive Metadata LIS 653, Session 4-B: Introduction to Descriptive Metadata
LIS 653, Session 4-B: Introduction to Descriptive Metadata Dr. Starr Hoffman
 
LIS 653, Session 4-A: Bibliographic Formats and MARC
LIS 653, Session 4-A: Bibliographic Formats and MARC LIS 653, Session 4-A: Bibliographic Formats and MARC
LIS 653, Session 4-A: Bibliographic Formats and MARC Dr. Starr Hoffman
 
LIS 653, Session 2: Basics of Information Organization
LIS 653, Session 2: Basics of Information Organization LIS 653, Session 2: Basics of Information Organization
LIS 653, Session 2: Basics of Information Organization Dr. Starr Hoffman
 
The Relationship of Electronic Reference and the Development of Distance Educ...
The Relationship of Electronic Reference and the Development of Distance Educ...The Relationship of Electronic Reference and the Development of Distance Educ...
The Relationship of Electronic Reference and the Development of Distance Educ...Dr. Starr Hoffman
 
Intro to Government Information Sources
Intro to Government Information SourcesIntro to Government Information Sources
Intro to Government Information SourcesDr. Starr Hoffman
 
Strategies for Supporting Scholarly Communication
Strategies for Supporting Scholarly CommunicationStrategies for Supporting Scholarly Communication
Strategies for Supporting Scholarly CommunicationDr. Starr Hoffman
 
The Preparation of Academic Library Administrators (Prezi import)
The Preparation of Academic Library Administrators (Prezi import)The Preparation of Academic Library Administrators (Prezi import)
The Preparation of Academic Library Administrators (Prezi import)Dr. Starr Hoffman
 
Networking and Getting Involved Professionally
Networking and Getting Involved Professionally Networking and Getting Involved Professionally
Networking and Getting Involved Professionally Dr. Starr Hoffman
 
Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie Dr. Starr Hoffman
 
Encouraging an Informed Citizenry (Part 2)
Encouraging an Informed Citizenry (Part 2)Encouraging an Informed Citizenry (Part 2)
Encouraging an Informed Citizenry (Part 2)Dr. Starr Hoffman
 
Encouraging an Informed Citizenry (Part 1)
Encouraging an Informed Citizenry (Part 1)Encouraging an Informed Citizenry (Part 1)
Encouraging an Informed Citizenry (Part 1)Dr. Starr Hoffman
 
Beyond the Avatar: Best Practices as Librarians Embedded in Online Classes
Beyond the Avatar: Best Practices as Librarians Embedded in Online ClassesBeyond the Avatar: Best Practices as Librarians Embedded in Online Classes
Beyond the Avatar: Best Practices as Librarians Embedded in Online ClassesDr. Starr Hoffman
 

More from Dr. Starr Hoffman (20)

LIS 653, Session 10: Controlled Vocabulary
LIS 653, Session 10: Controlled VocabularyLIS 653, Session 10: Controlled Vocabulary
LIS 653, Session 10: Controlled Vocabulary
 
LIS 653, Session 8: Radical Cataloging
LIS 653, Session 8: Radical Cataloging LIS 653, Session 8: Radical Cataloging
LIS 653, Session 8: Radical Cataloging
 
LIS 653, Session 7: Classification and Categorization
LIS 653, Session 7: Classification and CategorizationLIS 653, Session 7: Classification and Categorization
LIS 653, Session 7: Classification and Categorization
 
LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships
 
LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards
 
LIS 653, Session 5: Dublin Core & Metadata Basics
LIS 653, Session 5: Dublin Core & Metadata Basics LIS 653, Session 5: Dublin Core & Metadata Basics
LIS 653, Session 5: Dublin Core & Metadata Basics
 
LIS 653, Session 9: Subject Analysis
LIS 653, Session 9: Subject Analysis LIS 653, Session 9: Subject Analysis
LIS 653, Session 9: Subject Analysis
 
LIS 653, Session 4-B: Introduction to Descriptive Metadata
LIS 653, Session 4-B: Introduction to Descriptive Metadata LIS 653, Session 4-B: Introduction to Descriptive Metadata
LIS 653, Session 4-B: Introduction to Descriptive Metadata
 
LIS 653, Session 4-A: Bibliographic Formats and MARC
LIS 653, Session 4-A: Bibliographic Formats and MARC LIS 653, Session 4-A: Bibliographic Formats and MARC
LIS 653, Session 4-A: Bibliographic Formats and MARC
 
LIS 653, Session 2: Basics of Information Organization
LIS 653, Session 2: Basics of Information Organization LIS 653, Session 2: Basics of Information Organization
LIS 653, Session 2: Basics of Information Organization
 
The Relationship of Electronic Reference and the Development of Distance Educ...
The Relationship of Electronic Reference and the Development of Distance Educ...The Relationship of Electronic Reference and the Development of Distance Educ...
The Relationship of Electronic Reference and the Development of Distance Educ...
 
Intro to Government Information Sources
Intro to Government Information SourcesIntro to Government Information Sources
Intro to Government Information Sources
 
Strategies for Supporting Scholarly Communication
Strategies for Supporting Scholarly CommunicationStrategies for Supporting Scholarly Communication
Strategies for Supporting Scholarly Communication
 
The Preparation of Academic Library Administrators (Prezi import)
The Preparation of Academic Library Administrators (Prezi import)The Preparation of Academic Library Administrators (Prezi import)
The Preparation of Academic Library Administrators (Prezi import)
 
Networking and Getting Involved Professionally
Networking and Getting Involved Professionally Networking and Getting Involved Professionally
Networking and Getting Involved Professionally
 
Stop Using Cheesy Clip-Art!
Stop Using Cheesy Clip-Art!Stop Using Cheesy Clip-Art!
Stop Using Cheesy Clip-Art!
 
Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie
 
Encouraging an Informed Citizenry (Part 2)
Encouraging an Informed Citizenry (Part 2)Encouraging an Informed Citizenry (Part 2)
Encouraging an Informed Citizenry (Part 2)
 
Encouraging an Informed Citizenry (Part 1)
Encouraging an Informed Citizenry (Part 1)Encouraging an Informed Citizenry (Part 1)
Encouraging an Informed Citizenry (Part 1)
 
Beyond the Avatar: Best Practices as Librarians Embedded in Online Classes
Beyond the Avatar: Best Practices as Librarians Embedded in Online ClassesBeyond the Avatar: Best Practices as Librarians Embedded in Online Classes
Beyond the Avatar: Best Practices as Librarians Embedded in Online Classes
 

Recently uploaded

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIShubhangi Sonawane
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 

Recently uploaded (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 

LIS 653, Session 11: Data Management & Curation

  • 1. Data Management LIS 653 Starr Hoffman
  • 3. What is (are) Data?  Observations  Sensor data, telemetry, survey data, sample data  Experiments  Gene sequences, chromatograms  Simulations  Economic models  Derivations/Compilations  Text mining, data from public documents  Documents & texts themselves = data  Research Process  Observational conditions, experimental procedure, instrumentation, label descriptions, units, metadata
  • 4. Librarian Roles & Data Advisory  Original data:  Consult on creating DMP  Consult on data organization, methodology, etc.  Consult on metadata practices  Consult on archiving  Help disseminate research  Journal publication, OA resources, blogs, etc.  Deposit into repository (IR, 3rd party, etc.)  Secondary data:  Consult on methodology / analysis  Discovery… Curatorial  Manage IR (institutional repository)  Create metadata for datasets  Purchase / catalog / discovery for secondary data
  • 5. What is Data Management? Planning for the short-term and long-term:  care of and  access to …your data. Or: What are you going to do with that data?  How will you describe it?  How are you organizing it?  After you’re done, where will you put it?  How will you/others be able to access it? For how long?
  • 6. Data Management: Why Does it Matter?  Grant requirements  Public access to funded research  Validation  Replication  Re-use, continue research  Teaching  Natural disasters  Computer failure/stolen  USB/hard drive failure/lost  Files corrupted
  • 7. Funding Requirements  NSF: Proposals must include a supplementary document of no more than two pages labeled “Data Management Plan” …describe how the proposal will conform to NSF policy on the dissemination and sharing of research results.  NIH: The NIH expects and supports the timely release and sharing of final research data… for use by other researchers. …expected to include a plan for data sharing or state why data sharing is not possible.  NEH Office of Digital Humanities  NOAA  IMLS  NIJ
  • 8. DMP Considerations  What data types, from what sources, in what formats will this project produce? How much of it will there be?  How will you describe or document your data? Are there standards you will be using for this?  Will you be sharing your data? Do you have the rights to share the data? What did you tell the IRB?  How often do you need to backup your files? How do you need to be able to access your files? How many backups will you have?  How much storage space do you need? What is your budget for your storage?  Where are you going to archive or store the data? and how will it be accessed?  What are the roles and responsibilities around all of these things? i.e., Who's going to be doing all this?
  • 10. Planning the Data Life-Cycle Consider…  Files:  Size, format, organization  Security  Storage/Backup system  Retention  Access/Transparency
  • 11. Data Lifecycle: Create / Analyze / Edit  File Management  Consistency, brevity, description  Versioning (v01, v02, FINAL)  Avoid spaces Directory structure /[Project]/[Grant Number]/[Event]/[Date] File naming [description]_[instrument]_[location]_[YYYYMMDD].[ext]  Transparency/Sharing  Document data: codebook, metadata
  • 12. File Structure & Naming Examples Directory Structure /[Project]/[Grant Number]/[Event]/[Date]  /NYCPhysicalActivity/NOT-MH-14-033/Interview/20141109  /Dissertation/LitReview/LibraryLeadership/ File Naming [description]_[instrument]_[location]_[YYYYMMDD].[ext]  PhysicalActivity_InterviewQs_PS193_20141109.doc  PhysicalActivity_InterviewResponses_20141022.xls  LibraryLeadershipHenson_Article_2011.pdf  Leadership_Survey_20130917.doc
  • 13. Metadata & Description  Variables: labels, meaning, how they were measured, units, codes  Survey questions  Experimental procedures  Research methodology  Statistical analyses performed  Preferred data citation  Pew Hispanic Center. (2008). 2007 Hispanic Healthcare Survey [Data file and code book]. Retrieved from http://pewhispanic.org/datasets/
  • 16. Data Lifecycle: Publish, Store, Access, Reuse  File size & format  Open vs. proprietary  Security  Anonymize or encrypt?  Levels may vary by access (org. vs. 3rd party)  Data Citation  Sharing  Upload data & metadata  Institutional repository, data center, etc.  Persistent identifier
  • 22. Other places data can live…  Figshare  ICPSR  Github  DataUp  Dropbox  (or other cloud storage)  IF you use proper encryption Lists of data repositories:  DataCite  DataBib
  • 23. Data Discovery  Data Depositories (previous slide)  ICPSR  Figshare  Institutional Repositories  OpenDOAR (directory)  Specific institutions  Data Catalogs  Numeric Data Catalog (Columbia)  GeoData (Columbia, others)  Gov & Public Sources (data producers)  NYC OpenData  Data.gov  Census Bureau  Bureau of Labor Statistics  IMLS (Institute of Museum & Library Services)
  • 24. Replicated Data And Finally… Geeky puns.

Editor's Notes

  1. Data isn’t just for science! Lots of fields collect lots of kinds of data… Quantitative data—usually numbers, but anything that’s quantifiable—survey data maybe Qualitative data—interviews, some surveys, Ethnographic-- observations of space use Maps Photographs Sound recordings, video
  2. Observations, e.g.: Sensor data, telemetry, survey data, sample data, neuroimages. Experiments, e.g.: gene sequences, chromatograms, toroid magnetic field data. Simulations, e.g.: climate models, economic models Derivations / Compilations, e.g.: text and data mining, compiled database, 3D models, data gathered from public documents. Research process, e.g.: observational conditions, experimental procedure, instrumentation, label descriptions, units *http://www.nytimes.com/2011/10/06/science/06nobel.html?_r=0 “when you get to see everyone’s mistakes you can tell the mistake from a pattern” Micah Altman at RDS2013 Observational Data/Media Real time captureUsually irreplaceable Derivation/Compilation Data ReproducibleExpensive Research Process Data Data documentation & description (aka ‘metadata’)Analysis algorithms & codes Simulation Data Model & Inputs are usually the important elements
  3. So what are librarian roles, related to data? -- advisory -- curatorial Crossover of public services & technical services… -- metadata (TS) -- subject expertise (either; often PS) -- methodology/analysis (may be PS) -- data mgmt (TS, PS, or Scholarly Communication dept.) -- IRs… (digital services, schol comm—could be outside the libraries!) -- secondary data (usually PS)
  4. Some question prompts – librarian might work through this checklist with a researcher -- help shape their DMP
  5. DMPs include things like: -- kinds of data -- data collection methods -- hardware, software -- archival plans…
  6. These are things that the librarian may consult with a researcher on… Consider: Files: size, format, organization -- open format vs. proprietary Security -- Be vigilant and protective of data in your custody. Observe precautions when transporting paper files with patient or individual identifiable information. Do not leave documents or USB drives in unsecured locations (cars, lockers). Encrypt information (email, cloud storage) -- who needs to access this data—institutional, 3rd party access? -- before sharing data, anonymize individually identifying information, be aware of geo-location tags Storage/backup system -- Checksum validation, test to be sure backups are working properly Retention -- how long does this data need to be accessible/stored? Grant requirements, your own plans, others -- is it reproducible or not? Access/Transparency -- do you (or others) need to access it frequently, or only need periodic access? -- document the data (processes, labels, etc.) -- deposit in repository (for public access)
  7. Consistency: Pick a system, write it down, & stick with it Identify necessary elements Create brief, understandable names Date: YYYY-MM-DD Version: v01, v02,…FINAL In general, try to stay away from spaces in filenames as well as the following characters: .\ / : * ? “ < > | [ ] & $
  8. Metadata is important: -- to help the researcher understand and remember their data (variables!) -- to help others find that data (for validation, replication, reuse) -- replication in particular = GOOD metadata!
  9. Codebook is an essential part of metadata for data… Tells what each variable is, what it’s called, how it was measured, what units it’s in, how it’s coded…
  10. Codebooks can also include survey questions asked to obtain the variables. This is also important in Data Reference work w/ secondary data…
  11. Both funders and some journals require data sharing at time of publication / w/in 12 months So, when it’s all done… where do we PUT this stuff?
  12. More places to store original data….
  13. So from the other end —as a researcher wanting to use secondary data (already collected) --or a student wanting to do data analysis… Where do we FIND data?