The document discusses data management principles and best practices. It covers topics such as data quality, security, organization, and the data lifecycle. Effective data management requires following principles like ensuring data is accurate, consistent, complete, up-to-date, and unambiguous. It also requires proper data backup, storage, capturing, transfer between systems, analytics, and presentation to end users. Planning is important to integrate data management across all stages from initial collection through final analysis and sharing.
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Introduction to data management, terminologies and use of data management platforms
1. www.iita.orgA member of CGIAR consortium
Workshop on Management and
Analyses of ISFM Data
Monday, May 25, 2015
1
2. www.iita.orgA member of CGIAR consortium
Data management
"Data management is the development,
execution and supervision of plans, policies,
programs and practices that control, protect,
deliver and enhance the value of data and
information assets.“
(DAMA Data Management Association International )
2
3. www.iita.orgA member of CGIAR consortium
Data management
Objective:
• to maximize the potential of data while
integrating them into business processes
Topics:
• Data quality
• Data security
• Data organization
3
4. www.iita.orgA member of CGIAR consortium
Data management principles
• Data are correct
• Data are consistent
(uniform in content, content structure, notation, units,
methods used, meaning, language)
• Data are complete
• Data are up to date
• Data are relevant
• Data are precise enough
• Datasets are free of redundancies
• Data are reliable and comprehensible
• Data are understandable by all involved
users and processible by machines
• Data are unambiguous/explicit
4
Data quality
5. www.iita.orgA member of CGIAR consortium
Data management principles
• Every data needs a
frequent backup
• no data without
access permission
control
• Treatment of data of
different ownership
(private) is clarified
5
Data security
6. www.iita.orgA member of CGIAR consortium
Data management principles
• There is no data
without a person
responsible for it (clear
roles & responsibilities)
• There is no data
without one, clearly
defined, easy to find
and communicated
location for it
6
Data
organization
7. www.iita.orgA member of CGIAR consortium
Main roles in data management
• Data Editor: The person that validates, creates
and edits the data
• Data Steward: The person that holds the data,
usually they will take care of the data, ensuring
the data consumers obtain exactly the data
approved by the data owner
• Data Owner: The person that approves data
before it is published for the eventual audience
• Data Consumer: A person that uses the data
without editing, correcting or modifying it
7
8. www.iita.orgA member of CGIAR consortium
Operational levels
• Individual
(Execution of data activities, self-organizing)
• Project/working group
(Plans&deliveries, rules&responsibilities, workflow&steering,
communication, access/permission control, data organizing (content
mgt./file order, file naming strategies, templates, Project data…) )
• Organization
(Policies, Infrastructure&repositories, Ressources, …)
• Global
(Metadata standards, data exchange protocols, vocabularies/
ontologies, legal issues, Open Access, …)
8
9. www.iita.orgA member of CGIAR consortium
Data
lifecycle
9
interpret data
derive data (apply statistical
and analytical methods)
produce research outputs
author publications
create metadata
and documentation
Identify (tracking)
Categorize
migrate data to
suitable medium
back-up and store
data
archive data
collect data (experiment, observe, measure, simulate)
design research
plan data management (formats, storage etc)
plan consent for sharing
locate existing data
enter data, digitize, transcribe, translate
check, validate, clean data
anonymize data where necessary
describe data
migrate data to best
format
Locate, explore and understand data
scrutinize findings
distribute data
share data
control access
establish copyright
promote data
establish copyright
promote data
follow-up research
undertake research reviews
teach and learn
Exposing metadata through a searchable
interface
Source: Boston University Libraries
10. www.iita.orgA member of CGIAR consortium
Data intervention areas
Data capturing and preprocessing
Data transfer
Data flow/content mgt.
Data storage
Data analytics
Data delivery
10
12. Find answers to
• ensure all data mgt. principles are respected
• in and across all intervention areas
• at all operational levels
Start planning from the desired outcomes!
www.iita.orgA member of CGIAR consortium
Plan data management
12
Datamanagementprinciples
Data lifecycle / intervention areas
13. www.iita.orgA member of CGIAR consortium
Data presentation/publication
• Who are the end users of which data?
• Mode of presentation per information product
• Ease of extraction of the right data in the right
format for the right (authorized) people
• Automized? real-time data? Personalized data?
• Consumers conditions (file formats? Com. tools?)
• ability to search&browse (metadata, tags)
• Presentation mode and conditions (inclusive
visualization)
• licensing
13
14. www.iita.orgA member of CGIAR consortium
Data transfer
• Transfer format and requirements (Data
Transformation needed?)
• Transfer initiative (receiver or sender?)
• Transfer mode and instructions
• Transfer compression needs (zip, tar…),
limited internet availability?
• Transfer channels (email, phone, skype, RSS
etc.)
• Transfer check (i.e. email)
14
15. www.iita.orgA member of CGIAR consortium
Data transfer
• Transfer security
• Platform Openness
• Authorization Controls (user credentials)
• Encryption Standards (SSL, S/MIME etc.)
• Transfer scheduling
• Use of API’s?
15
16. www.iita.orgA member of CGIAR consortium
Data storage
• Suitable end repository
(server folder, Sharepoint, MySQL database, cloud based solution, PC, external repository)
• Suitable data infrastructure hardware
(servers, network(s), bandwidth, databases, security facilities, PCs, external hard drive, USB stick,
Smartphones/tablets, scanners, field or laboratory sensors with digital data capturing, etc.)
• Data categorization, file order, filing order criteria
• Data deleting policy and archiving for
evidence/documentation purposes
• Data disposal/sharing/access control +
administration
16
17. www.iita.orgA member of CGIAR consortium
Data analytics and data search
• Goal and mode of analysis
• Frequency of a data analysis
• Participating units and data integration
(Business intelligence)
• Storage and backup of analysis results
• Speed of search
• eventual transition or termination of the
data?
17
18. www.iita.orgA member of CGIAR consortium
Data backup
• Risk assessment:
loss/theft/damage/overload/hacker attack…
• Backup mode and regulations
• Backup frequency/scheduling and discipline
• Suitable backup repository (server folder,
Sharepoint, MySQL database, cloud, PC, external
repository, external hard drive, USB stick etc.)
• Backup tool/software/opportunities to automize
18
19. www.iita.orgA member of CGIAR consortium
Data capturing and preprocessing
• Capturing location and its conditions
• Capturing mode (manual typing, crowd sourcing, data
mining, etc.)
• Capturing tools/hardware (PC, Smartphones, tablets, GPS,
mobile phone, scanners etc.)
• Capturing software and requirements (field data capturing
tools, scanning & OCR read software, etc.)
• Capturing instructions (metadata, data protocols, add. data
descriptions, methodological correctness)
• Data validation rules + data checks: Ensuring Data quality
• Referencing captured data in time & space
• Data structure at capturing
• Capturing data intermediate storage
19
20. www.iita.orgA member of CGIAR consortium
Platforms
• MS SharePoint
• CKAN
• aWhere
• Collaboration tools
• File sharing services (google drive,
dropbox, FTP server, etc.)
20
21. www.iita.orgA member of CGIAR consortium
Data mgt. platforms (1)
MS SharePoint
• Fits to existing Microsoft environment
(MS Office (especially Outlook, Excel, Access, Visio, Project), MS Server
databases, Exchange server, skype)
• With proper permission settings, allows to
create as much pages, apps or subsites as
necessary
• Useful features for data mgt.
(Metadata tagging, version control, templates (MS office only), validation
rules, linking data lists, workflows (approvals etc.), many predefined apps
come with customizable metadata sets)
• Weak: issues linking open repositories
21
22. www.iita.orgA member of CGIAR consortium
Data mgt. platforms (2)
CKAN – “Meta-repository”
• functional emphasis: defacto standard software
for publishing open data
(started as a catalogue for harvesting published data spread of knowledge)
• Python based (DKAN in PHP)
• Strength: customizable, data organization,
harvesting multiple repositories
• Weak: no workflow or bulk operations:
processing need to be done before cataloguing;
no collaboration tools; no upload of multiple
ressources at a time and batch edit the metadata
• Example: http://data.ilri.org/portal/
22
23. www.iita.orgA member of CGIAR consortium
Data mgt. platforms (3a)
ILRI
dataset
portal
based
on
CKAN
23
24. www.iita.orgA member of CGIAR consortium
Data mgt. platforms (3b)
ILRI
dataset
portal
based
on
CKAN
24
25. www.iita.orgA member of CGIAR consortium
Data mgt. platforms (4)
aWhere
• Functional emphasis: (geo)data exploration
• Strength: easy to use platform to explore data
from xls or ODK as tables, diagram or maps and
in connection with data from other users, the
library and the weather module
• Weak: xls only; collaboration functionality
• More by Hannah and Courtney
25
26. www.iita.orgA member of CGIAR consortium
Data mgt. platforms (5)
Collaboration tools - basecamp
• Functional emphasis: collaboration with many
different partners in projects
• Strength: easy to use platform with typical
collab. tools (file sharing+tagging, calendar, wiki,
task tracking)
• Weak: not customizable, no data linkage to
databases
26
27. www.iita.orgA member of CGIAR consortium
Data mgt. platforms (6)
File sharing services – Google drive
• Functional emphasis: synchronized working on
office apps in the cloud
• Strength: data sharing and synchronizing, widely
known, easy to use
• Weak: not customizable, no data linkage to
databases, google account necessary; adverts
27
29. www.iita.orgA member of CGIAR consortium
File naming strategies
29
Order by date:
2013-04-12_interview-recording_THD.mp3
2013-04-12_interview-transcript_THD.docx
2012-12-15_interview-recording_MBD.mp3
2012-12-15_interview-transcript_MBD.docx
Order by subject:
MBD_interview-recording_2012-12-15.mp3
MBD_interview-transcript_2012-12-15.docx
THD_interview-recording_2013-04-12.mp3
THD_interview-transcript_2013-04-12.docx
Order by type:
Interview-recording_MBD_2012-12-15.mp3
Interview-recording_THD_2013-04-12.mp3
Interview-transcript_MBD_2012-12-15.docx
Interview-transcript_THD_2013-04-12.docx
Forced order with numbering:
01_THD_interview-recording_2013-04-12.mp3
02_THD_interview-transcript_2013-04-12.docx
03_MBD_interview-recording_2012-12-15.mp3
04_MBD_interview-transcript_2012-12-15.docx
30. www.iita.orgA member of CGIAR consortium
Supporting documentation(1)
30
Supporting documentation is information in
separate files that accompanies data in order to
provide
• context,
• explanation, or
• instructions on
• confidentiality and
• data use or
• reuse
Source: Dublin UCD Library
31. www.iita.orgA member of CGIAR consortium
Supporting documentation(1)
31
Examples of supporting documentation include:
Source: Dublin UCD Library
Information about the project and data creators;
Working papers or laboratory notebooks
Questionnaires or interview guides
Codebooks
Details on how the data were created, analysed,
anonymised etc;
Final project reports and publications
32. www.iita.orgA member of CGIAR consortium
Metadata
32
There are three broad categories of metadata:
Source: Dublin UCD Library
Descriptive - common fields such as title, author,
abstract, keywords which help users to discover
online sources through searching and browsing.
Administrative - preservation, rights
management, and technical metadata about
formats.
Structural - how different components of a set of
associated data relate to one another, such as a
schema describing relations between tables in a
database.
Notes de l'éditeur
identifying, classifying, prioritizing, storing, securing, archiving, preserving, retrieving, tracking and destroying of records
identifying, classifying, prioritizing, storing, securing, archiving, preserving, retrieving, tracking and destroying of records
identifying, classifying, prioritizing, storing, securing, archiving, preserving, retrieving, tracking and destroying of records