Data Quality Control

Data Quality ControlData Quality Control

Learning ObjectivesLearning Objectives
 To know the steps necessary for ensuring quality assurance
and control of data at various stages of a study
 To understand the difference between pilot testing and pre-
testing
 To understand the importance of designing data collection
instruments
 To understand how data can be managed using an audit
trail and the various techniques that can be used to inspect
your dataset after it has been entered

Performance ObjectivesPerformance Objectives
 Know the difference between quality assurance and quality
control and ways to ensure them
 Know the objectives of a pilot test and a pre-test
 Understand how data collection instruments should be
designed and coded
 Be able to manage data using an audit trail
 Be able to inspect datasets for errors and rectify them

Data Quality ControlData Quality Control
 Quality Assurance
– Activities to ensure
quality of data before
data collection
 Quality Control
– Monitoring and
maintaining the quality
of data during the
conduct of the study
• Data Management
– Handling and
processing of data
throughout the study

Steps in Quality AssuranceSteps in Quality Assurance
1. Specify the study hypothesis
2. Specify general design to test study hypothesis ⇒
Develop an overall study protocol
3. Choose or prepare specific instruments
4. Develop procedures for data collection and processing
⇒ Develop operation manuals
5. Train staff ⇒ Certify staff
6. User certified staff, pretest and pilot-study data
collection and processing instruments and procedures

Quality Assurance: Standardization ofQuality Assurance: Standardization of
proceduresprocedures
 Why is standardization important?
– In order to achieve highest possible level of uniformity
and standardization of data collection procedures in the
entire study population
 Preparation of written manual of operations
– Detailed descriptions of exactly how the procedures
specific to each data collection instrument are to be
carried out (BP example)
– Q by Q’s (question by question) instructions for
interviews

Quality Assurance: Training of StaffQuality Assurance: Training of Staff
Aim to make each staff person
thoroughly familiar with procedures
under his/her responsibility
Training certification of the staff
member to perform a specific procedure

Quality Assurance: Pretesting and PilotQuality Assurance: Pretesting and Pilot
testingtesting
Pretesting
– Involves assessing
specific procedures
on a sample in
order to detect
major flaws
Pilot Testing
– Formal rehearsal of
study procedures
– Attempts to
reproduce the
whole flow of
operations in a
sample as similar as
possible to study
participants

Pretesting and Pilot testing resultsPretesting and Pilot testing results
 Pretesting of questionnaire used to assess:
– flow of questions,
– presence of sensitive questions,
– appropriateness of categorization of variables,
– clarity of the q by q instructions to the
interviewer
 Pilot testing
– In addition to the above, flow of process

Quality Assurance: Data ManagementQuality Assurance: Data Management
Designing data collection
– Layout, questions to ask, sequence of questions,
phrasing of questions, response categories, skip
patterns
– Collect and record “raw”, not processed
information (eg. Age)
– Codebook: link between the questionnaire and
the data entered in the computer

Code book exampleCode book example
Variable QNo Meaning Codes Format
Q1Id Q1 Quest. No 1-750 C 3
Q2Sex Q2 Respondent’s sex 1 male
2 female
N 1.0
Q3Child Q3 No of children 99 no response N 2.0
Q4Wt Q4 Weight in kg 999 not recorded N 3.1
Q5roof Q5 Roof type 1 RCC
2 Cement sheet
3 Tin sheet
4 Thatched
Other (specify)
N 2.0

Quality Assurance: Use of a Code bookQuality Assurance: Use of a Code book
 Variable names
– Up to 8 characters a-z and 0-9, must start with a letter
– Combination of question number and description (eg.
q3age)
 Meaning:
– short text description describing the meaning of the
variable
– SPSS software can incorporate this info as variable
labels and display it in the output

Quality Assurance: Use of a Code bookQuality Assurance: Use of a Code book
 Codes
– Try and use numerical codes
 Predecide codes for no response, missing values
– Question could not be asked or not applicable (eg.
pregnancy outcome)
– Question was asked but respondent did not reply (eg
salary)
– Respondent replied “don’t know”

Quality ControlQuality Control
Observation of procedures and performance of staff
members for identification of obvious protocol
deviations
 Strategies include:
– Over-the-shoulder observation of staff
– Taping all interviews and reviewing a random sample
– Ongoing field supervision
– field editing by interviewer as well as field supervisor
– Office editing which includes coding
– log book maintenance
– Statistical assessment of trends over time in the
performance of each observer/interviewer/technician

Data Management: Audit trailData Management: Audit trail
 Researcher should be able to trace each piece of
information back to the original document:
– ID included in the original documents and in the dataset
– All corrections must be documented and explained
– All modifications to the dataset must be documented by
command files
– Each analysis must be documented by a command file
 Purpose of audit is to
– protect yourself against mistakes, errors, waste of time
and loss of information
– enable external audit (revision)

Data Management: Handling of DataData Management: Handling of Data
Entering data
– Use professional data entry program like
EpiData
Preparations
– complete codebook
– examine questionnaires for obvious
inconsistencies, skip patterns

Data Management: Handling of DataData Management: Handling of Data
Error prevention:
– Set up a data entry form resembling your
questionnaire
– Define valid values before entering data
– double data entry by two different operators
 compare contents to get list of discrepancies (
EpiInfo)
 correct errors in both files and run new comparison

First Inspection of data. Error FindingFirst Inspection of data. Error Finding
 Add variable and value labels to your data using a syntax
command
 Searching for errors
– make printouts of codebook from the data, overview of variables, simple
frequency tables of appropriate variables
– compare codebook created with original codebook and see if label
information is correct
– Inspect the generated summary/frequency tables for illegal or improbable
minimum and maximum values of variables and inconsistencies (eg. 250
years age, pregnant male; 23 yr woman with 19 yr son)
 Calculate the error rate by
– randomly select 10% or at least 40 of your questionnaires and re-enter
them into new file

Correction of errors - DocumentationCorrection of errors - Documentation
If errors are discovered
– Make corrections in a command file (SPSS
syntax file), this will provide full
documentation of changes made to the dataset
If errors are discovered when comparing
files after double data entry
– you can make corrections directly in the data
entered, provided you end this step with a
comparison of the two files entered and
corrected

Correction of errors - DocumentationCorrection of errors - Documentation
Split the process into distinct and well-
defined steps and that your
documentation from one step to another
is consistent
Archive
– once you have a “clean” documented version of
your primary data, save one copy in a safe
place and do your work with another copy

AnalysisAnalysis
Make sure you use the right data set
– recommend to create command files for
analysis which start with the command reading
the dataset
Late discovery of errors and inconsistencies

Backing up vs ArchivingBacking up vs Archiving
 Backing up
– everyday activity
– purpose to able you to restore your data and documents
in case of destruction or loss of data
– not only datasets, but also command files modifying
your data, written documents such as the protocol, log
book and other documenting information
 Archiving
– takes place once or a few times during the life of the
project
– purpose is to preserve your data and documents for a
more distant future, maybe to even allow other
researchers access to the information.

Data Quality Control

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Data Quality Control

Similaire à Data Quality Control (20)

Plus de Hashim Hasnain Hadi

Plus de Hashim Hasnain Hadi (20)

Dernier

Dernier (20)

Data Quality Control

Notes de l'éditeur