2. Data
• Data is a collection of facts, such as values or
measurements.
OR
• Data is information that has been translated into
a form that is more convenient to move or
process.
OR
• Data are any facts, numbers, or text that can be
processed by a computer.
3/3/2012 Dr. Riaz A. Bhutto 2
3. Statistics
Statistics is the study of the
collection, summarizing, organization, analysi
s, and interpretation of data.
3/3/2012 Dr. Riaz A. Bhutto 3
4. Vital statistics
Vital statistics is
collecting, summarizing, organizing, analysis,
presentation, and interpretation of data
related to vital events of life as births, deaths,
marriages, divorces,
health & diseases.
3/3/2012 Dr. Riaz A. Bhutto 4
5. Biostatistics
Biostatistics is the application of statistical
techniques to scientific research in health-
related fields, including
medicine, biology, and public health.
3/3/2012 Dr. Riaz A. Bhutto 5
6. Descriptive Statistics
The term descriptive statistics refers to
statistics that are used to describe. When
using descriptive statistics, every member of a
group or population is measured. A good
example of descriptive statistics is the
Census, in which all members of a population
are counted.
3/3/2012 Dr. Riaz A. Bhutto 6
7. Inferential or Analytical Statistics
Inferential statistics are used to draw
conclusions and make predictions based on the
analysis of numeric data.
3/3/2012 Dr. Riaz A. Bhutto 7
8. Primary & Secondary Data
• Raw or Primary data: when data collected
having lot of unnecessary, irrelevant & un
wanted information
• Treated or Secondary data: when we treat &
remove this unnecessary, irrelevant & un
wanted information
• Cooked data: when data collected not
genuinely and is false and fictitious
3/3/2012 Dr. Riaz A. Bhutto 8
9. Ungrouped & Grouped Data
• Ungrouped data: when data presented or observed individually. For
example if we observed no. of children in 6 families
2, 4, 6, 4, 6, 4
• Grouped data: when we grouped the identical data by frequency.
For example above data of children in 6 families can be grouped as:
No. of children Families
2 1
4 3
6 2
or alternatively we can make classes:
No. of children Frequency
2-4 4
5-7 2
3/3/2012 Dr. Riaz A. Bhutto 9
10. Variable
A variable is something that can be
changed, such as a characteristic or value. For
example age, height, weight, blood pressure
etc
3/3/2012 Dr. Riaz A. Bhutto 10
11. Types of Variable
Independent variable: is typically the
variable representing the value being
manipulated or changed. For example
smoking
Dependent variable: is the observed result of
the independent variable being manipulated.
For example ca of lung
Confounding variable: is associated with both
exposure and disease. For example age is
factor for many events
3/3/2012 Dr. Riaz A. Bhutto 11
13. Quantitative or Numerical data
This data is used to describe a type of
information that can be counted or expressed
numerically (numbers)
2, 4 , 6, 8.5, 10.5
9/3/2012 Dr. Riaz A. Bhutto 13
14. Quantitative or Numerical data (cont.)
This data is of two types
1. Discrete Data: it is in whole numbers or values and
has no fraction. For example
Number of children in a family = 4
Number of patients in hospital = 320
2. Continuous Data (Infinite Number): measured on a
continuous scale. It can be in fraction. For example
Height of a person = 5 feet 6 inches 5”.6’
Temperature = 92.3 °F
9/3/2012 Dr. Riaz A. Bhutto 14
15. Qualitative or Categorical data
This is non numerical data as
Male/Female, Short/Tall
This is of two types
1. Nominal Data: it has series of unordered categories
( one can not √ more than one at a time) For example
Sex = Male/Female Blood group = O/A/B/AB
2. Ordinal or Ranked Data: that has distinct ordered/ranked
categories. For example
Measurement of height can be = Short / Medium / Tall
Degree of pain can be = None / Mild /Moderate / Severe
9/3/2012 Dr. Riaz A. Bhutto 15
16. Measures of Central Tendency &
Variation (Dispersion)
9/3/2012 Dr. Riaz A. Bhutto 16
17. Measures of Central Tendency
are quantitative indices that describe the
center of a distribution of data. These are
• Mean
• Median (Three M M M)
• Mode
9/3/2012 Dr. Riaz A. Bhutto 17
18. Mean
Mean or arithmetic mean is also called AVERAGE and
only calculated for numerical data. For example
• What average age of children in years?
Children 1 2 3 4 5 6 7
Age 6443246
Formula -- = ∑ X
X ___
n
Mean = 6 4 4 3 2 4 5 = 28 = 4 years
7 7
9/3/2012 Dr. Riaz A. Bhutto 18
19. Median
• It is central most value. For example what is
central value in 2, 3, 4, 4, 4, 5, 6 data?
• If we divide data in two equal groups
2, 3, 4, 4, 4, 5, 6 hence 4 is the central
most value
• Formula to calculate central value is:
Median = n + 1 (here n is the total no. of value)
2
9/3/2012
Median = (n + 1)/2 = 7 + 1 = 8/2 = 4
Dr. Riaz A. Bhutto 19
20. Mode
• is the most frequently (repeated) occurring
value in set of observations. Example
• No mode
Raw data: 10.3 4.9 8.9 11.7 6.3 7.7
• One mode
Raw data: 2 3 4 4 4 5 6
• More than 1 mode
Raw data: 21 28 28 41 43 43
9/3/2012 Dr. Riaz A. Bhutto 20
21. Measures of Dispersion
quantitative indices that describe the spread of
a data set. These are
• Range
• Mean deviation
• Variance
• Standard deviation
• Coefficient of variation
• Percentile
9/3/2012 Dr. Riaz A. Bhutto 21
22. Range
It is difference between highest and lowest
values in a data series. For example:
the ages (in Years) of 10 children are
2, 6, 8, 10, 11, 14, 1, 6, 9, 15
here the range of age will be 15 – 1 = 14 years
9/3/2012 Dr. Riaz A. Bhutto 22
23. Mean Deviation
This is average deviation of all observation
from the mean
-
Mean Deviation = ∑ І X – X І
_______
_ n
here X = Value, X = Mean
n = Total no. of value
9/3/2012 Dr. Riaz A. Bhutto 23
24. Mean Deviation Example
A student took 5 exams in a class and had scores of
92, 75, 95, 90, and 98. Find the mean deviation for her
test scores.
• First step find the _
mean.
x = ___
∑x
n
= 92+75+95+90+98
5
= 450
5
= 90
9/3/2012 Dr. Riaz A. Bhutto 24
25. • 2nd step find mean deviation
Deviation from Absolute value of
ˉ ˉ Deviation
Values = X Mean = X Mean = X - X
Ignoring + signs
92 90 2 2
75 90 -15 15
95 90 5 5
90 90 0 0
98 90 8 8
Total = 450 --
∑ X - X = 30
_
n= 5 Mean Deviation =
∑І X – X І
_______ = 30/5 =6
n
Average deviation
from mean is 6
9/3/2012 Dr. Riaz A. Bhutto 25
26. Variance
• It is measure of variability which takes into
account the difference between each
observation and mean.
• The variance is the sum of the squared
deviations from the mean divided by the
number of values in the series minus 1.
• Sample variance is s² and population variance
is σ²
9/3/2012 Dr. Riaz A. Bhutto 26
27. Variance (cont.)
• The Variance is defined as:
• The average of the squared differences from the
Mean.
• To calculate the variance follow these steps:
• Work out the Mean (the simple average of the
numbers)
• Then for each number: subtract the Mean and
square the result (the squared difference)
• Then work out the average of those squared
differences.
9/3/2012 Dr. Riaz A. Bhutto 27
28. Example: House hold size of 5 families was recorded as following:
2, 5, 4, 6, 3 Calculate variance for above data.
Step 1 Step 2 Step 3 Step 4
Deviation from ˉ
Values = X ˉ ( X – X)²
Mean = X ˉ
Mean = X - X
2 4 -2 4
5 4 1 1
4 4 0 0
6 4 2 4
3 4 -1 1
∑ = 10 Step 5
_
∑ ( X – X )²
Step 6 s² = _______ = 10/5 = 2
n S²= 2 persons²
9/3/2012 Dr. Riaz A. Bhutto 28
29. Standard Deviation
• The Standard Deviation is a measure of how
spread out numbers are.
• Its symbol is σ (the greek letter sigma)
• The formula is easy: it is the square root of
the Variance.i-e s = √ s²
• SD is most useful measure of dispersion
s = √ (x - x²)
n (if n > 30)
s = √ (x - x²)
n-1 (if n < 30)
9/3/2012 Dr. Riaz A. Bhutto 29
30. Example
You and your friends have just measured the heights of your
dogs (in millimeters):
• The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and
300mm.
• Find out the Mean, the Variance, and the Standard Deviation.
9/3/2012 Dr. Riaz A. Bhutto 30
31. Your first step is to find the Mean:
Answer:
Mean = 600 + 470 + 170 + 430 + 300 = 1970 = 394
5 5
so the mean (average) height is 394 mm. Let's plot this on the chart:
9/3/2012 Dr. Riaz A. Bhutto 31
32. Now, we calculate each dogs difference from the Mean:
To calculate the Variance, take each difference, square it, and then average
the result:
9/3/2012 So, the Variance is 21,704. 32
Dr. Riaz A. Bhutto
33. And the Standard Deviation is just the square root of Variance, so:
Standard Deviation: σ = √21,704 = 147.32... = 147 (to the nearest mm)
And the good thing about the Standard Deviation is that it is useful. Now we can
show which heights are within one Standard Deviation (147mm) of the Mean:
• So, using the Standard Deviation we have a "standard" way of knowing
what is normal, and what is extra large or extra small.
9/3/2012 Dr. Riaz A. Bhutto 33