Central tendency of data is defined as the tendency of data to concentrate around some central value. here all the measures of central tendency have been explained such as mean, arithmetic mean, geometric mean, harmonic mean, mode, and median with examples.
2. Central tendency of a data is defined as the tendency of a
data to concentrate around some central value.
This central value is also called average and using this
average we can easily compare different sets of data.
Central tendency
Need of central tendency
We know that one of the important function of statistics is to compare
different sets of data .
For the comparison we want to represent the whole data set in single
value, therefore we study various characteristics of the data like
central tendency, measures of dispersions etc.
3. Different Measures Of Central Tendency
There are different formulas for finding the central values, which are as
follow:
1. Means
a) Arithmetic mean or Average
b) Geometric mean
c) Harmonic mean
2. Median
3. Mode
4. There are several measures of central tendency but the difficulty lies in choosing
the best of them as no hard and fast rule have been made to select any one.
However, there are certain guidelines for choosing a particular measure of central
tendency. A measure of central tendency is good or satisfactory if it possesses the
following characteristics:
1. It should be easy to calculate
2. Easy to understand
3. Based on all the observations.
4. Should not be affected by the extreme values.
5. Should be as close to the maximum number of observed values as
possible.
Characteristics for ideal Measure of Central Tendency
5. 6. It should be rigidly defined i.e. no confusion in the formula.
For example
For three observations such as
I: 12,15,12,12,13 -------------- mode is 12
II: 15,15,12,13,12 ------------- mode is 12 and 15 {bimodal}
III: 12,11,13,10,9 ------------- Mode does not exist.
Comment : mode is not clearly defined.
7. It should be capable of algebraic treatment.
For geometrical mean the algebraic treatment is not possible .
Suppose 𝑋1 , 𝑋2, 𝑋3 … … … . . 𝑋 𝑁 are N variate values.
Then,
G = 𝑁
𝑋1 + 𝑋2+X3+……XN
log 𝐺 =
1
𝑁
𝜀(log10 𝑋)
G= antilog {
1
𝑁
𝜀(log10 𝑋)}
6. 8. It should be least affected by the fluctuation of sampling.
Suppose we want to estimate population mean on the basis of different random samples
drawn from the population . In this case if we use AM of sample observed i.e. We calculate
sample mean of different samples drawn, than there will be least variation in the values of
sample mean obtained from different sample.
7. Mean
a) Arithmetic mean or average
• It is popularly known as average.
• It is the sum of the observed values of set divided by the number of
observations in the set is called as mean or an average.
For ungrouped data
If 𝑋1, 𝑋2, 𝑋3, 𝑋4 … … … … … … … … 𝑋 𝑁 are N observed values, the mean or average is
given as,
For example : In class 10 students, they scored following marks in Mathematics out
of 50.
X: 40,50,49,30,30,25,20,40,45,49
𝑋=
𝑋
𝑁
𝑋 =
40+50+49+30+30+25+20+40+45+49
10
𝑋 =
379
10
𝑋 = 37.9
So average marks of the class are 37.9
𝜇 𝑜𝑟 𝑋 =
𝑋1 + 𝑋2 + 𝑋3 … … . 𝑋 𝑁
𝑁
=
𝑋
𝑁
8. For grouped data
The arithmetic mean is given by
where X is the mid point of class interval 𝑋𝑖-𝑋𝑖+1 and is given by
X =
𝑋 𝑖− 𝑋 𝑖+1
2
i= 1,2…….k
Class interval Frequency
𝑋1-𝑋2 𝑓1
𝑋2-𝑋3 𝑓2
𝑋3-𝑋4 𝑓3
.
.
.
.
𝑋 𝑘-𝑋 𝑘+1
.
.
.
.
𝑓𝑘
𝑋 =
𝑓𝑖 𝑋𝑖
𝑁
9. Some algebraic properties of arithmetic mean
1. The sum of the deviation taken from arithmetic mean is zero.
For ungrouped data
We know , 𝑋 =
Ʃ𝑋
𝑛
(𝑋1- 𝑋) +(𝑋2- 𝑋) +(𝑋3- 𝑋) +(𝑋4- 𝑋) +……………(𝑋 𝑛- 𝑋) = 0
For ungrouped data
We know , 𝑋 =
Ʃ𝑓𝑋
𝑁
Ʃ(x− 𝑋)
𝑛
= 0
10. 2. Combined mean : if there are 2 groups with 𝑛1 and 𝑛2 observation with mean
𝑋1 and 𝑋2 . Then the arithmetic mean of combined mean can be calculated as:
𝑋 =
𝑛1 𝑋1 +𝑛2 𝑋2
𝑛1+𝑛2
So lets say there are two sections of a class A and B with 30 and 40 students in
respective sections. Average marks of both the classes are 13 & 15 respectively. Now
calculate the combined mean of the class.
So simply we have to put the values in the above formula
Give that 𝑛1 =30 𝑛2 = 40 𝑋1 = 13 and 𝑋2 = 15.
Now put the values and try to find out the combined mean of the class.
Note: above formula can be generated for 3 or more groups.
11. Weighted mean
Where weights are assigned according to some pre decided criteria weight then we
calculate the weighted mean.
Suppose in an interview two candidates appear where weightage is given to the XII marks as
following :
Subjects Weight Candidate 1 Candidate 2
Physics 3 85 80
Chemistry 4 82 85
Mathematics 5 86 86
English 2 80 87
Hindi 1 81 77
Weighted mean =
𝑊1 𝑋1+𝑊2 𝑋2+𝑊3 𝑋3….……….𝑊𝑛 𝑋 𝑛
𝑊1+𝑊2+𝑊3+ …………..𝑊𝑛
Remark: weighted mean is converted to Arithmetic mean if equal weightage is given to all.
12. Merits
Easy to calculate.
Easy to understand.
Demerits
1. Highly affected by the extreme values.
Example: marks of a 10 students 70,75,77,60,64,60,90,10,99,97.
𝑋 = 70.2
So we can see that the student scoring 10 has least effect in it.
2. Can’t be calculated for open ended class intervals
Example:
C I f
<60 10
60-70 20
70-80 30
80-90 40
>90 50
Merits and demerits of arithmetic mean
13. 3. Sometimes the value of arithmetic mean is not physically possible particularly in case
we have discrete variable.
Example: Average number of plants : 17.75 which is not possible for a plant to be in
fraction . So we take round off 18 plants.
4. If each observation is added, subtracted, multiplied or divided by a constant then the
mean will also be respectively get increased, decreased, multiplied and divided by the
same constants.
14. Geometric mean
Geometric mean of N variate values is the Nth root of their product.
Suppose 𝑋1, 𝑋2, 𝑋3, 𝑋4 … … … … … … … … 𝑋 𝑁 are N variate values, then the
geometric mean or average is given as,
G= 𝑁
𝑋1. 𝑋2. 𝑋3. 𝑋4 … … … … … … … … 𝑋 𝑁
G= antilog [
Ʃ𝑋
𝑁
]
In case 𝑋1, 𝑋2, 𝑋3, 𝑋4 … … … … … … … … 𝑋 𝑁 have corresponding frequencies 𝑓1,
𝑓2, 𝑓3……. 𝑓𝑁 then
G= antilog {Ʃ(𝑓 𝑋)
𝑁
}
15. Demerits
1. Based on all observations.
2. Rigidly defined.
Merits
1. Difficult to calculate
2. If some value is zero, GM cant be used
3. If some observation is negative, GM is not defined.
Note: GM is used in very limited situations.
For example : for finding the average percentage increase in population over the year.
Or average increase in production over years i.e. growth rate.
16. Harmonic mean
In algebra, harmonic mean is found out in the case of harmonic progression
only. But in statistics harmonic mean is a suitable measure of central tendency
when the data pertains to speed, rates and time.
Harmonic mean is the inverse of the arithmetic mean of the reciprocals of the
observations of a set.
Suppose 𝑋1 , 𝑋2, 𝑋3 … … … . . 𝑋 𝑁 are N variate values of a set, then harmonic mean
is given by:
H=
𝑛
1
𝑋1
+
1
𝑋2
+
1
𝑋3
………………
1
𝑋 𝑁
For ungrouped data
H=
𝑁
Ʃ𝑓
𝑋
For grouped data
17. Merits
Based on all observations .
Rigidly defied.
Tips and tricks :
we consider the unit of the variable under study. Suppose it is the ratio of 2 factors , numerator
factors and denominator factor. If numerator factor is constant then we use harmonic mean.
It the numerator factor is changing than arithmetic mean should be used
Demrits
1. Difficult to calculate
2. If some value is zero, then it is not defined
3. Used in some limited situations.
For example: for finding the average speed, average price of a commodity in a particular
situation.
Such as : from point A to B, a car goes with a speed of 6km/h and come back from B to A
with a speed of 7 km/h. then the average speed will be .
Average speed =
2
1
6
+
1
7
.
18. Lets try a problem on the basis of understanding till now:
A farmer purchased seeds at the rate of 30Rs/kg , 35Rs/kg and 40 Rs/kg
over 3 consecutive year. Find the average price of seeds in the following
cases:
1. If he spent same amount of money every year say Rs.600.
2. If he uses same amount of speed every year. Say 6 Kgs
Solution:
1. Average price =
3
1
30
+
1
35
+
1
40
{ harmonic mean}
2. Average price =
30+35+40
3
{arithmetic mean}
Remark: AM ≥GM≥HM
Equal sign because if all the variable value are same in sample then all the
three will be equal
19. Median
It has been pointed out that mean can not be calculated whenever there is
frequency distribution with open end intervals.
Also the mean is to a great extent affected by the extreme values of the set of
observations.
If eight people are getting salaries as Rs. 150,225,240,260,275,290,300 and
1500. the mean salary of these eight people is 405. This value is not a good
measure of central tendency because out of 8 people, 7 gets Rs. 300 or less.
Hence median is preferable.
• If the observations are arranged in increasing or decreasing order the
value of middle term is known as median.
• If there are two middle terms, then median is the average of the two
central values.
For ungrouped data :
Marks : 10, 20,30,40,50,60
Median = 30
If marks : 10,20,30,40,50,60,70
Median =
30+40
2
=35
20. For frequency distribution of ungrouped data
Suppose In case 𝑋1, 𝑋2, 𝑋3, 𝑋4 … … … … … … … … 𝑋 𝑁 have corresponding
frequencies 𝑓1, 𝑓2, 𝑓3……. 𝑓𝑁 the median for it can be worked out as following :
Step 1: find the cumulative frequencies. {refer to the presentation #2 graphical
representation and tabulation}
Step 2: find N/2.
Step 3 : search for the smallest cumulative frequency which contains this value
N/2. The variate value corresponding to this cumulative frequency is the median.
For grouped data
𝑀 𝑑 = 𝐿 +
(
𝑁
2
−𝑐)
𝑓
x h
Where L= lower limit of the median class
N= Sum of frequencies
f= Total frequency
c= cumulative frequency previous to the media class
Remark: less than type is related to upper limit of CI, where ever more than
type is related to lower limit of CI
21. Merits
Median is not affected by extreme values.
It can be calculated graphically by preparing cumulative frequency curves in the
following manner:
The x coordinate of cutting
point of cumulative
frequency curve is the
median.
Demerit
Not based on all observations
Not capable of algebraic treatment
22. Mode
Mode is a variate value which occurs most frequently in a set of values.
Or we can say for which frequency is maximum
Variate of (X) 3 4 7 8 9 11 12
Frequency (f) 2 6 5 14 10 6 3
Clearly x = 8 has maximum frequency i.e. 14. hence 8 is the modal value.
I: 12,15,12,12,13 -------------- mode is 12
II: 15,15,12,13,12 ------------- mode is 12 and 15 {bimodal}
III: 12,11,13,10,9 ------------- Mode does not exist.
Comment : mode is not clearly defined.
23. For grouped data
Mode = L+
𝑓−𝑓1
2𝑓−𝑓1−𝑓2
Where,
L= lower limit of modal class
f = frequency of modal class
𝑓1= frequency of just previous to modal class
𝑓2 = frequency of just after the modal class
Modal class = for which the frequency is maximum
25. Merits
Easy to calculate.
Not affected by the extreme values.
We can find mode graphically by preparing the histogram it the following
manner.
Point K on the x-axis in
this histogram is the mode
26. Demerits
Not clearly defined .
Not based on all the observations.
Not capable of algebraic treatment.
Application
To find average case of quantitative characters
for eg. Average color of flower
Average sizes of shoe
Commonly used in marketing business management to find the most popular
items
Mode= 3median – 2 arithmetic mean
Note : we can also find mode using the following relation: