Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Measures of central tendency

822 vues

Publié le

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Measures of central tendency

  1. 1. Measures of Central Tendency Back to Top The three important measures of central tendency are 1. The Mean 2. The Median 3. The Mode Measures of Central Tendency Definition Measure of central tendency can be the term which defines the centre of data. There are three parameters by which we can measure central tendency - Mean, median and mode. Central Tendency of Data Mean: Mean of data is a set of numerical values is the arithmetic average of the data values in the set. It is found by adding all the values in the data set and dividing the sum by the total number of values in the set. Mean of a data set = Sum of the Data ValuesTotal Number of Data Values Median: For an ordered data set, median is the value in the middle of the data distribution. If there are even number of data values in the set, then there will be two middle values and the median is the average of these two middle values. Mode Mode is the most frequently occurring value in the data set. In addition to these three important measures of central tendency, another measure is also defined. Midrange: Midrange is an estimated measure of the average. It is the average of the lowest and highest values in the data set. Midrange = Lowest Value + Highest Value2 Midrange is only a rough estimate of the central value. As it uses only the lowest and highest values of the data set, it is highly affected when one of them is very high or very low.
  2. 2. Central Tendency Definition Back to Top The term central tendency refers to the middle value of the data, and is measured using the mean, median, or mode. It is the tendency of the values of a random variable to cluster around the mean, median, and mode. And a measure of central tendency for a data distribution is a measure of centralness of data and it is used to summarize the data set. Mean Back to Top The mean of a sample data is denoted by x¯ and the population mean by μ. The mean of a small number of data set can be found by adding all the data values and dividing the sum by total number of values. Characteristics of Mean 1. Mean is computed using all the values in the data set. 2. Mean varies less for samples taken from the same population when compared to the median or mode. 3. The Mean is unique for a data set. The mean may not be one of the data values in the distribution. 4. Other statistics such as variance are computed using mean. 5. Mean is affected the most by the outliers present in the data set. Hence mean is not to be used for data sets containing outliers. Mean for the grouped data is also computed applying above methods, the mid point of the class is used as x. Solved Examples Question 1: The following data set is the worth(in billions of dollars) of 10 hypothetical wealthy men. Find the mean worth of these top 10 rich men. 12.6, 13.7, 18.0, 18.0, 18.0, 20.0, 20.0, 41.2, 48.0, 60.0 Solution: Given data, 12.6, 13.7, 18.0, 18.0, 18.0, 20.0, 20.0, 41.2, 48.0, 60.0 Mean of the data set, x¯ = 12.6+13.7+18+18+18+20+20+41.2+48+6010
  3. 3. = 269.510 = 26.95 Question 2: Compute the mean for the distribution given below Value x Frequency f 20 2 29 4 30 4 39 3 44 2 Solution: The frequency table is redone adding one more column f * x Value x Frequency f f * x 20 2 40 29 4 116 30 4 120 39 3 117 44 2 88 ∑f = 15 ∑fx = 481 Mean of the distribution x¯=∑fx∑f = 48115 = 32.1 (Answer rounded to the tenth). → Read More Median Back to Top
  4. 4. When we say the median value of earnings of Actuarial experts is 60,000 dollars, we mean that 50% of these experts earn less than 60,000 dollars and 50% earn more than this. Thus median is the balancing point in an ordered data set. As median represents the 50% mark in a distribution, this is a measure of position as well. Median is much more easier to find than computing the mean. Uses of Median 1. Median is used if the analysis requires the middle value of the distribution. 2. Median is used to determine whether the given data value/s fall in the upper or lower half of the distribution. 3. Medan can be used even if the classes in the frequency distribution are open ended. 4. Median is generally used as the central value, when the data is likely to contain outliers. Solved Examples Question 1: The number of rooms in 11 hotels in a city is as follows: 380, 220, 555, 678, 756, 823, 432, 367, 546, 402, 347. Solution: The data is first arranged starting from the lowest as follows: 220, 347, 367, 380, 402, 432, 546, 555, 678, 756, 823. As the number of data elements 11 is an odd number, there is only one middle value in the data array, which is the 6th. => The value of data in 6th position = 432. Hence the mean number of Hotel rooms in the city = 432. Question 2: Find the median of the given data Value X Frequency f 20 2 29 4 30 4 39 3 44 2 Solution: Value x Frequency f Cumulative frequency
  5. 5. 20 2 2 29 4 2 + 4 = 6 30 4 6 + 4 = 10 39 3 10 + 3 = 13 44 2 13 + 2 = 15 ∑f = 15 ∑fx = 481 => ∑f = 15 items, The 8th item in the ordered data array will be the median. The 8 item will be included in the cumulative frequency 10. Hence the median of the distribution is the x value corresponding to cumulative frequency 10 which reads as 30. => Median of the data = 30. → Read More Mode Back to Top Mode is the value or category that occurs most in a data set.  If all the elements in the data set have the same frequency of occurrence, then distribution does not have a mode.  In a unimodal distribution, one value occurs most frequently in comparison to other values.  A bimodal distribution has two elements have the highest frequency of occurrence. Characteristics of Mode: 1. Mode is the easiest average to determine and it is used when the most typical value is required as the central value. 2. Mode can be found for nominal data set as well. 3. Mode need not be a unique measure. A distribution can have more than one mode or no mode at all. Solved Example Question: Find the mode of a numerical data set 109 112 109 110 109 107 104 104 104 111 111 109 109 104 104 Solution:
  6. 6. Given data, 109 112 109 110 109 107 104 104 104 111 111 109 109 104 104 Total number of element = 15 Among the 15 data elements the values 104 and 109 both occur five times which are hence the modes of the data set. → Read More Effect of Transformations on Central Tendency Back to Top If all the data values in a data distribution is subjected to some common transformation, what would be the effect of this on the measures of central tendency?  If each element in a data set is increased by a constant, the mean, median and mode of the resulting data set can be obtained by adding the same constant to the corresponding values of the original data set.  When each element of a data set is multiplied by a constant, then the mean, median and mode of the new data set is obtained by multiplying the corresponding values of the original data set. Central Tendency and Dispersion Back to Top Two kinds of statistics are frequently used to describe data. They are measures of central tendency and dispersion. These are often called descriptive statistics because they can help us to describe our data. Measures of Central Tendency and Dispersion Mean, median and mode are all measures of central tendency whereas range, variance and standard deviation are all measures of dispersion. The measures used to describe the data set are measures of central tendency and measures of dispersion or variability. Central Tendency Dispersion If different sets of numbers can have the same mean. Then we will study two measures of dispersion, which give you an idea of how much the numbers in a set differ from the mean of the set. These two measures are called thevariance of the set and the standard deviation of the set. Formula for variance and Standard Deviation:
  7. 7. For the set of numbers {x1,x2,.............,xn} with a mean of x¯. The variance of the set is => V = (x1−x¯)2+(x2−x¯)2+.........+(xn−x¯)2n and the standard deviation is, => σ=V−−√. Standard Deviation can be represented as; σ = x21+x22+...............+x2nn−x¯2−−−−−−−−−−−−−−−−−√ Resistant Measures of Central Tendency Back to Top A resistant measure is one that is less influenced by extreme data values. The mean is less resistant than the median, that is the mean is more influenced by extreme data values. Resistant measure of central tendency can resist the influence of extreme observations or outliers. Let us see the effect of outlier with the help of example: Solved Example Question: Consider the data set, 5, 19, 19, 20, 21, 23, 23, 23, 24 , 25. Solution: The value 5 is an outlier of the data as it is too less than the other values in the distribution. Let us calculate the the central values for the data set either by including and excluding 5. Step 1: The data set excluding 5 is 19, 19, 20, 21, 23, 23, 23, 24 , 25 Mean = x¯=19+19+20+21+23+23+23+24+259=1979 = 21.89 => Mean = 21.89 Median = 23 Mode = 23 Step 2: For the data including the outlier 5, 19, 19, 20, 21, 23, 23, 23, 24 , 25
  8. 8. Mean = x¯=5+19+19+20+21+23+23+23+24+259=20210 = 20.2 => Mean = 20.2 Median = 21+232 = 22 Mode = 23 Step 3: Comparing the values of mean, median and mode found in step 1 and step 2, the mean is most affected and mode is least affected by the inclusion of the outlier value 5. Central Tendency and Variability Back to Top Central tendency is a statistical measure that represents a central entry of a data set. The problem is that there is no single measure that will always produce a central, representative value in every situation. There are three main measures of central tendency, mean, median and mode. Variability is the important feature of a frequency distribution. Range, variance and standard deviation are all measures of variability. Range, variance and standard deviation are all measures of variability. Range - The simplest measure of variability is the range, which is the difference between the highest and the lowest scores. Standard Deviation - The standard deviation is the average amount by which the scores differ from the mean. Variance - The variance is another measure of variability. It is just the mean of the squared differences, before we takethe square root to get the standard deviation. Central Tendency Theorem Back to Top A more formal and mathematical statement of the Central Limit Theorem is stated as follows: Suppose that x1,x2,x3,...................,xn are independent and identically distributed with mean μ and finite variance σ2 . Then the random variable Un is defined as, Un = X¯−μσn√ Where, X¯=1n∑ni=1Xi Then the distribution function of Un converges to the standard normal distribution function as n increases without bound.
  9. 9. Central Tendency Examples Back to Top A measure of central tendency is a value that represents a central entry of a data set. Central tendency of the data can be calculated by measuring mean, median and mode of the data. Below you could see some examples of central tendency: Solved Examples Question 1: Find the mean, median and mode of the given data. 10, 12, 34, 34, 45, 23, 42, 36, 34, 22, 20, 27, 33. Solution: Given Data, X = 10, 12, 34, 34, 45, 23, 42, 36, 34, 22, 20, 27, 33. ΣX = 10 + 12 + 34 + 34 + 45 + 23 + 42 + 36 + 34 + 22 + 20 + 27 + 33 = 372 => ΣX = 372 Step 1: Mean = ∑XX = 37213 [ X = Total number of terms ] = 28.6 => Mean = 28.6 Step 2: For Median, Arrange the data in ascending order.
  10. 10. 10, 12, 20, 22, 23, 27, 33, 34, 34, 34, 36, 42, 45. The median is 33. Half of the values fall above this number and half fall below. => Median = 33 Step 3: Mode Mode = 34 Because 34 occur maximum times. Question 2: The following table shows the sport activities of 2400 students. Sport Frequenc y Swimming 423 Tennis 368 Gymnastic s 125 Basket ball 452 Base ball 380 Athletics 275 None 377 Solution: From the given table: For grouped data the class with highest frequency is called the Modal class. The category with the longest column in the bar graph represents the mode of data set. Basket ball has the highest frequency of 452. Hence Basket ball is the mode of the sport activities.
  11. 11. Question 3: Find the median of the distribution, 223, 227, 240, 211, 212, 209, 211, 213, 240, 229. Solution: The ordered data array will be: 209, 211, 211, 212, 213, 223, 227, 229, 240, 240 The number of data values is even. Hence the two central values are those in the 5th and the 6thpositions. Median = 213+2232 = 4362 = 218 => Median = 218. http://math.tutorvista.com/statistics/central-tendency.html