Publicité

Unit 1 Introduction

Rai University
18 Mar 2015
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Publicité
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Publicité
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Publicité
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Unit 1 Introduction
Prochain SlideShare
Measures of central tendencyMeasures of central tendency
Chargement dans ... 3
1 sur 19
Publicité

Contenu connexe

Publicité
Publicité

Unit 1 Introduction

  1. Unit-1 Statistics Definition1 :- Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data. In applying statistics to scientific, industrial, or societal problem, it is necessary to begin with a population or process to be studied. Populations can be diverse topics such as "all persons living in a country" or "every atom composing a crystal". Itdeals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments. Definition2 :- Statistics is a science of facts and figures and nothing beyond that. It's a measurement of data and expression of the same in the numerical manner. Uses of statistics: 1. Itis highly quantitative than qualitative 2. Statistical method deals with two fundamental principles 3. Statistical unit 4. Statistical data mustbe manipulated 5. Presentation of statistical data with the help of line-diagram 1. It is highly quantitative than qualitative: Social statistics which presentthe data of an area mustbe numerous in nature. By which we can measurethe tendency of a project. In a little period, it also understand by everyone, when listen the percentage. So it is easy to record and easy to understand. 2. Statistical method deals with two fundamentalprinciples:  Fundamental regularity based on mathematical probability  Itsays aboutcapacity of the researcher Fundamental regularity based on mathematical probability: Itstates that every social phenomena is influenced by large number by
  2. variables, which are co-related and inter related and statistics ls to study this co-relation. Thereforethe theory of probability, linear programs and shadow prices are used to find-out the reality. Itsays aboutcapacity of the researcher: For substantiation of findings and conclusions, statisticaljargon are necessary and it savethe researcher/scholar fromdanger and challenges. Itis the data, facts and figures which say the capacity of the researcher. The skills and the resources which is used by the researcher mustbe applied in its research finding. 3. Statistical Units: Statistical unit has four characteristics as:  Appropriateness  Clarity  Measurability  Comparability 4. Statistical data must be manipulated: The statistical data mustbe manipulated, divided and totaled to formulate some conclusions. 5. Presentation of statistical data with the help of line-diagram: Presentation of statistical data with the help of line-diagram, graphs, charts, histogram, frequency, distribution, pie-diagrams etc. Limitations of statistics: Statistics is indispensable to almost all sciences - social, physical and natural. It is very often used in most of the spheres of human activity. In spite of the wide scope of the subject it has certain limitations. Some important limitations of statistics are the following: 1. Statistics does not study qualitative phenomena: Statistics deals with facts and figures. So the quality aspect of a variable or the subjective phenomenon falls out of the scope of statistics. For example, qualities like beauty, honesty, intelligence etc. cannot be numerically expressed. So these characteristics cannot be examined statistically. This limits the scope of the subject. 2. Statistical laws are not exact:
  3. Statistical laws are not exact as incase of natural sciences. These laws are true only on average. They hold good under certain conditions. They cannot be universally applied. So statistics has less practical utility. 3. Statistics does not study individuals: Statistics deals with aggregate of facts. Single or isolated figures are not statistics. This is considered to be a major handicap of statistics. 4. Statistics can be misused: Statistics is mostly a tool of analysis. Statistical techniques are used to analyze and interpret the collected information in an enquiry. As it is, statistics does not prove or disprove anything. It is just a means to an end. Statements supported by statistics are more appealing and are commonly believed. For this, statistics is often misused. Statistical methods rightly used are beneficial but if misused these become harmful. Statistical methods used by less expert hands will lead to inaccurate results. Here the fault does not lie with the subject of statistics but with the person who makes wrong use of it. Frequency Distribution Frequency:- Frequency is how often something occurs. Example:Samplayed footballon Saturday Morning,SaturdayAfternoon,ThursdayAfternoon The frequencywas 2 on Saturday, 1 onThursday and 3 for the whole week. FrequencyDistribution By countingfrequencieswe canmake a FrequencyDistributiontable. Example:Goals Sam put the numbers in order, then added up:  howoften1 occurs (2 times),  howoften2 occurs (5 times),  etc, and wrote themdownas a Frequency Distributiontable. Sam's teamhas scoredthe followingnumbers of goalsin recentgames: 2, 3, 1, 2, 1, 3, 2, 3, 4, 5, 4, 2, 2,3
  4. From the table we can see interesting things such as  getting2 goalshappensmostoften  onlyonce didtheyget5 goals Frequency Distribution:- values and their frequency (how often each value occurs). Example:Newspapers These are the numbersof newspaperssoldata local shopoverthe last10 days: 22, 20, 18, 23, 20, 25, 22, 20, 18, 20 Let uscount howmany of each numberthere is: Papers Sold Frequency 18 2 19 0 20 4 21 0 22 2 23 1 24 0 25 1 It isalso possible to groupthe values.Here theyare groupedin5s: Papers Sold Frequency 15-19 2 20-24 7 25-29 1 Frequency Curve A smooth curve which corresponds to the limiting case of a histogram computed for a frequency distribution of a continuous distribution as the number of data points becomes very large is called frequency curve.
  5. Measures of Central Tendency Introduction A measureof central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data. Measures of central tendency are sometimes called measures of central location. They are also classed as summary statistics. Themean (often called the average) is most likely the measureof central tendency that you are mostfamiliar with, but there are others, such as the median and the mode. The mean, median and mode are all valid measures of central tendency. Mean(Arithmetic) The mean (or average) is the most popular and well known measureof central tendency. Itcan be used with both discrete and continuous data, although its use is mostoften with continuous data. The mean is equal to the sumof all the values in the data set divided by the number of values in the data set. So, if we have n values in a data set and they have values 𝑥1, 𝑥2, 𝑥3,…, 𝑥 𝑛 the samplemean, usually denoted by (pronounced x bar), is: 𝑥̅ = (𝑥1 + 𝑥2 + 𝑥3 + ⋯+ 𝑥 𝑛) 𝑛 This formula is usually written in a slightly different manner using the Greek capitol letter, ∑ , pronounced "sigma", which means "sumof...": 𝑥̅ = ∑𝑥 𝑛 When not to use the mean The mean has one main disadvantage: it is particularly susceptibleto the influence of outliers. These are values that are unusualcompared to the restof the data set by being especially small or large in numerical value. For example, consider the wages of staff at a factory below:
  6. Staff 1 2 3 4 5 6 7 8 9 10 Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k The mean salary for these ten staff is $30.7k. However, inspecting theraw data suggests that this mean value might not be the best way to accurately reflect the typical salary of a worker, as mostworkers havesalaries in the $12k to 18k range. The mean is being skewed by the two large salaries. Therefore, in this situation, we would like to have a better measureof central tendency. Median The median is the middle scorefor a set of data that has been arranged in order of magnitude. If the number of events are even then the averageof two middle are taken. The median is better for describing the typical value. Example:- In order to calculate the median, supposewehave the data below: 65 55 89 56 35 14 56 55 87 45 92 We firstneed to rearrangethat data into order of magnitude (smallest first): 14 35 45 55 55 56 56 65 87 89 92 Our median mark is the middle mark - in this case, 56 (highlighted in bold). Mode The mode is the mostfrequent scorein our data set. What will happen to the measures of central tendency if we add the same amount to all data values, or multiply each data value by the same amount?
  7. Data Mean Mode Median Original Data Set: 6, 7, 8, 10, 12, 14, 14, 15, 16, 20 12.2 14 13 Add 3 to each data value 9, 10, 11, 13, 15, 17, 17, 18, 19, 23 15.2 17 16 Multiply 2 times each data value 12, 14, 16, 20, 24, 28, 28, 30, 32, 40 24.4 28 26 When added: Since all values are shifted the same amount, the measures of central tendency all shifted by the same amount. If you add 3 to each data value, you will add 3 to the mean, mode and median. When multiplied: Since all values are affected by the same multiplicative values, the measures of central tendency will feel the same affect. If you multiply each data value by 2, you will multiply the mean, mode and median by 2. Example :-1 Find the mean, median and mode forthe following data: 5, 15, 10, 15, 5, 10, 10, 20, 25, 15. Answer:- (You will need to organize the data.) 5, 5, 10, 10, 10, 15, 15, 15, 20, 25 Mean: 𝑆𝑢𝑚 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 = 130 10 = 13 Median: 5, 5, 10, 10,10,15,15, 15, 20, 25 Listing the data in order is the easiest way to find the median. The numbers 10 and 15 both fall in the middle. Average these two numbers to get the median. 10+15 2 = 12.5 Mode: Two numbers appear most often: 10 and 15. There are three 10's and three 15's. In this example there are two answers for the mode. Example :- 2 For what value of x will 8 and x have the same mean (average) as 27 and 5? Answer:-
  8. First, find the mean of 27 and 5: 27 + 5 2 = 16 Now, find the x value, knowing that the average of x and 8 must be 16: 𝑥 + 8 2 = 16 ⟹32 = x + 8 cross multiply ⇒ 𝑥 = 32 − 8 = 24 Example :- 3 On his first5 biology tests, Bob received the following scores: 72, 86, 92, 63, and 77. What test scoremust Bob earn on his sixth test so that his average(mean score) for all six tests will be 80? Show how you arrived at your answer. Answer:- Possible solution: Set up an equation to representthe situation. Remember to use all 6 test scores: 72+86+92+63+77+x 6 = 80 cross multiply and solve: (80)(6) = 390 + 𝑥 ⇒ 480 = 390 + 𝑥 ⇒ 𝑥 = 480− 390 = 90 Example:- 4 The mean (average) weightof three dogs is 38 pounds. One of the dogs, Sparky, weighs 46 pounds. The other two dogs, Eddie and Sandy, havethe same weight. Find Eddie's weight. Answer:- Let x = Eddie's weigh ( they weigh the same, so they are both represented by "x".) Let x = Sandy's weight Average: sumof the data divided by the number of data. x + x + 46 = 38 cross multiply and solve 3(dogs)
  9. (38)(3) =2x+ 46 114 = 2x + 46 2𝑥 = 114− 46 ⇒ 𝑥 = 68 2 = 34 ∴ Eddie weighs 34 pounds. For Class interval: 𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + ( 𝑁 2 − 𝑐𝑓 𝑓 ) × 𝑖 𝑊ℎ𝑒𝑟𝑒 𝐿 = 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠 𝑁 = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑖𝑡𝑒𝑚𝑠 𝑐𝑓 = 𝐶𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑓 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠 𝑖 = 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠 𝑀𝑜𝑑𝑒 = 𝑎 + 𝐶(𝑓𝑖 − 𝑓𝑖−1) 2𝑓𝑖 − 𝑓𝑖−1 − 𝑓𝑖+1 𝑊ℎ𝑒𝑟𝑒 𝑎 = 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠 𝑓𝑖 = 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝐶 = 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑐𝑙𝑎𝑠𝑠 Question:- Find the median of the following data. Cost 10-20 20-30 30-40 40-50 50-60 Items in a group 4 5 3 6 3
  10. Solution:- Cost Number of items in the group Cumulative frequency 10-20 4 4 20-30 5 9 30-40 3 12 40-50 6 18 50-60 3 21 Here N=21 ⇒ 𝑁 2 = 10.5 The median class is 30-40. FromFormula, 𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + ( 𝑁 2 − 𝑐𝑓 𝑓 ) × 𝑖 L=30, 𝑖 = 10, 𝑐𝑓 = 9 𝑀𝑒𝑑𝑖𝑎𝑛 = 30 + (10.5−9) 12 × 10 = 30 + 1.25 = 31.25 Question:- Find the Mode of the following distribution: Class Interval 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 Frequency 5 9 8 12 28 20 12 11 Solution:- MaximumFrequency=28,Modal class=40-50 From Formula, 𝑀𝑜𝑑𝑒 = 𝑎 + 𝐶(𝑓𝑖 − 𝑓𝑖−1) 2𝑓𝑖 − 𝑓𝑖−1 − 𝑓𝑖+1 𝑎 = 40, 𝐶 = 10, 𝑓𝑖 = 28, 𝑓𝑖−1 = 12, 𝑓𝑖+1 = 20 Mode=40+ 10(28−12) (2×28)−12−20 = 40 + 6.666 = 46.666
  11. FAQs - Measures of Central Tendency What is the best measure of central tendency? There can often be a "best" measureof central tendency with regards to the data you are analyzing, but there is no one "best" measureof central tendency. This is because whether you use the median, mean or mode will depend on the type of data you have(see our Types of Variable guide), such as nominal or continuous data; whether your data has outliers and/or is skewed; and whatyou aretrying to show fromyour data. Further considerations of when to useeach measure of central tendency is found in our guide on the previous page. In a strongly skeweddistribution, what is the best indicator of central tendency? Itis usually inappropriateto usethe mean in such situations whereyour data is skewed. You would normally choosethe median or mode, with the median usually preferred. This is discussed on the previous pageunder the subtitle, "When not to usethe mean". Does all data have a median, mode and mean? Yes and no. All continuous data has a median, mode and mean. However, strictly speaking, ordinaldata has a median and mode only, and nominal data has only a mode. However, a consensus has notbeen reached among statisticians about whether the mean can be used with ordinal data, and you can often see a mean reported for Likert data in research. When is the mean the best measure of central tendency? The mean is usually the best measureof central tendency to use when your data distribution is continuous and symmetrical, such as when your data is normally distributed. However, it all depends on what you are trying to show fromyour data. When is the mode the best measure of central tendency? The mode is the least used of the measures of central tendency and can only be used when dealing with nominal data. For this reason, the mode will be the best measureof central tendency (as it is the only one appropriateto use) when
  12. dealing with nominal data. The mean and/or median are usually preferred when dealing with all other types of data, but this does not mean it is never used with these data types. When is the median the best measure of central tendency? The median is usually preferred to other measures of central tendency when your data set is skewed (i.e., forms a skewed distribution) or you are dealing with ordinal data. However, themode can also be appropriate in these situations, but is not as commonly used as the median. What is the most appropriate measure of central tendency whenthe datahas outliers? The median is usually preferred in these situations because the value of the mean can be distorted by the outliers. However, it will depend on how influential the outliers are. If they do not significantly distortthe mean, using the mean as the measureof central tendency will usually be preferred. In a normally distributeddataset, whichis greatest:mode, medianor mean? If the data set is perfectly normal, the mean, median and mean are equal to each other (i.e., the same value). For any data set, whichmeasures of central tendency have only one value? The median and mean can only have one value for a given data set. The mode can have more than one value MERITS AND DEMERITS OF MEAN, MEDIAN AND MODE MEAN The arithmetic mean (or simply "mean") of a sample is the sumof the sampled values divided by the number of items in the sample. MERITS OF ARITHEMETIC MEAN 1. ARITHEMETICMEANRIGIDLYDEFINED BYALGEBRICFORMULA 2. It is easy to calculate and simple to understand
  13. 3. ITBASED ONALL OBSERVATIONS AND ITCANBE REGARDED AS REPRESENTATIVEOF THE GIVENDATA 4. It is capable of being treated mathematically and hence it is widely used in statistical analysis. 5. Arithmetic mean can be computed even if the detailed distribution is not known but someof the observation and number of the observation are known. 6. It is least affected by the fluctuation of sampling DEMERITS OF ARITHMETIC MEAN 1. Itcan neither be determined by inspection or by graphicallocation 2. Arithmetic mean cannot be computed for qualitative data like data on intelligence honesty and smoking habit etc 3. It is too much affected by extreme observations and hence it is not adequately representdata consisting of some extreme point 4. Arithmetic mean cannot be computed when class intervals have open ends MEDIAN The median is that value of the series which divides the group into two equal parts, one part comprising all values greater than the median value and the other part comprising all the values smaller than the median value. MERITS OF MEDIAN (1) Simplicity:- Itis very simple measureof the central tendency of the series. I the case of simple statistical series, justa glance at the data is enough to locate the median value. (2) Free fromthe effect of extreme values: - Unlike arithmetic mean, median value is not destroyed by the extreme values of the series.
  14. (3) Certainty: - Certainty is another merits is the median. Median values are always a certain specific value in the series. (4) Real value: - Median value is real value and is a better representativevalue of the series compared to arithmetic mean average, the value of which may not exist in the series at all. (5) Graphic presentation: - Besides algebraic approach, the median value can be estimated also through the graphic presentation of data. (6) Possibleeven when data is incomplete: - Median can be estimated even in the case of certain incomplete series. Itis enough if one knows the number of items and the middle item of the series. DEMERITS OF MEDIAN Following are the various demerits of median: (1) Lack of representative character: - Median fails to be a representative measurein caseof such series the different values of which are wide apart from each other. Also, median is of limited representative character as it is not based on all the items in the series. (2) Unrealistic:- When the median is located somewherebetween the two middle values, it remains only an approximate measure, not a precisevalue. (3) Lack of algebraic treatment: - Arithmetic mean is capable of further algebraic treatment, but median is not. For example, multiplying the median with the number of items in the series will not give us the sumtotal of the values of the series. However, median is quite a simple method finding an average of a series. Itis quite a commonly used measure in the caseof such series which are related to qualitative observation as and health of the student.
  15. MODE The value of the variable which occurs mostfrequently in a distribution is called the mode. MERITS OF M0DE Following are the various merits of mode: (1) Simple and popular: - Mode is very simple measure of central tendency. Sometimes, justat the series is enough to locate the model value. Because of its simplicity, it s a very popular measure of the central tendency. (2) Less effect of marginal values: - Compared top mean, mode is less affected by marginal values in the series. Mode is determined only by the value with highest frequencies. (3) Graphic presentation:- Mode can be located graphically, with the help of histogram. (4) Best representative: - Mode is that value which occurs mostfrequently in the series. Accordingly, modeis the best representativevalue of the series. (5) No need of knowing all the items or frequencies: - The calculation of mode does not requireknowledge of all the items and frequencies of a distribution. In simple series, it is enough if one knows theitems with highest frequencies in the distribution. DEMERITS OF M0DE Following are the various demerits of mode: (1) Uncertain and vague: - Mode is an uncertain and vaguemeasure of the central tendency. (2) Not capable of algebraic treatment: - Unlike mean, mode is not capable of further algebraic treatment. (3) Difficult: - With frequencies of all items are identical, it is difficult to identify
  16. the modal value. (4) Complex procedureof grouping:- Calculation of mode involves cumbersome procedureof grouping the data. If the extent of grouping changes there will be a change in the model value. (5) Ignores extrememarginal frequencies:- Itignores extreme marginal frequencies. To that extent model value is not a representative value of all the items in a series. Besides, one can question the representative character of the model value as its calculation does not involve all items of the series. Dispersion In statistics, dispersion (also called variability, scatter, or spread) denotes how stretched or squeezed is a distribution (theoretical or that underlying a statistical sample). Common examples of measures of statistical dispersion are the variance, standard deviation and interquartile range. Dispersion is contrasted with location or central tendency, and together they are the mostused properties of distributions. Measures of dispersion The set of constants which would in a concise way explain the “variability”, or “scatter” in a data is called “Measuresof dispersion or variability”. The average for two groups of the same number of measurements may be equal, but one group may be more variable then the others. e.g. set of five values 5,6,7,8,9 has themean as 7; while other set of five values 1,6,4,10,14 also has the samemean 7. The second set has more variability then the first. Usually four measures of dispersion or variability are defined. Range:- The Range is the difference between the two extreme values. In frequency distribution, 𝑅 = (𝐿𝑎𝑟𝑔𝑒𝑠𝑡 𝑥 𝑣𝑎𝑙𝑢𝑒) – (𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑥 𝑣𝑎𝑙𝑢𝑒)
  17. Example: In {4, 6, 9, 3, 7} the lowestvalue is 3, and the highest is 9. So the range is 9-3 = 6. Quartile deviation:- Median bisects the distribution. If the distribution divided into four parts, quartiles are obtained. FirstQuartile is𝑄1 and third Quartile is 𝑄3. 𝑄1 = 𝑙 + ( 𝑁 4 − 𝑓𝑄1 ) 𝑓 × 𝐶 𝑄3 = 𝑙 + ( 3𝑁 4 − 𝑓𝑄3 ) 𝑓 × 𝐶 Where 𝑙 = lower limit of the Quartile class 𝐶 = common factor Quartile Deviation is defined as 𝑄. 𝐷. = 1 2 ( 𝑄3 − 𝑄1) AverageDeviation:-If averagechosen A, then averagedeviation about A is averagedeviation. 𝐴. 𝐷. ( 𝐴) = 1 3 ∑| 𝑥𝑖 − 𝐴| 𝑓𝑜𝑟 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 𝑑𝑎𝑡𝑎 = 1 3 ∑𝑓𝑖| 𝑥𝑖 − 𝐴| 𝑓𝑜𝑟 𝑎 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 Standard deviation:- Standard deviation(𝜎) = √ 1 𝑛 ∑(𝑥𝑖 − 𝑥̅)2 𝑓𝑜𝑟 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 𝑑𝑎𝑡𝑎 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 = √ 1 𝑁 ∑𝑓𝑖 (𝑥𝑖 − 𝑥̅)2 𝑓𝑜𝑟 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 Square of standard deviation, 𝜎2 is defined as Variance (𝑉). 𝑉 = 𝜎2 = 1 𝑁 ∑𝑓𝑖 (𝑥𝑖 − 𝑥̅)2 Coefficient of variation In probability theory and statistics, the coefficient of variation (CV) is a standardized measureof dispersion of a probability distribution or frequency distribution. Itis defined as the ratio of the standard deviation 𝜎to the mean 𝜇 . It is also known as unitizedrisk or the variationcoefficient. Theabsolutevalue of the CV is sometimes known as relative standard deviation (RSD), which is expressed as a percentage.
  18. Definition The coefficient of variation (CV) is defined as the ratio of the standard deviation 𝜎 to the mean 𝜇 : 𝐶𝑣 = 𝜎 𝜇 Itshows the extent of variability in relation to mean of the population. Example:-Theowner of a restaurantis interested in how much people spend at the restaurant. Heexamines 10 randomly selected receipts for parties of four and write down the following data: 44, 50, 38, 96, 42, 47, 40,39,46, 50 Find mean, standard deviation and variance. Solution:- Mean is calculated by adding and dividing by 10. Mean = 𝑥̅ = 49.2 Following table is used to find standard deviation P 𝒙 − 𝟒𝟗. 𝟐 ( 𝒙 − 𝟒𝟗. 𝟐) 𝟐 44 -5.2 27.04 50 0.8 0.64 38 11.2 125.44 96 46.8 2190.24 42 -7.2 51.84 47 -2.2 4.84 40 -9.2 84.64 39 -10.2 104.04 46 -3.2 10.24 50 0.8 0.64 Total 2600.4 Standard Deviation= 𝜎
  19. = √ 1 𝑛 ∑(𝑥𝑖 − 𝑥̅)2 =√ 2600.4 10−1 = √ 2600.4 9 = √288.93 = ±16.997=17 Variance =𝜎2 = 288.93 Coefficient of variation (C.V.)= 𝐶𝑣 = 𝜎 𝜇 = 16.997 49.2 = 0.34547
Publicité