Publicité
Publicité

Contenu connexe

Publicité

Business Statistics & Research Methods For Assistant Professor Exam

  1. 2 Hypothesis – Concept, Example, Meaning, Definition, Formulation & Types Critical Regions – One tailed & two tailed Types of errors – Type I & Type II Parametric Statistical test – Meaning, Example & Types Non-parametric Statistical test – Meaning, Example & Types Differences between Parametric & Non-parametric tests
  2. COMMON STATEMENTS WE SEE ACCROSS (i) The refrigerator of certain brand saves up to 20% electric bill, (ii) The motorcycle of certain brand gives 60 km/liter mileage, (iii) A detergent of certain brand produces the cleanest wash, (iv) Ninety nine out of hundred dentists recommend brand A toothpaste for their patients to save the teeth against cavity, etc. The technique of testing such type of claims or statements or assumptions is known as testing of hypothesis. The truth or falsity of a claim or statement is never known unless we examine the entire population. 3
  3. CONCEPT OF HYPOTHESIS TESTING For example, (i) a customer of motorcycle wants to test whether the claim of motorcycle of certain brand gives the average mileage 60 km/liter is true or false (ii) the businessman of banana wants to test whether the average weight of a banana of Kerala is more than 200 gm, (iii) a doctor wants to test whether new medicine is really more effective for controlling blood pressure than old medicine, (iv) an economist wants to test whether the variability in incomes differ in two populations, (v) a psychologist wants to test whether the proportion of literates between two groups of people is same, etc. 4
  4. Hypothesis(es) Meaning ◦ A hypothesis is a possible answer to a research question. ◦ It is a presumption or a hunch on the basis of which a study has to be conducted. ◦ This hypothesis is tested for possible rejection or approval. ◦ Hypothesis is an assumption made about population parameter (e.g., mean, median, variance, proportion, etc.). ◦ Research hypothesis is a predictive statement capable of being tested by scientific methods that relates an IV to some DV. 5
  5. Definition of Hypothesis Goode and Hatt defined it as “ a proposition which can be put to test to determined validity”. Rummel “ a hypothesis is a statement capable of being tested and there by verified or rejected”. Grinnell – A hypothesis is written in such a way that it can be proven or disproven by valid and reliable data, in order to obtain these data that we perform our study. 6
  6. If you want to conduct a study on the effects of ‘Parental Depression on the Academic Performance of Children’, you may like to conduct it without any hypothesis but then you will have many dimensions to think upon and will be more likely get distracted. If you formulate a hypothesis that ‘Parental depression results in depression in children too and this depression leads to low grades’ your research will get a direction and you will not think about the broader effects of depression, everything is well defined as on the grades of children, now you have to test the impact of parental depression on the children’s depression and as well as on the grades of children. You may not need to test impacts on the extracurricular activities, class conduct and other such things How Does Hypothesis Works 7
  7. Types of hypothesis Null Hypothesis(Hₒ) ◦ According to Prof. R.A. Fisher, “A null hypothesis is a hypothesis which is tested for possible rejection under the assumption that it is true.” ◦ The hypothesis which we wish to test is called as the null hypothesis. ◦ Ex : there is no relationship between “student attendance” and “ student result” Alternative Hypothesis(H₁) ◦ The hypothesis which complements to the null hypothesis is called alternative hypothesis. ◦ Ex: there is relationship between “ student attendance” and “ student result” 8 We state the null and alternative hypotheses in such a way that they cover all possibility of the value of population parameter.
  8. Formulation claims for Hypothesis(es) Symbolic form of the claims (i) µ = 60 km / liter. (ii) µ > 200 gm (iii) µ1 > µ2 (iv) 𝝈₁² ⧣ 𝝈₂² (v) P₁ = P₂ or P₁ –P₂ = 0 9 1 For example, (i) a customer of motorcycle wants to test whether the claim of motorcycle of certain brand gives the average mileage 60 km/liter is true or false (ii) the businessman of banana wants to test whether the average weight of a banana of Kerala is more than 200 gm, (iii) a doctor wants to test whether new medicine is really more effective for controlling blood pressure than old medicine, (iv) an economist wants to test whether the variability in incomes differ in two populations, (v) a psychologist wants to test whether the proportion of literates between two groups of people is same, etc.
  9. Formulation of Null Hₒ & H₁ Hypothesis(es)  “the motorcycle of certain brand gives the average mileage 60km/liter”. Symbolically (µ=60km/liter)  The claim is µ = 60 km/liter then,  Complement is µ ≠ 60km/liter.  Null Hypothesis Hₒ: µ = 60 km/liter  Alternative Hypothesis H₁: µ ≠ 60km/liter 10 The thump rule is that the statement containing equality is the null hypothesis.
  10. Formulation Null Hₒ & H₁ Hypothesis(es)  “the average weight of a banana of Kerala is greater then 200gm”. Symbolically ( µ > 200 gm).  The claim is µ > 200gm  Complement is µ ≤ 200 gm.  Null Hypothesis Hₒ: µ ≤ 200 gm  Alternative Hypothesis H₁: µ > 200 gm 11
  11. General Procedure for Hypothesis Testing Set up null Hₒ and Alternative H₁ Hypothesis Decide the level of significance (α) Choose an appropriate test statistic Calculate the value of the test statistic Obtain the critical (or cut-off) values Compare the calculated value of test statistic with critical value Derive the conclusion 12
  12. Set up null Hₒ and Alternative H₁ Hypothesis 13
  13. Decide the level of significance (α) ◦ After setting the null and alternative hypotheses, we establish a criteria for rejection or non-rejection of null hypothesis, that is, decide the level of significance (𝜶), at which we want to test our hypothesis. ◦ Generally, it is taken as 5% or 1% (α = 0.05 or 0.01). 14
  14. Choose an appropriate test statistic 15 Test statistic = statistic – Value of the parameter under Standard error of statistic Standard form like Z (standard normal), Chi Square. t. F or any other well-known in literature.
  15. Derive the conclusion Do not Reject Hₒ  If calculated value of test statistic lies in non-rejection region at α level of significance then we do not reject null hypothesis.  If the calculated value is lesser than Critical(Table) Value then we do not reject null hypothesis. CV<TV Reject Hₒ  If calculated value of test statistic lies in rejection region at α level of significance then we reject null hypothesis.  If the calculated value is greater than Critical (Table) Value then we reject null hypothesis. CV>TV 16
  16. Critical Regions 17
  17. Null and Alternative Hypotheses and Corresponding One-tailed and Two-tailed Tests 18
  18. TYPE-I AND TYPE-II ERRORS ◦ A faulty sample misleads the inference (or conclusion) relating to the null hypothesis. ◦ For example, an engineer infers that a packet of screws is substandard when actually it is not. It is an error caused due to poor or inappropriate (faulty) sample. ◦ Similarly, a packet of screws may infer good when actually it is sub-standard. ◦ So we can commit two kinds of errors while testing a hypothesis 19
  19. TYPE-I AND TYPE-II ERRORS 20
  20. 21
  21. Parametric Statistical test Parametric statistic is a branch of statistic, which assumes that sample data comes from a population that follows a probability or normal distribution. When the assumption are correct, parametric methods will produce more accurate and precise estimates. Assumptions  The scores must be independent  The observations must be drawn from normally distributed populations (Follow ND)  The selected population is representative of general population  The data is in Interval or Ratio scale  The populations (If comparing two or more groups) must have the same variances 22
  22. 23 Types of Parametric test Z- test t-test ANOVA F-test Chi- Square test
  23. Z- test  A Z-test is given by Fisher.  A Z-test is a type of hypothesis test or statistical test.  It is used for testing the mean of a population versus a standard or comparing the means of two population  Z-test is the statistical test, used to analyze whether two population means are different or not when the variances are known and the sample size is large. 24
  24. Assumptions of Z-test  Sample size is greater than 30.  Data point should be independent from each other.  Data should be randomly selected from a population, where each item has an equal chance of being selected.  Data should follow normal distribution.  The standard deviation of the populations is known. 25
  25. One-sample z-test One-sample z-test we are comparing the mean, calculated on a single of score (one sample) with known standard deviation. Ex. The manager of a candy manufacture wants to know whether the mean weight of batch of candy boxes is equal to the target value of 10 pounds from historical data. 26
  26. Two-sample z-test When testing for the differences between two groups can imagine two separate situation. Comparing the proportion of two population. In two sample z-test both independent populations. Ex: 1. Comparing the average engineering salaries of men versus women. 2. Comparing the fraction defectives from two production line. 27
  27. t-test ◦ It is developed by Prof. W.S Gosset in 1908. It is also called student t-test. ◦ A t-test statistical significance indicates whether or not the difference between two groups. Assumption  Samples must be random and independent.  When samples are small n<30  Standard deviation is not known.  Population is Normal distributed. 28 There are two ways to calculate T-test such as, a. One sample t-test b. Unpaired t-test.(independent two sample t-test) c. Paired t-test.
  28. One sample t-test ◦ In a one-sample t-test, we compare the average (or mean) of one group against the set average (or mean). ◦ This set average can be any theoretical value (or it can be the population mean). 29 t = t-statistic m = mean of the group µ = theoretical value or population mean s = standard deviation of the group n = group size or sample size
  29. Unpaired t-test: If there is no link between the data then use the unpaired t-test. When two separate set of independent sample are obtain one from each of the two population being compared. Ex: 1. Compare the height of girls and boys. 2. Compare the 2 stress reduction intervention. When one group practiced mindfulness meditation, while other learned yoga. 30
  30. Paired t-test Paired t-test consists of a sample of matched pairs of similar units or one group of units that has been tested. If there is some link between the data then use the paired t-test.(e.g. Before and after) Ex: 1. where subject are tested prior to a treatment say for high blood pressure, and the same subject are tested again after treatment with a blood pressure lowering medication. 2. Test on person or any group before and after training. 31
  31. ANOVA (Analysis of Variance) The statistical technique known as “Analysis of Variance”, commonly referred to by the acronym ANOVA was developed by Prof. R. A. Fisher in 1920’s. The analysis of variance focuses on variability. Variation is inherent in nature, so analysis of variance means examining the variation present in data or parts of data. In other words, analysis of variance means to find out the cause of variation in the data. According to Professor R. A. Fisher, Analysis of Variance (ANOVA) is "Separation of variance ascribable to one group of causes from the variance ascribable to other group". 32
  32. Types of ANOVA 33 If we consider, only one independent variable which affects the dependent variable. One-way ANOVA If the independent variables/explanatory variables are more than one i.e. n (say) then it is called n-way ANOVA. If n is equal to two, then the ANOVA is called Two- way classified ANOVA Two-way classified ANOVA is used when the experimenter wants to study the interaction effects among the explanatory variables Factorial ANOVA is used when the same subjects (experimental units) are used for each treatment (levels of explanatory variable). Repeated measure ANOVA Multivariate analysis of variance (MANOVA) is used when there is more than one response variable.
  33. One-way ANOVA ◦ One-way analysis of variance is a technique where only one independent variable at different levels is considered which affects the response variable. ◦ Ex: You might be studying the effect of tea on weight loss, from three groups, green tea, black tea, no tea. Assumptions 1) Dependent variable measured on interval scale; 2) k sample are independently and randomly drawn from the population; 3) Population can be reasonably to have a normal distribution; 4) Homogeneity of sample variance. 34
  34. Two-way ANOVA Two way ANOVA technique is used when the data are classified on the basis of two factors. And two way ANOVA analyzed a 2 independent variable and 1 dependent variable. Ex: The agricultural output may be classified on the basis of different verities of seeds and also on the basis of different verities of fertilizer used. 35
  35. 36 ANOVA Table for One-way Classified Data
  36. Chi-Square test ◦ It is drawn by Karl Pearson. Chi square test is a statistical test used as a parametric for testing for comparing variance . 37
  37. Non-parametric statistics test Non-parametric statistics is the branch of statistics. It refers to a statistical method in which the data is not required to fit a normal distribution. Nonparametric statistics uses data that is often ordinal, meaning it does not rely on numbers, but rather a ranking or order of sorts. For example: a survey conveying consumer preferences ranging from like to dislike would be considered ordinal data. This type of statistics can be used without the mean, sample size, standard deviation or estimation of any other parameters. The non-parametric test are called as “distribution-free” test since they make no assumptions regarding the population distribution. 38
  38. Types of Non-parametric tests  Chi-square test  Rank correlation test Rank sum test ◦ ◦ 39
  39. Chi-square test ◦ The chi-square (chi, the Greek letter pronounced "kye”) ◦ is officially known as the Pearson chi-square in homage to its inventor, Karl Pearson. ◦ The chi-square test is one of the nonparametric tests for testing three types of statistical tests: ▫ The goodness of fit, ▫ Test of Independence, and ▫ Homogeneity. 40
  40. Chi-square test Test of Independence: The independence of test is difference between the frequencies of occurrence in two or more categories with two or more groups. 41 Goodness of fit. Refers to whether a significant difference exists between an observed number and an expected number of responses, people or other objects. For example: suppose that we flip a coin 20 times and record the frequency of occurrence of heads and tails. Then we should expect 10 heads and 10 tails.
  41. Spearman’s rank correlation test ◦ This method a measure of association, that is based on the rank of observation and not on the numerical value of the data ◦ It was developed by famous Charles spearman in the early 1990s and such it is also known as spearman’s rank correlation co-efficient. 42
  42. Mann-Whitney U-test 43
  43. Mann-Whitney U-test Assumptions (i) The two samples are randomly and independently drawn from their respective populations. (ii) The variable under study is continuous. (iii) The measurement scale is at least ordinal. (iv) The distributions of two populations differ only with respect to location parameter. ◦ Formula 44
  44. ◦ the medians several (more than two) populations the same or not. ◦ non-parametric analogue to one way ANOVA based on rank transformed data. ◦ The Kruskal-Wallis one way analysis of variance by ranks (Kruskal, 1952) and (Kruskal and Wallis, 1952) is employed with ordinal (rank order) data in a hypothesis testing situation involving a design with two or more independent samples. 45
  45. Assumptions a) Each sample has been randomly selected from the population it represents; b) The k samples are independent of one another; c) The variable under study is continuous. d) The measurement scale is at least ordinal. Kruskal-Wallis Test Statistics Adjusted Factor for Ties ◦ Adjusted K-W Test Statistics H= 𝐻 𝐶 46 H = 12 𝑛(𝑛 + 1) 𝑖=1 𝑘 𝑅ᵢ² 𝑛ᵢ − 3(n + 1) C = 1 (𝑛³ − 𝑛) 𝑖=1 𝑟 𝑡ᵢ³ − 𝑡
  46. Differences between Parametric & Non-parametric tests 47
Publicité