SlideShare une entreprise Scribd logo
1  sur  38
Non Parametric Tests
Mean / Median
• The mean is a good measure of center when the data is bell-shaped,
but it is sensitive to outliers and extreme values.
• When the data is skewed, however, a better measure of center would
be the median.
• The median, is a resistant measure.
• In other words, we may want to consider a test for the median and
not the mean.
• In a skewed distribution, the population median, typically denoted
as η, is a better typical value than the population mean, μ.
Sign test
• It is a non-parametric or “distribution free” test, which means the test
doesn’t assume the data comes from a particular distribution.
• The sign test compares the sizes of two groups.
• The sign test is an alternative to a one sample t test or a paired t test.
• It can also be used for ordered data.
• The null hypothesis for the sign test is that the difference
between medians is zero. red (ranked) categorical data.
• This test is used when we are interested in testing the population
median and not the mean.
One sample median test
• The one sample median test checks whether or not there is
a significant difference between our hypothesized median and the
real median of a sample.
• We learned how to use a t-test for the difference between means of
dependent samples. That test required both populations to be
normally distributed.
• If the condition of normality cannot be satisfied, we can use the
paired-sample sign test to test the difference between two
population medians, the following conditions must be met.
• 1. A sample must be randomly selected from each population.
• 2. The samples must be dependent (paired).
• We find the difference between corresponding data entries by
subtracting the entry representing the second variable from the entry
representing the first variable, and record the sign of the difference.
• Then compare the number of + and – signs. (the 0s are ignored.)
Steps:-
• State the hypothesis
• Specify alpha
• Specify sample size
• Find critical value – from t-table or z-table
• Find test statistic
• Make decision
• Interpret
Test statistic
• When n<=25 , test statistic is smaller no of positive or negative sign.
• When n>25 , test statistic is calculated from formula :-
• z=((x+0.5)+0.5n)/sqrt(n)/2
• Where x=smaller no of sign and n=total no of positive and negative
sign.
Example :- Sand C represent two tasks, S the spelling of 25 words presented separately, and C the
spelling of 25 words of equal difficulty presented as an integral part of a sentence (i.e., in context). A
teacher wants to know which condition is favorable to higher scores. Test the hypothesis that C is better
than S.
• Of the 10 differences, 7 are plus (C higher than S), 2 are minus (S
higher than C) and one is zero. Excluding the 0 as being neither +nor
- , we have 9 differences of which 7 are plus.
• Let alpha = 0.05 and N = 9 . It’s a left tailed test. Critical value- 1.860
(from t-table)
• Test statistics = 2
• Since test statistic is greater than the critical value , we fail to reject the
null hypothesis.
Ex
• A college statistics professor claims that the median test score for his
students is 58. The scores of 18 randomly selected tests are listed
below. At alpha=0.01, can you reject the professors claim?
• 58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55
Paired/Matched sample Sign test
• Assumptions for the test (your data should meet these requirements
before running the test) are:
• The data should be from two samples.
• The two dependent samples should be paired or matched. For example,
depression scores from before a medical procedure and after.
• Example:-
• This set of data represents test scores at the end of Spring and the
beginning of the Fall semesters.
• The hypothesis is that the summer break means a significant drop in test
scores.
• H0: No difference in median of the signed differences.
• H1: Median of the signed differences is less than zero.
• H0: No difference in median of the signed differences.
• H1: Median of the signed differences is less than zero.
• Count the number of positives and negatives.
• 4 positives.
• 12 negatives.
• Add up the number of items in your sample and subtract any you had
a difference of zero for (in column 3). The sample size in this question
was 17, with one zero, so n = 16.
• Let alpha = 0.05 and N = 16 . Critical value- 2.120 (from t-table)
• Test statistics = 4
• Since test statistic is greater than the critical value , we fail to reject the
null hypothesis.
Example:
A new chemotherapy treatment is proposed for patients with breast cancer. Investigators are
concerned with patient's ability to tolerate the treatment and assess their quality of life both before
and after receiving the new chemotherapy treatment. Quality of life (QOL) is measured on an
ordinal scale and for analysis purposes, numbers are assigned to each response category as follows:
1=Poor, 2= Fair, 3=Good, 4= Very Good, 5 = Excellent. The data are shown below.
Patient QOL Before
Chemother
apy
Treatment
QOL After
Chemother
apy
Treatment
Difference Sign
1 3 2 1 +
2 2 3 -1 -
3 3 4 -1 -
4 2 4 -2 -
5 1 1 0 NA
6 3 4 -1 -
7 2 4 -2 -
8 3 3 0 NA
9 2 1 1 +
10 1 3 -2 -
11 3 4 -1 -
12 2 3 -1 -
H0- no difference in median of both the data values
Ha – there is a difference in the median of both the data
values
No of +ves- 2
No. of –ves = 8
N=10
Alpha= 0.05
Test statistics= 2
Critical value- 1.812
Conclusion:- test statistic > critical value
We accept the hypothesis that there is no difference in the
median of both the data values.
There was no significant change in the quality of life after
and before the chemotherapy treatment.
Mood’s Median Test
• Mood’s median test is used to compare the medians for two samples to
find out if they are different.
• For example, you might want to compare the median number
of positive calls to a hotline vs. the median number of negative comment
calls to find out if you’re getting significantly more negative comments than
positive comments (or vice versa).
• This test is the nonparametric alternative to a one way ANOVA;
Nonparametric means that you don’t have to know what distribution your
sample came from (i.e. a normal distribution) before running the test. That
said, your samples should have been drawn from distributions with the
same shape.
• Use this test instead of the sign test when you have two independent
samples. The test is a particular case of the chi-square test of dependence.
• The null hypothesis for this test is that the medians are the same for both
groups.
• The alternate hypothesis for the test is that the medians are different for
both groups.
• Step 1: Make a 2 x k contingency table, where k is the number of
samples.
• Step 2: Find M, the overall median for all the data in your samples. To
do this, list all of your data (from all samples) in a single set. Sort in
ascending order and then find the middle number.
• Step 3: List each individual sample’s data in ascending order. Count
how many data points are greater than M (from Step 2) and then
count how many data points are smaller than or equal to M. List
these in the first row of the contingency table.
• Step 4: Perform a chi-square test on the completed contingency table.
• Step 5: Compare the chi-square statistic to the table value
with: degrees of freedom = (number of rows – 1) * (number of
columns – 1).
Example
• Non parametric test - Mood's Median test for the following sets of
data :- (11,15,9,4,34,17,18,14,12,13,26,31)
(34,31,35,29,28,12,18,30,14,22,10,29 )
• Significance Level α=0.05 and One-tailed test
• Sol:- Step-1:Calculate total Median of combination of 2 samples
Sorting of combined samples
4,9,10,11,12,12,13,14,14,15,17,18,18,22,26,28,29,29,30,31,31,34,34,
35
n=24
Median =(12thterm+13thterm)/2=(18+18)/2=18
• Step-2:Create a 2×2 contingency table whose first row consists of the number of elements in each
sample that are greater than Median and second row consists of the number of elements in each
sample that are less than or equal to Median
Sample A Sample B Total
> Median 3 8 11
<= Median 9 4 13
Total 12 12 24
Step-3:Perform a chi-square test of independence.
State the hypothesis
H0: two categories variables are independent.
H1: two categories variables are not independent.
Observed Frequencies
B1 B2
Total
A1 3 8 11
A2 9 4 13
Total 12 12 24
• Expected Frequencies
• Compute Chi-square
• χ2=∑(Oij-Eij)2/Eij
=(3-5.5)2/5.5+(8-5.5)2/5.5+(9-6.5)2/6.5+(4-6.5)2/6.5
=6.25/5.5+6.25/5.5+6.25/6.5+6.25/6.5
=1.1364+1.1364+0.9615+0.9615
=4.1958
• Compute the degrees of freedom (df).
df=(2-1)⋅(2-1)=1
• for 1 df, p(χ2≥4.1958)=0.0405. Test statistic- 4.1958. Critical value- 6.314
Since the test statistic < critical value , we reject the null hypothesis H0.
B1
B2 Total
A1 5.5 5.5 11
A2 6.5 6.5 13
Total 12 12 24
Example
• A major wheat supplier from Texas analyzing the yields of various
crop methods. He randomly assigned two different wheat crop
methods to a very high number of different acres of farm land and
recorded the production rate (yield per acre) for each plot. We need
to find out difference between the two wheat crop methods.
Kruskal Wallis Test
• The Kruskal Wallis test is the non parametric alternative to the One Way ANOVA.
• The test determines whether the medians of two or more groups are different. Like most
statistical tests, you calculate a test statistic and compare it to a distribution cut-off point.
The test statistic used in this test is called the H statistic.
• The hypotheses for the test are:
• H0: population medians are equal.
• H1: population medians are not equal.
• The Kruskal Wallis test will tell you if there is a significant difference between groups.
However, it won’t tell you which groups are different.
• You want to find out how test anxiety affects actual test scores. The independent
variable “test anxiety” has three levels: no anxiety, low-medium anxiety and high anxiety.
The dependent variable is the exam score, rated from 0 to 100%.
• You want to find out how socioeconomic status affects attitude towards sales tax
increases. Your independent variable is “socioeconomic status” with three levels:
working class, middle class and wealthy. The dependent variable is measured on a 5-
point scale from strongly agree to strongly disagree.
• The H test is used when the assumptions for ANOVA aren’t met (like
the assumption of normality). It is sometimes called the one-way
ANOVA on ranks, as the ranks of the data values are used in the test
rather than the actual data points.
• Assumptions:-
• One independent variable with two or more levels (independent
groups). The test is more commonly used when you have three or
more levels.
• Ordinal scale, Ratio Scale or Interval scale dependent variables.
• Your observations should be independent. In other words, there
should be no relationship between the members in each group or
between groups.
• All groups should have the same shape distributions.
• It is used for comparing two or more independent samples of equal
or different sample sizes.
• The Kruskal-Wallis H Test is a nonparametric procedure that can be
used to compare more than two populations in a completely
randomized design.
• All n = n1+n2+…+nk measurements are jointly
• ranked (i.e.treat as one large sample).
• We use the sums of the ranks of the k samples to compare the
distributions.
• Rank the total measurements in all k samples from 1 to n. Tied
observations are assigned average of the ranks they would have gotten if
not tied.
• Calculate
T = rank sum for the ith sample
And the test statistic
i = 1, 2,…,k
 3(n 1)
n(n 1) ni
12
2
T
 i
H 
H0: the k distributions are identical versus
Ha: at least one distribution is different Test
statistic: Kruskal-Wallis H
When H0 is true, the test statistic H has an
approximate chi-square distribution with df
= k-1.
Use a right-tailed rejection region or p-
value based on the Chi-square distribution.
Example
• A shoe company wants to know if three groups of workers have different salaries:
Women: 23K, 41K, 54K, 66K, 78K.
Men: 45K, 55K, 60K, 70K, 72K
Minorities: 18K, 30K, 34K, 40K, 44K.
• Sol:- Null Hypothesis H0 : All groups are equal
Alternative Hypothesis H1 : At least one group is not equal
• Step 1: Sort the data for all groups/samples into ascending order in one combined set.
20K
23K
30K
34K
40K
41K
44K
45K
54K
55K
60K
66K
70K
72K
90K
• Step 2: Assign ranks to the sorted data points. Give tied values the average
rank.
20K 1
23K 2
30K 3
34K 4
40K 5
41K 6
44K 7
45K 8
54K 9
55K 10
60K 11
66K 12
70K 13
72K 14
90K 15
• Step 3: Add up the different ranks for each group/sample.
Women: 23K, 41K, 54K, 66K, 90K = 2 + 6 + 9 + 12 + 15 = 44.
Men: 45K, 55K, 60K, 70K, 72K = 8 + 10 + 11 + 13 + 14 = 56.
Minorities: 20K, 30K, 34K, 40K, 44K = 1 + 3 + 4 + 5 + 7 = 20.
• Step 4: Calculate the H statistic: Where:
• n = sum of sample sizes for all samples,
• c = number of samples,
• Tj = sum of ranks in the jth sample,
• nj = size of the jth sample.
H = 6.72
Step 5: Find the critical chi-square value, with c-1 degrees of freedom. For 3 – 1
degrees of freedom and an alpha level of .05, the critical chi square value is
5.9915.
Step 6: Compare the H value from Step 4 to the critical chi-square value from
Step 5.
If the critical chi-square value is less than the H statistic, reject the null
hypothesis that the medians are equal.
If the chi-square value is not less than the H statistic, there is not enough
evidence to suggest that the medians are unequal.
In this case, 5.9915 is less than 6.72, so we can reject the null hypothesis.
• Perform Kruskal wallis test for the following data:-
8,5,7,11,9,6 – 25.5
10,12,11,9,13,12 - 64
11,14,10,16,17,12 – 87.5
18,20,16,15,14,22 - 123
• Significance Level α=0.05 and One-tailed test.
• 12/24*25[(25.52 + 642 + 87.52 + 1232 )/6] -3(24+1)
• H= 16.825
• Critical value = 7.815
Mann Whitney U Test
• The Mann-Whitney U test is the nonparametric equivalent of the two
sample t-test.
• The Mann Whitney U test, sometimes called the Mann Whitney Wilcoxon
Test or the Wilcoxon Rank Sum Test
• While the t-test makes an assumption about the distribution of
a population , the Mann Whitney U Test makes no such assumption.
• The test compares two populations.
• The null hypothesis is that the two samples come from the same
population (i.e. that they both have the same median).
• This test is often performed as a two-sided test and, thus, the research
hypothesis indicates that the populations are not equal as opposed to
specifying directionality.
• A one-sided research hypothesis is used if interest lies in detecting a
positive or negative shift in one population as compared to the other.
• Assumptions for the Mann Whitney U Test
• The dependent variable should be measured on an ordinal scale or a
continuous scale.
• The independent variable should be two independent, categorical
groups.
• Observations should be independent. In other words, there should be
no relationship between the two groups or within each group.
• Observations are not normally distributed. However, they should
follow the same shape (i.e. both are bell-shaped and skewed left).
• The result of performing a Mann Whitney U Test is a U Statistic.
• For small samples, use the direct method (see below) to find the U
statistic;
• For larger samples, a formula is necessary.
Formula
Either of these two formulas are valid for the Mann
Whitney U Test.
R is the sum of ranks in the sample, and n is the number
of items in the sample.
Consider a Phase II clinical trial designed to investigate the effectiveness of a new drug to reduce symptoms
of asthma in children. A total of n=10 participants are randomized to receive either the new drug or a
placebo. Participants are asked to record the number of episodes of shortness of breath over a 1 week
period following receipt of the assigned treatment. The data are shown below.
Placebo 7 5 6 4 12
New
Drug
3 6 4 2 1
Is there a difference in the number of episodes of shortness of breath over a 1 week period in
participants receiving the new drug as compared to those receiving the placebo?
SOL:- In this example, the outcome is a count and in this sample the data do not follow a normal
distribution. In addition, the sample size is small (n1=n2=5), so a nonparametric test is appropriate. The
hypothesis is given below, and we run the test at the 5% level of significance (i.e., α=0.05).
H0: The two populations are equal versus
H1: The two populations are not equal.
The first step is to assign ranks and to do so we order the data from smallest to largest. This is done on the
combined or total sample (i.e., pooling the data from the two treatment groups (n=10)), and assigning ranks
from 1 to 10, as follows.
Total Sample
(Ordered
Smallest to
Largest)
Ranks
Placebo New Drug Placebo New Drug Placebo New Drug
7 3 1 1
5 6 2 2
6 4 3 3
4 2 4 4 4.5 4.5
12 1 5 6
6 6 7.5 7.5
7 9
12 10
• We produce a test statistic based on the ranks.
• First, we sum the ranks in each group. In the placebo group, the sum
of the ranks is 37; in the new drug group, the sum of the ranks is 18.
Recall that the sum of the ranks will always equal n(n+1)/2. As a check
on our assignment of ranks, we have n(n+1)/2 = 10(11)/2=55 which is
equal to 37+18 = 55.
• For the test, we call the placebo group 1 and the new drug group 2
• We let R1 denote the sum of the ranks in group 1 (i.e., R1=37), and
R2denote the sum of the ranks in group 2 (i.e., R2=18).
• The test statistic for the Mann Whitney U Test is denoted U and is
the smaller of U1 and U2.
• In every test, we must determine whether the observed U supports
the null or research hypothesis.
• We determine a critical value of U such that if the observed value of U
is less than or equal to the critical value, we reject H0 in favor of
H1 and if the observed value of U exceeds the critical value we do not
reject H0.
• To determine the appropriate critical value we need sample sizes (for
Example: n1=n2=5) and our two-sided level of significance (α=0.05)
• The critical value is 2, and the decision rule is to reject H0 if U < 2. We
do not reject H0 because 3 > 2. We do not have statistically significant
evidence at α =0.05, to show that the two populations of numbers of
episodes of shortness of breath are not equal.
• To be significant, our obtained U has to be equal to or LESS than this
critical value.
• A new approach to prenatal care is proposed for pregnant women living in a rural
community. The new program involves in-home visits during the course of
pregnancy in addition to the usual or regularly scheduled visits. A pilot
randomized trial with 15 pregnant women is designed to evaluate whether
women who participate in the program deliver healthier babies than women
receiving usual care. The outcome is the APGAR score measured 5 minutes after
birth. Recall that APGAR scores range from 0 to 10 with scores of 7 or higher
considered normal (healthy), 4-6 low and 0-3 critically low. The data are shown
below.
Usual
Care
8 7 6 2 5 8 7 3
New
Program
9 9 7 8 10 9 6
Is there statistical evidence of a difference in APGAR scores in women receiving
the new and enhanced versus usual prenatal care?

Contenu connexe

Tendances

Tendances (20)

Test of significance
Test of significanceTest of significance
Test of significance
 
The Sign Test
The Sign TestThe Sign Test
The Sign Test
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt final
 
Parametric tests
Parametric testsParametric tests
Parametric tests
 
Parametric versus non parametric test
Parametric versus non parametric testParametric versus non parametric test
Parametric versus non parametric test
 
Statistics - ONE WAY ANOVA
Statistics - ONE WAY ANOVAStatistics - ONE WAY ANOVA
Statistics - ONE WAY ANOVA
 
Parametric test
Parametric testParametric test
Parametric test
 
t test
t testt test
t test
 
One Sample T Test
One Sample T TestOne Sample T Test
One Sample T Test
 
Z-test
Z-testZ-test
Z-test
 
The two sample t-test
The two sample t-testThe two sample t-test
The two sample t-test
 
Two Means Independent Samples
Two Means Independent Samples  Two Means Independent Samples
Two Means Independent Samples
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
 
Kruskal wallis test
Kruskal wallis testKruskal wallis test
Kruskal wallis test
 
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
Parametric test  - t Test, ANOVA, ANCOVA, MANOVAParametric test  - t Test, ANOVA, ANCOVA, MANOVA
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
 
Statistical tests of significance and Student`s T-Test
Statistical tests of significance and Student`s T-TestStatistical tests of significance and Student`s T-Test
Statistical tests of significance and Student`s T-Test
 
3.1 non parametric test
3.1 non parametric test3.1 non parametric test
3.1 non parametric test
 
Parametric Test
Parametric TestParametric Test
Parametric Test
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
 
The Kruskal-Wallis H Test
The Kruskal-Wallis H TestThe Kruskal-Wallis H Test
The Kruskal-Wallis H Test
 

Similaire à Non parametric-tests

non parametric test.pptx
non parametric test.pptxnon parametric test.pptx
non parametric test.pptxSoujanyaLk1
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethicsAbhishek Thakur
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptxjonatanjohn1
 
univariate and bivariate analysis in spss
univariate and bivariate analysis in spss univariate and bivariate analysis in spss
univariate and bivariate analysis in spss Subodh Khanal
 
tests of significance
tests of significancetests of significance
tests of significancebenita regi
 
Parametric tests
Parametric testsParametric tests
Parametric testsheena45
 
Testing of hypothesis.pptx
Testing of hypothesis.pptxTesting of hypothesis.pptx
Testing of hypothesis.pptxSyedaKumail
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in StatisticsVikash Keshri
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminardrdeepika87
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxxababid981
 
Non Parametric Test by Vikramjit Singh
Non Parametric Test by  Vikramjit SinghNon Parametric Test by  Vikramjit Singh
Non Parametric Test by Vikramjit SinghVikramjit Singh
 
Inferential statistics quantitative data - single sample and 2 groups
Inferential statistics   quantitative data - single sample and 2 groupsInferential statistics   quantitative data - single sample and 2 groups
Inferential statistics quantitative data - single sample and 2 groupsDhritiman Chakrabarti
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitativePandurangi Raghavendra
 

Similaire à Non parametric-tests (20)

non parametric test.pptx
non parametric test.pptxnon parametric test.pptx
non parametric test.pptx
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptx
 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
 
univariate and bivariate analysis in spss
univariate and bivariate analysis in spss univariate and bivariate analysis in spss
univariate and bivariate analysis in spss
 
tests of significance
tests of significancetests of significance
tests of significance
 
Parametric tests
Parametric testsParametric tests
Parametric tests
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
 
Student t test
Student t testStudent t test
Student t test
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
 
Testing of hypothesis.pptx
Testing of hypothesis.pptxTesting of hypothesis.pptx
Testing of hypothesis.pptx
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptx
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
Non Parametric Test by Vikramjit Singh
Non Parametric Test by  Vikramjit SinghNon Parametric Test by  Vikramjit Singh
Non Parametric Test by Vikramjit Singh
 
Inferential statistics quantitative data - single sample and 2 groups
Inferential statistics   quantitative data - single sample and 2 groupsInferential statistics   quantitative data - single sample and 2 groups
Inferential statistics quantitative data - single sample and 2 groups
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitative
 

Plus de Asmita Bhagdikar

Plus de Asmita Bhagdikar (6)

JUNCTION DIODE APPLICATIONS
JUNCTION DIODE APPLICATIONSJUNCTION DIODE APPLICATIONS
JUNCTION DIODE APPLICATIONS
 
Basic electronics
Basic electronicsBasic electronics
Basic electronics
 
Mod-1-CH01-Semiconductor-Diodes.pptx
Mod-1-CH01-Semiconductor-Diodes.pptxMod-1-CH01-Semiconductor-Diodes.pptx
Mod-1-CH01-Semiconductor-Diodes.pptx
 
8085-Programming-II-mod-1.pptx
8085-Programming-II-mod-1.pptx8085-Programming-II-mod-1.pptx
8085-Programming-II-mod-1.pptx
 
Vectors mod-1-part-2
Vectors mod-1-part-2Vectors mod-1-part-2
Vectors mod-1-part-2
 
Vectors mod-1-part-1
Vectors mod-1-part-1Vectors mod-1-part-1
Vectors mod-1-part-1
 

Dernier

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 

Dernier (20)

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 

Non parametric-tests

  • 2. Mean / Median • The mean is a good measure of center when the data is bell-shaped, but it is sensitive to outliers and extreme values. • When the data is skewed, however, a better measure of center would be the median. • The median, is a resistant measure. • In other words, we may want to consider a test for the median and not the mean. • In a skewed distribution, the population median, typically denoted as η, is a better typical value than the population mean, μ.
  • 3. Sign test • It is a non-parametric or “distribution free” test, which means the test doesn’t assume the data comes from a particular distribution. • The sign test compares the sizes of two groups. • The sign test is an alternative to a one sample t test or a paired t test. • It can also be used for ordered data. • The null hypothesis for the sign test is that the difference between medians is zero. red (ranked) categorical data. • This test is used when we are interested in testing the population median and not the mean.
  • 4. One sample median test • The one sample median test checks whether or not there is a significant difference between our hypothesized median and the real median of a sample. • We learned how to use a t-test for the difference between means of dependent samples. That test required both populations to be normally distributed. • If the condition of normality cannot be satisfied, we can use the paired-sample sign test to test the difference between two population medians, the following conditions must be met. • 1. A sample must be randomly selected from each population. • 2. The samples must be dependent (paired).
  • 5. • We find the difference between corresponding data entries by subtracting the entry representing the second variable from the entry representing the first variable, and record the sign of the difference. • Then compare the number of + and – signs. (the 0s are ignored.)
  • 6. Steps:- • State the hypothesis • Specify alpha • Specify sample size • Find critical value – from t-table or z-table • Find test statistic • Make decision • Interpret
  • 7. Test statistic • When n<=25 , test statistic is smaller no of positive or negative sign. • When n>25 , test statistic is calculated from formula :- • z=((x+0.5)+0.5n)/sqrt(n)/2 • Where x=smaller no of sign and n=total no of positive and negative sign.
  • 8. Example :- Sand C represent two tasks, S the spelling of 25 words presented separately, and C the spelling of 25 words of equal difficulty presented as an integral part of a sentence (i.e., in context). A teacher wants to know which condition is favorable to higher scores. Test the hypothesis that C is better than S.
  • 9. • Of the 10 differences, 7 are plus (C higher than S), 2 are minus (S higher than C) and one is zero. Excluding the 0 as being neither +nor - , we have 9 differences of which 7 are plus. • Let alpha = 0.05 and N = 9 . It’s a left tailed test. Critical value- 1.860 (from t-table) • Test statistics = 2 • Since test statistic is greater than the critical value , we fail to reject the null hypothesis.
  • 10. Ex • A college statistics professor claims that the median test score for his students is 58. The scores of 18 randomly selected tests are listed below. At alpha=0.01, can you reject the professors claim? • 58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55
  • 11. Paired/Matched sample Sign test • Assumptions for the test (your data should meet these requirements before running the test) are: • The data should be from two samples. • The two dependent samples should be paired or matched. For example, depression scores from before a medical procedure and after. • Example:- • This set of data represents test scores at the end of Spring and the beginning of the Fall semesters. • The hypothesis is that the summer break means a significant drop in test scores.
  • 12. • H0: No difference in median of the signed differences. • H1: Median of the signed differences is less than zero.
  • 13. • H0: No difference in median of the signed differences. • H1: Median of the signed differences is less than zero. • Count the number of positives and negatives. • 4 positives. • 12 negatives. • Add up the number of items in your sample and subtract any you had a difference of zero for (in column 3). The sample size in this question was 17, with one zero, so n = 16. • Let alpha = 0.05 and N = 16 . Critical value- 2.120 (from t-table) • Test statistics = 4 • Since test statistic is greater than the critical value , we fail to reject the null hypothesis.
  • 14. Example: A new chemotherapy treatment is proposed for patients with breast cancer. Investigators are concerned with patient's ability to tolerate the treatment and assess their quality of life both before and after receiving the new chemotherapy treatment. Quality of life (QOL) is measured on an ordinal scale and for analysis purposes, numbers are assigned to each response category as follows: 1=Poor, 2= Fair, 3=Good, 4= Very Good, 5 = Excellent. The data are shown below. Patient QOL Before Chemother apy Treatment QOL After Chemother apy Treatment Difference Sign 1 3 2 1 + 2 2 3 -1 - 3 3 4 -1 - 4 2 4 -2 - 5 1 1 0 NA 6 3 4 -1 - 7 2 4 -2 - 8 3 3 0 NA 9 2 1 1 + 10 1 3 -2 - 11 3 4 -1 - 12 2 3 -1 - H0- no difference in median of both the data values Ha – there is a difference in the median of both the data values No of +ves- 2 No. of –ves = 8 N=10 Alpha= 0.05 Test statistics= 2 Critical value- 1.812 Conclusion:- test statistic > critical value We accept the hypothesis that there is no difference in the median of both the data values. There was no significant change in the quality of life after and before the chemotherapy treatment.
  • 15. Mood’s Median Test • Mood’s median test is used to compare the medians for two samples to find out if they are different. • For example, you might want to compare the median number of positive calls to a hotline vs. the median number of negative comment calls to find out if you’re getting significantly more negative comments than positive comments (or vice versa). • This test is the nonparametric alternative to a one way ANOVA; Nonparametric means that you don’t have to know what distribution your sample came from (i.e. a normal distribution) before running the test. That said, your samples should have been drawn from distributions with the same shape. • Use this test instead of the sign test when you have two independent samples. The test is a particular case of the chi-square test of dependence. • The null hypothesis for this test is that the medians are the same for both groups. • The alternate hypothesis for the test is that the medians are different for both groups.
  • 16. • Step 1: Make a 2 x k contingency table, where k is the number of samples. • Step 2: Find M, the overall median for all the data in your samples. To do this, list all of your data (from all samples) in a single set. Sort in ascending order and then find the middle number. • Step 3: List each individual sample’s data in ascending order. Count how many data points are greater than M (from Step 2) and then count how many data points are smaller than or equal to M. List these in the first row of the contingency table. • Step 4: Perform a chi-square test on the completed contingency table. • Step 5: Compare the chi-square statistic to the table value with: degrees of freedom = (number of rows – 1) * (number of columns – 1).
  • 17. Example • Non parametric test - Mood's Median test for the following sets of data :- (11,15,9,4,34,17,18,14,12,13,26,31) (34,31,35,29,28,12,18,30,14,22,10,29 ) • Significance Level α=0.05 and One-tailed test • Sol:- Step-1:Calculate total Median of combination of 2 samples Sorting of combined samples 4,9,10,11,12,12,13,14,14,15,17,18,18,22,26,28,29,29,30,31,31,34,34, 35 n=24 Median =(12thterm+13thterm)/2=(18+18)/2=18
  • 18. • Step-2:Create a 2×2 contingency table whose first row consists of the number of elements in each sample that are greater than Median and second row consists of the number of elements in each sample that are less than or equal to Median Sample A Sample B Total > Median 3 8 11 <= Median 9 4 13 Total 12 12 24 Step-3:Perform a chi-square test of independence. State the hypothesis H0: two categories variables are independent. H1: two categories variables are not independent. Observed Frequencies B1 B2 Total A1 3 8 11 A2 9 4 13 Total 12 12 24
  • 19. • Expected Frequencies • Compute Chi-square • χ2=∑(Oij-Eij)2/Eij =(3-5.5)2/5.5+(8-5.5)2/5.5+(9-6.5)2/6.5+(4-6.5)2/6.5 =6.25/5.5+6.25/5.5+6.25/6.5+6.25/6.5 =1.1364+1.1364+0.9615+0.9615 =4.1958 • Compute the degrees of freedom (df). df=(2-1)⋅(2-1)=1 • for 1 df, p(χ2≥4.1958)=0.0405. Test statistic- 4.1958. Critical value- 6.314 Since the test statistic < critical value , we reject the null hypothesis H0. B1 B2 Total A1 5.5 5.5 11 A2 6.5 6.5 13 Total 12 12 24
  • 20. Example • A major wheat supplier from Texas analyzing the yields of various crop methods. He randomly assigned two different wheat crop methods to a very high number of different acres of farm land and recorded the production rate (yield per acre) for each plot. We need to find out difference between the two wheat crop methods.
  • 21. Kruskal Wallis Test • The Kruskal Wallis test is the non parametric alternative to the One Way ANOVA. • The test determines whether the medians of two or more groups are different. Like most statistical tests, you calculate a test statistic and compare it to a distribution cut-off point. The test statistic used in this test is called the H statistic. • The hypotheses for the test are: • H0: population medians are equal. • H1: population medians are not equal. • The Kruskal Wallis test will tell you if there is a significant difference between groups. However, it won’t tell you which groups are different. • You want to find out how test anxiety affects actual test scores. The independent variable “test anxiety” has three levels: no anxiety, low-medium anxiety and high anxiety. The dependent variable is the exam score, rated from 0 to 100%. • You want to find out how socioeconomic status affects attitude towards sales tax increases. Your independent variable is “socioeconomic status” with three levels: working class, middle class and wealthy. The dependent variable is measured on a 5- point scale from strongly agree to strongly disagree.
  • 22. • The H test is used when the assumptions for ANOVA aren’t met (like the assumption of normality). It is sometimes called the one-way ANOVA on ranks, as the ranks of the data values are used in the test rather than the actual data points. • Assumptions:- • One independent variable with two or more levels (independent groups). The test is more commonly used when you have three or more levels. • Ordinal scale, Ratio Scale or Interval scale dependent variables. • Your observations should be independent. In other words, there should be no relationship between the members in each group or between groups. • All groups should have the same shape distributions. • It is used for comparing two or more independent samples of equal or different sample sizes.
  • 23. • The Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized design. • All n = n1+n2+…+nk measurements are jointly • ranked (i.e.treat as one large sample). • We use the sums of the ranks of the k samples to compare the distributions.
  • 24. • Rank the total measurements in all k samples from 1 to n. Tied observations are assigned average of the ranks they would have gotten if not tied. • Calculate T = rank sum for the ith sample And the test statistic i = 1, 2,…,k  3(n 1) n(n 1) ni 12 2 T  i H 
  • 25. H0: the k distributions are identical versus Ha: at least one distribution is different Test statistic: Kruskal-Wallis H When H0 is true, the test statistic H has an approximate chi-square distribution with df = k-1. Use a right-tailed rejection region or p- value based on the Chi-square distribution.
  • 26. Example • A shoe company wants to know if three groups of workers have different salaries: Women: 23K, 41K, 54K, 66K, 78K. Men: 45K, 55K, 60K, 70K, 72K Minorities: 18K, 30K, 34K, 40K, 44K. • Sol:- Null Hypothesis H0 : All groups are equal Alternative Hypothesis H1 : At least one group is not equal • Step 1: Sort the data for all groups/samples into ascending order in one combined set. 20K 23K 30K 34K 40K 41K 44K 45K 54K 55K 60K 66K 70K 72K 90K
  • 27. • Step 2: Assign ranks to the sorted data points. Give tied values the average rank. 20K 1 23K 2 30K 3 34K 4 40K 5 41K 6 44K 7 45K 8 54K 9 55K 10 60K 11 66K 12 70K 13 72K 14 90K 15
  • 28. • Step 3: Add up the different ranks for each group/sample. Women: 23K, 41K, 54K, 66K, 90K = 2 + 6 + 9 + 12 + 15 = 44. Men: 45K, 55K, 60K, 70K, 72K = 8 + 10 + 11 + 13 + 14 = 56. Minorities: 20K, 30K, 34K, 40K, 44K = 1 + 3 + 4 + 5 + 7 = 20. • Step 4: Calculate the H statistic: Where: • n = sum of sample sizes for all samples, • c = number of samples, • Tj = sum of ranks in the jth sample, • nj = size of the jth sample.
  • 29. H = 6.72 Step 5: Find the critical chi-square value, with c-1 degrees of freedom. For 3 – 1 degrees of freedom and an alpha level of .05, the critical chi square value is 5.9915. Step 6: Compare the H value from Step 4 to the critical chi-square value from Step 5. If the critical chi-square value is less than the H statistic, reject the null hypothesis that the medians are equal. If the chi-square value is not less than the H statistic, there is not enough evidence to suggest that the medians are unequal. In this case, 5.9915 is less than 6.72, so we can reject the null hypothesis.
  • 30. • Perform Kruskal wallis test for the following data:- 8,5,7,11,9,6 – 25.5 10,12,11,9,13,12 - 64 11,14,10,16,17,12 – 87.5 18,20,16,15,14,22 - 123 • Significance Level α=0.05 and One-tailed test. • 12/24*25[(25.52 + 642 + 87.52 + 1232 )/6] -3(24+1) • H= 16.825 • Critical value = 7.815
  • 31. Mann Whitney U Test • The Mann-Whitney U test is the nonparametric equivalent of the two sample t-test. • The Mann Whitney U test, sometimes called the Mann Whitney Wilcoxon Test or the Wilcoxon Rank Sum Test • While the t-test makes an assumption about the distribution of a population , the Mann Whitney U Test makes no such assumption. • The test compares two populations. • The null hypothesis is that the two samples come from the same population (i.e. that they both have the same median). • This test is often performed as a two-sided test and, thus, the research hypothesis indicates that the populations are not equal as opposed to specifying directionality. • A one-sided research hypothesis is used if interest lies in detecting a positive or negative shift in one population as compared to the other.
  • 32. • Assumptions for the Mann Whitney U Test • The dependent variable should be measured on an ordinal scale or a continuous scale. • The independent variable should be two independent, categorical groups. • Observations should be independent. In other words, there should be no relationship between the two groups or within each group. • Observations are not normally distributed. However, they should follow the same shape (i.e. both are bell-shaped and skewed left). • The result of performing a Mann Whitney U Test is a U Statistic. • For small samples, use the direct method (see below) to find the U statistic; • For larger samples, a formula is necessary.
  • 33. Formula Either of these two formulas are valid for the Mann Whitney U Test. R is the sum of ranks in the sample, and n is the number of items in the sample.
  • 34. Consider a Phase II clinical trial designed to investigate the effectiveness of a new drug to reduce symptoms of asthma in children. A total of n=10 participants are randomized to receive either the new drug or a placebo. Participants are asked to record the number of episodes of shortness of breath over a 1 week period following receipt of the assigned treatment. The data are shown below. Placebo 7 5 6 4 12 New Drug 3 6 4 2 1 Is there a difference in the number of episodes of shortness of breath over a 1 week period in participants receiving the new drug as compared to those receiving the placebo? SOL:- In this example, the outcome is a count and in this sample the data do not follow a normal distribution. In addition, the sample size is small (n1=n2=5), so a nonparametric test is appropriate. The hypothesis is given below, and we run the test at the 5% level of significance (i.e., α=0.05). H0: The two populations are equal versus H1: The two populations are not equal. The first step is to assign ranks and to do so we order the data from smallest to largest. This is done on the combined or total sample (i.e., pooling the data from the two treatment groups (n=10)), and assigning ranks from 1 to 10, as follows.
  • 35. Total Sample (Ordered Smallest to Largest) Ranks Placebo New Drug Placebo New Drug Placebo New Drug 7 3 1 1 5 6 2 2 6 4 3 3 4 2 4 4 4.5 4.5 12 1 5 6 6 6 7.5 7.5 7 9 12 10
  • 36. • We produce a test statistic based on the ranks. • First, we sum the ranks in each group. In the placebo group, the sum of the ranks is 37; in the new drug group, the sum of the ranks is 18. Recall that the sum of the ranks will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 10(11)/2=55 which is equal to 37+18 = 55. • For the test, we call the placebo group 1 and the new drug group 2 • We let R1 denote the sum of the ranks in group 1 (i.e., R1=37), and R2denote the sum of the ranks in group 2 (i.e., R2=18). • The test statistic for the Mann Whitney U Test is denoted U and is the smaller of U1 and U2.
  • 37. • In every test, we must determine whether the observed U supports the null or research hypothesis. • We determine a critical value of U such that if the observed value of U is less than or equal to the critical value, we reject H0 in favor of H1 and if the observed value of U exceeds the critical value we do not reject H0. • To determine the appropriate critical value we need sample sizes (for Example: n1=n2=5) and our two-sided level of significance (α=0.05) • The critical value is 2, and the decision rule is to reject H0 if U < 2. We do not reject H0 because 3 > 2. We do not have statistically significant evidence at α =0.05, to show that the two populations of numbers of episodes of shortness of breath are not equal. • To be significant, our obtained U has to be equal to or LESS than this critical value.
  • 38. • A new approach to prenatal care is proposed for pregnant women living in a rural community. The new program involves in-home visits during the course of pregnancy in addition to the usual or regularly scheduled visits. A pilot randomized trial with 15 pregnant women is designed to evaluate whether women who participate in the program deliver healthier babies than women receiving usual care. The outcome is the APGAR score measured 5 minutes after birth. Recall that APGAR scores range from 0 to 10 with scores of 7 or higher considered normal (healthy), 4-6 low and 0-3 critically low. The data are shown below. Usual Care 8 7 6 2 5 8 7 3 New Program 9 9 7 8 10 9 6 Is there statistical evidence of a difference in APGAR scores in women receiving the new and enhanced versus usual prenatal care?