Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Hypothesis testng
1. 1. T-test: Difference in means
to test the statistical significance in the difference in
means
ex. income by gender, the num. of years at work by gender
2. T-test: Difference in proportions
to test the statistical significance in the difference in
proportions
ex. the proportion employed in government jobs by gender
3. Contingency Table/Chi-Square Analysis
to test whether all categories contain the same proportion
of values or not by comparing expected and actual values.
ex. the proportion employed in government jobs by gender
Hypothesis Testing
2. 1. A Research Question
2. The Null Hypothesis
usually assumes NO difference 2 tailed-test
3. Select Cases
4. T-test or Contingency/Chi-Square Analysis
5. Interpret Test Results
t-score, significance level, confidence interval,
likelihood ratio (for Chi-Square Analysis)
6. “Reject” or “Not reject” the null hypothesis
Hypothesis Testing Procedure
3. • Research Question: Are there differences in income
between male and female graduates and if so, what factors
might explain this difference?
1. Is there a difference in average income between male
and female graduates?
2. Is there a significant difference in average length of
time on the job, between male and female graduates?
3. Is there a difference in the proportion employed in
government jobs between males and females?
Hypothesis Testing
4. Research Question:
Is there a difference in average income
between male and female graduates?
H0: There is NO difference in average income
between male and female graduates
Note: Limit the data to full-time employees or self-
employed with income more than $20,000 and
less than $400,000.
1. T-test: Difference in Means
6. Data/Select Cases
• In a Select Cases dialogue
box, you specify logical
expressions to select
cases.
– Select the “If condition is
satisfied” option
– Click on the If… button
7. Data/Select Cases
Specifying fullself and income range
Type logical expression:
fullself = 1 & income > 20000 & income < 400000
to limit cases to alumni who work full-time or are self-employed and make
more than $20,000 and less than $400,000.
14. T-test: Results
Using the Unequal Variance model, we REJECT H0 and
conclude that there is a significant difference in average income
between male and female graduates.
Group Statistics
128 79868.22 35165.875 3108.254
137 98606.49 47980.995 4099.293
Gender
Female
Male
Income
N Mean Std. Deviation
Std. Error
Mean
Independent Samples Test
10.443 .001 -3.605 263 .000 -18738.270 5197.537 -28972.4 -8504.190
-3.642 249.145 .000 -18738.270 5144.458 -28870.4 -8606.100
Equal variances
assumed
Equal variances
not assumed
Income
F Sig.
Levene's Test for
Equality of Variances
t df Sig. (2-tailed)
Mean
Difference
Std. Error
Difference Lower Upper
95% Confidence
Interval of the
Difference
t-test for Equality of Means
>-1.96 < 0.05 -18,738 Doesn’t include 0
15. Possible explanation for the difference in income:
Male income is higher because men have been
on the job longer than women.
Research Question:
Is there a difference in average length of time on
the job (YEARS) between male and female
graduates?
H0: There is NO difference in length of time on the
job between male and female graduates
1-2. T-test: Difference in Means
18. T-test: Results
Using the Unequal Variance model, we REJECT H0 and conclude
that there is a significant difference in average length of time on
the job between male and female graduates.
Group Statistics
128 4.15 4.315 .381
137 5.90 5.764 .492
Gender
Female
Male
Years at Current Position
N Mean Std. Deviation
Std. Error
Mean
Independent Samples Test
13.386 .000 -2.786 263 .006 -1.752 .629 -2.991 -.514
-2.813 251.276 .005 -1.752 .623 -2.979 -.525
Equal variances
assumed
Equal variances
not assumed
Years at Current Position
F Sig.
Levene's Test for
Equality of Variances
t df Sig. (2-tailed)
Mean
Difference
Std. Error
Difference Lower Upper
95% Confidence
Interval of the
Difference
t-test for Equality of Means
Does not
include 0
>-1.96 < 0.05 -1.752
19. Possible explanation for the difference in income:
Male income is higher because more females work
for government than males.
Research Question:
Is there a difference in the proportion employed in
government jobs between male and female
graduates?
H0: There is NO difference in the proportion
employed in government jobs between male
and female graduates
2. T-test: Difference in Proportions
20. • Create a new variable GOV that
– has the value 1 if the EMPLOYER (1-6) indicates the
alumnus works for a government organization.
– has the value 0 if the EMPLOYER is not 1-6.
1. Use Transform/Compute to convert the EMPLOYER
variable into a new categorical variable GOV.
2. Use Transform/Recode/Into Different Variables to
create a new categorical variable GOV.
Step 1: Create a new variable (GOV)
21. OUTPUT:
Analyze/Descriptive Statistics/Frequencies
Employer
8 2.9 2.9 2.9
12 4.3 4.3 7.2
13 4.7 4.7 11.9
45 16.2 16.2 28.2
17 6.1 6.1 34.3
3 1.1 1.1 35.4
5 1.8 1.8 37.2
5 1.8 1.8 39.0
20 7.2 7.2 46.2
11 4.0 4.0 50.2
47 16.9 17.0 67.1
51 18.3 18.4 85.6
4 1.4 1.4 87.0
25 9.0 9.0 96.0
11 4.0 4.0 100.0
277 99.6 100.0
1 .4
278 100.0
Gov: Federal
Gov: State
Gov: County
Gov: City
Gov: Special Agency
Gov: Non U.S.
Private: Single Person
Private: 2-4 Persons
Private: 5-19 Persons
Private: 20-49 Persons
Private: >= 50 Persons
Non-Profit (U.S.)
International Org.
Educational Inst.
Other
Total
Valid
SystemMissing
Total
Frequency Percent Valid Percent
Cumulative
Percent
7-11 Private
Missing Values
1-6 Government
29. • Analyze/Descriptive Statistics/Frequencies
Step 2: Create a frequency table for GOV
Thirty five percent of the graduates employed full time or
self-employed and making more than $20,000 and less than $400,000
work in government jobs.
Government Job
179 64.4 64.6 64.6
98 35.3 35.4 100.0
277 99.6 100.0
1 .4
278 100.0
No
Yes
Total
Valid
SystemMissing
Total
Frequency Percent Valid Percent
Cumulative
Percent
32. T-test: Results
Using the Unequal Variance model, we CANNOT REJECT H0
and cannot conclude that there is a significant difference
between male and female graduates with respect to the
proportion working in the government sector.
Group Statistics
127 .3543 .48020 .04261
137 .3650 .48319 .04128
Gender
Female
Male
Government Job
N Mean Std. Deviation
Std. Error
Mean
Independent Samples Test
.129 .720 -.179 262 .858 -.01063 .05934 -.12748 .10622
-.179 260.726 .858 -.01063 .05933 -.12746 .10619
Equal variances
assumed
Equal variances
not assumed
Government Job
F Sig.
Levene's Test for
Equality of Variances
t df Sig. (2-tailed)
Mean
Difference
Std. Error
Difference Lower Upper
95% Confidence
Interval of the
Difference
t-test for Equality of Means
<-1.96 > 0.05 Includes 0
33. 3. Contingency Table/
Chi-Square Analysis
The same question can be analyzed by a contingency
table with GOV and GENDER and testing using
the Chi-Square statistic.
H0: There is NO relationship between employment
sector and gender.
36. Contingency table
Analyze/Descriptive statistics/Crosstabs
Government Job * Gender Crosstabulation
82 87 169
48.5% 51.5% 100.0%
64.6% 63.5% 64.0%
45 50 95
47.4% 52.6% 100.0%
35.4% 36.5% 36.0%
127 137 264
48.1% 51.9% 100.0%
100.0% 100.0% 100.0%
Count
% within Government Job
% within Gender
Count
% within Government Job
% within Gender
Count
% within Government Job
% within Gender
No
Yes
Government
Job
Total
Female Male
Gender
Total
37. Contingency table
Analyze/Descriptive statistics/Crosstabs
Chi-Square value = 0.032 < 3.84 (1.962
= Cutoff value at 95%
confidence level at 1 df).
We CANNOT REJECT the null hypothesis and cannot conclude
there is a statistically significant relationship between gender and
whether or not a person works for the government.
> 0.05
Chi-Square Tests
.032b 1 .857
.003 1 .959
.032 1 .857
.898 .480
264
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
N of Valid Cases
Value df
Asymp. Sig.
(2-sided)
Exact Sig.
(2-sided)
Exact Sig.
(1-sided)
Computed only for a 2x2 tablea.
0 cells (.0%) have expected count less than 5. The minimum expected count is 45.
70.
b.
< 3.84 > 0.05
38. OUTPUT:
Analyze/Descriptive Statistics/Frequencies
Missing Values
Employer
8 2.9 2.9 2.9
12 4.3 4.3 7.2
13 4.7 4.7 11.9
45 16.2 16.2 28.2
17 6.1 6.1 34.3
3 1.1 1.1 35.4
5 1.8 1.8 37.2
5 1.8 1.8 39.0
20 7.2 7.2 46.2
11 4.0 4.0 50.2
47 16.9 17.0 67.1
51 18.3 18.4 85.6
4 1.4 1.4 87.0
25 9.0 9.0 96.0
11 4.0 4.0 100.0
277 99.6 100.0
1 .4
278 100.0
Gov: Federal
Gov: State
Gov: County
Gov: City
Gov: Special Agency
Gov: Non U.S.
Private: Single Person
Private: 2-4 Persons
Private: 5-19 Persons
Private: 20-49 Persons
Private: >= 50 Persons
Non-Profit (U.S.)
International Org.
Educational Inst.
Other
Total
Valid
SystemMissing
Total
Frequency Percent Valid Percent
Cumulative
Percent
7-11. Private
39. 3-2. Contingency Table/
Chi-Square Analysis
How about analyzing the difference in the proportion
of males and females in the private sector by a
contingency table with PRIVATE and GENDER.
H0: There is NO relationship between employment
sector and gender.
40. • Create a new variable PRIVATE that
– has the value 1 if the EMPLOYER (7-11) indicates the
alumnus works for a government organization.
– has the value 0 if the EMPLOYER is not 7-11 (else).
Method 2.
Use Transform/Recode/Into Different Variables to
create a new categorical variable PRIVATE.
Step1: Create a new variable
(PRIVATE)
42. Contingency table
Analyze/Descriptive statistics/Crosstabs
Private Sector Job * Gender Crosstabulation
92 87 179
51.4% 48.6% 100.0%
72.4% 63.5% 67.8%
35 50 85
41.2% 58.8% 100.0%
27.6% 36.5% 32.2%
127 137 264
48.1% 51.9% 100.0%
100.0% 100.0% 100.0%
Count
% within Private
Sector Job
% within Gender
Count
% within Private
Sector Job
% within Gender
Count
% within Private
Sector Job
% within Gender
.00
1.00
Private Sector
Job
Total
Female Male
Gender
Total
43. Contingency table
Analyze/Descriptive statistics/Crosstabs
Chi-Square value = 2.411 < 3.84 (1.962
).
We CANNOT REJECT the null hypothesis and cannot conclude
that the difference in the proportion of males and females in the
private sector is statistically significant.
Chi-Square Tests
2.411b 1 .120
2.019 1 .155
2.422 1 .120
.147 .077
264
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
N of Valid Cases
Value df
Asymp. Sig.
(2-sided)
Exact Sig.
(2-sided)
Exact Sig.
(1-sided)
Computed only for a 2x2 tablea.
0 cells (.0%) have expected count less than 5. The minimum expected count is 40.
89.
b.
< 3.84 > 0.05
44. The degrees of freedom in the chi-square test of a contingency table:
d.o.f = (r-1)*(c-1)
where
r & c are the number of rows and columns (or the number of
categories of two variables) in a table.
The number of d.o.f is the number of comparisons between actual and
expected frequencies minus the number of restrictions imposed on
these frequencies.
Since the number of cells in a contingency tables is r*c, there are r*c
actual frequencies to be compared with the corresponding expected
frequencies. Because the sum (total) of the frequencies in each row
and each column are given, there are r+c-1 restrictions.
Therefore, the number of d.o.f is: r*c - (r+c-1) = (r-1)*(c-1).
The degrees of freedom
in the chi-square test
45. • What other factors may influence income?
• Control for job sector (government, private, non-profit),
and examine a difference in average income between
males and females within each sector.
– Select cases: Data/Select Cases
if STATUS =1 & INCOME >20000 & INCOME > 400000 & GOV = 1
if STATUS =1 & INCOME >20000 & INCOME > 400000 & PRIVATE = 1
– Compare means/Independent Sample T-test
• If we see differences within each sector, other factors
besides job sector are influencing income.
Extensions to the Analysis