Unit3

MOMENTS, SKEWNESS, AND
KURTOSIS

Moment Ratios
2
3 4
1 23 2
2 2
,
 
 
 
 
2
3 4
1 23 2
2 2
,
m m
b b
m m
 

Skewness
A distribution in which the values equidistant from the mean have equal
frequencies and is called Symmetric Distribution.
Any departure from symmetry is called skewness.
In a perfectly symmetric distribution, Mean=Median=Mode and the two
tails of the distribution are equal in length from the mean. These values are
pulled apart when the distribution departs from symmetry and consequently
one tail become longer than the other.
If right tail is longer than the left tail then the distribution is said to have
positive skewness. In this case, Mean>Median>Mode
If left tail is longer than the right tail then the distribution is said to have
negative skewness. In this case, Mean<Median<Mode

Kurtosis
For a normal distribution, kurtosis is equal to 3.
When is greater than 3, the curve is more sharply peaked and has narrower
tails than the normal curve and is said to be leptokurtic.
When it is less than 3, the curve has a flatter top and relatively wider tails
than the normal curve and is said to be platykurtic.
4
4
2
1 1
4
4
2
1 1
1 1
,
1 1
,
n n
i
i i
n n
i
i i
x
kurt z for populationdata
n n
x x
kurt b z for sampledata
n n s


 
 
 
    
 
 
    
 
 
 

Curve Fitting and Correlation
This will be concerned primarily with two
separate but closely interrelated processes:
(1) the fitting of experimental data to
mathematical forms that describe their
behavior and
(2) the correlation between different
experimental data to assess how closely
different variables are interdependent.

•The fitting of experimental data to a
mathematical equation is called regression.
Regression may be characterized by different
adjectives according to the mathematical form
being used for the fit and the number of
variables. For example, linear regression
involves using a straight-line or linear equation
for the fit. As another example, Multiple
regression involves a function of more than one
independent variable.

Linear Regression
•Assume n points, with each point having values
of both an independent variable x and a
dependent variable y.
1 2 3The values of are , , ,...., .nx x x x x
1 2 3The values of are , , ,...., .ny y y y y
A best-fitting straight line equation
will have the form
1 0y a x a 

Preliminary Computations
0
1
sample mean of the values
n
k
k
x x x
n 
  
0
1
sample mean of the values
n
k
k
y y y
n 
  
2 2
1
1
sample mean-square of the values
n
k
k
x x x
n 
  
1
1
sample mean of the product
n
k k
k
xy xy x y
n 
  

Best-Fitting Straight Line
    
   
1 22
xy x y
a
x x



     
   
2
0 22
x y x xy
a
x x



0 1Alternately, a y a x 
1 0y a x a 

Example-1. Find best fitting straight
line equation for the data shown
below.
x 0 1 2 3 4 5 6 7 8 9
y 4.00 6.10 8.30 9.90 12.40 14.30 15.70 17.40 19.80 22.30
10
1
1 0 1 2 3 4 5 6 7 8 9 45
4.50
10 10 10
k
k
x x

        
   
10
1
1 4 6.1 8.3 9.9 12.4 14.3 15.7 17.4 19.8 22.3
10 10
130.2
13.02
10
k
k
y y

        
 
 


Multiple Linear Regression
0 1 1 2 2 ..... m my a a x a x a x    
Assume independent variablesm
1 2, ,..... mx x x
Assume a dependent variable that
is to be considered as a linear function
of the independent variables.
y
m

Multiple Regression
(Continuation)
1
Assume that there are values of each
of the variables. For , we have
k
m x
11 12 13 1, , ,....., kx x x x
Similar terms apply for all other variables.
For the th variable, we havem
1 2 3, , ,.....,m m m mkx x x x

Correlation
corr( , ) ( )x y E xy xy 
Cross-Correlation
 cov( , ) ( )( )
corr( , ) ( )( )
( )( )
x y E x x y y
x y x y
xy x y
  
 
 
Covariance

Correlation Coefficient
 ( )( )
( , )
cov( , )
cov( , )cov( , )
x y
E x x y y
C x y
x y
x x y y
 
 



Implications of Correlation Coefficient
• 1. If C(x, y) = 1, the two variables are totally
correlated in a positive sense.
• 2. If C(x, y) = -1 , the two variables are totally
correlated in a negative sense.
• 3. If C(x, y) = 0, the two variables are said to
be uncorrelated.

Binomial Distribution and
Applications

Binomial Probability Distribution
A binomial random variable X is defined to the number
of “successes” in n independent trials where the
P(“success”) = p is constant.
Notation: X ~ BIN(n,p)
In the definition above notice the following conditions
need to be satisfied for a binomial experiment:
1. There is a fixed number of n trials carried out.
2. The outcome of a given trial is either a “success”
or “failure”.
3. The probability of success (p) remains constant
from trial to trial.
4. The trials are independent, the outcome of a trial is
not affected by the outcome of any other trial.

Binomial Distribution
• If X ~ BIN(n, p), then
• where
.,...,1,0)1(
)!(!
!
)1()( nxpp
xnx
n
pp
x
n
xXP xnxxnx








 
psuccessP
nx
nnnn








)"("
trials.insuccesses""
obtaintowaysofnumberthex"choosen"
x
n
11!and10!also,1...)2()1(!

• E.g. when n = 3 and p = .50 there are 8 possible
equally likely outcomes (e.g. flipping a coin)
SSS SSF SFS FSS SFF FSF FFS FFF
X=3 X=2 X=2 X=2 X=1 X=1 X=1 X=0
P(X=3)=1/8, P(X=2)=3/8, P(X=1)=3/8, P(X=0)=1/8
• Now let’s use binomial probability formula instead…
.,...,1,0)1(
)!(!
!
)1()( nxpp
xnx
n
pp
x
n
xXP xnxxnx








 

• E.g. when n = 3, p = .50 find P(X = 2)
.,...,1,0)1(
)!(!
!
)1()( nxpp
xnx
n
pp
x
n
xXP xnxxnx








 
8
3or375.)5)(.5(.3)5(.5.
2
3
)2(
ways3
1)12(
123
!1!2
!3
)!23(!2
!3
2
3
12232



















XP
SSF
SFS
FSS

The Poisson Distribution
The Poisson distribution is defined by:
!
)(
x
e
xf
x 
 

Where f(x) is the probability of x occurrences in an interval
 is the expected value or mean value of occurrences within
an interval
e is the natural logarithm. e = 2.71828

Properties of the Poisson Distribution
1. The probability of occurrences is the same for any
two intervals of equal length.
2. The occurrence or nonoccurrence of an event in one
interval is independent of an occurrence on
nonoccurrence of an event in any other interval

Problem
a. Write the appropriate Poisson distribution
b. What is the average number of occurrences in three time periods?
c. Write the appropriate Poisson function to determine the probability
of x occurrences in three time periods.
d. Compute the probability of two occurrences in one time period.
e. Compute the probability of six occurrences in three time periods.
f. Compute the probability of five occurrences in two time periods.
Consider a Poisson probability distribution with an average
number of occurrences of two per period.

Problem
!
2
)(
2
X
e
xf
x 

6
!
6
)(
6
X
e
xf
x 

27067.
2
5413.
!2
2
)2(
22


e
f
(a)
(b)
(c)
(d)

Hypergeometric Distribution
rx
n
N
xn
rN
x
r
xf 




















 0allfor)(
Where
n = the number of trials.
N = number of elements in the population
r = number of elements in the population labeled a success

Parametric and Nonparametric Tests
It introduces two non-parametric hypothesis
tests using the chi-square statistic: the chi-
square test for goodness of fit and the chi-
square test for independence.

(cont.)
• The term "non-parametric" refers to the fact that the
chi-square tests do not require assumptions about
population parameters nor do they test hypotheses
about population parameters.
• Previous examples of hypothesis tests, such as the t
tests and analysis of variance, are parametric tests
and they do include assumptions about parameters
and hypotheses about parameters.

(cont.)
• The most obvious difference between the
chi-square tests and the other hypothesis
tests we have considered (t and ANOVA) is the
nature of the data.
• For chi-square, the data are frequencies rather
than numerical scores.

The Chi-Square Test for Goodness-of-Fit
• The chi-square test for goodness-of-fit uses
frequency data from a sample to test hypotheses
about the shape or proportions of a population.
• Each individual in the sample is classified into one
category on the scale of measurement.
• The data, called observed frequencies, simply count
how many individuals from the sample are in each
category.

The Chi-Square Test for Goodness-of-Fit
(cont.)
• The null hypothesis specifies the proportion of
the population that should be in each
category.
• The proportions from the null hypothesis are
used to compute expected frequencies that
describe how the sample would appear if it
were in perfect agreement with the null
hypothesis.

The Chi-Square Test for Independence
• The second chi-square test, the chi-square
test for independence, can be used and
interpreted in two different ways:
1. Testing hypotheses about the
relationship between two variables in a
population, or
2. Testing hypotheses about
differences between proportions for two
or more populations.

(cont.)
• Although the two versions of the test for
independence appear to be different, they are
equivalent and they are interchangeable.
• The first version of the test emphasizes the
relationship between chi-square and a
correlation, because both procedures examine
the relationship between two variables.

(cont.)
• The second version of the test emphasizes the
relationship between chi-square and an
independent-measures t test (or ANOVA)
because both tests use data from two (or
more) samples to test hypotheses about the
difference between two (or more)
populations.

(cont.)
• The first version of the chi-square test for
independence views the data as one sample in
which each individual is classified on two
different variables.
• The data are usually presented in a matrix
with the categories for one variable defining
the rows and the categories of the second
variable defining the columns.

(cont.)
• The data, called observed frequencies, simply
show how many individuals from the sample
are in each cell of the matrix.
• The null hypothesis for this test states that
there is no relationship between the two
variables; that is, the two variables are
independent.

(cont.)
• The second version of the test for independence
views the data as two (or more) separate samples
representing the different populations being
compared.
• The same variable is measured for each sample by
classifying individual subjects into categories of the
variable.
• The data are presented in a matrix with the different
samples defining the rows and the categories of the
variable defining the columns..

(cont.)
• The data, again called observed frequencies,
show how many individuals are in each cell of
the matrix.
• The null hypothesis for this test states that the
proportions (the distribution across
categories) are the same for all of the
populations

(cont.)
• Both chi-square tests use the same statistic. The
calculation of the chi-square statistic requires two
steps:
1. The null hypothesis is used to construct an idealized
sample distribution of expected frequencies that
describes how the sample would look if the data
were in perfect agreement with the null hypothesis.

(cont.)
For the goodness of fit test, the expected frequency for each
category is obtained by
expected frequency = fe = pn
(p is the proportion from the null hypothesis and n is the size
of the sample)
For the test for independence, the expected frequency for each
cell in the matrix is obtained by
(row total)(column total)
expected frequency = fe = ─────────────────
n

(cont.)
2. A chi-square statistic is computed to measure the
amount of discrepancy between the ideal sample
(expected frequencies from H0) and the actual
sample data (the observed frequencies = fo).
A large discrepancy results in a large value for chi-
square and indicates that the data do not fit the null
hypothesis and the hypothesis should be rejected.

(cont.)
The calculation of chi-square is the same for all chi-
square tests:
(fo – fe)2
chi-square = χ2 = Σ ─────
fe
The fact that chi-square tests do not require scores
from an interval or ratio scale makes these tests a
valuable alternative to the t tests, ANOVA, or
correlation, because they can be used with data
measured on a nominal or an ordinal scale.

Measuring Effect Size for the Chi-Square
Test for Independence
• When both variables in the chi-square test for
independence consist of exactly two
categories (the data form a 2x2 matrix), it is
possible to re-code the categories as 0 and 1
for each variable and then compute a
correlation known as a phi-coefficient that
measures the strength of the relationship.

Measuring Effect Size for the Chi-Square Test
for Independence (cont.)
• The value of the phi-coefficient, or the
squared value which is equivalent to an r2, is
used to measure the effect size.
• When there are more than two categories for
one (or both) of the variables, then you can
measure effect size using a modified version
of the phi-coefficient known as Cramér=s V.
• The value of V is evaluated much the same as
a correlation.

The t-test
Inferences about Population Means

Questions
• What is the main use of the t-test?
• How is the distribution of t related to the unit
normal?
• When would we use a t-test instead of a z-test? Why
might we prefer one to the other?
• What are the chief varieties or forms of the t-test?
• What is the standard error of the difference between
means? What are the factors that influence its size?

Background
• The t-test is used to test hypotheses about
means when the population variance is
unknown (the usual case). Closely related to
z, the unit normal.
• Developed by Gossett for the quality control
of beer.
• Comes in 3 varieties:
• Single sample, independent samples, and
dependent samples.

What kind of t is it?
• Single sample t – we have only 1 group; want to test
against a hypothetical mean.
• Independent samples t – we have 2 means, 2 groups;
no relation between groups, e.g., people randomly
assigned to a single group.
• Dependent t – we have two means. Either same
people in both groups, or people are related, e.g.,
husband-wife, left hand-right hand, hospital patient
and visitor.

Single-sample z test
• For large samples (N>100) can use z to test
hypotheses about means.
• Suppose
• Then
• If
M
M
est
X
z


.
)( 

N
N
XX
N
s
est X
M
1
)(
.
2





200;5;10:;10: 10  NsHH X
35.
14.14
5
200
5
. 
N
s
est X
M
05.96.183.2;83.2
35.
)1011(
11 

 pzX

The t Distribution
We use t when the population variance is unknown (the usual case) and
sample size is small (N<100, the usual case). If you use a stat package for
testing hypotheses about means, you will use t.
The t distribution is a short, fat relative of the normal. The shape of t depends
on its df. As N becomes infinitely large, t becomes normal.

Degrees of Freedom
For the t distribution, degrees of freedom are always a simple function of the
sample size, e.g., (N-1).
One way of explaining df is that if we know the total or mean, and all but one
score, the last (N-1) score is not free to vary. It is fixed by the other scores.
4+3+2+X = 10. X=1.

Single-sample t-test
With a small sample size, we compute the same numbers as we did for z,
but we compare them to the t distribution instead of the z distribution.
25;5;10:;10: 10  NsHH X
1
25
5
. 
N
s
est X
M 1
1
)1011(
11 

 tX
064.2)24,05(. t 1<2.064, n.s.
Interval =
]064.13,936.8[)1(064.211
ˆ

 MtX 
Interval is about 9 to 13 and contains 10, so n.s.
(c.f. z=1.96)

Difference Between Means (1)
• Most studies have at least 2 groups (e.g., M
vs. F, Exp vs. Control)
• If we want to know diff in population means,
best guess is diff in sample means.
• Unbiased:
• Variance of the Difference:
• Standard Error:
2
2
2
121 )var( MMyy  
212121 )()()(   yEyEyyE
2
2
2
1 MMdiff  

Difference Between Means (2)
• We can estimate the standard error of the
difference between means.
• For large samples, can use z
2
2
2
1 ... MMdiff estestest  
diffest
XX
diffz 
 )()( 2121 

3;100;12
2;100;10
0:;0:
222
111
211210



SDNX
SDNX
HH 
36.
100
13
100
9
100
4
. diffest 
05.;56.5
36.
2
36.
0)1210(


 pzdiff

Independent Samples t (1)
• Looks just like z:
• df=N1-1+N2-1=N1+N2-2
• If SDs are equal, estimate is:
diffest
yy
difft 
 )()( 2121 








21
2
2
2
1
2
11
NNNN
diff 


Pooled variance estimate is weighted average:
)]2/(1/[])1()1[( 21
2
22
2
11
2
 NNsNsN
Pooled Standard Error of the Difference (computed):





 



21
21
21
2
22
2
11
2
)1()1(
.
NN
NN
NN
sNsN
est diff

Independent Samples t (2)





 



21
21
21
2
22
2
11
2
)1()1(
.
NN
NN
NN
sNsN
est diff
diffest
yy
difft 
 )()( 2121 

7;83.5;20
5;7;18
0:;0:
2
2
22
1
2
11
211210



Nsy
Nsy
HH 
47.1
35
12
275
)83.5(6)7(4
. 





diffest 
..;36.1
47.1
2
47.1
0)2018(
sntdiff 




tcrit = t(.05,10)=2.23

Dependent t (1)
Observations come in pairs. Brother, sister, repeated measure.
),cov(2 21
2
2
2
1
2
yyMMdiff  
Problem solved by finding diffs between pairs Di=yi1-yi2.
1
)( 2
2




N
DD
s i
D
N
s
est D
MD .N
D
D i
)(
MDest
DED
t
.
)(
 df=N(pairs)-1

Dependent t (2)
Brother Sister
5 7
7 8
3 3
5y 6y
Diff
2 1
1 0
0 1
1D
58.3/1. MDest 
72.1
58.
1
.
)(



MDest
DED
t

1
1
)( 2





N
DD
sD
2
)( DD 

Unit3

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Unit3

Similaire à Unit3 (20)

Dernier

Dernier (20)

Unit3