SlideShare une entreprise Scribd logo
1  sur  53
RESEARCH METHODOLOGY
INFERENCIAL STATISTICS
BY
ONDABU IBRAHIM TIRIMBA
Overview
 In this lesson, we will briefly cover a few main concepts
used in inferential Statistics, such as estimating a
population parameter, hypothesis testing, T-tests, linear
regression and Analysis of Variance (ANOVA).
 After completing this section you should be able to do
the following:
 Recognize common inferential statistical tests
 Identify and compute basic point estimates of population
parameters
 Describe the basics of hypothesis testing
 Understand and identify the use of regression modeling
Introduction
• Inferential Statistics’ are mathematical
tools that permits the researcher to
generalize to a population of individuals
based upon information obtained from a
limited number of research participants
(the sample).
Example
• For instance, consider an experiment where sales
were increased by 25% following a media
advertisement on 10 products compared to sales of 10
products which were not advertised. Inferential
Statistics allows us to decide if the increased sales are
due to chance or from the effect of advertising.
• There are primarily two ways to use inferential
statistics:
• Parameter Estimation
• Test of Hypothesis
Parameter Estimation
• A Parameter is any of the factors that
limits the way in which something can
be done.
• Parameter estimation falls into two
Categories:
• Point estimation
• Confidence interval (CI) estimation
Point Estimation
• Point estimation: The Estimate or Prediction
of a population parameter is often referred to as
a Point estimate.
• That is to say, the estimate is a single value
based on a sample, a statistic, which is then
used to estimate the corresponding value in the
population (a parameter).
• The average (mean = a parameter) of our
sample can be used as an estimator of the
population mean.
Sampling Error
• Sampling Error: the difference between the
population value of interest (e.g. mean), and
the sample value. Our sample value is often
referred to as an estimate of our population
value.
• If the sample is randomly drawn from the
population, then sampling error will be random
and will be distributed normally.
Confidence Interval (CI)
• Confidence Interval: Is a range of numbers which
are calculated so that the true populations mean
lies within this range with a particular degree of
certainty.
• The certainty in which a population mean lies
within the range is typically expressed as 95%
confidence interval, or a 99% confidence interval.
As you add more certainty the width of the interval
will increase.
• A confidence interval gives an estimated range of
values which is likely to include an unknown
population parameter, the estimated range being
calculated from a given set of sample data.
Confidence Interval cont.
• Confidence interval for the mean is given by
formula:
CI = ¯x ± Zα s/√n ¯x = mean
Zα = constant for 95% CI
= 1.96 and 2.56 for 99%
¯x – 1.96xs/√n < µ < ¯x + 1.96xs/√n
Confidence Interval cont.
So if for the selected sample the sample size is
36 (= n) with mean of 5 (= ¯x) and standard
deviation of 2 (= s) then the 95% confidence
interval (CI) of the population mean is given
by:
4.35=5–1.96 x 2/√36 < µ < 5+1.96 x 2/√36=5.65
Since, 1.96 x 2/√36 = ± 0.65
Thus, CI ranges between 4.35< µ < 5.65
Confidence Interval cont.
• So the 95% confidence interval for the mean
using this formula is between 4.35 and 5.65.
Notice, that if we select another random sample
of size 36, its mean and standard deviation
would be different so we would obtain a
different confidence interval.
Exercise: Use the same data given above to
calculate the 99% confidence interval of the
population mean
Confidence Interval cont.
• If independent samples are taken repeatedly
from the same population, and a confidence
interval calculated for each sample, then a
certain percentage (confidence level) of the
intervals will include the unknown population
parameter.
• Confidence intervals are usually calculated
so that this percentage is 95%, but we can
produce 90%, 99%, 99.9% (or whatever)
confidence intervals for the unknown
parameter.
Confidence Interval cont.
• The width of the confidence interval gives us
some idea about how uncertain we are about
unknown parameter.
• A very wide interval may indicate that more
data should be collected before anything very
definite can be said about the parameter.
• Confidence intervals are more informative than
the simple results of hypothesis tests (where
we decide “reject Ho” or “don’t reject Ho”) since
they provide a range of plausible values for the
unknown parameter.
Confidence Interval cont.
• Confidence limits are the lower and the upper
boundaries/values of a confidence interval, that
is, the values which define the range of a
confidence interval.
• The upper and lower bounds of a 95%
confidence interval are the 95% confidence
limits. Such limits may be taken for other
confidence levels, for example, 90%, 99%,
99.9%.
Hypothesis Testing
• The second type of inferential statistics is
hypothesis testing. This is sometimes called
statistical testing as well.
• In point estimation and in constructing
confidence interval, we had no expectations
about the values we calculated, whereas in
hypothesis testing we have formed some
expectation about the population parameter.
HYPOTHESIS TESTING cont.
Example
• Our hypothesis is that “tree mortality after a particular
forest fire will be greater than 60%”, in other words average
tree mortality > 60%.
• Once our notion of the population parameter has been
developed, we can write two contradictory hypotheses:
 The first is research (or alternative) hypothesis, which in
our case is that “the mean tree mortality > 60%”.
 The second hypothesis is called the null hypothesis, and is
the opposite of our research hypothesis. In our example, the
null hypothesis would be stated as “the mean tree mortality is
less than or equal to 60%”.
Hypothesis Testing cont.
Basic Concepts in Test of Hypothesis
• Def.: A Hypothesis is a tentative explanation
for an observation, phenomenon, or scientific
problem that can be tested by further
investigation.
Null and Alternative Hypothesis
• Null Hypothesis: The null hypothesis, (Ho),
represents a theory that has been put forward, either
because it is believed to be true or because it is to be
used as a basis for argument, but has not been
proved.
• For example, in a clinical trial of a new drug, the null
hypothesis might be that “the new drug is no better,
on average, than the current drug”.
We would write
Ho: there is no difference between the two drugs on
average.
Null and Alternative Hypothesis
• Alternative Hypothesis: The alternative
hypothesis, H1, is a statement of what a
statistical hypothesis test is set to establish.
• For example, in a clinical trial of a new drug,
the alternative hypothesis might be that “the
new drug has a different effect, on average,
compared to that of the current drug,
We would write:
• H1: the two drugs have different effects, on
average.
Null and Alternative Hypothesis
• The alternative hypothesis might also be
that the new drug is better, on average,
than the current drug.
In this case we would write:
• H1: the new drug is better than the current
drug, on average.
Null and Alternative Hypothesis
• We give special consideration to the null hypothesis. This
is due to the fact that the null hypothesis relates to the
statement of being tested, whereas the alternative
hypothesis relates to the statement to be accepted if /when
the null is rejected.
• The final conclusion once the test has been carried out is
always given in terms of the null hypothesis. We either
reject Ho in favor of H1 or do not reject Ho.
We never conclude, Reject H1 or even Accept H1.
• We conclude “Do not reject Ho”, this does not necessarily
mean that the null hypothesis is true, it only suggests that
there is not sufficient evidence against Ho in favor of H1.
Rejecting the null hypothesis then, suggests that the
alternative hypothesis may be true.
One and Two Tailed Tests
One Tailed Tests (T-Test)
Example
• Our hypothesis is that tree mortality after a
particular forest fire will be greater that 60%.
In other words average tree mortality > 60%.
• In this example, it is a one-tailed test.
Here we were simply considering the idea
that the population mean was larger than
some number. So we would reject the null
hypothesis if we had large values of tree
mortality.
Two Tailed Tests cont.
A two-tailed test is used when a research hypothesis is
stated as the following:
Example
• “Tree mortality following fire will be equal to 60%”,
whereas
• our null hypothesis would read “tree mortality
following fire is not equal to 60%”.
• Under this scenario, we could reject our research
hypothesis if tree mortality was much larger than 60
or much smaller than 60.
• This is a two-tailed test
Significance
Significance
• The probability of an outcome given the
null hypothesis is a p-value.
• A low probability value indicates rejection
of the null hypothesis.
• Typically: reject Ho if p-value ≤ 0.05 (for a
95% levels of significance test) or 0.01 (for
a 99% levels of significance test).
Statistically, significant means the effect is
not due to chance.
Type I and II Errors
Type I and II Errors
• We define a type I error as the event of rejecting
the null hypothesis when the null hypothesis was
true. The probability of a type I error (a) is called
the significance level.
• We define type II error (with probability b) as the
event of failing to reject the null hypothesis when
the null hypothesis was false.
• The type I risk is the chance of deciding that a
significant effect is present when it isn’t.
• The type II risk is the chance of not detecting a
significance effect when one exists.
Test of Hypothesis
Steps in Test of Hypothesis
The usual process of hypothesis testing
consists of four steps:
• Formulate the null hypothesis Ho
(commonly, that the observations are the
result of pure chance)
• and the alternative hypothesis H1
(commonly, that the observations show a
real effect combined with a component of
chance variation).
• Identify a test statistic that can be used to
assess the truth of the null hypothesis.
Test of Hypothesis cont.
• Compute the P-value, which is the probability
that a test statistic at least as significant as
the one observed would be obtained
assuming that the null hypothesis were true.
The smaller the P-value, the stronger the
evidence against the null hypothesis.
• Compare the P-value to an acceptable
significance value α (sometimes called an
alpha value). If P≤ α, that the observed
effect is statistically significant, the null
hypothesis is ruled out, and the alternative
hypothesis is valid.
Statistical Tests
Statistical tests include:
• Linear Regression
• T-test
• ANOVA
Regression Models and
Correlation
• The use of regression models is very common, and
serves a very specific point to us as managers.
• Regression models allow us to predict the outcome of
one variable from another variable.
• When two variables are related, it is possible to
predict a persons score on one variable from their
score on he second variable with better than chance
accuracy.
• This section describes how these predictions are
made and what can be learned about the relationship
between the variables by developing a prediction
equation.
Regression Models and Correlation
• It will be assumed that the relationship
between the two variables is linear.
• Given that the relationship is linear,
the prediction problem becomes one of
findings the straight line that best fits
the data.
• Since the terms “regression” and
“prediction” are synonymous, this line
is called the regression line.
Regression line
The mathematical form of the regression line
predicting Y from X is:
Y = Bo + B1X
• Where:
- X is the variable represented on the X-
axis (Independent variable)
- B1 is the slope of the line,
- Bo is the Y-intercept and
- Y consist of the predicted (dependent variable)
values of Y for the various values of X.
The Coefficient of Correlation
• The correlation between two variables reflects the
degree to which the variables are related. The most
common measure of correlation is the Pearson
Product Moment Correlation (called Pearson’s
correlation in short).
• When measured in a population, the Pearson
Product Moment correlation is designated by the
Greek letter rho (p).
• When computed in a sample, it is designated by the
letter r and is sometimes called “Pearson’s r”.
• Pearson’s correlation reflects the degree of linear
relationship between two variables. It ranges from
+1 to -1.
The Coefficient of Correlation
• A correlation of +1 means that there is a
perfect positive linear relationship.
• A positive relationship shows high scores
on the X axis that are associated with high
scores on the Y-axis.
• A correlation of -1 means that there is a perfect
negative linear relationship between variables.
• A negative relationship shows high scores on
the X-axis that are associated with low scores on
the Y-axis.
The Coefficient of Correlation
• A correlation of 0 means there is no linear
relationship between the two variables.
Coefficient
of Determination
• The coefficient of determination r2 gives the proportion
of the variance (fluctuation) of one variable that is
predictable from the other variables.
• It is a measure that allows us to determine how certain
one can be in making predictions from a certain
model/graph.
• The coefficient of determination is a measure of how
well the regression line represents the data.
• If the regression line passes exactly through every
point on the scatter plot, it would be able to explain all
of the variation.
Coefficient
of Determination
• The further the line is away from the points, the
lesser it is able to explain.
For example, if r = 0.922, then r2 = 0.850, which
means that 85% of the total variation in Y can
be explained by the linear relationship between
X and Y. The other 15% of the total variation in
Y remains unexplained (or is by chance).
T-test
• The T-test gives an indication of the separateness of two sets
of measurements, and is thus used to check whether two sets
of measures are essentially different.
• In many situations, we will want to compare two populations
parameters. To compare these two populations, we can
compare the differences between the two sample means.
• T-test looks for significant difference in means between two
samples or between a population and a sample.
There are 3 types of T-tests;
- One sample T-test
- Independent 2 samples T-test
- Paired sample T-test
One Sample T-test
• One sample t-test: is a statistical procedure that is
used to know the mean differences between the
sample and the known value of the population
mean.
• In one sample t-test, we know the population mean.
We draw a random sample from the population and
then compare the sample mean with the population
mean and make a statistical decision as to whether
or not the sample mean is different from the
population.
Assumptions in One Sample
t-test
• In one sample t-test, dependent variables should be
normally distributed.
• In one sample t-test, samples drawn from the
population should be random.
• In one sample t-test, cases of the samples should be
independent
• The data is measurement data-interval/ratio
• In one sample t-test, we should know the
population mean.
Formula
t = (X1 – µ)/sx
Where: X1= Sample mean
µ = Population mean
Sx = Standard error of the mean
Independent t-test
Independent t-test: the independent-
measures t-test (or independent t-test) is
used when measures from the two samples
being compared do not come in matched
pairs. It is used when groups are
independent.
Related Formula
t = x1 – x2/√{s2 (1/n1 + 1/n2)}
For an independent 2 sample t-test, it is
important to know if the 2 samples have
similar variances as we interpret data. The
requirement for variance homogeneity test
may be measure with Levine’s test. Results
for this can be given in SPSS along with the t-
test results.
Assumption in 2 sample
independence T-test
1.0 Normality: Assumes that the population distributions are
normal. The t-test is quite robust over moderate violations
of this assumption. It is especially robust if a two tailed test
is used and if the sample sizes are not especially small.
Check for normality by creating a histogram.
2.0 Independent Observations: The
observations within each treatment condition
must be independent.
Assumption in 2 sample independence
t-test cont.
3.0 Equal Variances: Assume that the
population distributions have the same
variance. This assumption is quite important
(If it is violated, it makes the test’s averaging
of the 2 variances meaningless).
If it is violated, then use a modification of the t-
test procedures as needed. See
“Understanding the Output” in this section for
how to check this with Levenes Test for Equality
of Variances.
Paired Sample T test
The matched-pair t-test (or paired t-test or
paired samples t-test or dependent t-test) is
used when the data from the two groups can
be presented in pairs,
For example where the same people are
being measured in before-and-after
comparison or when the group is given two
different tests at different times (e.g
pleasantness of two different types of
chocolate).
Assumptions in paired
sample t-test
1. The first assumption in the paired sample t-test is that only the
matched pair can be used to perform the paired sample t-test.
2. In the paired sample t-test, normal distributions are
assumed.
3. Variance in paired sample t-test: in a paired sample t-
test, it is assumed that the variance of two sample is
same.
4. The data is measurement data-interval/ratio
5. Independence of observation in paired sample t-test:
in a paired sample t-test, observations must be
independent of each other.
Formula:
t = d/ √ s2
/n
Where:
d bar is the mean difference between two
samples;
s2
is the sample variance,
n is sample size and
t is a paired sample t-test with n-1 degree of
freedom
ANOVA or Analysis of
Variance
So far we have discussed comparing the means of two
populations to each other and comparing the population
mean to another number. However, we often want to
compare many populations to each other.
ANOVA or Analysis of Variance
Example:
We may want to compare regeneration rates for three
different tree species in northern Idaho. We would begin by
taking samples from each population and then calculate the
means from the three samples and make an inference about
the population means from this.
It is common since these three mean regeneration rates
would all be different numbers however, this does not mean
that there is a difference between the population means for
the three tree types.
To answer that question we can use a statistical test called
an analysis of variance or ANOVA. This test is widely used in
natural resources, and you are bound to come across it when
reading scientific literature.
The ANOVA Assumptions
The use of an ANOVA assumes that:
• All the populations are normally distributed (follow a bell
shaped curve)
• All the population variances are equal,
• And all the samples were taken independently of each other
and are randomly collected from their population.
Generally, our null hypothesis when conducting an ANOVA is
that all the population means are equal and our research
(alternative) hypothesis will be that at least one of the
population means is not equal.
The ANOVA Assumptions
Although an ANOVA is widely used and it does
indicate that a population mean is different
than others, it does not tell us which one is
different from the others.
Analysis of variance tests the null hypothesis
that all the population means are equal:
Formula:
Ho: µ1 = µ2 = µ3……. = µa
You can read more from Text books
The ANOVA cont.
• By comparing two estimates of variance (………) recall that ……….. is the
variance within each of the “a” treatment populations.) one estimate (called
the mean square error or MSE for short) is based on the variances within
the samples. The MSE is an estimate of ………. Whether or not the null
hypothesis is true. The second estimate (mean square between or MSB for
short) is based on the variance of the sample means. The MSB is only an
estimate of ………. If the null hypothesis is true. If the null hypothesis is
false then MSB estimates something larger than ……… The logic by which
analysis of variance tests the null hypothesis is as follows: if the null
hypothesis is true, then MSE and MSB should be about the same since they
are both estimates of the same quantity (…): however, if the null hypothesis
is false then MSB can be expected to be larger than MSE since MSB is
estimating a quantity larger than ……..
• Therefore, if MSB is sufficiently larger than MSE, the null hypothesis can be
rejected. If MSB is not sufficiently larger than MSE then the null hypothesis
cannot be rejected. How much larger is sufficiently larger.
END
•Questions
•Next Class
•Assignments
•AOB
Prof. Joseph M. Keriko
Principal, JKUAT - Nairobi Campus
Professor of Organic Chemistry and
EIA/EA Leader Expert
P.O. Box 39125 – 00623 Nairobi
Tel. 0722-915026
Email: kerikojm@yahoo.co.uk

Contenu connexe

Tendances

Introduction to business statistics
Introduction to business statisticsIntroduction to business statistics
Introduction to business statistics
Aakash Kulkarni
 
Confidence intervals
Confidence intervalsConfidence intervals
Confidence intervals
Tanay Tandon
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval Estimation
Shubham Mehta
 
Statistics for management
Statistics for managementStatistics for management
Statistics for management
Vinay Aradhya
 

Tendances (20)

Testing of hypotheses
Testing of hypothesesTesting of hypotheses
Testing of hypotheses
 
Basics of Hypothesis Testing
Basics of Hypothesis TestingBasics of Hypothesis Testing
Basics of Hypothesis Testing
 
TOPIC- HYPOTHESIS TESTING RMS.pptx
TOPIC- HYPOTHESIS TESTING  RMS.pptxTOPIC- HYPOTHESIS TESTING  RMS.pptx
TOPIC- HYPOTHESIS TESTING RMS.pptx
 
Calculating p value
Calculating p valueCalculating p value
Calculating p value
 
What is a Single Sample Z Test?
What is a Single Sample Z Test?What is a Single Sample Z Test?
What is a Single Sample Z Test?
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testing
 
Single linear regression
Single linear regressionSingle linear regression
Single linear regression
 
HYPOTHESIS TESTING
HYPOTHESIS TESTINGHYPOTHESIS TESTING
HYPOTHESIS TESTING
 
Introduction to business statistics
Introduction to business statisticsIntroduction to business statistics
Introduction to business statistics
 
Point estimation
Point estimationPoint estimation
Point estimation
 
statistical inference
statistical inference statistical inference
statistical inference
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Estimation and confidence interval
Estimation and confidence intervalEstimation and confidence interval
Estimation and confidence interval
 
Testing Hypothesis
Testing HypothesisTesting Hypothesis
Testing Hypothesis
 
Confidence intervals
Confidence intervalsConfidence intervals
Confidence intervals
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval Estimation
 
Interval Estimation & Estimation Of Proportion
Interval Estimation & Estimation Of ProportionInterval Estimation & Estimation Of Proportion
Interval Estimation & Estimation Of Proportion
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Statistics for management
Statistics for managementStatistics for management
Statistics for management
 

En vedette

Chemical naming jeopardy
Chemical naming jeopardyChemical naming jeopardy
Chemical naming jeopardy
zehnerm2
 
Alkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical propertiesAlkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical properties
MRSMPC
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
Harve Abella
 
10 naming and formula writing 2012
10 naming and formula writing 201210 naming and formula writing 2012
10 naming and formula writing 2012
mrtangextrahelp
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis tests
University of Salerno
 
Writing and Naming formula
Writing and Naming formulaWriting and Naming formula
Writing and Naming formula
Jeric Lazo
 
Chemistry - Chp 9 - Chemical Names and Formulas - PowerPoint
Chemistry - Chp 9 - Chemical Names and Formulas - PowerPointChemistry - Chp 9 - Chemical Names and Formulas - PowerPoint
Chemistry - Chp 9 - Chemical Names and Formulas - PowerPoint
Mr. Walajtys
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
Nursing Path
 

En vedette (20)

Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
 
2nd lectures research
2nd lectures research2nd lectures research
2nd lectures research
 
Chemical reaction and balancing chemical equation
Chemical reaction and balancing chemical equationChemical reaction and balancing chemical equation
Chemical reaction and balancing chemical equation
 
Chemical naming jeopardy
Chemical naming jeopardyChemical naming jeopardy
Chemical naming jeopardy
 
Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Analysis of variance (ANOVA)
Analysis of variance (ANOVA)
 
Alkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical propertiesAlkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical properties
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
 
10 naming and formula writing 2012
10 naming and formula writing 201210 naming and formula writing 2012
10 naming and formula writing 2012
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis tests
 
Chemical nomenclature 1
Chemical nomenclature 1Chemical nomenclature 1
Chemical nomenclature 1
 
Writing and Naming formula
Writing and Naming formulaWriting and Naming formula
Writing and Naming formula
 
Unit 12 Chemical Naming and Formulas
Unit 12 Chemical Naming and FormulasUnit 12 Chemical Naming and Formulas
Unit 12 Chemical Naming and Formulas
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10stst
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
 
parametric test of difference z test f test one-way_two-way_anova
parametric test of difference z test f test one-way_two-way_anova parametric test of difference z test f test one-way_two-way_anova
parametric test of difference z test f test one-way_two-way_anova
 
Chemistry - Chp 9 - Chemical Names and Formulas - PowerPoint
Chemistry - Chp 9 - Chemical Names and Formulas - PowerPointChemistry - Chp 9 - Chemical Names and Formulas - PowerPoint
Chemistry - Chp 9 - Chemical Names and Formulas - PowerPoint
 
Research methodology (2)
Research methodology (2)Research methodology (2)
Research methodology (2)
 
Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
 
Differential equations
Differential equationsDifferential equations
Differential equations
 
Applications of simulation in Business with Example
Applications of simulation in Business with ExampleApplications of simulation in Business with Example
Applications of simulation in Business with Example
 

Similaire à RESEARCH METHODS LESSON 3

Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Nirajan Bam
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
Maria Theresa
 
Lecture 7 Hypothesis testing.pptx
Lecture 7 Hypothesis testing.pptxLecture 7 Hypothesis testing.pptx
Lecture 7 Hypothesis testing.pptx
shakirRahman10
 
Hypothesis Testing Definitions A statistical hypothesi.docx
Hypothesis Testing  Definitions A statistical hypothesi.docxHypothesis Testing  Definitions A statistical hypothesi.docx
Hypothesis Testing Definitions A statistical hypothesi.docx
wilcockiris
 
Research method ch07 statistical methods 1
Research method ch07 statistical methods 1Research method ch07 statistical methods 1
Research method ch07 statistical methods 1
naranbatn
 

Similaire à RESEARCH METHODS LESSON 3 (20)

Class 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptxClass 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptx
 
Statistical analysis
Statistical analysisStatistical analysis
Statistical analysis
 
Test of hypotheses part i
Test of hypotheses part iTest of hypotheses part i
Test of hypotheses part i
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Chapter 28 clincal trials
Chapter 28 clincal trials Chapter 28 clincal trials
Chapter 28 clincal trials
 
3b. Introductory Statistics - Julia Saperia
3b. Introductory Statistics - Julia Saperia3b. Introductory Statistics - Julia Saperia
3b. Introductory Statistics - Julia Saperia
 
HYPOTHESIS TESTING 20200702.pptx
HYPOTHESIS TESTING 20200702.pptxHYPOTHESIS TESTING 20200702.pptx
HYPOTHESIS TESTING 20200702.pptx
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Lecture 7 Hypothesis testing.pptx
Lecture 7 Hypothesis testing.pptxLecture 7 Hypothesis testing.pptx
Lecture 7 Hypothesis testing.pptx
 
unit-2.2 and 2.3.pptx
unit-2.2 and 2.3.pptxunit-2.2 and 2.3.pptx
unit-2.2 and 2.3.pptx
 
Hypothesis Testing Definitions A statistical hypothesi.docx
Hypothesis Testing  Definitions A statistical hypothesi.docxHypothesis Testing  Definitions A statistical hypothesi.docx
Hypothesis Testing Definitions A statistical hypothesi.docx
 
L hypo testing
L hypo testingL hypo testing
L hypo testing
 
20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhd20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhd
 
Introductory Statistics
Introductory StatisticsIntroductory Statistics
Introductory Statistics
 
Research method ch07 statistical methods 1
Research method ch07 statistical methods 1Research method ch07 statistical methods 1
Research method ch07 statistical methods 1
 
Tests of significance Periodontology
Tests of significance PeriodontologyTests of significance Periodontology
Tests of significance Periodontology
 
hypothesis.pptx
hypothesis.pptxhypothesis.pptx
hypothesis.pptx
 
COM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptxCOM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptx
 

Plus de DR. TIRIMBA IBRAHIM (7)

Consumer choice by Dr. Tirimba Ibrahim
Consumer choice by Dr. Tirimba IbrahimConsumer choice by Dr. Tirimba Ibrahim
Consumer choice by Dr. Tirimba Ibrahim
 
Economies of Scale by Dr. Tirimba Ibrahim
Economies of Scale by Dr. Tirimba IbrahimEconomies of Scale by Dr. Tirimba Ibrahim
Economies of Scale by Dr. Tirimba Ibrahim
 
Introduction to set theory
Introduction to set theoryIntroduction to set theory
Introduction to set theory
 
INTRODUCTION TO MICROECONOMICS BY IBRAHIM TIRIMBA
INTRODUCTION TO MICROECONOMICS BY IBRAHIM TIRIMBAINTRODUCTION TO MICROECONOMICS BY IBRAHIM TIRIMBA
INTRODUCTION TO MICROECONOMICS BY IBRAHIM TIRIMBA
 
RESEARCH METHODS LESSON 2
RESEARCH METHODS LESSON 2 RESEARCH METHODS LESSON 2
RESEARCH METHODS LESSON 2
 
RESEARCH METHODS LESSON 1
RESEARCH METHODS LESSON 1RESEARCH METHODS LESSON 1
RESEARCH METHODS LESSON 1
 
HUMAN RESOURCE MANAGEMENT BY TIRIMBA IBRAHIM
HUMAN RESOURCE MANAGEMENT BY TIRIMBA IBRAHIMHUMAN RESOURCE MANAGEMENT BY TIRIMBA IBRAHIM
HUMAN RESOURCE MANAGEMENT BY TIRIMBA IBRAHIM
 

Dernier

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Dernier (20)

Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 

RESEARCH METHODS LESSON 3

  • 2. Overview  In this lesson, we will briefly cover a few main concepts used in inferential Statistics, such as estimating a population parameter, hypothesis testing, T-tests, linear regression and Analysis of Variance (ANOVA).  After completing this section you should be able to do the following:  Recognize common inferential statistical tests  Identify and compute basic point estimates of population parameters  Describe the basics of hypothesis testing  Understand and identify the use of regression modeling
  • 3. Introduction • Inferential Statistics’ are mathematical tools that permits the researcher to generalize to a population of individuals based upon information obtained from a limited number of research participants (the sample).
  • 4. Example • For instance, consider an experiment where sales were increased by 25% following a media advertisement on 10 products compared to sales of 10 products which were not advertised. Inferential Statistics allows us to decide if the increased sales are due to chance or from the effect of advertising. • There are primarily two ways to use inferential statistics: • Parameter Estimation • Test of Hypothesis
  • 5. Parameter Estimation • A Parameter is any of the factors that limits the way in which something can be done. • Parameter estimation falls into two Categories: • Point estimation • Confidence interval (CI) estimation
  • 6. Point Estimation • Point estimation: The Estimate or Prediction of a population parameter is often referred to as a Point estimate. • That is to say, the estimate is a single value based on a sample, a statistic, which is then used to estimate the corresponding value in the population (a parameter). • The average (mean = a parameter) of our sample can be used as an estimator of the population mean.
  • 7. Sampling Error • Sampling Error: the difference between the population value of interest (e.g. mean), and the sample value. Our sample value is often referred to as an estimate of our population value. • If the sample is randomly drawn from the population, then sampling error will be random and will be distributed normally.
  • 8. Confidence Interval (CI) • Confidence Interval: Is a range of numbers which are calculated so that the true populations mean lies within this range with a particular degree of certainty. • The certainty in which a population mean lies within the range is typically expressed as 95% confidence interval, or a 99% confidence interval. As you add more certainty the width of the interval will increase. • A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data.
  • 9. Confidence Interval cont. • Confidence interval for the mean is given by formula: CI = ¯x ± Zα s/√n ¯x = mean Zα = constant for 95% CI = 1.96 and 2.56 for 99% ¯x – 1.96xs/√n < µ < ¯x + 1.96xs/√n
  • 10. Confidence Interval cont. So if for the selected sample the sample size is 36 (= n) with mean of 5 (= ¯x) and standard deviation of 2 (= s) then the 95% confidence interval (CI) of the population mean is given by: 4.35=5–1.96 x 2/√36 < µ < 5+1.96 x 2/√36=5.65 Since, 1.96 x 2/√36 = ± 0.65 Thus, CI ranges between 4.35< µ < 5.65
  • 11. Confidence Interval cont. • So the 95% confidence interval for the mean using this formula is between 4.35 and 5.65. Notice, that if we select another random sample of size 36, its mean and standard deviation would be different so we would obtain a different confidence interval. Exercise: Use the same data given above to calculate the 99% confidence interval of the population mean
  • 12. Confidence Interval cont. • If independent samples are taken repeatedly from the same population, and a confidence interval calculated for each sample, then a certain percentage (confidence level) of the intervals will include the unknown population parameter. • Confidence intervals are usually calculated so that this percentage is 95%, but we can produce 90%, 99%, 99.9% (or whatever) confidence intervals for the unknown parameter.
  • 13. Confidence Interval cont. • The width of the confidence interval gives us some idea about how uncertain we are about unknown parameter. • A very wide interval may indicate that more data should be collected before anything very definite can be said about the parameter. • Confidence intervals are more informative than the simple results of hypothesis tests (where we decide “reject Ho” or “don’t reject Ho”) since they provide a range of plausible values for the unknown parameter.
  • 14. Confidence Interval cont. • Confidence limits are the lower and the upper boundaries/values of a confidence interval, that is, the values which define the range of a confidence interval. • The upper and lower bounds of a 95% confidence interval are the 95% confidence limits. Such limits may be taken for other confidence levels, for example, 90%, 99%, 99.9%.
  • 15. Hypothesis Testing • The second type of inferential statistics is hypothesis testing. This is sometimes called statistical testing as well. • In point estimation and in constructing confidence interval, we had no expectations about the values we calculated, whereas in hypothesis testing we have formed some expectation about the population parameter.
  • 16. HYPOTHESIS TESTING cont. Example • Our hypothesis is that “tree mortality after a particular forest fire will be greater than 60%”, in other words average tree mortality > 60%. • Once our notion of the population parameter has been developed, we can write two contradictory hypotheses:  The first is research (or alternative) hypothesis, which in our case is that “the mean tree mortality > 60%”.  The second hypothesis is called the null hypothesis, and is the opposite of our research hypothesis. In our example, the null hypothesis would be stated as “the mean tree mortality is less than or equal to 60%”.
  • 17. Hypothesis Testing cont. Basic Concepts in Test of Hypothesis • Def.: A Hypothesis is a tentative explanation for an observation, phenomenon, or scientific problem that can be tested by further investigation.
  • 18. Null and Alternative Hypothesis • Null Hypothesis: The null hypothesis, (Ho), represents a theory that has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved. • For example, in a clinical trial of a new drug, the null hypothesis might be that “the new drug is no better, on average, than the current drug”. We would write Ho: there is no difference between the two drugs on average.
  • 19. Null and Alternative Hypothesis • Alternative Hypothesis: The alternative hypothesis, H1, is a statement of what a statistical hypothesis test is set to establish. • For example, in a clinical trial of a new drug, the alternative hypothesis might be that “the new drug has a different effect, on average, compared to that of the current drug, We would write: • H1: the two drugs have different effects, on average.
  • 20. Null and Alternative Hypothesis • The alternative hypothesis might also be that the new drug is better, on average, than the current drug. In this case we would write: • H1: the new drug is better than the current drug, on average.
  • 21. Null and Alternative Hypothesis • We give special consideration to the null hypothesis. This is due to the fact that the null hypothesis relates to the statement of being tested, whereas the alternative hypothesis relates to the statement to be accepted if /when the null is rejected. • The final conclusion once the test has been carried out is always given in terms of the null hypothesis. We either reject Ho in favor of H1 or do not reject Ho. We never conclude, Reject H1 or even Accept H1. • We conclude “Do not reject Ho”, this does not necessarily mean that the null hypothesis is true, it only suggests that there is not sufficient evidence against Ho in favor of H1. Rejecting the null hypothesis then, suggests that the alternative hypothesis may be true.
  • 22. One and Two Tailed Tests One Tailed Tests (T-Test) Example • Our hypothesis is that tree mortality after a particular forest fire will be greater that 60%. In other words average tree mortality > 60%. • In this example, it is a one-tailed test. Here we were simply considering the idea that the population mean was larger than some number. So we would reject the null hypothesis if we had large values of tree mortality.
  • 23. Two Tailed Tests cont. A two-tailed test is used when a research hypothesis is stated as the following: Example • “Tree mortality following fire will be equal to 60%”, whereas • our null hypothesis would read “tree mortality following fire is not equal to 60%”. • Under this scenario, we could reject our research hypothesis if tree mortality was much larger than 60 or much smaller than 60. • This is a two-tailed test
  • 24. Significance Significance • The probability of an outcome given the null hypothesis is a p-value. • A low probability value indicates rejection of the null hypothesis. • Typically: reject Ho if p-value ≤ 0.05 (for a 95% levels of significance test) or 0.01 (for a 99% levels of significance test). Statistically, significant means the effect is not due to chance.
  • 25. Type I and II Errors Type I and II Errors • We define a type I error as the event of rejecting the null hypothesis when the null hypothesis was true. The probability of a type I error (a) is called the significance level. • We define type II error (with probability b) as the event of failing to reject the null hypothesis when the null hypothesis was false. • The type I risk is the chance of deciding that a significant effect is present when it isn’t. • The type II risk is the chance of not detecting a significance effect when one exists.
  • 26. Test of Hypothesis Steps in Test of Hypothesis The usual process of hypothesis testing consists of four steps: • Formulate the null hypothesis Ho (commonly, that the observations are the result of pure chance) • and the alternative hypothesis H1 (commonly, that the observations show a real effect combined with a component of chance variation). • Identify a test statistic that can be used to assess the truth of the null hypothesis.
  • 27. Test of Hypothesis cont. • Compute the P-value, which is the probability that a test statistic at least as significant as the one observed would be obtained assuming that the null hypothesis were true. The smaller the P-value, the stronger the evidence against the null hypothesis. • Compare the P-value to an acceptable significance value α (sometimes called an alpha value). If P≤ α, that the observed effect is statistically significant, the null hypothesis is ruled out, and the alternative hypothesis is valid.
  • 28. Statistical Tests Statistical tests include: • Linear Regression • T-test • ANOVA
  • 29. Regression Models and Correlation • The use of regression models is very common, and serves a very specific point to us as managers. • Regression models allow us to predict the outcome of one variable from another variable. • When two variables are related, it is possible to predict a persons score on one variable from their score on he second variable with better than chance accuracy. • This section describes how these predictions are made and what can be learned about the relationship between the variables by developing a prediction equation.
  • 30. Regression Models and Correlation • It will be assumed that the relationship between the two variables is linear. • Given that the relationship is linear, the prediction problem becomes one of findings the straight line that best fits the data. • Since the terms “regression” and “prediction” are synonymous, this line is called the regression line.
  • 31. Regression line The mathematical form of the regression line predicting Y from X is: Y = Bo + B1X • Where: - X is the variable represented on the X- axis (Independent variable) - B1 is the slope of the line, - Bo is the Y-intercept and - Y consist of the predicted (dependent variable) values of Y for the various values of X.
  • 32. The Coefficient of Correlation • The correlation between two variables reflects the degree to which the variables are related. The most common measure of correlation is the Pearson Product Moment Correlation (called Pearson’s correlation in short). • When measured in a population, the Pearson Product Moment correlation is designated by the Greek letter rho (p). • When computed in a sample, it is designated by the letter r and is sometimes called “Pearson’s r”. • Pearson’s correlation reflects the degree of linear relationship between two variables. It ranges from +1 to -1.
  • 33. The Coefficient of Correlation • A correlation of +1 means that there is a perfect positive linear relationship. • A positive relationship shows high scores on the X axis that are associated with high scores on the Y-axis. • A correlation of -1 means that there is a perfect negative linear relationship between variables. • A negative relationship shows high scores on the X-axis that are associated with low scores on the Y-axis.
  • 34. The Coefficient of Correlation • A correlation of 0 means there is no linear relationship between the two variables.
  • 35. Coefficient of Determination • The coefficient of determination r2 gives the proportion of the variance (fluctuation) of one variable that is predictable from the other variables. • It is a measure that allows us to determine how certain one can be in making predictions from a certain model/graph. • The coefficient of determination is a measure of how well the regression line represents the data. • If the regression line passes exactly through every point on the scatter plot, it would be able to explain all of the variation.
  • 36. Coefficient of Determination • The further the line is away from the points, the lesser it is able to explain. For example, if r = 0.922, then r2 = 0.850, which means that 85% of the total variation in Y can be explained by the linear relationship between X and Y. The other 15% of the total variation in Y remains unexplained (or is by chance).
  • 37. T-test • The T-test gives an indication of the separateness of two sets of measurements, and is thus used to check whether two sets of measures are essentially different. • In many situations, we will want to compare two populations parameters. To compare these two populations, we can compare the differences between the two sample means. • T-test looks for significant difference in means between two samples or between a population and a sample. There are 3 types of T-tests; - One sample T-test - Independent 2 samples T-test - Paired sample T-test
  • 38. One Sample T-test • One sample t-test: is a statistical procedure that is used to know the mean differences between the sample and the known value of the population mean. • In one sample t-test, we know the population mean. We draw a random sample from the population and then compare the sample mean with the population mean and make a statistical decision as to whether or not the sample mean is different from the population.
  • 39. Assumptions in One Sample t-test • In one sample t-test, dependent variables should be normally distributed. • In one sample t-test, samples drawn from the population should be random. • In one sample t-test, cases of the samples should be independent • The data is measurement data-interval/ratio • In one sample t-test, we should know the population mean.
  • 40. Formula t = (X1 – µ)/sx Where: X1= Sample mean µ = Population mean Sx = Standard error of the mean
  • 41. Independent t-test Independent t-test: the independent- measures t-test (or independent t-test) is used when measures from the two samples being compared do not come in matched pairs. It is used when groups are independent.
  • 42. Related Formula t = x1 – x2/√{s2 (1/n1 + 1/n2)} For an independent 2 sample t-test, it is important to know if the 2 samples have similar variances as we interpret data. The requirement for variance homogeneity test may be measure with Levine’s test. Results for this can be given in SPSS along with the t- test results.
  • 43. Assumption in 2 sample independence T-test 1.0 Normality: Assumes that the population distributions are normal. The t-test is quite robust over moderate violations of this assumption. It is especially robust if a two tailed test is used and if the sample sizes are not especially small. Check for normality by creating a histogram. 2.0 Independent Observations: The observations within each treatment condition must be independent.
  • 44. Assumption in 2 sample independence t-test cont. 3.0 Equal Variances: Assume that the population distributions have the same variance. This assumption is quite important (If it is violated, it makes the test’s averaging of the 2 variances meaningless). If it is violated, then use a modification of the t- test procedures as needed. See “Understanding the Output” in this section for how to check this with Levenes Test for Equality of Variances.
  • 45. Paired Sample T test The matched-pair t-test (or paired t-test or paired samples t-test or dependent t-test) is used when the data from the two groups can be presented in pairs, For example where the same people are being measured in before-and-after comparison or when the group is given two different tests at different times (e.g pleasantness of two different types of chocolate).
  • 46. Assumptions in paired sample t-test 1. The first assumption in the paired sample t-test is that only the matched pair can be used to perform the paired sample t-test. 2. In the paired sample t-test, normal distributions are assumed. 3. Variance in paired sample t-test: in a paired sample t- test, it is assumed that the variance of two sample is same. 4. The data is measurement data-interval/ratio 5. Independence of observation in paired sample t-test: in a paired sample t-test, observations must be independent of each other.
  • 47. Formula: t = d/ √ s2 /n Where: d bar is the mean difference between two samples; s2 is the sample variance, n is sample size and t is a paired sample t-test with n-1 degree of freedom
  • 48. ANOVA or Analysis of Variance So far we have discussed comparing the means of two populations to each other and comparing the population mean to another number. However, we often want to compare many populations to each other.
  • 49. ANOVA or Analysis of Variance Example: We may want to compare regeneration rates for three different tree species in northern Idaho. We would begin by taking samples from each population and then calculate the means from the three samples and make an inference about the population means from this. It is common since these three mean regeneration rates would all be different numbers however, this does not mean that there is a difference between the population means for the three tree types. To answer that question we can use a statistical test called an analysis of variance or ANOVA. This test is widely used in natural resources, and you are bound to come across it when reading scientific literature.
  • 50. The ANOVA Assumptions The use of an ANOVA assumes that: • All the populations are normally distributed (follow a bell shaped curve) • All the population variances are equal, • And all the samples were taken independently of each other and are randomly collected from their population. Generally, our null hypothesis when conducting an ANOVA is that all the population means are equal and our research (alternative) hypothesis will be that at least one of the population means is not equal.
  • 51. The ANOVA Assumptions Although an ANOVA is widely used and it does indicate that a population mean is different than others, it does not tell us which one is different from the others. Analysis of variance tests the null hypothesis that all the population means are equal: Formula: Ho: µ1 = µ2 = µ3……. = µa You can read more from Text books
  • 52. The ANOVA cont. • By comparing two estimates of variance (………) recall that ……….. is the variance within each of the “a” treatment populations.) one estimate (called the mean square error or MSE for short) is based on the variances within the samples. The MSE is an estimate of ………. Whether or not the null hypothesis is true. The second estimate (mean square between or MSB for short) is based on the variance of the sample means. The MSB is only an estimate of ………. If the null hypothesis is true. If the null hypothesis is false then MSB estimates something larger than ……… The logic by which analysis of variance tests the null hypothesis is as follows: if the null hypothesis is true, then MSE and MSB should be about the same since they are both estimates of the same quantity (…): however, if the null hypothesis is false then MSB can be expected to be larger than MSE since MSB is estimating a quantity larger than …….. • Therefore, if MSB is sufficiently larger than MSE, the null hypothesis can be rejected. If MSB is not sufficiently larger than MSE then the null hypothesis cannot be rejected. How much larger is sufficiently larger.
  • 53. END •Questions •Next Class •Assignments •AOB Prof. Joseph M. Keriko Principal, JKUAT - Nairobi Campus Professor of Organic Chemistry and EIA/EA Leader Expert P.O. Box 39125 – 00623 Nairobi Tel. 0722-915026 Email: kerikojm@yahoo.co.uk