SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
SAMPLING THEORY
Sampling Theory
Two ways of collection of statistical data:
      1.Complete Enumeration (or Census)
     2. Sample Survey
 Population (or Universe):
  Totality of statistical data forming a
   subject of investigation.
  Sample :
  Portion of population which is examined
   with a view to estimating the
   characteristics of population.
Methods of Sampling

1.Simple Random Sampling
a) [Simple Random Sampling without replacement]
b) [Simple Random Sampling with replacement]

2. Systematic Sampling

3.Stratified Sampling

4.Cluster Sampling

5.Quota Sampling

6. Purposive Sampling ( or Judgment Sampling)
Some Important terms associated with sampling
Parameter : A characteristic of a population based on
  all the units of the population.
Statistics: A statistical measure of sample
  observation and as such it is a function of sample
  observations.
Statistical inferences are drawn about population
  values i. e. parameters based on the sample
  observations i.e. statistics. Usually the following
  notations are used:
       Measure             Parameter        Statistic
         Mean                   μ               X
        Proportion              P                p
    Standard deviation          σ               s
Sampling Distribution:

Starting with a population of N units, we can draw
  many samples of a fixed size n. In case of
  sampling with replacement, the total number of
  samples that can be drawn is Nn and when
  sampling is without replacement, the total
  number of samples that can be drawn is NCn.
If it is possible to obtain the values of a statistic
  (t) from all possible samples of a fixed sample
  size along with corresponding probabilities, then
  we can arrange these values of statistics
  (treating them as random variables ) , in the form
  of probability distribution. Such a probability
  Distribution is called Sampling Distribution.
Basic Statistical Laws:
1. Law of Statistical Regularity:- It states that a
  reasonably large number of items selected at random
  from a large group of items, will on the average represent
  the characteristics of the group.

2. Law of Inertia of Large Number: It states that large
  groups of data show high degree of stability because
  there is a greater possibility that one side are
  compensated by the extremes on the other side.

3. Central Limit Theorem : If x1, x2, x3, …….. xn is a random
  sample of size n drawn from any population (having mean µ
  and variance σ2), then the distribution sample mean (x) is
  normally distributed with mean µ and variance σ2/n,
  provided n is sufficiently large, i.e. n→∞, where µ and σ2
  respectively are population mean and variance.
The Mean of the statistic is called ‘Expectation’ and standard
  deviation of statistic t is called Standard Error.

Standard Errors (S.E.) of common statistics:

             Statistic                     Standard Error(S.E.)
1.Single Mean (x) :                             σ/√n
2.Differences of Means (x-y) :           √[σ’ 2/n’ + σ”2/n” ]
3. Single Proportion (p) :                     √[PQ/n]
4. Differences of proportion (p’-p”):    √[PQ(1/ n’ +1/ n”]

The factor √[N-n / N-1] is known as finite population
  correction factor (fpc)
This is ignored for large population. It is used when n/N is
  greater than 0.05
Examples :
1. A simple random sample of size 36 is drawn from
   finite population consisting of 101 units. If the
   population S.D. is 12.6, find the standard error
   of sample mean when the sample is drawn
(a) with replacement           (b) without
   replacement .
                                   [Ans: a) 2.1 b) 1.69]

2. A random sample of 500 oranges was taken
  from a large consignment and 65 were found to
  be defective. Show that the S.E. of the
  proportion of bad ones in a sample of this size is
  0.015.
Theory of Estimation :

Point Estimation ; When a single sample value (t) is used to
  estimate parameter (θ), is called point estimation.

Interval Estimation: Instead of estimating parameter θ by a
   single value, an interval of values is defined. It specifies two
   values that contains unknown parameter.
i.e. P ( t ’≤ θ ≤ t” ) = 1 – α. Then [ t’ , t” ] is called confidence
   interval.
     α is called level of significance e.g. 5% or 1% l.o.s.
     1 – α is called confidence level e.g. 95% or 99% .
Confidence Level
The confidence level is the probability value associated with a
   confidence interval.
It is often expressed as a percentage. For example, say , then
   the confidence level is equal to (1-0.05) = 0.95, i.e. a 95%
   confidence level.
Determination of sample size for Mean :

The following factors must be known:
i) The desired confidence level.
ii) The permissible sampling error E = x - µ.
iii) The standard deviation σ.

The size of sample mean n is given by

            n = ( σ Z / E )2 .
Determination of sample size for
   Proportion:

The following factors must be known:
i) The desired confidence level.
ii) The permissible sampling error E = P - p.
iii) The estimated true proportion of success.

The size of sample mean n is given by

            n = ( Z2pq / E 2 ).   Where q = 1-p
Problems:
1. It is known that the population standard deviation in waiting time
    for L.P.G. gas cylinder in Delhi is 15 days. How large a sample
    should be chosen to be 95% confident, the waiting time is within
    7 days of true average.                                       [18]

2. A manufacturing concern wants to estimate the average amount
   of purchase of its product in a month by the customers whose
   standard error is Rs.10. Find the sample size if the maximum
   error is not to exceed Rs.3 with a probability of 0.99    [74]

3. The business manager of a large company wants to check the
   inventory records against the physical inventories by a sample
   survey. He wants to almost assure that maximum sampling error
   should not be more than 5% or below the true proportion of
   accurate records. The proportion of accurate records is
   estimated as 35% from past experience. Determine the sample
   size.                                                     [819]
                           ************
• Standard deviation and confidence
  intervals
If t is statistic then

   95% confidence interval is given by
                  [ t ± 1.96 S.E.of t]
   99% confidence interval is given by
                  [ t ± 2.58 S.E.of t]
There are five ingredients to any statistical
  test :

(a) Null Hypothesis (Ho)

(b) Alternate Hypothesis

(c) Test Statistic

(d) Rejection/Critical Region or Acceptance of Ho

(e) Conclusion
Null Hypothesis
H0: there is no significant difference between the
 two values (i. e. statistic and parameter or two
 sample values)



Alternative hypothesis
H1: The above difference is significant
[the statement to be accepted if the null is
  rejected ]
Type I Error
In a hypothesis test, a type I error occurs when
the null hypothesis is rejected when in fact it is
true; that is, Ho is wrongly rejected.
P(type I error) = significance level = 1 – α.
Type I error = ( Reject Ho / Ho is true)

Type II Error
In a hypothesis test, a type II error occurs when
the null hypothesis Ho, is not rejected when in fact
  it is false .
Type II error = ( Accept Ho / Ho is not true)
Decision
         Reject Ho           Accept Ho
Truth Ho Type I Error      Right decision
      H1 Right decision   Type II Error

P(RejectHo/Ho is true) = Type I Error
                        = Level of significance
                      (Producer’s risk)
P(AcceptHo/Ho is not true) = Type II Error
                      (Consumer’s risk)
A type I error is often considered to be more serious, and
  therefore more important to avoid, than a type II error.
• One tailed test : Here the alternate hypothesis HA is one-
  sided and we test whether the test statistic falls in the critical
  region on only one side of the distribution




• Two tailed test : Here the alternate hypothesis HA is
  formulated to test for difference in either direction
Common test statistics

        Name                            Formula

1.One-sample z-test




2.Two-sample z-test



3.One-proportion z-test



4.Two-proportion z-test,
Critical Value(s)

The critical value(s) for a hypothesis test is a
  threshold to which the value of the test statistic
  in a sample is compared to determine whether or
  not the null hypothesis is rejected.
For Normal Tests:
Critical value (Ztable)  Level of Significance
                         1%               5%
Two tailed test         2.58              1.96
One tailed test         2.33              1.645
Decision:

*If modulus of the computed value of Z is less
  than table value of Z, then Accept Null
  Hypothesis Ho.
 i.e. Calculated |z| < Table z then Accept Ho

*If modulus of the computed value of Z is greater
  than table value of Z, then Reject Null
  Hypothesis Ho.
  i.e. Calculated |z| > Table z then Reject Ho
Steps in Hypothesis Testing
1. Identify the null hypothesis Ho and the alternate hypothesis H A.

2. Choose 1- α (level of significance). The value should be small, usually
   less than 10%. It is important to consider the consequences of both
   types of errors.

3. Select the test statistic and determine its value from the sample
   data. This value is called the observed value of the test statistic.

4. Compare the observed value of the statistic to the critical value
   obtained for the chosen l.o.s..

5. Make a decision. :
  -If the test statistic falls in the critical region:
   Reject Ho in favour of H1.
  -If the test statistic does not fall in the critical region:
   Conclude that there is not enough evidence to reject Ho.
Chi Square Goodness of Fit
(One Sample Test)
This test allows us to compare a collection of categorical
  data with some theoretical expected distribution.
Ho: There is no considerable difference between observed
  value and theoretical value.
H1: The difference is significant

Chi Square Test of Independence
For a contingency table that has r rows and c columns, the
  chi square test can be thought of as a test of
  independence. In a test of independence the null and
  alternative hypotheses are:
Ho: The two categorical variables are independent.
H1: The two categorical variables are related.
Calculate the chi square statistic x2 by completing
  the following steps:
1.For each observed number in the table subtract
  the corresponding expected number (O — E).

2.Square the difference [ (O —E)2 ].

3.Divide the squares obtained for each cell in the
  table by the expected number for that cell
 [ (O - E)2 / E ].

4.Sum all the values for (O - E)2 / E. This is the chi
  square statistic .
Example . Incidence of three types of malaria in three tropical regions.
               Asia     Africa     South America        Totals 
   Malaria A    31        14            45                90 
   Malaria B    2          5            53                60
   Malaria C   53          45           2                100
   Totals      86          64          100                250 

Solution: We now   set up the following table
     Observed      Expected |O -E|         (O — E) 2    (O — E)2/E
         31          30.96       0.04         0.0016      0.0000516 
        14           23.04        9.04        81.72      3.546 
        45           36.00       9.00         81.00       2.25 
         2           20.64      18.64       347.45        16.83 
         5           15.36      10.36      107.33          6.99 
        53           24.00      29.00      841.00         35.04 
        53           34.40      18.60      345.96         10.06
        45           25.60       19.40      376.36         14.70
          2          40.00       38.00  1444.00            36.10
Test Statistic:




Chi Square = 125.516(Calculated value)
Degrees of Freedom = (c - 1)(r - 1) = 2(2) = 4

Reject Ho because 125.516 is greater than
 9.488 (for alpha 5% l.o.s.)(Table value)
Oneway analysis of variance



If the variances in the groups (treatments) are
  similar, we can divide the variation of the
  observations into
the variation of the groups (variation of the means)
  and
the variation in the groups. The variation is
  measured with the sum of the squares
Analysis of Variance (By Coding Method)
Steps in Short Cut Method
1.Set the null hypothesis Ho & Alternate hypothesis H1
2. Steps of computing test statistic
i] Find the sum of all the values of all the items of all the
   samples (T)
ii] Compute the correction factor C = square of T / N
    N – the total number of observations of all the samples.
iii] Find sum of squares of all the items of all the samples.
iv] Find the total sum of squares SST [ Total in (iii) – C]
v] Find sum of squares between the samples SSC.
    [Square the totals of the sample total ,divide by no. of
   elements in that samples & subtract C from it.]
vi] Set up ANOVA table and calculate F, which is the test
   statistic.
vii] If calculated F is less than table F , Accept Ho otherwise
   Reject Ho.
ANOVA Table
  Source of   Sum of    d.o.f.   Mean Squares       F
  variation   squares

 Between      SSC        c-1     MSC=SSC/c-1
 Samples

 Within       SSE       c(r-1)   MSE=SSE/ c(r-1)    MSC/MS
Samples                                            [orMSE/MSC]
                                                    (As F ratio is
                                                     greater than 1)
Total         SST        cr-1        -
Sampling theory
Sampling theory
Sampling theory

Contenu connexe

Tendances

Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.pptNursing Path
 
The sampling distribution
The sampling distributionThe sampling distribution
The sampling distributionHarve Abella
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: EstimationParag Shah
 
Statistical inference
Statistical inferenceStatistical inference
Statistical inferenceJags Jagdish
 
Lecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysisLecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysisDr Rajeev Kumar
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distributionswarna dey
 
PROCEDURE FOR TESTING HYPOTHESIS
PROCEDURE FOR   TESTING HYPOTHESIS PROCEDURE FOR   TESTING HYPOTHESIS
PROCEDURE FOR TESTING HYPOTHESIS Sundar B N
 
descriptive and inferential statistics
descriptive and inferential statisticsdescriptive and inferential statistics
descriptive and inferential statisticsMona Sajid
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Harve Abella
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysissristi1992
 
SAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORSSAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORSrambhu21
 

Tendances (20)

Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
 
Testofhypothesis
TestofhypothesisTestofhypothesis
Testofhypothesis
 
Binomial distribution
Binomial distributionBinomial distribution
Binomial distribution
 
The sampling distribution
The sampling distributionThe sampling distribution
The sampling distribution
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
 
Statistical inference
Statistical inferenceStatistical inference
Statistical inference
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Standard error
Standard error Standard error
Standard error
 
Lecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysisLecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysis
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
PROCEDURE FOR TESTING HYPOTHESIS
PROCEDURE FOR   TESTING HYPOTHESIS PROCEDURE FOR   TESTING HYPOTHESIS
PROCEDURE FOR TESTING HYPOTHESIS
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Errors and types
Errors and typesErrors and types
Errors and types
 
descriptive and inferential statistics
descriptive and inferential statisticsdescriptive and inferential statistics
descriptive and inferential statistics
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
 
Sampling and its types
Sampling and its typesSampling and its types
Sampling and its types
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysis
 
SAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORSSAMPLING AND SAMPLING ERRORS
SAMPLING AND SAMPLING ERRORS
 

Similaire à Sampling theory

1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inferenceDev Pandey
 
Testing a claim about a standard deviation or variance
Testing a claim about a standard deviation or variance  Testing a claim about a standard deviation or variance
Testing a claim about a standard deviation or variance Long Beach City College
 
10. sampling and hypotehsis
10. sampling and hypotehsis10. sampling and hypotehsis
10. sampling and hypotehsisKaran Kukreja
 
Testing hypothesis
Testing hypothesisTesting hypothesis
Testing hypothesisAmit Sharma
 
Basics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyBasics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyParag Shah
 
Statistics practice for finalBe sure to review the following.docx
Statistics practice for finalBe sure to review the following.docxStatistics practice for finalBe sure to review the following.docx
Statistics practice for finalBe sure to review the following.docxdessiechisomjj4
 
unit-2.2 and 2.3.pptx
unit-2.2 and 2.3.pptxunit-2.2 and 2.3.pptx
unit-2.2 and 2.3.pptxvishnupavan8
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...The Stockker
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminardrdeepika87
 

Similaire à Sampling theory (20)

STATISTIC ESTIMATION
STATISTIC ESTIMATIONSTATISTIC ESTIMATION
STATISTIC ESTIMATION
 
Unit 3
Unit 3Unit 3
Unit 3
 
Two Proportions
Two Proportions  Two Proportions
Two Proportions
 
1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference1192012 155942 f023_=_statistical_inference
1192012 155942 f023_=_statistical_inference
 
Testing a claim about a standard deviation or variance
Testing a claim about a standard deviation or variance  Testing a claim about a standard deviation or variance
Testing a claim about a standard deviation or variance
 
Basics of Hypothesis Testing
Basics of Hypothesis Testing  Basics of Hypothesis Testing
Basics of Hypothesis Testing
 
hypothesis.pptx
hypothesis.pptxhypothesis.pptx
hypothesis.pptx
 
Day 3 SPSS
Day 3 SPSSDay 3 SPSS
Day 3 SPSS
 
10. sampling and hypotehsis
10. sampling and hypotehsis10. sampling and hypotehsis
10. sampling and hypotehsis
 
Testing hypothesis
Testing hypothesisTesting hypothesis
Testing hypothesis
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
Basics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyBasics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for Pharmacy
 
Testing a claim about a mean
Testing a claim about a mean  Testing a claim about a mean
Testing a claim about a mean
 
Statistics practice for finalBe sure to review the following.docx
Statistics practice for finalBe sure to review the following.docxStatistics practice for finalBe sure to review the following.docx
Statistics practice for finalBe sure to review the following.docx
 
unit-2.2 and 2.3.pptx
unit-2.2 and 2.3.pptxunit-2.2 and 2.3.pptx
unit-2.2 and 2.3.pptx
 
Goodness of fit (ppt)
Goodness of fit (ppt)Goodness of fit (ppt)
Goodness of fit (ppt)
 
312320.pptx
312320.pptx312320.pptx
312320.pptx
 
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
 

Sampling theory

  • 2. Sampling Theory Two ways of collection of statistical data: 1.Complete Enumeration (or Census) 2. Sample Survey Population (or Universe): Totality of statistical data forming a subject of investigation. Sample : Portion of population which is examined with a view to estimating the characteristics of population.
  • 3. Methods of Sampling 1.Simple Random Sampling a) [Simple Random Sampling without replacement] b) [Simple Random Sampling with replacement] 2. Systematic Sampling 3.Stratified Sampling 4.Cluster Sampling 5.Quota Sampling 6. Purposive Sampling ( or Judgment Sampling)
  • 4. Some Important terms associated with sampling Parameter : A characteristic of a population based on all the units of the population. Statistics: A statistical measure of sample observation and as such it is a function of sample observations. Statistical inferences are drawn about population values i. e. parameters based on the sample observations i.e. statistics. Usually the following notations are used: Measure Parameter Statistic Mean μ X Proportion P p Standard deviation σ s
  • 5. Sampling Distribution: Starting with a population of N units, we can draw many samples of a fixed size n. In case of sampling with replacement, the total number of samples that can be drawn is Nn and when sampling is without replacement, the total number of samples that can be drawn is NCn. If it is possible to obtain the values of a statistic (t) from all possible samples of a fixed sample size along with corresponding probabilities, then we can arrange these values of statistics (treating them as random variables ) , in the form of probability distribution. Such a probability Distribution is called Sampling Distribution.
  • 6. Basic Statistical Laws: 1. Law of Statistical Regularity:- It states that a reasonably large number of items selected at random from a large group of items, will on the average represent the characteristics of the group. 2. Law of Inertia of Large Number: It states that large groups of data show high degree of stability because there is a greater possibility that one side are compensated by the extremes on the other side. 3. Central Limit Theorem : If x1, x2, x3, …….. xn is a random sample of size n drawn from any population (having mean µ and variance σ2), then the distribution sample mean (x) is normally distributed with mean µ and variance σ2/n, provided n is sufficiently large, i.e. n→∞, where µ and σ2 respectively are population mean and variance.
  • 7. The Mean of the statistic is called ‘Expectation’ and standard deviation of statistic t is called Standard Error. Standard Errors (S.E.) of common statistics: Statistic Standard Error(S.E.) 1.Single Mean (x) : σ/√n 2.Differences of Means (x-y) : √[σ’ 2/n’ + σ”2/n” ] 3. Single Proportion (p) : √[PQ/n] 4. Differences of proportion (p’-p”): √[PQ(1/ n’ +1/ n”] The factor √[N-n / N-1] is known as finite population correction factor (fpc) This is ignored for large population. It is used when n/N is greater than 0.05
  • 8. Examples : 1. A simple random sample of size 36 is drawn from finite population consisting of 101 units. If the population S.D. is 12.6, find the standard error of sample mean when the sample is drawn (a) with replacement (b) without replacement . [Ans: a) 2.1 b) 1.69] 2. A random sample of 500 oranges was taken from a large consignment and 65 were found to be defective. Show that the S.E. of the proportion of bad ones in a sample of this size is 0.015.
  • 9. Theory of Estimation : Point Estimation ; When a single sample value (t) is used to estimate parameter (θ), is called point estimation. Interval Estimation: Instead of estimating parameter θ by a single value, an interval of values is defined. It specifies two values that contains unknown parameter. i.e. P ( t ’≤ θ ≤ t” ) = 1 – α. Then [ t’ , t” ] is called confidence interval. α is called level of significance e.g. 5% or 1% l.o.s. 1 – α is called confidence level e.g. 95% or 99% . Confidence Level The confidence level is the probability value associated with a confidence interval. It is often expressed as a percentage. For example, say , then the confidence level is equal to (1-0.05) = 0.95, i.e. a 95% confidence level.
  • 10. Determination of sample size for Mean : The following factors must be known: i) The desired confidence level. ii) The permissible sampling error E = x - µ. iii) The standard deviation σ. The size of sample mean n is given by n = ( σ Z / E )2 .
  • 11. Determination of sample size for Proportion: The following factors must be known: i) The desired confidence level. ii) The permissible sampling error E = P - p. iii) The estimated true proportion of success. The size of sample mean n is given by n = ( Z2pq / E 2 ). Where q = 1-p
  • 12. Problems: 1. It is known that the population standard deviation in waiting time for L.P.G. gas cylinder in Delhi is 15 days. How large a sample should be chosen to be 95% confident, the waiting time is within 7 days of true average. [18] 2. A manufacturing concern wants to estimate the average amount of purchase of its product in a month by the customers whose standard error is Rs.10. Find the sample size if the maximum error is not to exceed Rs.3 with a probability of 0.99 [74] 3. The business manager of a large company wants to check the inventory records against the physical inventories by a sample survey. He wants to almost assure that maximum sampling error should not be more than 5% or below the true proportion of accurate records. The proportion of accurate records is estimated as 35% from past experience. Determine the sample size. [819] ************
  • 13. • Standard deviation and confidence intervals
  • 14. If t is statistic then 95% confidence interval is given by [ t ± 1.96 S.E.of t] 99% confidence interval is given by [ t ± 2.58 S.E.of t]
  • 15. There are five ingredients to any statistical test : (a) Null Hypothesis (Ho) (b) Alternate Hypothesis (c) Test Statistic (d) Rejection/Critical Region or Acceptance of Ho (e) Conclusion
  • 16. Null Hypothesis H0: there is no significant difference between the two values (i. e. statistic and parameter or two sample values) Alternative hypothesis H1: The above difference is significant [the statement to be accepted if the null is rejected ]
  • 17. Type I Error In a hypothesis test, a type I error occurs when the null hypothesis is rejected when in fact it is true; that is, Ho is wrongly rejected. P(type I error) = significance level = 1 – α. Type I error = ( Reject Ho / Ho is true) Type II Error In a hypothesis test, a type II error occurs when the null hypothesis Ho, is not rejected when in fact it is false . Type II error = ( Accept Ho / Ho is not true)
  • 18. Decision Reject Ho Accept Ho Truth Ho Type I Error Right decision H1 Right decision Type II Error P(RejectHo/Ho is true) = Type I Error = Level of significance (Producer’s risk) P(AcceptHo/Ho is not true) = Type II Error (Consumer’s risk) A type I error is often considered to be more serious, and therefore more important to avoid, than a type II error.
  • 19. • One tailed test : Here the alternate hypothesis HA is one- sided and we test whether the test statistic falls in the critical region on only one side of the distribution • Two tailed test : Here the alternate hypothesis HA is formulated to test for difference in either direction
  • 20. Common test statistics Name Formula 1.One-sample z-test 2.Two-sample z-test 3.One-proportion z-test 4.Two-proportion z-test,
  • 21. Critical Value(s) The critical value(s) for a hypothesis test is a threshold to which the value of the test statistic in a sample is compared to determine whether or not the null hypothesis is rejected. For Normal Tests: Critical value (Ztable) Level of Significance 1% 5% Two tailed test 2.58 1.96 One tailed test 2.33 1.645
  • 22. Decision: *If modulus of the computed value of Z is less than table value of Z, then Accept Null Hypothesis Ho. i.e. Calculated |z| < Table z then Accept Ho *If modulus of the computed value of Z is greater than table value of Z, then Reject Null Hypothesis Ho. i.e. Calculated |z| > Table z then Reject Ho
  • 23. Steps in Hypothesis Testing 1. Identify the null hypothesis Ho and the alternate hypothesis H A. 2. Choose 1- α (level of significance). The value should be small, usually less than 10%. It is important to consider the consequences of both types of errors. 3. Select the test statistic and determine its value from the sample data. This value is called the observed value of the test statistic. 4. Compare the observed value of the statistic to the critical value obtained for the chosen l.o.s.. 5. Make a decision. : -If the test statistic falls in the critical region: Reject Ho in favour of H1. -If the test statistic does not fall in the critical region: Conclude that there is not enough evidence to reject Ho.
  • 24. Chi Square Goodness of Fit (One Sample Test) This test allows us to compare a collection of categorical data with some theoretical expected distribution. Ho: There is no considerable difference between observed value and theoretical value. H1: The difference is significant Chi Square Test of Independence For a contingency table that has r rows and c columns, the chi square test can be thought of as a test of independence. In a test of independence the null and alternative hypotheses are: Ho: The two categorical variables are independent. H1: The two categorical variables are related.
  • 25. Calculate the chi square statistic x2 by completing the following steps: 1.For each observed number in the table subtract the corresponding expected number (O — E). 2.Square the difference [ (O —E)2 ]. 3.Divide the squares obtained for each cell in the table by the expected number for that cell [ (O - E)2 / E ]. 4.Sum all the values for (O - E)2 / E. This is the chi square statistic .
  • 26. Example . Incidence of three types of malaria in three tropical regions.   Asia Africa South America Totals  Malaria A 31 14 45 90  Malaria B 2 5 53 60  Malaria C 53 45 2 100 Totals  86 64 100 250  Solution: We now set up the following table   Observed Expected |O -E|  (O — E) 2  (O — E)2/E 31  30.96  0.04  0.0016  0.0000516  14  23.04  9.04 81.72 3.546  45  36.00  9.00 81.00 2.25  2  20.64  18.64 347.45 16.83  5  15.36  10.36 107.33 6.99  53  24.00  29.00 841.00 35.04  53  34.40  18.60 345.96 10.06 45  25.60  19.40 376.36 14.70  2  40.00  38.00  1444.00 36.10
  • 27. Test Statistic: Chi Square = 125.516(Calculated value) Degrees of Freedom = (c - 1)(r - 1) = 2(2) = 4 Reject Ho because 125.516 is greater than 9.488 (for alpha 5% l.o.s.)(Table value)
  • 28. Oneway analysis of variance If the variances in the groups (treatments) are similar, we can divide the variation of the observations into the variation of the groups (variation of the means) and the variation in the groups. The variation is measured with the sum of the squares
  • 29.
  • 30. Analysis of Variance (By Coding Method) Steps in Short Cut Method 1.Set the null hypothesis Ho & Alternate hypothesis H1 2. Steps of computing test statistic i] Find the sum of all the values of all the items of all the samples (T) ii] Compute the correction factor C = square of T / N N – the total number of observations of all the samples. iii] Find sum of squares of all the items of all the samples. iv] Find the total sum of squares SST [ Total in (iii) – C] v] Find sum of squares between the samples SSC. [Square the totals of the sample total ,divide by no. of elements in that samples & subtract C from it.] vi] Set up ANOVA table and calculate F, which is the test statistic. vii] If calculated F is less than table F , Accept Ho otherwise Reject Ho.
  • 31. ANOVA Table Source of Sum of d.o.f. Mean Squares F variation squares Between SSC c-1 MSC=SSC/c-1 Samples Within SSE c(r-1) MSE=SSE/ c(r-1) MSC/MS Samples [orMSE/MSC] (As F ratio is greater than 1) Total SST cr-1 -