SlideShare une entreprise Scribd logo
1  sur  22
Nora Galambos, PhD
Office of Institutional Research
        Stony Brook University
» What hypotheses are being tested?
» What types of analyses are planned to test the
  hypotheses?
» Look over the instrument and create a map or
  outline of possible analysis methods
» What is the magnitude of the differences you
  would like to detect?
» The most obvious reason for pilot testing is to
  be able to estimate the sample size.
» Find potential sources of bias
» Assists in power calculations
» Discover possible distribution problems prior to
  surveying the entire sample
» A Type I error occurs when a true null
  hypothesis is rejected. The probability of a
  Type I error is denoted by α, and is the
  significance level of the hypothesis test, with
  0.05 being a common value for α.

» On the other hand, a Type II error occurs when
  the null hypothesis is false and it is not
  rejected. A Type II error is denoted by β and is
  often set to 0.20.
True Results


Experimental Results        Ho is true                   Ho is false


Reject Ho              α (Type I error rate)            Power = 1 - β

Accept Ho                                         β (Type II error rate)
» Statistical Power Analysis for the Behavioral
  Sciences—Jacob Cohen
» The power of a significance test is the probability of
  rejecting a false null hypothesis, and is equal to 1 -
   β. If β is 0.20, the power = 0.80.
» 0.80 is generally considered to be adequate level
  for the power
» Since sample size and power are related, a small
  sample size results in less power, or reduced
  probability of rejecting a false null hypothesis.
d = 0.2, 0.5, 0.8 (small, medium, and large effects)

n (for each group)       0.2                  0.5                  0.8
        30               0.03                 0.24                 0.66
        40               0.04                 0.35                 0.82
        50               0.06                 0.45                 0.91
        60               0.07                 0.55               >0.995
        80               0.12                 0.82               >0.995
       100               0.29                 0.99               >0.995
       200               0.29               >0.995               >0.995
       500               0.72               >0.995               >0.995
» Missing Completely at Random (MCAR)
  ˃ Given two variables X and Y, the missingness is unrelated to either.
    The missing values in X are independent of Y and vice versa.
  ˃ If the data are MCAR, then listwise deletion is appropriate
» Missing at Random (MAR)
  ˃ Given two variables X and Y, the missingness is related to or
    dependent upon X, but not Y. Suppose X = age and Y = income and
    income is more often missing in certain age groups, but within each
    age group, no income group is missing more often that any
    others, then the data are MAR.
» Nonignorable
  ˃ Given two variables X and Y, the missingness is related to X, but may
    also be related to Y. In our age-income example, certain income
    groups within an age group may be less likely to respond.
» Select items with a missing percentage greater
  than 1% or 2%.
» Recode them into binary variables where with
  1=missing and 0=non-missing.
» Analyze these variables by the demographic
  variables using t-tests or chi-square, as
  appropriate.
» Significant results indicate that missingness is
  associated with one or more of the
  demographic variables.
» Used to uncover relationship patterns among a
  group of variables with the goal of reducing the
  variables to a smaller group
» Two types of data reduction methods--
  confirmatory and exploratory
» Exploratory factor analysis does not assume any
  particular structure prior to the analysis and is used
  to “explore” relationships between variables
» Confirmatory factor analysis is used to test
  hypotheses regarding the underlying structure of a
  group of variables
» Traditional factor analysis and principal
  components analysis are exploratory data
  reduction methods
» Principal components analysis a method often
  used for reducing the number of variables
» Principal components analysis is part of the
  factor analysis procedures in SAS and SPSS
» Although factor analysis (FA) and principal
  components analysis (PCA) have mathematical
  differences the results are often similar
» Many authors loosely use the term “factor
  analysis” to refer to data reduction methods, in
  general
» Finds groups that are correlated with each
  other, possibly measuring the same
  construct.
» Reduces the variables in the data to a
  smaller number of items that account for
  most of the variance of all of the variables in
  the data
» The first component accounts for the
  greatest amount of variance. Then second
  one accounts for the greatest amount not
  accounted for by the first component and is
  uncorrelated with the first component.
» Suggested sample size: at least 100 subjects
  and 10 observations per variable
» A correlation analysis of the variables should
  result in most correlations greater than 0.3
» Bartlett’s test of sphericity is significant (p <
  0.05)
» Kaiser-Meyer-Olkin (KMO) test of sampling
  adequacy ≥ 0.6
» Determinant >0.00001 which indicates that
  multicollinearity is not a problem
» In SPSS select principal components
  under “extraction method”
» Select varimax rotation.
  ˃A rotation uses a transformation to aid in the
   interpretation of the factor solution
  ˃A varimax rotation is orthogonal, so the components
   are uncorrelated, which maximizes the column
   variance
» Kaiser criterion—choose components with
  eigenvalues greater than one.
» Scree plot—plot of eigenvalues
  ˃ Retain the eigenvalues before the leveling off point of the plot.

» Want the proportion of variance accounted
  for by each factor (or component) to be 5%
  to 10%
» Cumulative variance accounted for should
  be 70% to 80%
Total Variance Explained
                        Initial Eigenvalues              Extraction Sums of Squared Loadings              Rotation Sums of Squared Loadings
Component    Total      % of Variance
                                    Cumulative %
                                              Total      % of Variance       Cumulative %      Total      % of Variance
                                                                                                                    Cumulative %
           1     14.26       47.53      47.53     14.26                47.53             47.53       7.22     24.06                     24.06
           2       2.55        8.49     56.02       2.55                8.49             56.02       5.79     19.31                     43.37
           3       1.37        4.56     60.58       1.37                4.56             60.58       4.41     14.70                     58.07
           4       1.09        3.64     64.22       1.09                3.64             64.22       1.84      6.15                     64.22
           5       0.98        3.26     67.48
           6       0.86        2.86     70.33
           7       0.80        2.67     73.00
           8       0.75        2.51     75.51
           9       0.68        2.25     77.76
          10       0.62        2.06     79.82
          11       0.58        1.93     81.75
          12       0.56        1.88     83.63
          13       0.49        1.64     85.27
          14       0.48        1.59     86.85
» There should be at least three items with
  significant loadings on each component
» Check the conceptualization of the component
  items
» With an orthogonal rotation the factor loadings
  = correlation between variable and component
» A communality is the proportion of variance in
  a variable that is accounted for by the retained
  components or factors. A communality is large
  if it loads heavily on at least one component.
» Factor score
  ˃Save the regression scores as variables
  ˃Standardize the survey responses
  ˃For each subject’s response, multiply the
   standardized survey response by the corresponding
   regression weights—add the results
» Factor-based score
  ˃Average the responses of the items in the
   component
  ˃Check for reverse codings and missing data.
» Cronbach’s Alpha is used to measure the
  reliability or the internal consistency of the
  factors or components.
» The variables in a scale are all entered into the
  calculation to obtain the alpha score.
» A Cronbach’s alpha > 0.7 is considered to be
  sufficient for demonstrating internal
  consistency for most social science research,
  while values > 0.6 are marginably acceptable

Contenu connexe

Tendances

Single sample z test - explain (final)
Single sample z test - explain (final)Single sample z test - explain (final)
Single sample z test - explain (final)CTLTLA
 
Test of hypothesis (t)
Test of hypothesis (t)Test of hypothesis (t)
Test of hypothesis (t)Marlon Gomez
 
Calculating a two sample z test by hand
Calculating a two sample z test by handCalculating a two sample z test by hand
Calculating a two sample z test by handKen Plummer
 
The implications of parameter independence in probabilistic sensitivity analy...
The implications of parameter independence in probabilistic sensitivity analy...The implications of parameter independence in probabilistic sensitivity analy...
The implications of parameter independence in probabilistic sensitivity analy...cheweb1
 
Calculating a single sample z test by hand
Calculating a single sample z test by handCalculating a single sample z test by hand
Calculating a single sample z test by handKen Plummer
 
What is a two sample z test?
What is a two sample z test?What is a two sample z test?
What is a two sample z test?Ken Plummer
 
Lect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_testLect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_testRione Drevale
 
The hypothesis process 2
The hypothesis process 2The hypothesis process 2
The hypothesis process 2Roger Gomez
 
hypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigmahypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigmavdheerajk
 
Hypothesis testing examples on z test
Hypothesis testing examples on z testHypothesis testing examples on z test
Hypothesis testing examples on z testJags Jagdish
 
Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...
Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...
Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...Alex J Mitchell
 
Presentation on Hypothesis Test by Ashik Amin Prem
Presentation on Hypothesis Test by Ashik Amin PremPresentation on Hypothesis Test by Ashik Amin Prem
Presentation on Hypothesis Test by Ashik Amin PremAshikAminPrem
 
Introduction to hypothesis testing ppt @ bec doms
Introduction to hypothesis testing ppt @ bec domsIntroduction to hypothesis testing ppt @ bec doms
Introduction to hypothesis testing ppt @ bec domsBabasab Patil
 
Calculating a single sample z test
Calculating a single sample z testCalculating a single sample z test
Calculating a single sample z testKen Plummer
 

Tendances (20)

Single sample z test - explain (final)
Single sample z test - explain (final)Single sample z test - explain (final)
Single sample z test - explain (final)
 
Test of hypothesis (t)
Test of hypothesis (t)Test of hypothesis (t)
Test of hypothesis (t)
 
Calculating a two sample z test by hand
Calculating a two sample z test by handCalculating a two sample z test by hand
Calculating a two sample z test by hand
 
The implications of parameter independence in probabilistic sensitivity analy...
The implications of parameter independence in probabilistic sensitivity analy...The implications of parameter independence in probabilistic sensitivity analy...
The implications of parameter independence in probabilistic sensitivity analy...
 
Calculating a single sample z test by hand
Calculating a single sample z test by handCalculating a single sample z test by hand
Calculating a single sample z test by hand
 
Normal distribution
Normal distribution  Normal distribution
Normal distribution
 
What is a two sample z test?
What is a two sample z test?What is a two sample z test?
What is a two sample z test?
 
Lect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_testLect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_test
 
Testing a Claim About a Mean
Testing a Claim About a MeanTesting a Claim About a Mean
Testing a Claim About a Mean
 
Depression
DepressionDepression
Depression
 
The hypothesis process 2
The hypothesis process 2The hypothesis process 2
The hypothesis process 2
 
hypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigmahypothesis testing-tests of proportions and variances in six sigma
hypothesis testing-tests of proportions and variances in six sigma
 
Stats chapter 12
Stats chapter 12Stats chapter 12
Stats chapter 12
 
Hypothesis testing examples on z test
Hypothesis testing examples on z testHypothesis testing examples on z test
Hypothesis testing examples on z test
 
Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...
Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...
Keynote - How Do Investigations in Psycho-oncology Inform Clinical Practice? ...
 
Presentation on Hypothesis Test by Ashik Amin Prem
Presentation on Hypothesis Test by Ashik Amin PremPresentation on Hypothesis Test by Ashik Amin Prem
Presentation on Hypothesis Test by Ashik Amin Prem
 
Biostatistics ii4june
Biostatistics ii4juneBiostatistics ii4june
Biostatistics ii4june
 
Emri sampling
Emri samplingEmri sampling
Emri sampling
 
Introduction to hypothesis testing ppt @ bec doms
Introduction to hypothesis testing ppt @ bec domsIntroduction to hypothesis testing ppt @ bec doms
Introduction to hypothesis testing ppt @ bec doms
 
Calculating a single sample z test
Calculating a single sample z testCalculating a single sample z test
Calculating a single sample z test
 

Similaire à Galambos N Analysis Of Survey Results

Looking at data
Looking at dataLooking at data
Looking at datapcalabri
 
Multiplicity, how to deal with the testing of more than one hypothesis.
Multiplicity, how to deal with the testing of more than one hypothesis.Multiplicity, how to deal with the testing of more than one hypothesis.
Multiplicity, how to deal with the testing of more than one hypothesis.Gaetan Lion
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Lesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.pptLesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.pptlokeshgupta130
 
Descriptive Analysis in Statistics
Descriptive Analysis in StatisticsDescriptive Analysis in Statistics
Descriptive Analysis in StatisticsAzmi Mohd Tamil
 
Factor Analysis for Exploratory Studies
Factor Analysis for Exploratory StudiesFactor Analysis for Exploratory Studies
Factor Analysis for Exploratory StudiesManohar Pahan
 
Receiver Operating Characteristic (ROC) curve analysis. 19.12
Receiver Operating Characteristic (ROC) curve analysis. 19.12Receiver Operating Characteristic (ROC) curve analysis. 19.12
Receiver Operating Characteristic (ROC) curve analysis. 19.12Kenisha S Russell Jonsson
 
RecSys2018論文読み会 資料
RecSys2018論文読み会 資料RecSys2018論文読み会 資料
RecSys2018論文読み会 資料Toshihiro Kamishima
 
Quantitative Analysis: Conducting, Interpreting, & Writing
Quantitative Analysis: Conducting, Interpreting, & WritingQuantitative Analysis: Conducting, Interpreting, & Writing
Quantitative Analysis: Conducting, Interpreting, & WritingStatistics Solutions
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppthabtamu biazin
 
An Introduction to Factor analysis ppt
An Introduction to Factor analysis pptAn Introduction to Factor analysis ppt
An Introduction to Factor analysis pptMukesh Bisht
 
The standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciencesThe standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciencesAbhi Manu
 
Factor analysis ppt
Factor analysis pptFactor analysis ppt
Factor analysis pptMukesh Bisht
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 

Similaire à Galambos N Analysis Of Survey Results (20)

Notes Chapter 4.pptx
Notes Chapter 4.pptxNotes Chapter 4.pptx
Notes Chapter 4.pptx
 
Looking at data
Looking at dataLooking at data
Looking at data
 
Multiplicity, how to deal with the testing of more than one hypothesis.
Multiplicity, how to deal with the testing of more than one hypothesis.Multiplicity, how to deal with the testing of more than one hypothesis.
Multiplicity, how to deal with the testing of more than one hypothesis.
 
Corrleation and regression
Corrleation and regressionCorrleation and regression
Corrleation and regression
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Lesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.pptLesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.ppt
 
Descriptive Analysis in Statistics
Descriptive Analysis in StatisticsDescriptive Analysis in Statistics
Descriptive Analysis in Statistics
 
Factor Analysis for Exploratory Studies
Factor Analysis for Exploratory StudiesFactor Analysis for Exploratory Studies
Factor Analysis for Exploratory Studies
 
Receiver Operating Characteristic (ROC) curve analysis. 19.12
Receiver Operating Characteristic (ROC) curve analysis. 19.12Receiver Operating Characteristic (ROC) curve analysis. 19.12
Receiver Operating Characteristic (ROC) curve analysis. 19.12
 
Ch1not prologueppt
Ch1not prologuepptCh1not prologueppt
Ch1not prologueppt
 
Ltc completed slides
Ltc completed slidesLtc completed slides
Ltc completed slides
 
RecSys2018論文読み会 資料
RecSys2018論文読み会 資料RecSys2018論文読み会 資料
RecSys2018論文読み会 資料
 
Quantitative Analysis: Conducting, Interpreting, & Writing
Quantitative Analysis: Conducting, Interpreting, & WritingQuantitative Analysis: Conducting, Interpreting, & Writing
Quantitative Analysis: Conducting, Interpreting, & Writing
 
Chapter37
Chapter37Chapter37
Chapter37
 
Logistic Regression.ppt
Logistic Regression.pptLogistic Regression.ppt
Logistic Regression.ppt
 
An Introduction to Factor analysis ppt
An Introduction to Factor analysis pptAn Introduction to Factor analysis ppt
An Introduction to Factor analysis ppt
 
The standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciencesThe standard normal curve & its application in biomedical sciences
The standard normal curve & its application in biomedical sciences
 
Factor analysis ppt
Factor analysis pptFactor analysis ppt
Factor analysis ppt
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Boston regulated bioanalysis
Boston regulated bioanalysisBoston regulated bioanalysis
Boston regulated bioanalysis
 

Galambos N Analysis Of Survey Results

  • 1. Nora Galambos, PhD Office of Institutional Research Stony Brook University
  • 2. » What hypotheses are being tested? » What types of analyses are planned to test the hypotheses? » Look over the instrument and create a map or outline of possible analysis methods » What is the magnitude of the differences you would like to detect?
  • 3. » The most obvious reason for pilot testing is to be able to estimate the sample size. » Find potential sources of bias » Assists in power calculations » Discover possible distribution problems prior to surveying the entire sample
  • 4.
  • 5. » A Type I error occurs when a true null hypothesis is rejected. The probability of a Type I error is denoted by α, and is the significance level of the hypothesis test, with 0.05 being a common value for α. » On the other hand, a Type II error occurs when the null hypothesis is false and it is not rejected. A Type II error is denoted by β and is often set to 0.20.
  • 6. True Results Experimental Results Ho is true Ho is false Reject Ho α (Type I error rate) Power = 1 - β Accept Ho β (Type II error rate)
  • 7. » Statistical Power Analysis for the Behavioral Sciences—Jacob Cohen » The power of a significance test is the probability of rejecting a false null hypothesis, and is equal to 1 - β. If β is 0.20, the power = 0.80. » 0.80 is generally considered to be adequate level for the power » Since sample size and power are related, a small sample size results in less power, or reduced probability of rejecting a false null hypothesis.
  • 8.
  • 9. d = 0.2, 0.5, 0.8 (small, medium, and large effects) n (for each group) 0.2 0.5 0.8 30 0.03 0.24 0.66 40 0.04 0.35 0.82 50 0.06 0.45 0.91 60 0.07 0.55 >0.995 80 0.12 0.82 >0.995 100 0.29 0.99 >0.995 200 0.29 >0.995 >0.995 500 0.72 >0.995 >0.995
  • 10. » Missing Completely at Random (MCAR) ˃ Given two variables X and Y, the missingness is unrelated to either. The missing values in X are independent of Y and vice versa. ˃ If the data are MCAR, then listwise deletion is appropriate » Missing at Random (MAR) ˃ Given two variables X and Y, the missingness is related to or dependent upon X, but not Y. Suppose X = age and Y = income and income is more often missing in certain age groups, but within each age group, no income group is missing more often that any others, then the data are MAR. » Nonignorable ˃ Given two variables X and Y, the missingness is related to X, but may also be related to Y. In our age-income example, certain income groups within an age group may be less likely to respond.
  • 11. » Select items with a missing percentage greater than 1% or 2%. » Recode them into binary variables where with 1=missing and 0=non-missing. » Analyze these variables by the demographic variables using t-tests or chi-square, as appropriate. » Significant results indicate that missingness is associated with one or more of the demographic variables.
  • 12. » Used to uncover relationship patterns among a group of variables with the goal of reducing the variables to a smaller group » Two types of data reduction methods-- confirmatory and exploratory » Exploratory factor analysis does not assume any particular structure prior to the analysis and is used to “explore” relationships between variables » Confirmatory factor analysis is used to test hypotheses regarding the underlying structure of a group of variables » Traditional factor analysis and principal components analysis are exploratory data reduction methods
  • 13. » Principal components analysis a method often used for reducing the number of variables » Principal components analysis is part of the factor analysis procedures in SAS and SPSS » Although factor analysis (FA) and principal components analysis (PCA) have mathematical differences the results are often similar » Many authors loosely use the term “factor analysis” to refer to data reduction methods, in general
  • 14. » Finds groups that are correlated with each other, possibly measuring the same construct. » Reduces the variables in the data to a smaller number of items that account for most of the variance of all of the variables in the data » The first component accounts for the greatest amount of variance. Then second one accounts for the greatest amount not accounted for by the first component and is uncorrelated with the first component.
  • 15. » Suggested sample size: at least 100 subjects and 10 observations per variable » A correlation analysis of the variables should result in most correlations greater than 0.3 » Bartlett’s test of sphericity is significant (p < 0.05) » Kaiser-Meyer-Olkin (KMO) test of sampling adequacy ≥ 0.6 » Determinant >0.00001 which indicates that multicollinearity is not a problem
  • 16. » In SPSS select principal components under “extraction method” » Select varimax rotation. ˃A rotation uses a transformation to aid in the interpretation of the factor solution ˃A varimax rotation is orthogonal, so the components are uncorrelated, which maximizes the column variance
  • 17. » Kaiser criterion—choose components with eigenvalues greater than one. » Scree plot—plot of eigenvalues ˃ Retain the eigenvalues before the leveling off point of the plot. » Want the proportion of variance accounted for by each factor (or component) to be 5% to 10% » Cumulative variance accounted for should be 70% to 80%
  • 18. Total Variance Explained Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Component Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative % 1 14.26 47.53 47.53 14.26 47.53 47.53 7.22 24.06 24.06 2 2.55 8.49 56.02 2.55 8.49 56.02 5.79 19.31 43.37 3 1.37 4.56 60.58 1.37 4.56 60.58 4.41 14.70 58.07 4 1.09 3.64 64.22 1.09 3.64 64.22 1.84 6.15 64.22 5 0.98 3.26 67.48 6 0.86 2.86 70.33 7 0.80 2.67 73.00 8 0.75 2.51 75.51 9 0.68 2.25 77.76 10 0.62 2.06 79.82 11 0.58 1.93 81.75 12 0.56 1.88 83.63 13 0.49 1.64 85.27 14 0.48 1.59 86.85
  • 19.
  • 20. » There should be at least three items with significant loadings on each component » Check the conceptualization of the component items » With an orthogonal rotation the factor loadings = correlation between variable and component » A communality is the proportion of variance in a variable that is accounted for by the retained components or factors. A communality is large if it loads heavily on at least one component.
  • 21. » Factor score ˃Save the regression scores as variables ˃Standardize the survey responses ˃For each subject’s response, multiply the standardized survey response by the corresponding regression weights—add the results » Factor-based score ˃Average the responses of the items in the component ˃Check for reverse codings and missing data.
  • 22. » Cronbach’s Alpha is used to measure the reliability or the internal consistency of the factors or components. » The variables in a scale are all entered into the calculation to obtain the alpha score. » A Cronbach’s alpha > 0.7 is considered to be sufficient for demonstrating internal consistency for most social science research, while values > 0.6 are marginably acceptable