SlideShare une entreprise Scribd logo
1  sur  31
DATA SCREENING
Wei-Jiun, Shen Ph. D.
Anything that can
go wrong will go
wrong
Why do we need to
screen data?
Purpose
 Detect and correct data errors
 Detect and treat missing data
 Detect and handle insufficiently sampled variables
 Conduct transformations and standardizations
 Detect and handle outliers
First concern
 Accuracy of data file
 Descriptive statistics
 Graphic representations
 Honest correlations
 Missing data
 Pattern or amount
 Random or not
 Outliers
MISSING DATA
“blank” part in data set
Why is missing data a problem?
 Systematical problem
 Bias sampling
 Demographic variables
 Inappropriate measuring procedure
 Behavioral items
 Insufficient amount for analysis
 Small sample
 Misleading research results
 Biased data in, _______ out
Probability distribution of missingness
 Consider the probability of missingness
 Are certain groups more likely to have missing values?
 Respondents in female less likely to report age?
 Are certain responses more likely to be missing?
 Respondents with high SPA less likely to report anxiety?
 Certain analysis methods assume a certain probability
distribution
Missing completely at random (MCAR)
 Missing data is independent of any other
measured variable (y2) and independent of the
variable itself (y1)
 I.e., SES=y2; depression=y1.
 If participants dropped out across a range of SES
levels, then the missing on depression would be
independent of SES
 Little’s MCAR test in MVA indicates whether MCAR
or not (want ns)
Missing at random (MAR)
 Missing data may be dependent on another
measured variable (y2), but is independent of the
variable itself (y1).
 I.e., SES=y2; depression=y1.
 If participants only from high levels of SES dropped
out , then the missing on depression would be
dependent on SES. SES.
 MAR can be inferred if Little’s test is significant but
missingness predictable from other vars (other than
the variable itself) –tested by Separate Variance Test.
MNAR indicated if this test reveals missingness
related to the DV
Treatment for missing data
 Deleting cases or variables
 Descriptive statistics
 Estimating missing data
 Using missing data correlation matrix
 Treating missing data as data
 Repeating analyses with and without missing data
 Choosing among methods for dealing with
missing data
 Pattern or amount
Deletion or preservation?
 Deletion
 <5%
 MCAR/MAR
 Preservation
 MNAR
 Small sample
 Replacement
 Mean (grand or group)
 Regression (predict missing value by other IVs)
 Expectation Maximization (form missing data r matrix by
assumed distribution)
OUTLIER
Cases with extreme value on variables
Why is outlier a problem?
 Systematical problem
 Bias sampling
 Wrong population
 Statistical problem
 ↑error variance
 ↓statistical power
 ↑typeⅠ, Ⅱ error
 ↓normality
 Misleading research results
 Biased data in, _______ out
Influence of outlier
 Leverage × discrepancy
Treatment for outlier
 Estimating outlier
 Standardized score (z>2, 2.5, 3)
 Graphical methods (p-p, q-q plot)
 Mahalanobis distance (χ2 test)
 Deletion or transformation
 Critical to analysis or not
 Preservation
 Transformation
 Score alternation
NORMALITY,
LINEARITY &
HOMOSCEDASTICITY
Basic assumption
Key assumptions in GLM
 Normality
 Linearity
 Homogeneity of variance
 Interval level data
 Independence of observations
Normality
 Normal distribution
Test for normality
 Skewness & Kurtosis
Test for normality
 T-test for skewness & kurtosis score
 Kolmogorov-Smirnov test & Shaprio-wilk test
Z=
𝑠−0
𝑠 𝑠/𝑘
w=
( 𝑖=1
𝑛
𝑎 𝑖 𝑥 𝑖)
2
𝑖=1
𝑛
(𝑥 𝑖−𝐴)
2
Test for normality
 Plotting cumulative distribution function
Test for normality
 P-P plot (probability) & Q-Q plot (quantile)
Linearity
 Straight-line relationship between 2 variables
Homoscedasticity
 Homogeneity of variance
 Homogeneity of variance-covariance matrix
Homoscedasticity
 Residual
COMMON DATA
TRANSFORMATIONS
Data transformations
Directio
n
Skewness Treatment
+
Moderate New X = SQRT (X)
Substantial New X = LG10 (X)
Substantial with zero New X = LG10 (X+C)
Severe New X = 1/X
L-shaped with zero New X = 1 (X+C)
-
Moderate New X = SQRT (K-X)
Substantial New X = LG10 (K-X)
J-shaped New X = 1 (K-X)
C = a constant added to each score so that the smallest score is 1.
K = a constant from which each score is subtracted so that the smallest score is 1;
usually equal to the largest score + 1.
PRACTICE
Check list
 Descriptive statistics
 Range
 Mean & SD
 Skewness & kurtosis
 Missing data (missing value analysis)
 Normal distribution
 Kolmogorov-Smirnov test (n>50)
 Shapiro-Wilk test (n<50)
 Skewness & kurtosis
 PP plot
 Outlier (single/multiple: z-score/Mahalanobis distance)
 Linearilty
 Homoscedasticity
 Multiconllinearity
Report
 Try

Contenu connexe

Tendances (20)

probability and non-probability samplings
probability and non-probability samplingsprobability and non-probability samplings
probability and non-probability samplings
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt final
 
T test and types of t-test
T test and types of t-testT test and types of t-test
T test and types of t-test
 
Measurement Scales in Research
Measurement Scales in Research Measurement Scales in Research
Measurement Scales in Research
 
Anova lecture
Anova lectureAnova lecture
Anova lecture
 
Kruskal wallis test
Kruskal wallis testKruskal wallis test
Kruskal wallis test
 
Four steps to hypothesis testing
Four steps to hypothesis testingFour steps to hypothesis testing
Four steps to hypothesis testing
 
Kruskal Wall Test
Kruskal Wall TestKruskal Wall Test
Kruskal Wall Test
 
Anova in easyest way
Anova in easyest wayAnova in easyest way
Anova in easyest way
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
 
Test of hypothesis
Test of hypothesisTest of hypothesis
Test of hypothesis
 
One way anova final ppt.
One way anova final ppt.One way anova final ppt.
One way anova final ppt.
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Statistical inference
Statistical inferenceStatistical inference
Statistical inference
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Parametric Test
Parametric TestParametric Test
Parametric Test
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Analysis of covariance
Analysis of covarianceAnalysis of covariance
Analysis of covariance
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 

En vedette

paulina koza prezentacja
paulina koza prezentacjapaulina koza prezentacja
paulina koza prezentacjaPaulina Koza
 
The Art Of Vision- Justin Moran
The Art Of Vision- Justin MoranThe Art Of Vision- Justin Moran
The Art Of Vision- Justin MoranJustin Moran
 
20080523_SID_71-1_Wenchih
20080523_SID_71-1_Wenchih20080523_SID_71-1_Wenchih
20080523_SID_71-1_WenchihWen-Chih Tai
 
Mina Mounir (updated)
Mina Mounir (updated)Mina Mounir (updated)
Mina Mounir (updated)Mina Mounir
 
Relaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)Turismo
Relaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)TurismoRelaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)Turismo
Relaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)TurismoIsabel Terrero
 
Thuốc Trị Bệnh Khớp
Thuốc Trị Bệnh KhớpThuốc Trị Bệnh Khớp
Thuốc Trị Bệnh Khớproseanna448
 
Projekt člověk 3
Projekt  člověk 3Projekt  člověk 3
Projekt člověk 3hankaamb
 
1 открытые системы. клиент и сервер
1 открытые системы. клиент и сервер1 открытые системы. клиент и сервер
1 открытые системы. клиент и серверKewpaN
 
Pan pel bem 2015
Pan pel bem 2015Pan pel bem 2015
Pan pel bem 2015Unipdu
 
0 wiki технологии
0 wiki технологии0 wiki технологии
0 wiki технологииKewpaN
 
Black color serving tray
Black color serving trayBlack color serving tray
Black color serving trayJulian Chen
 
10 компонентные и офисные приложения на платформе microsoft
10 компонентные и офисные приложения на платформе microsoft10 компонентные и офисные приложения на платформе microsoft
10 компонентные и офисные приложения на платформе microsoftKewpaN
 

En vedette (20)

paulina koza prezentacja
paulina koza prezentacjapaulina koza prezentacja
paulina koza prezentacja
 
Untitled 2
Untitled 2Untitled 2
Untitled 2
 
The Art Of Vision- Justin Moran
The Art Of Vision- Justin MoranThe Art Of Vision- Justin Moran
The Art Of Vision- Justin Moran
 
20080523_SID_71-1_Wenchih
20080523_SID_71-1_Wenchih20080523_SID_71-1_Wenchih
20080523_SID_71-1_Wenchih
 
Mina Mounir (updated)
Mina Mounir (updated)Mina Mounir (updated)
Mina Mounir (updated)
 
Awards
AwardsAwards
Awards
 
Relaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)Turismo
Relaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)TurismoRelaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)Turismo
Relaciones Humanas en la Empresa (Isabel Terrero) I-U-T- (75)Turismo
 
MANUEL R
MANUEL RMANUEL R
MANUEL R
 
Thuốc Trị Bệnh Khớp
Thuốc Trị Bệnh KhớpThuốc Trị Bệnh Khớp
Thuốc Trị Bệnh Khớp
 
Projekt člověk 3
Projekt  člověk 3Projekt  člověk 3
Projekt člověk 3
 
Cloud computings
Cloud computingsCloud computings
Cloud computings
 
1 открытые системы. клиент и сервер
1 открытые системы. клиент и сервер1 открытые системы. клиент и сервер
1 открытые системы. клиент и сервер
 
Pan pel bem 2015
Pan pel bem 2015Pan pel bem 2015
Pan pel bem 2015
 
Rawshan- CV New
Rawshan- CV NewRawshan- CV New
Rawshan- CV New
 
Mis valores
Mis valoresMis valores
Mis valores
 
0 wiki технологии
0 wiki технологии0 wiki технологии
0 wiki технологии
 
Black color serving tray
Black color serving trayBlack color serving tray
Black color serving tray
 
10 компонентные и офисные приложения на платформе microsoft
10 компонентные и офисные приложения на платформе microsoft10 компонентные и офисные приложения на платформе microsoft
10 компонентные и офисные приложения на платформе microsoft
 
คอมเดี่ยว
คอมเดี่ยวคอมเดี่ยว
คอมเดี่ยว
 
Lactancia materna
Lactancia maternaLactancia materna
Lactancia materna
 

Similaire à Data screening

Data Normality (1).pptx
Data Normality (1).pptxData Normality (1).pptx
Data Normality (1).pptxGhaziaBatool3
 
SLR Assumptions:Model Check Using SPSS
SLR Assumptions:Model Check Using SPSSSLR Assumptions:Model Check Using SPSS
SLR Assumptions:Model Check Using SPSSNermin Osman
 
bio statistics for clinical research
bio statistics for clinical researchbio statistics for clinical research
bio statistics for clinical researchRanjith Paravannoor
 
Overview of different statistical tests used in epidemiological
Overview of different  statistical tests used in epidemiologicalOverview of different  statistical tests used in epidemiological
Overview of different statistical tests used in epidemiologicalshefali jain
 
Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Giannis Tsakonas
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notesBob Smullen
 
Normality test on SPSS
Normality test on SPSSNormality test on SPSS
Normality test on SPSSAmnaFazal3
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysisgokulprasath06
 
MELJUN CORTES research lectures_evaluating_data_statistical_treatment
MELJUN CORTES research lectures_evaluating_data_statistical_treatmentMELJUN CORTES research lectures_evaluating_data_statistical_treatment
MELJUN CORTES research lectures_evaluating_data_statistical_treatmentMELJUN CORTES
 
Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf   Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf Qasim Raza
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.pptmousaderhem1
 
250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARCLeaCamillePacle
 
Descriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing ResearchDescriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing Researchenamprofessor
 

Similaire à Data screening (20)

tables.pptx
tables.pptxtables.pptx
tables.pptx
 
Data Normality (1).pptx
Data Normality (1).pptxData Normality (1).pptx
Data Normality (1).pptx
 
Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2Measures of dispersion discuss 2.2
Measures of dispersion discuss 2.2
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
SLR Assumptions:Model Check Using SPSS
SLR Assumptions:Model Check Using SPSSSLR Assumptions:Model Check Using SPSS
SLR Assumptions:Model Check Using SPSS
 
Chapter 11 Psrm
Chapter 11 PsrmChapter 11 Psrm
Chapter 11 Psrm
 
Bgy5901
Bgy5901Bgy5901
Bgy5901
 
bio statistics for clinical research
bio statistics for clinical researchbio statistics for clinical research
bio statistics for clinical research
 
Overview of different statistical tests used in epidemiological
Overview of different  statistical tests used in epidemiologicalOverview of different  statistical tests used in epidemiological
Overview of different statistical tests used in epidemiological
 
Session1b.ppt
Session1b.pptSession1b.ppt
Session1b.ppt
 
Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
 
Normality test on SPSS
Normality test on SPSSNormality test on SPSS
Normality test on SPSS
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
biostatistics
biostatisticsbiostatistics
biostatistics
 
MELJUN CORTES research lectures_evaluating_data_statistical_treatment
MELJUN CORTES research lectures_evaluating_data_statistical_treatmentMELJUN CORTES research lectures_evaluating_data_statistical_treatment
MELJUN CORTES research lectures_evaluating_data_statistical_treatment
 
Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf   Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.ppt
 
250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC
 
Descriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing ResearchDescriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing Research
 

Plus de 緯鈞 沈

SEM model examination
SEM model examinationSEM model examination
SEM model examination緯鈞 沈
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis緯鈞 沈
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis緯鈞 沈
 
Canonical analysis
Canonical analysisCanonical analysis
Canonical analysis緯鈞 沈
 
Factor analysis
Factor analysisFactor analysis
Factor analysis緯鈞 沈
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression緯鈞 沈
 
Multiple regression
Multiple regressionMultiple regression
Multiple regression緯鈞 沈
 

Plus de 緯鈞 沈 (9)

SEM model examination
SEM model examinationSEM model examination
SEM model examination
 
SEM
SEMSEM
SEM
 
Manova
ManovaManova
Manova
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Canonical analysis
Canonical analysisCanonical analysis
Canonical analysis
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Multiple regression
Multiple regressionMultiple regression
Multiple regression
 

Dernier

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Dernier (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Data screening

  • 2. Anything that can go wrong will go wrong
  • 3. Why do we need to screen data?
  • 4. Purpose  Detect and correct data errors  Detect and treat missing data  Detect and handle insufficiently sampled variables  Conduct transformations and standardizations  Detect and handle outliers
  • 5. First concern  Accuracy of data file  Descriptive statistics  Graphic representations  Honest correlations  Missing data  Pattern or amount  Random or not  Outliers
  • 7. Why is missing data a problem?  Systematical problem  Bias sampling  Demographic variables  Inappropriate measuring procedure  Behavioral items  Insufficient amount for analysis  Small sample  Misleading research results  Biased data in, _______ out
  • 8. Probability distribution of missingness  Consider the probability of missingness  Are certain groups more likely to have missing values?  Respondents in female less likely to report age?  Are certain responses more likely to be missing?  Respondents with high SPA less likely to report anxiety?  Certain analysis methods assume a certain probability distribution
  • 9. Missing completely at random (MCAR)  Missing data is independent of any other measured variable (y2) and independent of the variable itself (y1)  I.e., SES=y2; depression=y1.  If participants dropped out across a range of SES levels, then the missing on depression would be independent of SES  Little’s MCAR test in MVA indicates whether MCAR or not (want ns)
  • 10. Missing at random (MAR)  Missing data may be dependent on another measured variable (y2), but is independent of the variable itself (y1).  I.e., SES=y2; depression=y1.  If participants only from high levels of SES dropped out , then the missing on depression would be dependent on SES. SES.  MAR can be inferred if Little’s test is significant but missingness predictable from other vars (other than the variable itself) –tested by Separate Variance Test. MNAR indicated if this test reveals missingness related to the DV
  • 11. Treatment for missing data  Deleting cases or variables  Descriptive statistics  Estimating missing data  Using missing data correlation matrix  Treating missing data as data  Repeating analyses with and without missing data  Choosing among methods for dealing with missing data  Pattern or amount
  • 12. Deletion or preservation?  Deletion  <5%  MCAR/MAR  Preservation  MNAR  Small sample  Replacement  Mean (grand or group)  Regression (predict missing value by other IVs)  Expectation Maximization (form missing data r matrix by assumed distribution)
  • 13. OUTLIER Cases with extreme value on variables
  • 14. Why is outlier a problem?  Systematical problem  Bias sampling  Wrong population  Statistical problem  ↑error variance  ↓statistical power  ↑typeⅠ, Ⅱ error  ↓normality  Misleading research results  Biased data in, _______ out
  • 15. Influence of outlier  Leverage × discrepancy
  • 16. Treatment for outlier  Estimating outlier  Standardized score (z>2, 2.5, 3)  Graphical methods (p-p, q-q plot)  Mahalanobis distance (χ2 test)  Deletion or transformation  Critical to analysis or not  Preservation  Transformation  Score alternation
  • 18. Key assumptions in GLM  Normality  Linearity  Homogeneity of variance  Interval level data  Independence of observations
  • 20. Test for normality  Skewness & Kurtosis
  • 21. Test for normality  T-test for skewness & kurtosis score  Kolmogorov-Smirnov test & Shaprio-wilk test Z= 𝑠−0 𝑠 𝑠/𝑘 w= ( 𝑖=1 𝑛 𝑎 𝑖 𝑥 𝑖) 2 𝑖=1 𝑛 (𝑥 𝑖−𝐴) 2
  • 22. Test for normality  Plotting cumulative distribution function
  • 23. Test for normality  P-P plot (probability) & Q-Q plot (quantile)
  • 25. Homoscedasticity  Homogeneity of variance  Homogeneity of variance-covariance matrix
  • 28. Data transformations Directio n Skewness Treatment + Moderate New X = SQRT (X) Substantial New X = LG10 (X) Substantial with zero New X = LG10 (X+C) Severe New X = 1/X L-shaped with zero New X = 1 (X+C) - Moderate New X = SQRT (K-X) Substantial New X = LG10 (K-X) J-shaped New X = 1 (K-X) C = a constant added to each score so that the smallest score is 1. K = a constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1.
  • 30. Check list  Descriptive statistics  Range  Mean & SD  Skewness & kurtosis  Missing data (missing value analysis)  Normal distribution  Kolmogorov-Smirnov test (n>50)  Shapiro-Wilk test (n<50)  Skewness & kurtosis  PP plot  Outlier (single/multiple: z-score/Mahalanobis distance)  Linearilty  Homoscedasticity  Multiconllinearity