SlideShare une entreprise Scribd logo
1  sur  40
Overview of the statistical analysis
Jonas Ranstam, PhD,
National Musculoskeletal Competence Centre, Lund, Sweden
Explanations and points of reference

1.   Methodological background
2.   International guidelines
3.   Multiplicity issues
4.   Study population definitions
5.   Statistical models
1. Methodological background
Clinical research
Before 1948

Unclear validity, unknown statistical precision

- Prof A's patients better than Prof B's
- Small series of patients or even single cases
Streptomycin in Tuberculosis Trials Committee.
Streptomycin treatment of pulmonary tuberculosis.
BMJ 1948;2:769-83.


The Control Scheme

Determination of whether a patient would be treated by streptomycin
and bed-rest (S case) or by bed-rest alone (C case) was made by
reference to a statistical series based on random sampling numbers
drawn up for each sex at each centre by Professor Bradford Hill; the
details of the series were unknown to any of the investigators or to the
co-ordinator and were contained in a set of sealed envelopes, each
bearing on the outside only the name of the hospital and a number.
Clinical research
From 1948

Elimination/reduction of bias, assessment of
statistical precision

- Randomization and blinding (intervention studies)
- Effect modeling (observation studies)
- P-values and confidence intervals
Quantitative principles I

Randomized allocation of patients to treatment groups
(and blinding when possible) guarantee that:

1. All differences between treatment groups at
   baseline are random (not systematic).

   Complete absence of baseline imbalance is not
   the aim. Stratification on prognostic factors are
   used to make the groups less imbalanced.

2. Treatment effect estimates are unaffected by
   selection and confounding bias (and with
   blinding, differential misclassification bias).
Quantitative principles II

1. Individual effects vary between subjects.
   Different samples of subjects will yield
   different observed mean effects.

2. The subject variation can be estimated
   using the observations in a random sample.

3. A universal mean effect can be estimated,
   and the reliability of this estimate can be
   described with p-values and confidence
   intervals.
P-values are often misunderstood
They do

- describe the reliability of findings. P < 0.05 is usually
  considered reliable.

They do not

- describe clinical relevance (they depend on sample
  size).

- show that a difference “does not exist” (“n.s.” is
  absence of evidence, not evidence of absence).
2. International guidelines
ICMJE – the Vancouver group
Results

“Avoid relying solely on statistical hypothesis testing,
such as the use of P values, which fails to convey
important information about effect size.”

“When possible, quantify findings and present them
with appropriate indicators of measurement error or
uncertainty (such as confidence intervals).”
Example: FREE SF36-PCS

Estimated treatment effect difference at baseline

        Difference (95%Ci)      p-value
        0.4 (-1.7 – 2.6)        0.7

Estimated treatment effect difference at 1 month

        Difference (95%Ci)      p-value
        5.9 (3.7 – 8.2)         <0.0001
P-values vs. confidence intervals

                  P-value                          Confidence intervals
                  2 possible outcomes              5 possible outcomes



                                                              Statistically and clinically significant effect
                            p < 0.05


                            p < 0.05               Statistically, but not necessarily clinically, significant effect




                            n.s.                     Inconclusive


                     n.s.                       Neither statistically nor clinically significant effect


       p < 0.05                                   Statistically significant reversed effect


Bad                                                                                        Good
                                          0
                                       Effect   Clinically significant effect
Clinical trials
International regulatory guidelines
ICH Topic E9 - Statistical Principles for Clinical Trials

EMEA Points to consider: baseline covariates
                        - missing data
                        - multiplicity issues
                        - etc.

and similar documents from the FDA

These guidelines can all be found on the internet.
3. Multiplicity issues
Multiplicity
Multiplicity of inferences is present in almost all trials.
If not properly handled, unsubstantiated claims for
effectiveness may be made as a consequence of an
inflated rate of false positive conclusions.
Multiplicity
The chance of at least one
false positive finding (FPR) = 1 - (1 – α)k

where k is the number of performed comparisons and
α the significance level (usually 0.05).

k = 1 => FPR = 0.05
k = 2 => FPR = 0.0975
k = 10 => FPR = 0.4013

Bonferroni method: divide the significance level by the
number of comparisons. This is bad for the statistical
power, should be avoided.
Endpoints
Primary     The variable capable of providing the
            most clinically relevant evidence
            directly related to the primary objective
            of the trial

Secondary   Either measurements supporting the
            primary endpoint or effects related to
            secondary objectives
Statistical analyses
Confirmatory   The result concerns a primary endpoint
               and the p-value or confidence interval
               accounts for potential multiplicity.

               The result can support a claim of
               superiority, equivalence or non-
               inferiority.

Exploratory    All other analyses.

               The result is either supporting or
               explanatory, or simply just a new
               hypothesis.
4. Study population definitions
Study populations
Intention-to-treat   Analyze all randomized subjects
(ITT) principle      according to planned treatment
                     regimen.

Full analysis set    The set of subjects that is as close
(FAS)                as possible to the ideal implied by
                     the ITT-principle.

Per protocol         The set of subjects who complied
(PP) set             with the protocol sufficiently to ensure
                     that they are likely to exhibit the
                     effects of treatment according to the
                     underlying scientific model.
FAS vs. PP-set
FAS       + no selection bias
          - misclassification problem (effect dilution)

PP-set    + no contamination problem
          - possible selection bias (confounding)


When the FAS and PP-set lead to essentially the same
conclusions, confidence in the trial is supported.
5. Statistical models
Fixed and random effects
Fixed effects      when the levels of an effect
                   constitute the entire population
                   about which you are interested.

Random effects     when the levels in your experiment
                   represent only a sample from that
                   population.


Random effects models can be used to analyze data with
multiple observations per patient.
Mixed effects model
If all the effects in a statistical model (ANOVA) are
considered random effects, then the model is called a
random effects model; likewise, a model with only
fixed effects is called a fixed effects model. When
some factors are fixed and others are random, the
model is called a mixed model.


(R.A. Fisher 1926: Type-1 and type-2 ANOVA)
Data from 3 subjects:
Messrs. Green, Blue and Red
 Effect




 Baseline   1st visit    2nd visit   Time
Analysis requirement: FAS

 Effect




 Baseline    1st visit      2nd visit   Time
1. Assume independence between subjects'
   repeated observations and use ANOVA
 Effect




 Baseline   1st visit   2nd visit   Time
1. Assume independence between subjects'
   repeated observations and use ANOVA
 Effect
                                Bad idea:
                                Within-subject variation
                                is confused with
                                between-subject
                                variation. Statistical
                                precision will be
                                incorrectly calculated.




 Baseline   1st visit   2nd visit         Time
2. Repeated fixed effects comparisons
   e.g. Student's t-tests
 Effect




 Baseline    1st visit    2nd visit   Time
2. Repeated fixed effects comparisons
   e.g. Student's t-tests (no FAS)
 Effect




 Baseline    1st visit    2nd visit   Time
3. Fixed effects RM-model

 Effect




 Baseline    1st visit      2nd visit   Time
3. Fixed effects RM-model
   (no FAS)
 Effect




 Baseline    1st visit      2nd visit   Time
4. Fixed effects RM-model with LOCF

   Effect




   Baseline    1st visit   2nd visit   Time
4. Fixed effects RM-model with LOCF

   Effect
                                       LOCF-imputation is
                                       not necessarily
                                       conservative, and
                                       under-estimates
                                       variability.

                                       Not the best alternative!




   Baseline    1st visit   2nd visit            Time
5. Mixed effects (subject random) ANOVA

 Effect




 Baseline    1st visit   2nd visit   Time
5. Mixed effects (subject random) ANOVA

 Effect                              Within- and between
                                     subject variation are
                                     separated in the model.
                                     Statistical precision is
                                     correctly calculated.

                                     A number of publica-
                                     tions reporting monte-
                                     carlo simulation studies
                                     show that this is the
                                     best alternative, both in
                                     terms of precision and
                                     validity!




 Baseline    1st visit   2nd visit            Time
Example: FREE SF36-PCS

Estimated treatment effect difference at 1 month

Method              Difference   p-value

ITT-analysis
ME ANOVA            5.5          <0.0001

PP-analysis
FE ANOVA Compl.     5.2          <0.0001
FE ANOVA LOCF       4.9          <0.0001
Thank you for your attention!

Contenu connexe

Tendances

Causal design & research
Causal design & researchCausal design & research
Causal design & research
gadiabinit
 
Inferential statistics (2)
Inferential statistics (2)Inferential statistics (2)
Inferential statistics (2)
rajnulada
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne
rgveroniki
 

Tendances (20)

Biostatistics in cancer RCTs
Biostatistics in cancer RCTsBiostatistics in cancer RCTs
Biostatistics in cancer RCTs
 
Parametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use whichParametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use which
 
Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."
Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."
Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."
 
Copenhagen 23.10.2008
Copenhagen 23.10.2008Copenhagen 23.10.2008
Copenhagen 23.10.2008
 
Causal design & research
Causal design & researchCausal design & research
Causal design & research
 
Casual research design experimentation
Casual research design experimentationCasual research design experimentation
Casual research design experimentation
 
Common Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical TrialsCommon Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical Trials
 
Stats test
Stats testStats test
Stats test
 
Fishers test
Fishers testFishers test
Fishers test
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta Sawant
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
 
Inferential statistics (2)
Inferential statistics (2)Inferential statistics (2)
Inferential statistics (2)
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Deciphering the dilemma of parametric and nonparametric tests
Deciphering the dilemma of parametric and nonparametric testsDeciphering the dilemma of parametric and nonparametric tests
Deciphering the dilemma of parametric and nonparametric tests
 
2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne2010 smg training_cardiff_day2_session4_sterne
2010 smg training_cardiff_day2_session4_sterne
 
Parmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trailsParmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trails
 
Study designs, randomization, bias errors, power, p-value, sample size
Study designs, randomization, bias errors, power, p-value, sample sizeStudy designs, randomization, bias errors, power, p-value, sample size
Study designs, randomization, bias errors, power, p-value, sample size
 
Avoid overfitting in precision medicine: How to use cross-validation to relia...
Avoid overfitting in precision medicine: How to use cross-validation to relia...Avoid overfitting in precision medicine: How to use cross-validation to relia...
Avoid overfitting in precision medicine: How to use cross-validation to relia...
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Hypothesis Testing. Inferential Statistics pt. 2
Hypothesis Testing. Inferential Statistics pt. 2Hypothesis Testing. Inferential Statistics pt. 2
Hypothesis Testing. Inferential Statistics pt. 2
 

En vedette (10)

Sof stat issues_pro
Sof stat issues_proSof stat issues_pro
Sof stat issues_pro
 
Sof klin forsk_stat
Sof klin forsk_statSof klin forsk_stat
Sof klin forsk_stat
 
Lund 2010
Lund 2010Lund 2010
Lund 2010
 
Oarsi jr1
Oarsi jr1Oarsi jr1
Oarsi jr1
 
London 2008
London 2008London 2008
London 2008
 
Odense 2010
Odense 2010Odense 2010
Odense 2010
 
Lecture jr
Lecture jrLecture jr
Lecture jr
 
Copenhagen 2008
Copenhagen 2008Copenhagen 2008
Copenhagen 2008
 
Lund 2009
Lund 2009Lund 2009
Lund 2009
 
Brussels 2010
Brussels 2010Brussels 2010
Brussels 2010
 

Similaire à Prague 2008

4 primaryresearchquestionanddefinitionofendpointsindia2007
4 primaryresearchquestionanddefinitionofendpointsindia20074 primaryresearchquestionanddefinitionofendpointsindia2007
4 primaryresearchquestionanddefinitionofendpointsindia2007
KhanhHoa Tran
 
Sample size
Sample sizeSample size
Sample size
zubis
 
2010 smg training_cardiff_day1_session4_harbord
2010 smg training_cardiff_day1_session4_harbord2010 smg training_cardiff_day1_session4_harbord
2010 smg training_cardiff_day1_session4_harbord
rgveroniki
 
Choosing statistical tests
Choosing statistical testsChoosing statistical tests
Choosing statistical tests
Akiode Noah
 
Chapter10 3%285%29
Chapter10 3%285%29Chapter10 3%285%29
Chapter10 3%285%29
jhtrespa
 
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
InsideScientific
 

Similaire à Prague 2008 (20)

4 primaryresearchquestionanddefinitionofendpointsindia2007
4 primaryresearchquestionanddefinitionofendpointsindia20074 primaryresearchquestionanddefinitionofendpointsindia2007
4 primaryresearchquestionanddefinitionofendpointsindia2007
 
Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test Selection
 
Sample size
Sample sizeSample size
Sample size
 
Seminar iv
Seminar ivSeminar iv
Seminar iv
 
Oac guidelines
Oac guidelinesOac guidelines
Oac guidelines
 
2010 smg training_cardiff_day1_session4_harbord
2010 smg training_cardiff_day1_session4_harbord2010 smg training_cardiff_day1_session4_harbord
2010 smg training_cardiff_day1_session4_harbord
 
RCT to causal inference.pptx
RCT to causal inference.pptxRCT to causal inference.pptx
RCT to causal inference.pptx
 
Choosing statistical tests
Choosing statistical testsChoosing statistical tests
Choosing statistical tests
 
Understanding clinical trial's statistics
Understanding clinical trial's statisticsUnderstanding clinical trial's statistics
Understanding clinical trial's statistics
 
How to read a paper
How to read a paperHow to read a paper
How to read a paper
 
Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
 
Analysis and Interpretation
Analysis and InterpretationAnalysis and Interpretation
Analysis and Interpretation
 
Sample determinants and size
Sample determinants and sizeSample determinants and size
Sample determinants and size
 
Common statistical pitfalls in basic science research
Common statistical pitfalls in basic science researchCommon statistical pitfalls in basic science research
Common statistical pitfalls in basic science research
 
Coursebooklet
CoursebookletCoursebooklet
Coursebooklet
 
Chapter10 3%285%29
Chapter10 3%285%29Chapter10 3%285%29
Chapter10 3%285%29
 
Introduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detailIntroduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detail
 
Sample size estimation
Sample size estimationSample size estimation
Sample size estimation
 
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
Evidence Synthesis for Sparse Evidence Base, Heterogeneous Studies, and Disco...
 

Plus de Jonas Ranstam PhD (17)

Rcsyd pres nara
Rcsyd pres naraRcsyd pres nara
Rcsyd pres nara
 
Oac beijing jr
Oac beijing jrOac beijing jr
Oac beijing jr
 
Norsminde 2009
Norsminde 2009Norsminde 2009
Norsminde 2009
 
Nara guidelines-jr
Nara guidelines-jrNara guidelines-jr
Nara guidelines-jr
 
Malmo 30 03-2012
Malmo 30 03-2012Malmo 30 03-2012
Malmo 30 03-2012
 
Karlskrona 2009
Karlskrona 2009Karlskrona 2009
Karlskrona 2009
 
Datavalidering jr1
Datavalidering jr1Datavalidering jr1
Datavalidering jr1
 
Amsterdam 2008
Amsterdam 2008Amsterdam 2008
Amsterdam 2008
 
Actalecturerungsted
ActalecturerungstedActalecturerungsted
Actalecturerungsted
 
Abc4
Abc4Abc4
Abc4
 
Umeapresjr
UmeapresjrUmeapresjr
Umeapresjr
 
Stockholm 6 7.11.2008
Stockholm 6 7.11.2008Stockholm 6 7.11.2008
Stockholm 6 7.11.2008
 
Malmo 17.10.2008
Malmo 17.10.2008Malmo 17.10.2008
Malmo 17.10.2008
 
Malmo 11.11.2008
Malmo 11.11.2008Malmo 11.11.2008
Malmo 11.11.2008
 
Lund 30.09.2008
Lund 30.09.2008Lund 30.09.2008
Lund 30.09.2008
 
London 21.11.2008
London 21.11.2008London 21.11.2008
London 21.11.2008
 
Amsterdam 11.06.2008
Amsterdam 11.06.2008Amsterdam 11.06.2008
Amsterdam 11.06.2008
 

Prague 2008

  • 1. Overview of the statistical analysis Jonas Ranstam, PhD, National Musculoskeletal Competence Centre, Lund, Sweden
  • 2. Explanations and points of reference 1. Methodological background 2. International guidelines 3. Multiplicity issues 4. Study population definitions 5. Statistical models
  • 4. Clinical research Before 1948 Unclear validity, unknown statistical precision - Prof A's patients better than Prof B's - Small series of patients or even single cases
  • 5. Streptomycin in Tuberculosis Trials Committee. Streptomycin treatment of pulmonary tuberculosis. BMJ 1948;2:769-83. The Control Scheme Determination of whether a patient would be treated by streptomycin and bed-rest (S case) or by bed-rest alone (C case) was made by reference to a statistical series based on random sampling numbers drawn up for each sex at each centre by Professor Bradford Hill; the details of the series were unknown to any of the investigators or to the co-ordinator and were contained in a set of sealed envelopes, each bearing on the outside only the name of the hospital and a number.
  • 6. Clinical research From 1948 Elimination/reduction of bias, assessment of statistical precision - Randomization and blinding (intervention studies) - Effect modeling (observation studies) - P-values and confidence intervals
  • 7. Quantitative principles I Randomized allocation of patients to treatment groups (and blinding when possible) guarantee that: 1. All differences between treatment groups at baseline are random (not systematic). Complete absence of baseline imbalance is not the aim. Stratification on prognostic factors are used to make the groups less imbalanced. 2. Treatment effect estimates are unaffected by selection and confounding bias (and with blinding, differential misclassification bias).
  • 8. Quantitative principles II 1. Individual effects vary between subjects. Different samples of subjects will yield different observed mean effects. 2. The subject variation can be estimated using the observations in a random sample. 3. A universal mean effect can be estimated, and the reliability of this estimate can be described with p-values and confidence intervals.
  • 9. P-values are often misunderstood They do - describe the reliability of findings. P < 0.05 is usually considered reliable. They do not - describe clinical relevance (they depend on sample size). - show that a difference “does not exist” (“n.s.” is absence of evidence, not evidence of absence).
  • 11.
  • 12. ICMJE – the Vancouver group Results “Avoid relying solely on statistical hypothesis testing, such as the use of P values, which fails to convey important information about effect size.” “When possible, quantify findings and present them with appropriate indicators of measurement error or uncertainty (such as confidence intervals).”
  • 13. Example: FREE SF36-PCS Estimated treatment effect difference at baseline Difference (95%Ci) p-value 0.4 (-1.7 – 2.6) 0.7 Estimated treatment effect difference at 1 month Difference (95%Ci) p-value 5.9 (3.7 – 8.2) <0.0001
  • 14. P-values vs. confidence intervals P-value Confidence intervals 2 possible outcomes 5 possible outcomes Statistically and clinically significant effect p < 0.05 p < 0.05 Statistically, but not necessarily clinically, significant effect n.s. Inconclusive n.s. Neither statistically nor clinically significant effect p < 0.05 Statistically significant reversed effect Bad Good 0 Effect Clinically significant effect
  • 15. Clinical trials International regulatory guidelines ICH Topic E9 - Statistical Principles for Clinical Trials EMEA Points to consider: baseline covariates - missing data - multiplicity issues - etc. and similar documents from the FDA These guidelines can all be found on the internet.
  • 17. Multiplicity Multiplicity of inferences is present in almost all trials. If not properly handled, unsubstantiated claims for effectiveness may be made as a consequence of an inflated rate of false positive conclusions.
  • 18. Multiplicity The chance of at least one false positive finding (FPR) = 1 - (1 – α)k where k is the number of performed comparisons and α the significance level (usually 0.05). k = 1 => FPR = 0.05 k = 2 => FPR = 0.0975 k = 10 => FPR = 0.4013 Bonferroni method: divide the significance level by the number of comparisons. This is bad for the statistical power, should be avoided.
  • 19. Endpoints Primary The variable capable of providing the most clinically relevant evidence directly related to the primary objective of the trial Secondary Either measurements supporting the primary endpoint or effects related to secondary objectives
  • 20. Statistical analyses Confirmatory The result concerns a primary endpoint and the p-value or confidence interval accounts for potential multiplicity. The result can support a claim of superiority, equivalence or non- inferiority. Exploratory All other analyses. The result is either supporting or explanatory, or simply just a new hypothesis.
  • 21. 4. Study population definitions
  • 22. Study populations Intention-to-treat Analyze all randomized subjects (ITT) principle according to planned treatment regimen. Full analysis set The set of subjects that is as close (FAS) as possible to the ideal implied by the ITT-principle. Per protocol The set of subjects who complied (PP) set with the protocol sufficiently to ensure that they are likely to exhibit the effects of treatment according to the underlying scientific model.
  • 23. FAS vs. PP-set FAS + no selection bias - misclassification problem (effect dilution) PP-set + no contamination problem - possible selection bias (confounding) When the FAS and PP-set lead to essentially the same conclusions, confidence in the trial is supported.
  • 25. Fixed and random effects Fixed effects when the levels of an effect constitute the entire population about which you are interested. Random effects when the levels in your experiment represent only a sample from that population. Random effects models can be used to analyze data with multiple observations per patient.
  • 26. Mixed effects model If all the effects in a statistical model (ANOVA) are considered random effects, then the model is called a random effects model; likewise, a model with only fixed effects is called a fixed effects model. When some factors are fixed and others are random, the model is called a mixed model. (R.A. Fisher 1926: Type-1 and type-2 ANOVA)
  • 27. Data from 3 subjects: Messrs. Green, Blue and Red Effect Baseline 1st visit 2nd visit Time
  • 28. Analysis requirement: FAS Effect Baseline 1st visit 2nd visit Time
  • 29. 1. Assume independence between subjects' repeated observations and use ANOVA Effect Baseline 1st visit 2nd visit Time
  • 30. 1. Assume independence between subjects' repeated observations and use ANOVA Effect Bad idea: Within-subject variation is confused with between-subject variation. Statistical precision will be incorrectly calculated. Baseline 1st visit 2nd visit Time
  • 31. 2. Repeated fixed effects comparisons e.g. Student's t-tests Effect Baseline 1st visit 2nd visit Time
  • 32. 2. Repeated fixed effects comparisons e.g. Student's t-tests (no FAS) Effect Baseline 1st visit 2nd visit Time
  • 33. 3. Fixed effects RM-model Effect Baseline 1st visit 2nd visit Time
  • 34. 3. Fixed effects RM-model (no FAS) Effect Baseline 1st visit 2nd visit Time
  • 35. 4. Fixed effects RM-model with LOCF Effect Baseline 1st visit 2nd visit Time
  • 36. 4. Fixed effects RM-model with LOCF Effect LOCF-imputation is not necessarily conservative, and under-estimates variability. Not the best alternative! Baseline 1st visit 2nd visit Time
  • 37. 5. Mixed effects (subject random) ANOVA Effect Baseline 1st visit 2nd visit Time
  • 38. 5. Mixed effects (subject random) ANOVA Effect Within- and between subject variation are separated in the model. Statistical precision is correctly calculated. A number of publica- tions reporting monte- carlo simulation studies show that this is the best alternative, both in terms of precision and validity! Baseline 1st visit 2nd visit Time
  • 39. Example: FREE SF36-PCS Estimated treatment effect difference at 1 month Method Difference p-value ITT-analysis ME ANOVA 5.5 <0.0001 PP-analysis FE ANOVA Compl. 5.2 <0.0001 FE ANOVA LOCF 4.9 <0.0001
  • 40. Thank you for your attention!