SlideShare une entreprise Scribd logo
1  sur  1
A Statistical Analysis on the Nutritional Intakes of Secondary School Children
An assessment of the impact of revised school food standards
Adverse outcomes of obesity include cardiovascular disease,
many cancers, type II diabetes, strokes, high blood pressure,
osteoarthritis, fertility problems, reduced life expectancy,
depression, anxiety and low self-esteem.
In 2002, 21.8% of boys and 27.5% of girls aged 2-15 years
were overweight or obese. Furthermore, the direct cost of
obesity to the NHS was estimated at £46-49 million per year.
In response to Jamie Oliver’s Feed Me Better campaign in
2005, the Department for Education and Skills revised the
national school food standards.
513 schoolchildren from 2 time-points (2000 and 2009)
completed ‘food diaries.’ From these, nutritionists devised
each child’s mean daily intake and mean lunchtime intake
for each nutrient (energy, protein, fat etc.).
Aim: Assess impact of standards
Variables that affect food/nutrient intake are:
• YEAR: 2000 or 2009: since changes were made to school
food regulations in this time period.
• LUNCH TYPE: School lunch (SL) or packed lunch (PL): since
regulations applied to school lunches only.
• SEX: Male or Female: since boys eat more than girls.
However, the difference between sexes does not depend
on the new standards and so this effect is not of interest.
The mean lunchtime energy intake in kcal:
Inference: Average energy intake decreased substantially for
school lunches, but not a lot for packed lunches.
Problem: The 4 groups do not contain equal amounts of boys
and girls, and since sex affects energy intake, the year/lunch
effects are confounded with the sex effect which is not of
interest. Therefore the groups are not comparable.
Solution: Adjusted means.
2000 2009 Difference: 2009-2000
SL 711.9 495.9 -216.0
PL 612.3 574.2 -38.2
Method:
1. Fit a linear regression model to the data:
𝑌𝑖𝑗𝑘𝓁 = 𝜇 + 𝛼𝑖 + 𝛽𝑗 + (𝛼𝛽)𝑖𝑗+𝛾 𝑘 + 𝜖𝑖𝑗𝑘𝓁 ,
* If the p-value for the interaction is significant, year affects intake differently for each lunch type, so a two-way table is needed to
present means. If however it is not significant, the interaction complicates the presentation, yet does not add anything worthwhile.
Therefore, the model will be re-fitted without the interaction if not significant and one-way tables used.
† Choice of 𝑆𝑒𝑥 is arbitrary as it does not affect the differences, but one that produces plausible mean values is preferable, so that
practitioners without statistics backgrounds are not disconcerted.
Response for 𝓁 𝑡ℎ subject,
who was from 𝑖 𝑡ℎ year,
𝑗 𝑡ℎ lunch type and 𝑘 𝑡ℎ sex
Overall mean
Effect of
𝑖 𝑡ℎ year
Effect of 𝑗 𝑡ℎ
lunch type
Effect of (𝑖𝑗) 𝑡ℎ
combination of
year and lunch
type*
Effect of 𝑘 𝑡ℎ sex
- to be corrected for
Error of the
𝓁 𝑡ℎ individual
2. Estimate regression coefficients & obtain equation for the fitted mean of each group:
𝑌𝑖𝑗 = 734.5 − 219.7 𝐼 𝑌𝑒𝑎𝑟 = 2009 − 106.4 𝐼 𝐿𝑢𝑛𝑐ℎ = 𝑆𝐿 − 39.1 𝑆𝑒𝑥𝑖𝑗 + 186.1 𝐼[𝑌𝑒𝑎𝑟 = 2009 & 𝐿𝑢𝑛𝑐ℎ = 𝑆𝐿]
where 𝐼[𝐴] is an indicator variable that equals 1 if the event A is true and 0 otherwise,
and 𝑆𝑒𝑥𝑖𝑗 is the proportion of females in the group.
3. Fix sex variable at a constant arbitrary† value, say the mean sex value of the sample:
S𝑒𝑥 = 0.5185
4. Compute the mean for each group at this uniform sex value, instead of using 𝑆𝑒𝑥𝑖𝑗
2000 school lunch: 𝑌0,0 = 734.5 − 219.7 × 0 − 106.4 × 0 − 39.9𝑆𝑒𝑥 + 186.1 × 0 × 0 = 713.8
2000 packed lunch: 𝑌0,1 = 734.5 − 219.7 × 0 − 106.4 × 1 − 39.9𝑆𝑒𝑥 + 186.1 × 0 × 1 = 607.5
2009 school lunch: 𝑌1,0 = 734.5 − 219.7 × 1 − 106.4 × 0 − 39.9𝑆𝑒𝑥 + 186.1 × 1 × 0 = 494.1
2009 packed lunch: 𝑌1,1 = 734.5 − 219.7 × 1 − 106.4 × 1 − 39.9𝑆𝑒𝑥 + 186.1 × 1 × 1 = 573.9
5. The group means have been adjusted for sex imbalance so they are comparable! Inference can now be made
on the differences (estimable quantities). This is because the differences are independent of choice of 𝑆𝑒𝑥
(when one mean is subtracted from another), 𝑆𝑒𝑥 cancels out – so differences are unique!
A package called lsmeans can be downloaded in R, allowing efficient calculation of adjusted group means, for
lunchtime and daily intakes of all nutrients. This package, by default, uses 0.5 for the arbitrary fixed value of 𝑆𝑒𝑥.
Diagnostic checks must be performed for each model, to check for homoscedasticity (constant variance, by residual
plots) and Normality (by Normal probability plots) of the estimated residuals.
For most models, the plots are satisfactory. However, lunchtime and daily vitamin C intake have concerning Normal
probability plots. The obvious curvature means that Normality cannot be assumed.
Normal probability plots for lunchtime and daily vitamin C intake
Problem: Significance tests and
confidence intervals are invalidated.
Solution: Data transformation:
a transformation must not change
the order of values, but can alter
the distance between successive
points to modify the overall shape
of the distribution and achieve a
‘bell curve’.
The Box-Cox power transformation (1964) is the most commonly used tool to remedy the
breakdown of the Normality assumption. For some positive data 𝑌1,…, 𝑌𝑛, it is given by
𝑌𝑖
(𝜆)
=
𝑌𝑖
𝜆
− 1
𝜆
, 𝑖𝑓 𝜆 ≠ 0,
log 𝑌𝑖 , 𝑖𝑓 𝜆 = 0,
where the transformation parameter 𝜆 requires estimation.
For non-positive data, there is a two-parameter version, which allows for a shift before
transformation, given by
𝑌𝑖
(𝜆)
=
(𝑌𝑖 + 𝜆2) 𝜆1−1
𝜆1
, 𝑖𝑓 𝜆1 ≠ 0,
log 𝑌𝑖 + 𝜆2 , 𝑖𝑓 𝜆1 = 0,
where the transformation parameter 𝜆1 and the shift parameter 𝜆2 both require estimation.
The Box-Cox parameters are usually estimated by maximum likelihood and then rounded to
resemble a practical transformation (e.g. square root, cube root, inverse).
The lunchtime vitamin C data is non-positive (contains some zeroes), so the two-parameter
version is used, giving 𝜆1 = 0.4022 and 𝜆2 = 0.0015 to 4 d.p. Rounding gives a square-root
transformation (preceded by a shift of size zero). The daily vitamin C data is positive. The
standard version is therefore used, giving 𝜆 = 0.1997. Rounding gives a log transformation.
Problem: Transformed data will typically be on a scale that is unfamiliar to practitioners.
Solution: Use the inverse transformation to back-transform the results, so that they are put on
the original scale and made accessible to practitioners.
Fitting the model to the square-rooted data gives an acceptable fit to Normality. The adjusted
means for the square-rooted data are calculated, then squared to convert them back to the
original scale:
Problem: It has been made apparent already that data on the original scale violates the
Normality assumption, which is the reason a transformation was sought in the first place.
Confidence intervals for the difference therefore cannot be found. Valid conclusions can only be
drawn from data on the square-root scale, which makes the back-transformation redundant.
Solution: Use log-transformation.
2000 2009 Difference 95% CI of difference
Square-root scale 𝑌2000 𝑌2009 𝑑 = 𝑌2009 − 𝑌2000
𝑑 ± 𝑠. 𝑒. 𝑑
Original scale
𝑌2000
2
𝑌2009
2
𝑌2009
2
- 𝑌2000
2
SL PL Difference 95% CI of difference
Square-root
scale
𝑌𝑆𝐿 𝑌𝑃𝐿 𝑑 = 𝑌𝑃𝐿 − 𝑌𝑆𝐿
𝑑 ± 𝑠. 𝑒. 𝑑
Original scale
𝑌𝑆𝐿
2
𝑌𝑃𝐿
2
𝑌𝑃𝐿
2
− 𝑌𝑆𝐿
2
A useful quality of the log-transformation is that an intuitive interpretation is possible upon back-
transformation. This is owing to the relationship between the geometric and arithmetic means of some
general data 𝑌1, 𝑌2, … , 𝑌𝑛
𝐺𝑀(𝑌𝑖) =
𝑖=1
𝑛
𝑌𝑖
1
𝑛
= exp
1
𝑛
𝑖=1
𝑛
log 𝑌𝑖 = exp 𝐴𝑀(log(𝑌𝑖) ,
where 𝐺𝑀(. ) and 𝐴𝑀(. ) denote the geometric and arithmetic means respectively. Therefore,
𝐴𝑀(log(𝑌𝑖)) = log 𝐺𝑀 𝑌𝑖 .
Hence, the difference between two group (arithmetic) means (of logged data) is given by
log 𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐴 − log 𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐵 = log
𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐴
𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐵
.
Upon exponentiation, the ‘difference’ simply becomes the ratio of the geometric means. Then, due to the
asymmetry of the log-transformation, the confidence interval of this ratio can be found directly by anti-
logging the confidence interval of the difference.
Unlike the square-root transformation however, a log-transformation cannot be applied to the
zero observations. This is resolved by shifting the data, but a sensible constant must be
determined.
Whichever one minimizes the residual
skewness is a logical choice, since
Normality corresponds to zero residual
skewness. Minimal residual skewness is
achieved with a shift of approximately
15. The log-transformation can then be
performed on the shifted intakes.
After log-transformation, there is still an acceptable fit to Normality. So although the Box-Cox method
indicated square-root, the log-transformation also manages to Normalize the data quite well.
Inference: Vitamin C intake in 2009 was 1.12 times larger than in 2000 and packed lunches on average
contained 1.07 times as much as school lunches.
2000 2009 Difference/Ratio 95% CI of difference
Log-scale 3.638 3.756 PL – SL: 0.118 (0.023, 0.212)
Original scale 38.0 42.8 PL / SL: 1.12 (1.02, 1.24)
SL PL Difference/Ratio 95% CI of difference
Log scale 3.664 3.730 PL – SL: 0.0664 (-0.0299, 0.1626)
Original scale 39.0 41.7 PL / SL: 1.07 (0.97, 1.18)
Lunchtime intakes: Consumption of energy, sodium and saturated fat declined significantly in school lunch
children, but not in packed lunch children. Vitamin C intake increased reasonably over the years, but the
impact was the same in both lunch types.
Daily intakes: Daily consumption of all nutrients did not differ for school and packed lunch children.
Consumption of energy and sodium fell significantly, but there was no evidence to suggest the same for
saturated fat. Vitamin C increased quite reasonably.
Problem: Energy intake is a proxy for amount eaten.
Energy intake decreased over the years, meaning that
children ate less in 2009 than in 2000. What if the
decrease in sodium is simply due to the fact that
they ate less food overall?
Solution: Investigate sodium-density.
To investigate how heavily sodium depends on energy, energy is included as an explanatory variable:
𝑁𝑎 = 𝛼 + 𝛽𝐸 + 𝜖,
where 𝑁𝑎 = daily sodium intake, 𝐸 = daily energy intake, 𝜖 = error, with 𝜖~𝑁(0, 𝜎2
), and 𝛼 incorporates the
effects of all other covariates as well as the general mean.
This time, means are not only adjusted for sex, but also for energy intake.
Hence, even if a child’s energy intake was the same in each year, their average daily sodium intake will still
have decreased by over 170 mg, which is a relatively large amount, suggesting that a reasonable amount of
the Na reduction is not attributed to reduced energy intake. So there has been a reduction in sodium-density.
Overall conclusions: standards have had a positive impact on school children’s diets, particularly in terms of
energy, sodium and sodium density.
2000 2009 Difference 95% CI of difference
Daily Na intake 2497.5 2323.8 -173.7 (-254.2, -93.2)
1. The childhood obesity crisis
2. Revised school food standards – a
response to the rising obesity levels
3. Project objective: were the
standards successful?
4. Factors affecting food intake
5. Simple analysis of lunchtime energy
intake
6. Adjusted means
7. Lsmeans
8. Diagnostic checks
9. Box-Cox power transformation
10. Square-root transformation of lunchtime Vit C intake
11. Shifting the lunchtime Vit C intakes
12. Unique property of log transformation
13. Log transformation of shifted Vit C intake
14. Summary of results

Contenu connexe

En vedette

Portafolio digital
Portafolio digitalPortafolio digital
Portafolio digitalanixxa
 
Trabajo final de creatividad
Trabajo final de creatividadTrabajo final de creatividad
Trabajo final de creatividadfranksanabria
 
Trabajoprcticonro6666 120507100659-phpapp02h
Trabajoprcticonro6666 120507100659-phpapp02hTrabajoprcticonro6666 120507100659-phpapp02h
Trabajoprcticonro6666 120507100659-phpapp02hRodrigo_Pastoriza
 
Release Creche Cruzada pela infância do leme é reinaugurada após reform
Release   Creche Cruzada pela infância do leme é reinaugurada após reformRelease   Creche Cruzada pela infância do leme é reinaugurada após reform
Release Creche Cruzada pela infância do leme é reinaugurada após reformCarolina Maciel
 
Hygiene booklet
Hygiene bookletHygiene booklet
Hygiene bookletZia Rahman
 
Trabajoprcticonro6666 120507100659-phpapp02
Trabajoprcticonro6666 120507100659-phpapp02Trabajoprcticonro6666 120507100659-phpapp02
Trabajoprcticonro6666 120507100659-phpapp02Rodrigo_Pastoriza
 
Generaciones de las computadoras
Generaciones de las computadorasGeneraciones de las computadoras
Generaciones de las computadorasjose312
 
HARVARDCertificateLCOR VFD
HARVARDCertificateLCOR VFDHARVARDCertificateLCOR VFD
HARVARDCertificateLCOR VFDVicente Ferrio
 
CJVM 2012 - Adaptation de série TV en webgame
CJVM 2012 - Adaptation de série TV en webgameCJVM 2012 - Adaptation de série TV en webgame
CJVM 2012 - Adaptation de série TV en webgameDigiworks
 
Oportunidades para joyería en España
Oportunidades para joyería en EspañaOportunidades para joyería en España
Oportunidades para joyería en EspañaProColombia
 
Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...
Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...
Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...ChatBotCamp
 
4. final sbm scoring template
4. final sbm scoring  template4. final sbm scoring  template
4. final sbm scoring templateberbina ducalang
 

En vedette (15)

Portafolio digital
Portafolio digitalPortafolio digital
Portafolio digital
 
Trabajo final de creatividad
Trabajo final de creatividadTrabajo final de creatividad
Trabajo final de creatividad
 
Trabajoprcticonro6666 120507100659-phpapp02h
Trabajoprcticonro6666 120507100659-phpapp02hTrabajoprcticonro6666 120507100659-phpapp02h
Trabajoprcticonro6666 120507100659-phpapp02h
 
Trabajo practico 1
Trabajo practico 1Trabajo practico 1
Trabajo practico 1
 
Release Creche Cruzada pela infância do leme é reinaugurada após reform
Release   Creche Cruzada pela infância do leme é reinaugurada após reformRelease   Creche Cruzada pela infância do leme é reinaugurada após reform
Release Creche Cruzada pela infância do leme é reinaugurada após reform
 
BULGARI 1
BULGARI 1BULGARI 1
BULGARI 1
 
Hygiene booklet
Hygiene bookletHygiene booklet
Hygiene booklet
 
Trabajos practico 6 (1)
Trabajos practico 6 (1)Trabajos practico 6 (1)
Trabajos practico 6 (1)
 
Trabajoprcticonro6666 120507100659-phpapp02
Trabajoprcticonro6666 120507100659-phpapp02Trabajoprcticonro6666 120507100659-phpapp02
Trabajoprcticonro6666 120507100659-phpapp02
 
Generaciones de las computadoras
Generaciones de las computadorasGeneraciones de las computadoras
Generaciones de las computadoras
 
HARVARDCertificateLCOR VFD
HARVARDCertificateLCOR VFDHARVARDCertificateLCOR VFD
HARVARDCertificateLCOR VFD
 
CJVM 2012 - Adaptation de série TV en webgame
CJVM 2012 - Adaptation de série TV en webgameCJVM 2012 - Adaptation de série TV en webgame
CJVM 2012 - Adaptation de série TV en webgame
 
Oportunidades para joyería en España
Oportunidades para joyería en EspañaOportunidades para joyería en España
Oportunidades para joyería en España
 
Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...
Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...
Платформа «ДиалТех» - простой инструмент для создания и хостинга «умных» чат-...
 
4. final sbm scoring template
4. final sbm scoring  template4. final sbm scoring  template
4. final sbm scoring template
 

Similaire à poster draft 2

Rev. 0527 Basic Math Review Complete
Rev. 0527 Basic Math Review CompleteRev. 0527 Basic Math Review Complete
Rev. 0527 Basic Math Review CompleteStepbk2
 
Aligning benchmarks with high stakes assessments 2010
Aligning benchmarks with high stakes assessments 2010Aligning benchmarks with high stakes assessments 2010
Aligning benchmarks with high stakes assessments 2010dvodicka
 
Aligning Benchmarks With High Stakes Assessments 2009
Aligning Benchmarks With High Stakes Assessments 2009Aligning Benchmarks With High Stakes Assessments 2009
Aligning Benchmarks With High Stakes Assessments 2009dvodicka
 
Converting Within And Between Systems
Converting Within And Between SystemsConverting Within And Between Systems
Converting Within And Between Systemswindleh
 
UEMCON_2016_4
UEMCON_2016_4UEMCON_2016_4
UEMCON_2016_4Zsolt Ori
 
Business Statistics Chapter 3
Business Statistics Chapter 3Business Statistics Chapter 3
Business Statistics Chapter 3Lux PP
 
6MODULE 2Module 2 Problem SetEXAMPLEGrand .docx
6MODULE 2Module 2 Problem SetEXAMPLEGrand .docx6MODULE 2Module 2 Problem SetEXAMPLEGrand .docx
6MODULE 2Module 2 Problem SetEXAMPLEGrand .docxblondellchancy
 
dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...
dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...
dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...dkNET
 
Copy of simple Linear regression _1_RK (1).pptx
Copy of simple Linear regression _1_RK (1).pptxCopy of simple Linear regression _1_RK (1).pptx
Copy of simple Linear regression _1_RK (1).pptxTabrezahmed39
 
LINEAR PROGRAMMING PROBLEMS.pptx
LINEAR PROGRAMMING PROBLEMS.pptxLINEAR PROGRAMMING PROBLEMS.pptx
LINEAR PROGRAMMING PROBLEMS.pptxcidiorapalpha
 
3. 1 2022 prot energy lush pasture FINAL FINAL.pptx
3. 1 2022 prot energy lush pasture FINAL FINAL.pptx3. 1 2022 prot energy lush pasture FINAL FINAL.pptx
3. 1 2022 prot energy lush pasture FINAL FINAL.pptx2damcreative
 
MSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik MallaMSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik MallaKartik Malla
 
Biostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptBiostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptletayh2016
 
Effective Process Control
Effective Process ControlEffective Process Control
Effective Process ControlNaveed699481
 
Stat170 - Introductory Statistics Semester 2, 2015 Assignmen.docx
Stat170 - Introductory Statistics Semester 2, 2015 Assignmen.docxStat170 - Introductory Statistics Semester 2, 2015 Assignmen.docx
Stat170 - Introductory Statistics Semester 2, 2015 Assignmen.docxdessiechisomjj4
 
G6 m1-c-lesson 21-t
G6 m1-c-lesson 21-tG6 m1-c-lesson 21-t
G6 m1-c-lesson 21-tmlabuski
 
Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58
Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58
Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58CIMNE
 

Similaire à poster draft 2 (20)

Rev. 0527 Basic Math Review Complete
Rev. 0527 Basic Math Review CompleteRev. 0527 Basic Math Review Complete
Rev. 0527 Basic Math Review Complete
 
Aligning benchmarks with high stakes assessments 2010
Aligning benchmarks with high stakes assessments 2010Aligning benchmarks with high stakes assessments 2010
Aligning benchmarks with high stakes assessments 2010
 
Aligning Benchmarks With High Stakes Assessments 2009
Aligning Benchmarks With High Stakes Assessments 2009Aligning Benchmarks With High Stakes Assessments 2009
Aligning Benchmarks With High Stakes Assessments 2009
 
Converting Within And Between Systems
Converting Within And Between SystemsConverting Within And Between Systems
Converting Within And Between Systems
 
UEMCON_2016_4
UEMCON_2016_4UEMCON_2016_4
UEMCON_2016_4
 
Business Statistics Chapter 3
Business Statistics Chapter 3Business Statistics Chapter 3
Business Statistics Chapter 3
 
6MODULE 2Module 2 Problem SetEXAMPLEGrand .docx
6MODULE 2Module 2 Problem SetEXAMPLEGrand .docx6MODULE 2Module 2 Problem SetEXAMPLEGrand .docx
6MODULE 2Module 2 Problem SetEXAMPLEGrand .docx
 
bayes_proj
bayes_projbayes_proj
bayes_proj
 
Notes 3-6
Notes 3-6Notes 3-6
Notes 3-6
 
dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...
dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...
dkNET Webinar: A New Approach to the Study of Energy Balance and Obesity usin...
 
Copy of simple Linear regression _1_RK (1).pptx
Copy of simple Linear regression _1_RK (1).pptxCopy of simple Linear regression _1_RK (1).pptx
Copy of simple Linear regression _1_RK (1).pptx
 
LINEAR PROGRAMMING PROBLEMS.pptx
LINEAR PROGRAMMING PROBLEMS.pptxLINEAR PROGRAMMING PROBLEMS.pptx
LINEAR PROGRAMMING PROBLEMS.pptx
 
3. 1 2022 prot energy lush pasture FINAL FINAL.pptx
3. 1 2022 prot energy lush pasture FINAL FINAL.pptx3. 1 2022 prot energy lush pasture FINAL FINAL.pptx
3. 1 2022 prot energy lush pasture FINAL FINAL.pptx
 
MSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik MallaMSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik Malla
 
Biostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.pptBiostatistics lecture notes 7.ppt
Biostatistics lecture notes 7.ppt
 
Effective Process Control
Effective Process ControlEffective Process Control
Effective Process Control
 
L3 energy new_slide30
L3 energy new_slide30L3 energy new_slide30
L3 energy new_slide30
 
Stat170 - Introductory Statistics Semester 2, 2015 Assignmen.docx
Stat170 - Introductory Statistics Semester 2, 2015 Assignmen.docxStat170 - Introductory Statistics Semester 2, 2015 Assignmen.docx
Stat170 - Introductory Statistics Semester 2, 2015 Assignmen.docx
 
G6 m1-c-lesson 21-t
G6 m1-c-lesson 21-tG6 m1-c-lesson 21-t
G6 m1-c-lesson 21-t
 
Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58
Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58
Presentation of EMPOWERING project in the last Workshop of the IEA Annex 58
 

poster draft 2

  • 1. A Statistical Analysis on the Nutritional Intakes of Secondary School Children An assessment of the impact of revised school food standards Adverse outcomes of obesity include cardiovascular disease, many cancers, type II diabetes, strokes, high blood pressure, osteoarthritis, fertility problems, reduced life expectancy, depression, anxiety and low self-esteem. In 2002, 21.8% of boys and 27.5% of girls aged 2-15 years were overweight or obese. Furthermore, the direct cost of obesity to the NHS was estimated at £46-49 million per year. In response to Jamie Oliver’s Feed Me Better campaign in 2005, the Department for Education and Skills revised the national school food standards. 513 schoolchildren from 2 time-points (2000 and 2009) completed ‘food diaries.’ From these, nutritionists devised each child’s mean daily intake and mean lunchtime intake for each nutrient (energy, protein, fat etc.). Aim: Assess impact of standards Variables that affect food/nutrient intake are: • YEAR: 2000 or 2009: since changes were made to school food regulations in this time period. • LUNCH TYPE: School lunch (SL) or packed lunch (PL): since regulations applied to school lunches only. • SEX: Male or Female: since boys eat more than girls. However, the difference between sexes does not depend on the new standards and so this effect is not of interest. The mean lunchtime energy intake in kcal: Inference: Average energy intake decreased substantially for school lunches, but not a lot for packed lunches. Problem: The 4 groups do not contain equal amounts of boys and girls, and since sex affects energy intake, the year/lunch effects are confounded with the sex effect which is not of interest. Therefore the groups are not comparable. Solution: Adjusted means. 2000 2009 Difference: 2009-2000 SL 711.9 495.9 -216.0 PL 612.3 574.2 -38.2 Method: 1. Fit a linear regression model to the data: 𝑌𝑖𝑗𝑘𝓁 = 𝜇 + 𝛼𝑖 + 𝛽𝑗 + (𝛼𝛽)𝑖𝑗+𝛾 𝑘 + 𝜖𝑖𝑗𝑘𝓁 , * If the p-value for the interaction is significant, year affects intake differently for each lunch type, so a two-way table is needed to present means. If however it is not significant, the interaction complicates the presentation, yet does not add anything worthwhile. Therefore, the model will be re-fitted without the interaction if not significant and one-way tables used. † Choice of 𝑆𝑒𝑥 is arbitrary as it does not affect the differences, but one that produces plausible mean values is preferable, so that practitioners without statistics backgrounds are not disconcerted. Response for 𝓁 𝑡ℎ subject, who was from 𝑖 𝑡ℎ year, 𝑗 𝑡ℎ lunch type and 𝑘 𝑡ℎ sex Overall mean Effect of 𝑖 𝑡ℎ year Effect of 𝑗 𝑡ℎ lunch type Effect of (𝑖𝑗) 𝑡ℎ combination of year and lunch type* Effect of 𝑘 𝑡ℎ sex - to be corrected for Error of the 𝓁 𝑡ℎ individual 2. Estimate regression coefficients & obtain equation for the fitted mean of each group: 𝑌𝑖𝑗 = 734.5 − 219.7 𝐼 𝑌𝑒𝑎𝑟 = 2009 − 106.4 𝐼 𝐿𝑢𝑛𝑐ℎ = 𝑆𝐿 − 39.1 𝑆𝑒𝑥𝑖𝑗 + 186.1 𝐼[𝑌𝑒𝑎𝑟 = 2009 & 𝐿𝑢𝑛𝑐ℎ = 𝑆𝐿] where 𝐼[𝐴] is an indicator variable that equals 1 if the event A is true and 0 otherwise, and 𝑆𝑒𝑥𝑖𝑗 is the proportion of females in the group. 3. Fix sex variable at a constant arbitrary† value, say the mean sex value of the sample: S𝑒𝑥 = 0.5185 4. Compute the mean for each group at this uniform sex value, instead of using 𝑆𝑒𝑥𝑖𝑗 2000 school lunch: 𝑌0,0 = 734.5 − 219.7 × 0 − 106.4 × 0 − 39.9𝑆𝑒𝑥 + 186.1 × 0 × 0 = 713.8 2000 packed lunch: 𝑌0,1 = 734.5 − 219.7 × 0 − 106.4 × 1 − 39.9𝑆𝑒𝑥 + 186.1 × 0 × 1 = 607.5 2009 school lunch: 𝑌1,0 = 734.5 − 219.7 × 1 − 106.4 × 0 − 39.9𝑆𝑒𝑥 + 186.1 × 1 × 0 = 494.1 2009 packed lunch: 𝑌1,1 = 734.5 − 219.7 × 1 − 106.4 × 1 − 39.9𝑆𝑒𝑥 + 186.1 × 1 × 1 = 573.9 5. The group means have been adjusted for sex imbalance so they are comparable! Inference can now be made on the differences (estimable quantities). This is because the differences are independent of choice of 𝑆𝑒𝑥 (when one mean is subtracted from another), 𝑆𝑒𝑥 cancels out – so differences are unique! A package called lsmeans can be downloaded in R, allowing efficient calculation of adjusted group means, for lunchtime and daily intakes of all nutrients. This package, by default, uses 0.5 for the arbitrary fixed value of 𝑆𝑒𝑥. Diagnostic checks must be performed for each model, to check for homoscedasticity (constant variance, by residual plots) and Normality (by Normal probability plots) of the estimated residuals. For most models, the plots are satisfactory. However, lunchtime and daily vitamin C intake have concerning Normal probability plots. The obvious curvature means that Normality cannot be assumed. Normal probability plots for lunchtime and daily vitamin C intake Problem: Significance tests and confidence intervals are invalidated. Solution: Data transformation: a transformation must not change the order of values, but can alter the distance between successive points to modify the overall shape of the distribution and achieve a ‘bell curve’. The Box-Cox power transformation (1964) is the most commonly used tool to remedy the breakdown of the Normality assumption. For some positive data 𝑌1,…, 𝑌𝑛, it is given by 𝑌𝑖 (𝜆) = 𝑌𝑖 𝜆 − 1 𝜆 , 𝑖𝑓 𝜆 ≠ 0, log 𝑌𝑖 , 𝑖𝑓 𝜆 = 0, where the transformation parameter 𝜆 requires estimation. For non-positive data, there is a two-parameter version, which allows for a shift before transformation, given by 𝑌𝑖 (𝜆) = (𝑌𝑖 + 𝜆2) 𝜆1−1 𝜆1 , 𝑖𝑓 𝜆1 ≠ 0, log 𝑌𝑖 + 𝜆2 , 𝑖𝑓 𝜆1 = 0, where the transformation parameter 𝜆1 and the shift parameter 𝜆2 both require estimation. The Box-Cox parameters are usually estimated by maximum likelihood and then rounded to resemble a practical transformation (e.g. square root, cube root, inverse). The lunchtime vitamin C data is non-positive (contains some zeroes), so the two-parameter version is used, giving 𝜆1 = 0.4022 and 𝜆2 = 0.0015 to 4 d.p. Rounding gives a square-root transformation (preceded by a shift of size zero). The daily vitamin C data is positive. The standard version is therefore used, giving 𝜆 = 0.1997. Rounding gives a log transformation. Problem: Transformed data will typically be on a scale that is unfamiliar to practitioners. Solution: Use the inverse transformation to back-transform the results, so that they are put on the original scale and made accessible to practitioners. Fitting the model to the square-rooted data gives an acceptable fit to Normality. The adjusted means for the square-rooted data are calculated, then squared to convert them back to the original scale: Problem: It has been made apparent already that data on the original scale violates the Normality assumption, which is the reason a transformation was sought in the first place. Confidence intervals for the difference therefore cannot be found. Valid conclusions can only be drawn from data on the square-root scale, which makes the back-transformation redundant. Solution: Use log-transformation. 2000 2009 Difference 95% CI of difference Square-root scale 𝑌2000 𝑌2009 𝑑 = 𝑌2009 − 𝑌2000 𝑑 ± 𝑠. 𝑒. 𝑑 Original scale 𝑌2000 2 𝑌2009 2 𝑌2009 2 - 𝑌2000 2 SL PL Difference 95% CI of difference Square-root scale 𝑌𝑆𝐿 𝑌𝑃𝐿 𝑑 = 𝑌𝑃𝐿 − 𝑌𝑆𝐿 𝑑 ± 𝑠. 𝑒. 𝑑 Original scale 𝑌𝑆𝐿 2 𝑌𝑃𝐿 2 𝑌𝑃𝐿 2 − 𝑌𝑆𝐿 2 A useful quality of the log-transformation is that an intuitive interpretation is possible upon back- transformation. This is owing to the relationship between the geometric and arithmetic means of some general data 𝑌1, 𝑌2, … , 𝑌𝑛 𝐺𝑀(𝑌𝑖) = 𝑖=1 𝑛 𝑌𝑖 1 𝑛 = exp 1 𝑛 𝑖=1 𝑛 log 𝑌𝑖 = exp 𝐴𝑀(log(𝑌𝑖) , where 𝐺𝑀(. ) and 𝐴𝑀(. ) denote the geometric and arithmetic means respectively. Therefore, 𝐴𝑀(log(𝑌𝑖)) = log 𝐺𝑀 𝑌𝑖 . Hence, the difference between two group (arithmetic) means (of logged data) is given by log 𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐴 − log 𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐵 = log 𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐴 𝐺𝑀 𝑌𝐺𝑅𝑂𝑈𝑃 𝐵 . Upon exponentiation, the ‘difference’ simply becomes the ratio of the geometric means. Then, due to the asymmetry of the log-transformation, the confidence interval of this ratio can be found directly by anti- logging the confidence interval of the difference. Unlike the square-root transformation however, a log-transformation cannot be applied to the zero observations. This is resolved by shifting the data, but a sensible constant must be determined. Whichever one minimizes the residual skewness is a logical choice, since Normality corresponds to zero residual skewness. Minimal residual skewness is achieved with a shift of approximately 15. The log-transformation can then be performed on the shifted intakes. After log-transformation, there is still an acceptable fit to Normality. So although the Box-Cox method indicated square-root, the log-transformation also manages to Normalize the data quite well. Inference: Vitamin C intake in 2009 was 1.12 times larger than in 2000 and packed lunches on average contained 1.07 times as much as school lunches. 2000 2009 Difference/Ratio 95% CI of difference Log-scale 3.638 3.756 PL – SL: 0.118 (0.023, 0.212) Original scale 38.0 42.8 PL / SL: 1.12 (1.02, 1.24) SL PL Difference/Ratio 95% CI of difference Log scale 3.664 3.730 PL – SL: 0.0664 (-0.0299, 0.1626) Original scale 39.0 41.7 PL / SL: 1.07 (0.97, 1.18) Lunchtime intakes: Consumption of energy, sodium and saturated fat declined significantly in school lunch children, but not in packed lunch children. Vitamin C intake increased reasonably over the years, but the impact was the same in both lunch types. Daily intakes: Daily consumption of all nutrients did not differ for school and packed lunch children. Consumption of energy and sodium fell significantly, but there was no evidence to suggest the same for saturated fat. Vitamin C increased quite reasonably. Problem: Energy intake is a proxy for amount eaten. Energy intake decreased over the years, meaning that children ate less in 2009 than in 2000. What if the decrease in sodium is simply due to the fact that they ate less food overall? Solution: Investigate sodium-density. To investigate how heavily sodium depends on energy, energy is included as an explanatory variable: 𝑁𝑎 = 𝛼 + 𝛽𝐸 + 𝜖, where 𝑁𝑎 = daily sodium intake, 𝐸 = daily energy intake, 𝜖 = error, with 𝜖~𝑁(0, 𝜎2 ), and 𝛼 incorporates the effects of all other covariates as well as the general mean. This time, means are not only adjusted for sex, but also for energy intake. Hence, even if a child’s energy intake was the same in each year, their average daily sodium intake will still have decreased by over 170 mg, which is a relatively large amount, suggesting that a reasonable amount of the Na reduction is not attributed to reduced energy intake. So there has been a reduction in sodium-density. Overall conclusions: standards have had a positive impact on school children’s diets, particularly in terms of energy, sodium and sodium density. 2000 2009 Difference 95% CI of difference Daily Na intake 2497.5 2323.8 -173.7 (-254.2, -93.2) 1. The childhood obesity crisis 2. Revised school food standards – a response to the rising obesity levels 3. Project objective: were the standards successful? 4. Factors affecting food intake 5. Simple analysis of lunchtime energy intake 6. Adjusted means 7. Lsmeans 8. Diagnostic checks 9. Box-Cox power transformation 10. Square-root transformation of lunchtime Vit C intake 11. Shifting the lunchtime Vit C intakes 12. Unique property of log transformation 13. Log transformation of shifted Vit C intake 14. Summary of results