Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Sampling based appr ximation of
confidence intervals for functions of
genetic covariance matrices
Karin Meyer 1
David Houle...
Sampling standard errors | Introduction
REML sampling variances
REML estimates of covariance components
– multivariate nor...
Sampling standard errors | Introduction
Alternatives
Dealing with boundary conditions
– Derive confidence intervals from pr...
Sampling standard errors | Introduction
Alternatives
Dealing with boundary conditions
– Derive confidence intervals from pr...
Sampling standard errors | Method
Sampling scheme
Large sample theory
– (RE)ML estimates have MVN distribution
– Sampling ...
Sampling standard errors | Method
Sampling scheme
Large sample theory
– (RE)ML estimates have MVN distribution
– Sampling ...
Sampling standard errors | Simulation
Does it work?
Simulate two data sets
– 4000 animals, 6 traits
– h2
= 2 × (0.2, 0.3, ...
Sampling standard errors | Results
Sampling covariances for ˆΣΣΣG - a∗
Empirical vs. REML Approximate vs. REML Approximate...
Sampling standard errors | Results
Sampling covariances for ˆΣΣΣG - b†
Rank 6 Rank 5
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
...
Sampling standard errors | Results
Delta method for ˆrij
Estimate elements of Cholesky L factor of ΣΣΣ = LL
– H(ˆθθθ)−1
gi...
Sampling standard errors | Results
Approximation for ˆrij
Let ΣΣΣ = LL and θθθ = vech(L)
For many replicates
– Sample ˜θθθ...
Sampling standard errors | Results
Distribution of ˆrG12 - b
Empirical
0.5 0.6 0.7 0.8 0.9 1.0
Correlation
Approximate
0.5...
Sampling standard errors | Results
Distribution of second eigenvalue
Empirical
20 30 40
Eigenvalue
Approximate
20 30 40
Ei...
Sampling standard errors | Results | Conclusions
Conclusions
Sampling from MVN distribution
– accommodates arbitrary funct...
Sampling based approximation of confidence intervals for functions of genetic covariance matrices
Prochain SlideShare
Chargement dans…5
×

Sampling based approximation of confidence intervals for functions of genetic covariance matrices

Approximate lower bound sampling errors of maximum likelihood estimates of covariance components and their linear functions can be obtained from the inverse of the
information matrix. For non-linear functions, sampling variances are commonly determined as the variance of their first order Taylor series expansions. This is used to obtain sampling errors for estimates of heritabilities and correlations, and these quantities can be computed
with most software performing such analyses. In other instances, however, more complicated functions are of interest or the linear approximation is difficult or inadequate. A pragmatic alternative then is to evaluate sampling characteristics by repeated sampling of parameters from their asymptotic, multivariate normal distribution, calculating the function(s) of interest for each sample and inspecting the distribution across replicates. This paper demonstrates the use of this approach and examines the quality of
approximation obtained.

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Sampling based approximation of confidence intervals for functions of genetic covariance matrices

  1. 1. Sampling based appr ximation of confidence intervals for functions of genetic covariance matrices Karin Meyer 1 David Houle 2 1 Animal Genetics and Breeding Unit, University of New England, Armidale NSW 2351 2 Department of Biological Science, Florida State University, Tallahassee, FL 32306-4295 AAABG 2013
  2. 2. Sampling standard errors | Introduction REML sampling variances REML estimates of covariance components – multivariate normal distribution: ˆθθθ ∼ N (θθθ, I(θθθ)−1) – inverse of information matrix −→ sampling errors – large sample theory; asymptotic lower bounds Linear functions of estimates – sampling variances readily obtained Non-linear functions – obtain 1st order Taylor series expansion – evaluate sampling variance of linear approximation – needs partial derivatives w.r.t. all variables −→ can be complicated / tedious −→ options for evaluating in REML software limited Confidence intervals: ±zα s.e. – misleading at boundary of parameter space? K. M. | 2 / 12 “Delta method”
  3. 3. Sampling standard errors | Introduction Alternatives Dealing with boundary conditions – Derive confidence intervals from profile likelihood – Bayesian estimation General procedure – Sample data, repeat analysis −→ distribution over reps – slow & laborious! K. M. | 3 / 12
  4. 4. Sampling standard errors | Introduction Alternatives Dealing with boundary conditions – Derive confidence intervals from profile likelihood – Bayesian estimation General procedure – Sample data, repeat analysis −→ distribution over reps – slow & laborious! Objectives 1 Propose new scheme – sample from (theoretical) distribution of estimates – simple & fast 2 Examine quality of approximation of sampling errors K. M. | 3 / 12
  5. 5. Sampling standard errors | Method Sampling scheme Large sample theory – (RE)ML estimates have MVN distribution – Sampling covariance ∝ inverse of information matrix Sample from this distribution ˜θθθ ∼ N ˆθθθ, H(ˆθθθ)−1 Information matrix – Use same parameterisation as REML analysis → eliminate linear approx., account for constraints – Evaluate function(s) of interest for ˜θθθ – Examine distribution over replicates K. M. | 4 / 12
  6. 6. Sampling standard errors | Method Sampling scheme Large sample theory – (RE)ML estimates have MVN distribution – Sampling covariance ∝ inverse of information matrix Sample from this distribution ˜θθθ ∼ N ˆθθθ, H(ˆθθθ)−1 Information matrix – Use same parameterisation as REML analysis → eliminate linear approx., account for constraints – Evaluate function(s) of interest for ˜θθθ – Examine distribution over replicates Mandel, M. (2013) Simulation-based confidence intervals for functions with complicated derivatives. American Statistician 67, 76–81. K. M. | 4 / 12
  7. 7. Sampling standard errors | Simulation Does it work? Simulate two data sets – 4000 animals, 6 traits – h2 = 2 × (0.2, 0.3, 0.4) – σ2 P = 100 – rE = 0.3 – a) rG = 0.5, b) rG = |0.7||i−j| REML analysis – AI algorithm – Cholesky factor Estimates – ˆθθθ – H(ˆθθθ) Compare estimates of sampling variances REML Based on H(ˆθθθ), “Delta” method Empirical Re-sample data using estimates as popul. values, repeat analysis; 10000 replicates Approx. Sample from MVN distribution, N(ˆθθθ, H(ˆθθθ)−1 ) 200000 replicates K. M. | 5 / 12
  8. 8. Sampling standard errors | Results Sampling covariances for ˆΣΣΣG - a∗ Empirical vs. REML Approximate vs. REML Approximate vs. Empirical ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ●● ●●●● ●● ●● ●●● ● ● ● ● ● ● ●● ● ●● ●●● ● ●● ● ● ●● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ●● ● ●●●●● ●● ●●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●●● ● ●● ●● ● ●●● ● ● ● ● ● ●● ●●● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●●● REML ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ●● ●●●● ●●●●●●● ● ●● ● ● ● ●● ● ●● ●●● ● ●● ● ● ●● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ●● ● ●●●● ●●●●●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ●●● ● ●● ●● ● ●●● ● ● ● ● ● ●● ●●● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●●● REML ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ●● ●●●● ●●●●●●● ● ●● ● ● ● ●● ● ●● ●●● ● ●● ● ● ●● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ●● ● ●●●● ●●●●●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ●●● ● ●● ●● ● ●●● ● ● ● ● ● ●● ●●● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●●● Empirical0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 6 traits, 21 (co)variance components, 231 sampling (co)variances variance, ◦ covariance ∗Case a: all genetic eigenvalues > 0 K. M. | 6 / 12
  9. 9. Sampling standard errors | Results Sampling covariances for ˆΣΣΣG - b† Rank 6 Rank 5 ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●●●● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●●● ● ●●● ● ● ● ● ● ● ● ● ● ●●● 0 5 10 15 0 5 10 15 0 5 10 15 Empirical Approximate Approximation unreliable if model is over-parameterised †Case b: one genetic eigenvalue ≈ 0 K. M. | 7 / 12
  10. 10. Sampling standard errors | Results Delta method for ˆrij Estimate elements of Cholesky L factor of ΣΣΣ = LL – H(ˆθθθ)−1 gives Cov(ˆlij,ˆlmn) – covariances between σij Cov(ˆσij, ˆσkl) ≈ f(i,j) t=1 f(k,m) s=1 ˆljt ˆlms Cov ˆlit,ˆlks +ˆljt ˆlks Cov ˆlit,ˆlms +ˆlit ˆlms Cov ˆljt,ˆlks +ˆlit ˆlks Cov ˆljt,ˆlms For ˆrij = ˆσij/ ˆσ2 i ˆσ2 j Var(ˆrij) ≈ 4ˆσ4 i ˆσ4 j Var(ˆσij) + ˆσ2 ij ˆσ4 j Var(ˆσ2 i ) + ˆσ2 ij ˆσ4 i Var(ˆσ2 j ) − 4ˆσij ˆσ2 i ˆσ4 j Cov(ˆσij, ˆσ2 i ) − 4ˆσij ˆσ4 i ˆσ2 j Cov(ˆσij, ˆσ2 j ) + 2ˆσ2 ij ˆσ2 i ˆσ2 j Cov(ˆσ2 i , ˆσ2 j ) / 4ˆσ6 i ˆσ6 j K. M. | 8 / 12
  11. 11. Sampling standard errors | Results Approximation for ˆrij Let ΣΣΣ = LL and θθθ = vech(L) For many replicates – Sample ˜θθθ ∼ N(ˆθθθ, H(ˆθθθ)−1 ) – Construct ˜L from ˜θθθ – Calculate ˜ΣΣΣ = ˜L˜L – Calculate correlation ˜rij = ˜σij/ ˜σ2 i ˜σ2 j Evaluate Var(ˆrij) as emprical variance of ˜rij across replicates K. M. | 9 / 12
  12. 12. Sampling standard errors | Results Distribution of ˆrG12 - b Empirical 0.5 0.6 0.7 0.8 0.9 1.0 Correlation Approximate 0.5 0.6 0.7 0.8 0.9 1.0 Correlation REML Empirical Approxim. ˆrG12 0.897 0.873 0.866 s.e. 0.059 0.066 0.063 K. M. | 10 / 12
  13. 13. Sampling standard errors | Results Distribution of second eigenvalue Empirical 20 30 40 Eigenvalue Approximate 20 30 40 Eigenvalue REML Empirical Approxim. ˆλ2 32.93 33.25 33.84 s.e. – 3.27 3.30 K. M. | 11 / 12
  14. 14. Sampling standard errors | Results | Conclusions Conclusions Sampling from MVN distribution – accommodates arbitrary functions – yields good approximation of sampling variances – easier than Delta method for complicated derivatives – more appropriate confidence interval at boundary of parameter space – but: −→ relies on large sample theory −→ information matrix needs to be safely p.d. −→ assumes ˆθθθ ≈ θθθ Simple but useful addition to our toolkit – implemented in WOMBAT K. M. | 12 / 12

×