Improving accuracy by using information from relatives—The animal model

Partner
Logo
Improving accuracy by using information from
relatives—The animal model
Raphael Mrode, ILRI
Australian Africa Universities (AAUN) Workshop, Mauritius University,
Mauritius, 29-31 January 2018

Why improve accuracy
• Affect the rate of response :
• Breeder’s equation
• More information and higher accuracies give
more spreads to evaluations
• enabling the identification the best candidates
easily
• Can be used to compute confidence interval on
evaluations
• CI = (EBV ± tα√PEV) , with PEV = (1 - R2)σ2
a

Basic steps – Single records on each candidate
• Initial assumptions:
• All environmental effects are known
• Genetic and phenotypic parameters are known
exactly

Single records on each plant
• Assume a linear relationship between phenotype
and genotype
• â = b(y-pop)
• b = regression of genotype on phenotypic
• pop= is the mean
b = cov(a,y) / var(y) = cov(a, a + e) / var(y)
= σ2
a / σ2
y
= h2

Accuracy of prediction
ra,y = cov(a,y)/(σa σy)
= σ2a /(σa σy)
= h
and reliability equals h2.
The var of EBV (var(âi)) is
var(âi) = var(by) = var(h2y)
= h4σ2y
= r2
a,yh2σ2
y = r2
a,yσ2
a

Combine information on candidate + relatives:
Selection index
Selection index is a method for estimating the breeding value of
an animal combining all information available on the candidate and
its relatives
I = b1(y1 - μ1) + b2(y2 - u2) + b3(y3 - u3)
= b'(Y- μ) with b = P-1Ga

Combine information on individual + relatives:
Selection index
• Ii = âi = b1(y1 - μ1) + b2(y2 - u2) + b3(y3 - u3)
– bi are weights on each measurement. Represent partial
regression coefficients of the individual's breeding value
on each record
– Ii = selection index = estimate of genetic merit
• It is refereed to as the best linear prediction
(BLP) because;
– It minimises the average square prediction error, that is,
minimises the average of all (ai- âi)2.
– It maximises the correlation (ra,â) between the true
breeding value and the index (accuracy).

Selection Index
pmm = phenotypic variances for individual m
gmm = genetic variances for individual m
pmn and gmn are the phenotypic and genetic
covariances, respectively, between individuals m and n.
bbb
bbb
bbb
m
m
g=p+...+p+p
..
..
..
g=p+...+p+p
g=p+...+p+p
mmmmm2m
122m2221
111m1211
1211
21
21

Selection index
In matrix form, equations are :
Pb = G
b = P-1G
I = â = (P-1G)’(y - μ)
= b'(y - μ)
P = variance and covariance matrix for observations
G = covariance matrix between observations and breeding
value to be predicted.
μ = estimates of environmental effect on records, assumed to
be known without error. When these include management or
fixed effects to be estimated, it limits use of selection index

gb
=r 2
a
ijj
m
1=j
aI,



Example : Individual + relatives
Suppose the average daily gain (ADG) for a bull calf (y1), his
sire (y2) and dam (y3) are 900, 800 and 450g/day
respectively.
Assuming all records were obtained in the same herd,
heritability of 0.43 and phenotypic standard deviation of
80, predict the breeding value of the bull calf for ADG and
its accuracy. Compare the accuracy that obtained from
using only the record of the calf
From the parameters given
p11 = p22 = p33 = σ2
y = 802 = 6400
p12 = cov(y1,y2) = 1/2 σ2
a = 1/2(2752) = 1376
p13 = p12 = 1376
p23 = 0
g11 = σ2
a = 2752
g12 = g13 = 1/2 σ2
a = 1376



































31
21
11
333231
232221
131211
g
g
g
=
PPP
PPP
PPP
=
-
3
2
1
1
b
b
b
































1376
1376
2752
640000001376
000064001376
137613766400
b
b
b -
3
2
1
1
==

Solutions to the above equations are
b1 = 0.372, b2 = 0.135 and b3 = 0.135
The index is
I = 0.372(900 - μ) + 0.135(800 - μ) + 0.135(450 - μ)
where μ is the herd average.
The accuracy is
r =  [(0.372(2752) + 0.135(176) + 0.135(176)) / 2752] =
0.712
If only the record of the calf is used for prediction:
r =  0.43 = 0.656
Using parents records increased accuracy by about 9%

Accounting for all relationships
• Previous example, we dealt with 3 animals and it was easy to
include the relationship among them.
• A matrix that describe this relationship is called the
numerator relationship matrix (A)
• For the example , this will look like
Animals 1 2 3
A =
1 = sire, 2 = dam and 3= calf
Next , we look at rules to compute A for a large population










10.50.5
0.510
0.501

Numerator relationship matrix
• Usually in animal breeding data, animals tended to be related and
the genetic relationship among these animals is needed for more
accurate genetic evaluation.
• The genetic covariance among individuals is comprised of three
components:
– the additive genetic variance
– the dominance variance and
– the epistatic variance.
• This lecture will address only the additive genetic relationship
• Use of additive genetic relationship matrix usually increases the
accuracies of evaluations and should help account for previous
selection decisions if all pedigrees are utilised

Numerator relationship matrix
• The numerator relationship matrix (A) describes the additive genetic
relationship among individuals
• Coancestry ( rxy) = probability that a randomly drawn gene from x is identical
by descent with a gene randomly drawn from y
• It is equal inbreeding coefficient of their progeny (z) if x and y be mated (fz =
rxy) and rzz = (1+fz)/2
• The additive genetic relationship between animals x and y is twice the
coancestry
• The matrix A is symmetric and its
– diagonal element for animal i (aii) is equal to 1 + Fi, with Fi is the inbreeding
coefficient
– off-diagonal element, aij equals the numerator of the coefficient of relationship
• When multiplied with genetic variance (σ2
u) it is equal to the covariance of
breeding values. Thus var(ui) = aiiσ2
u = (1 + Fi)σ2
u.

Recursive method for computing A
• Henderson (1976) described method for calculating the matrix A
• Pedigree are coded 1 to n and ordered such that parents precede
their progeny.
If both parents (s and d) of animal i are known
aji = aij = 0.5(ajs + ajd ) ;j = 1 to i-1
aii = 1 + 0.5(asd)
If only one parent s is known and assumed unrelated to the mate
aji = aij = 0.5(ajs) ;j = 1 to i-1
aii = 1
If both parents are unknown and are assumed unrelated
aji = aij = 0 ;j = 1 to i-1
aii = 1

Example pedigree
• Calf Sire Dam
• ---------------------------
• 3 1 2
• 4 1 unknown
• 5 4 3
• 6 5 2
• a11 = 1 + 0 = 1
• a12 = 0.5(0+0) = 0 = a12
• a22 = 1 + 0 = 1
• a13 = 0.5(a11+a12) = 0.5(1.0 + 0 ) = 0.5 = a31
• a23 = 0.5(a12+a22) = 0.5(0 + 1.0) = 0.5 = a32

Recursive method for computing A
1 2 3 4 5 6
----------------------------------------------------
1 1.00 0.0 0.50 0.50 0.50 0.25
2 0.00 1.0 0.50 0.00 0.25 0.625
3 0.50 0.50 1.00 0.25 0.625 0.563
4 0.50 0.00 0.25 1.00 0.625 0.313
5 0.50 0.25 0.625 0.625 1.125 0.688
6 0.25 0.625 0.563 0.313 0.688 1.125
• a66 = 1 + 0.5(a52) = 1 + 0.5(0.25) =1.125
• From the above calculation the inbreeding coefficient for calf 6 is
0.125

Rules for the inverse of A (ignoring
inbreeding
For the MME, it is the inverse of A that is
needed.
Henderson (1976) used equation to develope
rules for A-1
D-1 = 2 if both parents are known
4/3 if one parent is known
1 if no parent is known
If di = diagonal element of D-1 for animal i.
di = 4/(2 + no of parents unknown)

Rules for the inverse of A (ignoring
inbreeding
A-1 = 0
If both parents of the ith animal are known, add
di to the (i,i) element
-di/2 to the (s,i), (i,s), (d,i) and (i,d) elements
di/4 to the (s,s), (s,d), (d,s) and (d,d) elements
If only one parent (s) of the ith animal is known, add
-di/2 to the (s,i) and (i,s) elements
di/4 to the (s,s) element
Neither parents of the ith animal are known, add

Example using the given pedigree
• calf sire dam
• --------------------
• 1 unknown unknown
• 2 unknown unknown
• 3 1 2
• 4 1 unknown
• 5 4 3
• 6 5 2
• -------------------

Set up a table of 6,6 for the animals.
For animals 1 and 2, both parents are unknown, therefore α1 =
α2 = 1. Add 1 to their diagonal elements (1,1 and 2,2).
For animal 3, both parents are known therefore α3 = 2. Add 2
to the 3,3 element, -1 to the (3,1), (1,3), (3,2) and (2,3)
elements and 1/2 to the (1,1), (1,2), (2,1) and (2,2) elements.
For animal 4, only one parent is known, therefore α4 = 4/3.
Add 4/3 to the (4,4) element, -2/3 to the (4,1) and 1,4)
elements and 1/3 to the (1,1) element. After the first four
animals, the table is

• After the first four animals, the table is
•
• 1 2 3 4 5 6
• |------------------------------------------------
• 1 | 1+1/2+1/3 1/2 -1 -2/3
• 2 | 1/2 1+1/2 -1
• 3 | -1 -1 2
• 4 | -2/3 0 0 4/3
• 5 |
• 6 |

• After applying the relevant rules to animals 5 and 6, the
inverse of A then is
•
• 1 2 3 4 5 6
• |------------------------------------------------
• 1 | 1.83 0.5 -1.0 -0.667 0.0 0.0
• 2 | 0.5 2.0 -1.0 0.0 0.5 -1.0
• 3 | -1.0 -1.0 2.50 0.5 -1.0 0.0
• 4 | -0.67 0.0 0.5 1.833 -1.0 0.0
• 5 | 0.0 0.5 -1.0 -1.0 2.50 -1.0
• 6 | 0.0 -1.0 0.0 0.0 -1.0 2.0
•

BLUP-Individual Animal model
• We see the use of selection index (best linear prediction) for including
information of relatives
• Disadvantages: Records have to be pre-adjusted for fixed and
environmental effects. These are usually known and is a major issue if
there is no prior data for new subclasses of fixed effects.
P-1 is needed and may be too large to invert with large data
sets
• Mixed linear model presents better framework for the simultaneous
estimation of fixed effects and prediction of genetic merit.
• Prediction of genetic merit for all individuals using data and pedigree
information is called the individual animal model

Linear mixed models
• Mixed linear model - explanatory variables
consist of both fixed effects and random effects,
• No general consensus on classification of effects
either as fixed or random.
• In general, factors are considered as fixed effects
when
– Main focus on the levels of factor represented in the
study, e.g. plots, lactation numbers, or perhaps farms.
– Therefore inferences can be made with respect to the
levels of the factor in the model

Linear mixed models
• Random effects are usually assumed to be drawn
from a normal distribution, N(0, V) where V is the
variance of the effects and mean = 0.
• For genetic effects in a mixed linear model, above
assumption is equivalent to assuming that traits
are determined by additive alleles of infinitesimal
small effects at infinitely many unlinked loci.

BLUP
• Henderson (1949) Best Linear Unbiased Prediction (BLUP), by
which fixed effects and breeding values can be simultaneously
estimated.
• Properties include:
• Best - means it maximizes the correlation between true (a)
and predicted breeding value (â) or minimizes prediction error
variance (var(a - â)).
• Linear - predictors are linear functions of observations.
• Unbiased - estimation of realized values for a random variable
such as animal breeding values are unbiased (E(a = â)).
• Prediction - involves prediction of true breeding value or
future performance in terms what an individual will pass to its
progeny

Mixed Model Equations (MME)
• MME equations to be solved to fixed and
random effects solutions are:
with α = σ2
e/σ2
a or (1-h2)/h2.
=
+ 1-
























yZ
yX
a
b
AZZXZ
ZXXX
ˆ
ˆ


Individual Animal model
Calves Pens Sire Dam WWG (kg)
---------------------------------------------------
4 Pen1 1 unknown 4.5
5 Pen2 3 2 2.9
6 Pen2 1 2 3.9
7 Pen1 4 5 3.5
8 Pen1 3 6 5.0
--------------------------------------------------
Assume = σ2
a=20 and σ2
e= 40
and α= σ2
e/σ2
a= 40/20 =2

Individual Animal model
• Aim : estimate the effects of pen and predict breeding values
for all animals.
• The model to describe the observations is
• yijk = pi + aj + eijk
• where
• yij = the WWG of the jth calf of the ith pen
• pi = the fixed effect of the ith pen
• aj = random effect of the jth calf
• eijk = random error effect.

Linear mixed model
• In matrix notation, a mixed linear model may be represented as
• y = Xb + Za + e
• where
• y = n x 1 vector of observations; n = number of records.
• b = p x 1 vector of fixed effects; p = number of levels for fixed effects.
• a = q x 1 vector of random animal effects; q = number of levels for
random effects
• e = n x 1 vector of random residual effects
• X = design matrix of order n x p, that relates records to fixed effects
• Z = design matrix of order n x q, that relates records to random animal
effects
• Both X and Z are both termed design or incidence matrices.

Incidence matrices
 0.55.39.39.25.4'; 





 y
00110
11001
=X
;
















10000000
01000000
00100000
00010000
00001000
=Z







00110000
11001000
=ZX

Matrices for the MME
• Z’X = transpose of X’Z
 
)1,1,1,1,1,0,0,0(
0.55.39.39.25.4000'ofTranpose
8.6
0.13
'
diag








ZZ'
yZ
yX

Least Square Equations (LSE)
























yZ
yX
a
b
ZZXZ
ZXXX
=
ˆ
ˆ

LSE










































































































5.0
3.5
3.9
2.9
4.5
0
0
0
6.8
13.0
=
a8
a7
a6
a5
a4
a3
a2
a1
b2
b1
1000000001
0100000001
0010000010
0001000010
0000100001
0000000000
0000000000
0000000000
0011000020
1100100003
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ

The A-1 for the example data
• A-1α is easily obtained by multiplying every element of A-1 by
2, the value of α.
• MME are obtained by adding A-1α to Z'Z in the LSE




























2.0000.0001.000-0.0000.0001.000-0.0000.000
0.0002.0000.0001.000-1.000-0.0000.0000.000
1.000-0.0002.5000.0000.0000.5001.000-1.000-
0.0001.000-0.0002.5000.5001.000-1.000-0.000
0.0001.000-0.0000.5001.8330.0000.0000.667-
1.000-0.0000.5001.000-0.0002.0000.5000.000
0.0000.0001.000-1.000-0.0000.5002.0000.500
0.0000.0001.000-0.0000.667-0.0000.5001.833
=A
1-

Solutions to the MME
Effects Solutions
Pens
1 4.358
2 3.404
Animals
1 0.098
2 -0.019
3 -0.041
4 -0.009
5 -0.186
6 0.177
7 -0.249
8 0.183

Fixed effect solutions
From first row of MME
(X’X)b = X’y - (X’Z)â
b = (X’X)-1X’(y - Z â)
- Solution for calves in pen 1 is
b1 = [(4.5 + 3.5 + 5.0) - (-0.009 + -0.249 + 0.183)]/ 3
i
j j
ijiji diagay /)ˆ(bˆ  

Understanding animal EBV
From second row of MME
(Z’Z+A-1α)â = Z’y - (Z’X)b
(Z’Z+A-1α)â = Z’(y - Xb)
(Z’Z+A-1α)â = (Z’Z)YD
with YD = ZZ-1 Z’(y – Xb)
(Z’Z+ uiiα)âi = αuip(âs + âd) + (Z’Z)YD +
αΣkuim(âanim - 0.5âm)
(Z’Z+uiiα)âi = αupar(PA) + (Z’Z)YD +
0.5αΣkuprog(2âanim - âm)

EBV is made of contributions from:
Parents average (PA), its record and progeny (PC)
EBV = n1PA + n2YD + n3 PC
Where n1 + n2 + n3 = 1
Numerator of:
n1 = 2α, 4/3α or 1α if both,1 or no parents is known
n2 = number of records
n3 = ½α or ⅓α when mate is known or not
Denominator = sum of numerators for n1 ,n2 ,n3
YD = record of animal corrected for fixed effects

Animal 8 as an example:
EBV8 = n1(PA) + n2(y8 – b1)
where
n1 = 2 α /5=4/5 and n2 = 1/5
EBV8 = n1(EBV3+EBV6)/2) + n2 (5.0 -4.358)
EBV8 = n1(0.068) + n2 (5.0 -4.358) = 0.183

Application to Crop situation
• Evaluation of different cultivars, with each
cultivar of identical genotype evaluated in
different plots.
– Each cultivar can be regarded as an ‘individual’
– Mean each could then be used as response
variable and standard error of the mean can be
used as weights

Accuracy and prediction error
variance
• The accuracy (r) of predictions is the correlation
between true and predicted breeding values.
• Dairy cattle evaluations, the accuracy of evaluations is
usually expressed in terms of reliability, which is r2.
• Calculation for r or r2 require the diagonal elements of
the inverse of the MME.
• If the coefficient matrix of the MME is represented as
C:

variance
• Prediction error variance (PEV) = var(a-â) = C22σ2
e
• PEV could be regarded as the fraction of additive
genetic variance not accounted for by the prediction.
Therefore
• PEV = C22σ2
e = (1 - r2)σ2
a or PEV = (1 - r2)σ2
s for a sire
model
• with r2 = squared correlation between the true and
estimated breeding values














 
2221
1211
1
2221
1211
CC
CC
C
CC
CC
C inversedgeneraliseaand

variance
• Thus for animal i
• diσ2
e = (1 - r2)σ2
a
• where di is the ith diagonal element of the C22
•
• di σ2
e/σ2
a = 1 - r2
• r2 = 1 - diα
• The standard error of prediction (SEP) is
• SEP = var(a - â)
• =  diσ2
e for animal i
• Note that r2 = 1 - ( SEP2 /σ2
a ). ASReml gives SEP and not r2, so compute r2 from SEP
• Diagonal elements from the 3 by 3 block for the 3 sires in example were 0.975, 0.970 and 1.045.
The corresponding reliabilities for the 3 sires therefore equals were 0.55, 0.57 and 0.54
respectively.

Acknowledgements
I thank the organizers of the conference especially Prof.
Wallace Cowling for the invitation

This presentation is licensed for use under the Creative Commons Attribution 4.0 International Licence.
better lives through livestock
ilri.org
ILRI thanks all donors and organizations who globally supported its work through their contributions
to the CGIAR system

Improving accuracy by using information from relatives—The animal model

Recommandé

Recommandé

Contenu connexe

Similaire à Improving accuracy by using information from relatives—The animal model

Similaire à Improving accuracy by using information from relatives—The animal model (20)

Plus de ILRI

Plus de ILRI (20)

Dernier

Dernier (20)

Improving accuracy by using information from relatives—The animal model

Notes de l'éditeur