Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
FST & Some Selection Index
유전체역학 2017
김진섭
GSPH, SNU
November 22, 2017
김진섭 (GSPH, SNU) FST & Some Selection Index November ...
Fst
Contents
1 Fst
Wright’s F-statistics
Cockerham’s θ-statistics
2 Selection Index
EHH
iHS
xp-EHH
3 Practice
김진섭 (GSPH, S...
Fst Wright’s F-statistics
3 types of Heterozygosity[4]
Individual, Subpopulation, Total Population
1 HI = 1
n
n
i=1
ˆHi
2 ...
Fst Wright’s F-statistics
Wright’s F-statistics[4]
1 FIS = HS −HI
HS
2 FST = HT −HS
HT
3 FIT = HT −HI
HT
Example
FST = 0 →...
Fst Wright’s F-statistics
http://academic.reed.edu/biology/professors/srenn/pages/
research/2011_students/sean/SM_thesis.h...
Fst Wright’s F-statistics
http://www.johnderbyshire.com/Miscellaneous/Other/Fst.jpg
김진섭 (GSPH, SNU) FST & Some Selection I...
Fst Wright’s F-statistics
FST inference[5]
Convenient measure of genetic differentiation.
Most widely used descriptive stat...
Fst Wright’s F-statistics
Problem in estimation
HT = 2¯p¯q
1 Subpopulation마다 sample수가 다르면??
2 Ex: SASIA 1000명, Oceania 100...
Fst Cockerham’s θ-statistics
ANOVA approach[1, 5]
θ =
σP
σT
(σP: variance due to population, σT : total variance)
김진섭 (GSP...
Fst Cockerham’s θ-statistics
Wright’s FST = Cockerham’s θ
실제 계산은 대부분 θ
김진섭 (GSPH, SNU) FST & Some Selection Index November...
Fst Cockerham’s θ-statistics
θ inference
Population > 2
대세와 다른 population이 있다!!
어떤 population인지는 말 안해준다.
Pairwise FST
2 po...
Fst Cockerham’s θ-statistics
김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 12 / 65
Fst Cockerham’s θ-statistics
Figure. FST calculated for each SNP between Tibetan and Han populations[6]
김진섭 (GSPH, SNU) FS...
Fst Cockerham’s θ-statistics
Figure. Inter-population pairwise comparisons of FST statistics
http://academic.reed.edu/biol...
Selection Index
Contents
1 Fst
Wright’s F-statistics
Cockerham’s θ-statistics
2 Selection Index
EHH
iHS
xp-EHH
3 Practice
...
Selection Index
특정 인구집단에 특정 haplotype이 많냐??
Example: Erik Corona’s slide - Next slide
김진섭 (GSPH, SNU) FST & Some Selection...
Selection Index
Population Genetics
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATT...
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 1...
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 1...
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 1...
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 21% -1%
AATTACAGATTAAA 3%
GACTACAGATTA...
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 21% -1%
AATTACAGATTAAA 3%
GACTACAGATTA...
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 20% -2%
AATTACAGATTAAA 3%
GACTACAGATTA...
Selection Index EHH
EHH: Sabeti, Reich et al. (2002)[7]
Extended Haplotype Homozygosity
Random으로 2개 haplotype 뽑았을 때 그것이 같을...
Selection Index EHH
How can we detect Pos. Sel.?
AATTACAGATTACA 50 people have this
GATTACAGATTACA 50 people have this
---...
Selection Index EHH
50 KB + 20 KB = 70 KB__
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AA...
Selection Index EHH
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTAC...
Selection Index EHH
( (3
2
5
2
7
2
8
2)+
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA AT...
Selection Index EHH
)+
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGAT...
Selection Index EHH
EHH Drops Over Genetic Distance
EHH drops off quickly over 
genetic distance
Starts with 1
Ends at 0
E...
Selection Index EHH
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
A...
Selection Index EHH
Compare EHH Scores
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACA...
Selection Index EHH
Can EHH Detect Pos. Sel.?
김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 33 / 65
Selection Index EHH
Relative EHH
Detects over‐representation of a haplotype
Low recombination
This will raise the p(two ha...
Selection Index EHH
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTAC...
Selection Index EHH
REHH: Problem #1
We get a different REHH value at different genetic 
distance cutoffs
AATTACAGATTACA 5...
Selection Index EHH
Which REHH value to use?
Extend to the right
AGTTACAGATTACAAACACGC
AAATACAGATTACAATGATAG
AATTACAGATTAC...
Selection Index EHH
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…AC...
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACA...
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACA...
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACA...
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACA...
Selection Index EHH
Which REHH value to use?
Extend to the left
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAA...
Selection Index EHH
Which REHH value to use?
Extend to the left
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAA...
Selection Index EHH
Which REHH value to use?
Extend to the left
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAA...
Selection Index EHH
REHH: Problem #2
REHH score is heavily 
biased by allele 
frequencies
Must normalize
P(REHH | Allele F...
Selection Index EHH
REHH: Problem #3
Not possible to detect 
selection in high 
frequency alleles
Solution requires a X‐
p...
Selection Index EHH
Leaves a lot to be desired
Picking the maximum is arbitrary
Why not the mean REHH score?
Biased by all...
Selection Index EHH
Site-specific EHH[9]
두 allele의 EHH값의 대략적인 평균(weight: squared allele frequencies)
Focal SNP의 대략적인 EHH크기
...
Selection Index iHS
iHS: sabeti(2007)[8]
모든 위치에 대해 적분!!!!해서 비교
김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 201...
Selection Index iHS
Integrated Haplotype Score (iHS)
Unstandardized iHS = 
EHH
y  x
y = bwd distance
x = fwd distance
EHHD...
Selection Index iHS
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…AC...
Selection Index iHS
iHS Characteristics
As both alleles have the same AUC, iHS zero
Large negative values indicate selecti...
Selection Index iHS
Unstandardized iHS ‐ E(iHS | Allele Frequency) 
SD(iHS | Allele Frequency) 
E(iHS | Allele Freq.):   E...
Selection Index iHS
iHS Overview
iHS and REHH are EHH based methods to detect 
positive selection
iHS outperforms REHH in ...
Selection Index iHS
iHS: Problem #1
Still can’t detect selection in high frequency (old) 
alleles
Relatively High EHH valu...
Selection Index xp-EHH
xp-EHH: sabeti(2007)[8]
Population 별, 같은 allele별 integreted EHH를 비교!!
김진섭 (GSPH, SNU) FST & Some Se...
Selection Index xp-EHH
Cross Population EHH (XP‐EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AAC...
Selection Index xp-EHH
REHH and iHS are more or less complementary
e.a. is better at detecting pos. sel. at diff freqs.
XP...
Selection Index xp-EHH
Final Verdict: REHH vs iHS vs XP‐EHH
REHH
iHS test
XP‐EHH
김진섭 (GSPH, SNU) FST & Some Selection Inde...
Selection Index xp-EHH
Rsb[9]
Population끼리 비교하는 또다른 지표.
Population별로만 비교.
Locus별로 두 allele의 integrated EHH의 average: iES
L...
Practice
Contents
1 Fst
Wright’s F-statistics
Cockerham’s θ-statistics
2 Selection Index
EHH
iHS
xp-EHH
3 Practice
김진섭 (GS...
Practice
FST
hierfstat[3]
PER3 gene in HGDP(Human Genome Diversity Panel): 289 SNPs &
7 population
EHH, iHS
rehh[2]
패키지 자체...
Practice
Reference I
[1] Cockerham, C. C. (1969). Variance of gene frequencies. Evolution, pages 72–84.
[2] Gautier, M. an...
Practice
END
Email : secondmath85@gmail.com
Office: (02)880-2743
H.P: 010-9192-5385
김진섭 (GSPH, SNU) FST & Some Selection Ind...
Prochain SlideShare
Chargement dans…5
×

Fst, selection index

567 vues

Publié le

Genetic Epidemiology 2017 (same as previous version)

Publié dans : Santé
  • Hello! I can recommend a site that has helped me. It's called ⇒ www.HelpWriting.net ⇐ They helped me for writing my quality research paper.
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • My brother found Custom Writing Service ⇒ www.WritePaper.info ⇐ and ordered a couple of works. Their customer service is outstanding, never left a query unanswered.
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Hello! I have searched hard to find a reliable and best research paper writing service and finally i got a good option for my needs as ⇒ www.HelpWriting.net ⇐
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Soyez le premier à aimer ceci

Fst, selection index

  1. 1. FST & Some Selection Index 유전체역학 2017 김진섭 GSPH, SNU November 22, 2017 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 1 / 65
  2. 2. Fst Contents 1 Fst Wright’s F-statistics Cockerham’s θ-statistics 2 Selection Index EHH iHS xp-EHH 3 Practice 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 2 / 65
  3. 3. Fst Wright’s F-statistics 3 types of Heterozygosity[4] Individual, Subpopulation, Total Population 1 HI = 1 n n i=1 ˆHi 2 HS = 1 n n i=1 2pi qi 3 HT = 2¯p¯q ( ˆHi : observed heterozygosity in ith subpopulation, 2pi qi : average heterozygosity in ith subpopulation, 2¯p¯q: average heterozygosity of total population) Locus 별로 값 구한다. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 3 / 65
  4. 4. Fst Wright’s F-statistics Wright’s F-statistics[4] 1 FIS = HS −HI HS 2 FST = HT −HS HT 3 FIT = HT −HI HT Example FST = 0 → Subpopulation의 effect없다!! 차이 없다. FST = 1 → Subpopulation별로 차이가 크다. Simple relation 1 − FIT = (1 − FIS )(1 − FST ) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 4 / 65
  5. 5. Fst Wright’s F-statistics http://academic.reed.edu/biology/professors/srenn/pages/ research/2011_students/sean/SM_thesis.html 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 5 / 65
  6. 6. Fst Wright’s F-statistics http://www.johnderbyshire.com/Miscellaneous/Other/Fst.jpg 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 6 / 65
  7. 7. Fst Wright’s F-statistics FST inference[5] Convenient measure of genetic differentiation. Most widely used descriptive statistics in population and evolutionary genetics. Natural selection in particular subpopulation. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 7 / 65
  8. 8. Fst Wright’s F-statistics Problem in estimation HT = 2¯p¯q 1 Subpopulation마다 sample수가 다르면?? 2 Ex: SASIA 1000명, Oceania 100명.. 3 제대로 된 ¯p 추정이 아님. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 8 / 65
  9. 9. Fst Cockerham’s θ-statistics ANOVA approach[1, 5] θ = σP σT (σP: variance due to population, σT : total variance) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 9 / 65
  10. 10. Fst Cockerham’s θ-statistics Wright’s FST = Cockerham’s θ 실제 계산은 대부분 θ 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 10 / 65
  11. 11. Fst Cockerham’s θ-statistics θ inference Population > 2 대세와 다른 population이 있다!! 어떤 population인지는 말 안해준다. Pairwise FST 2 population만 가지고 계산. 상대적인 비교. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 11 / 65
  12. 12. Fst Cockerham’s θ-statistics 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 12 / 65
  13. 13. Fst Cockerham’s θ-statistics Figure. FST calculated for each SNP between Tibetan and Han populations[6] 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 13 / 65
  14. 14. Fst Cockerham’s θ-statistics Figure. Inter-population pairwise comparisons of FST statistics http://academic.reed.edu/biology/professors/srenn/pages/ research/2011_students/sean/SM_thesis.html 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 14 / 65
  15. 15. Selection Index Contents 1 Fst Wright’s F-statistics Cockerham’s θ-statistics 2 Selection Index EHH iHS xp-EHH 3 Practice 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 15 / 65
  16. 16. Selection Index 특정 인구집단에 특정 haplotype이 많냐?? Example: Erik Corona’s slide - Next slide 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 16 / 65
  17. 17. Selection Index Population Genetics Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% Lactase + H2O 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 17 / 65
  18. 18. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 18 / 65
  19. 19. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% AATTGCAGATTACA <1% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 19 / 65
  20. 20. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% AATTGCAGATTACA <1% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 20 / 65
  21. 21. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 21% -1% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 8% -1% AATTGCAGATTACA 2% +2% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 21 / 65
  22. 22. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 21% -1% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 23% -1% AACTACAGATTACC 15% -1% GATTACAGACTACA 7% AATTACAGATTACA 7% -2% AATTGCAGATTACA 5% +5% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 22 / 65
  23. 23. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 20% -2% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 23% -1% AACTACAGATTACC 15% -1% GATTACAGACTACA 6% -1% AATTACAGATTACA 5% -4% AATTGCAGATTACA 9% +9% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 23 / 65
  24. 24. Selection Index EHH EHH: Sabeti, Reich et al. (2002)[7] Extended Haplotype Homozygosity Random으로 2개 haplotype 뽑았을 때 그것이 같을 확률은?? 0 → haplotype이 다 다르다. 1 → haplotype이 모두 같다. 관심있는 haplotype을 Core라 한다. EHHt = s i=1 eti 2 ct 2 (t: core haplotype, c: the number of samples of a particular core haplotype, e: the number of samples of a particular extended haplotype, s: the number of unique extended haplotype) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 24 / 65
  25. 25. Selection Index EHH How can we detect Pos. Sel.? AATTACAGATTACA 50 people have this GATTACAGATTACA 50 people have this ---- 50 KB ---- 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 25 / 65
  26. 26. Selection Index EHH 50 KB + 20 KB = 70 KB__ AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 How can we detect Pos. Sel.? 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 26 / 65
  27. 27. Selection Index EHH Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 27 / 65
  28. 28. Selection Index EHH ( (3 2 5 2 7 2 8 2)+ Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 10 2)+( )+( )+ )+( )+6 2( )+4 2( )7 2 )50 2( ( 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 28 / 65
  29. 29. Selection Index EHH )+ Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 10 2( )+ 8 2( )+7 2( )+5 2( )+3 2( )+6 2( )+4 2( )7 2( )50 2( 0.121 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 29 / 65
  30. 30. Selection Index EHH EHH Drops Over Genetic Distance EHH drops off quickly over  genetic distance Starts with 1 Ends at 0 Every hap block will  eventually be unique 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 30 / 65
  31. 31. Selection Index EHH AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 EHH What It Is & What It Isn’t Detects over‐representation of a haplotype This will raise the p(two haps are homozygous) Does NOT detect if a haplotype spread quickly Low recombination != spread quickly AATTACAGATTACA AACACGC 22 AATTACAGATTACA ATGATAG 28 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 31 / 65
  32. 32. Selection Index EHH Compare EHH Scores AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 )+24 2( )26 2( )50 2( 0.121 0.490 Low Recombination Over Represented 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 32 / 65
  33. 33. Selection Index EHH Can EHH Detect Pos. Sel.? 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 33 / 65
  34. 34. Selection Index EHH Relative EHH Detects over‐representation of a haplotype Low recombination This will raise the p(two haps are homozygous) Does detect if a haplotype spread quickly Other haplotype blocks are controls! Recombination cold‐spot / hot‐spot agnostic Low score if both alleles are assoc. w/ high or  low recombination AATTACAGATTACA AACACGC 22 AATTACAGATTACA ATGATAG 28 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 34 / 65
  35. 35. Selection Index EHH Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 0.121 0.490 0.490 0.121 = 4.05REHH = 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 35 / 65
  36. 36. Selection Index EHH REHH: Problem #1 We get a different REHH value at different genetic  distance cutoffs AATTACAGATTACA 50 GATTACAGATTACA 50 ---- 50 KB ---- REHH = 1.0 AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 36 / 65
  37. 37. Selection Index EHH Which REHH value to use? Extend to the right AGTTACAGATTACAAACACGC AAATACAGATTACAATGATAG AATTACAGATTACAAACCCAG AATTTCAGATTACACTGACAG AATTAAAGATTACACAGACAG AATTACCGATTACAAACACAG AATTACAAATTACACACACAG AATTACAGGTTACACACCCAG GATTACAGATTACACACATAG GATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 37 / 65
  38. 38. Selection Index EHH …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 Which REHH value to use? Extend to the right 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 38 / 65
  39. 39. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 39 / 65
  40. 40. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 40 / 65
  41. 41. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 41 / 65
  42. 42. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 42 / 65
  43. 43. Selection Index EHH Which REHH value to use? Extend to the left …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 43 / 65
  44. 44. Selection Index EHH Which REHH value to use? Extend to the left …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 44 / 65
  45. 45. Selection Index EHH Which REHH value to use? Extend to the left …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 45 / 65
  46. 46. Selection Index EHH REHH: Problem #2 REHH score is heavily  biased by allele  frequencies Must normalize P(REHH | Allele Freq.) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 46 / 65
  47. 47. Selection Index EHH REHH: Problem #3 Not possible to detect  selection in high  frequency alleles Solution requires a X‐ population approach  (discussed later) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 47 / 65
  48. 48. Selection Index EHH Leaves a lot to be desired Picking the maximum is arbitrary Why not the mean REHH score? Biased by allele frequency ln(REHH | allele freq) ~ norm dist. Still widely used and published with REHH Overview 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 48 / 65
  49. 49. Selection Index EHH Site-specific EHH[9] 두 allele의 EHH값의 대략적인 평균(weight: squared allele frequencies) Focal SNP의 대략적인 EHH크기 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 49 / 65
  50. 50. Selection Index iHS iHS: sabeti(2007)[8] 모든 위치에 대해 적분!!!!해서 비교 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 50 / 65
  51. 51. Selection Index iHS Integrated Haplotype Score (iHS) Unstandardized iHS =  EHH y  x y = bwd distance x = fwd distance EHHD = derived allele EHHA = ancestral allele 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 51 / 65
  52. 52. Selection Index iHS …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACAACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG + 0.5 = 1.20.7 4.0 + 4.4 = 8.4 Unstandardized iHS ln(8.4/3.2)  =  0.419  Integrated Haplotype Score (iHS) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 52 / 65
  53. 53. Selection Index iHS iHS Characteristics As both alleles have the same AUC, iHS zero Large negative values indicate selection of allele in the  denominator Large positive values indicate selection of allele in the  numerator Still heavily biased by allele frequency! Z‐score normalization 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 53 / 65
  54. 54. Selection Index iHS Unstandardized iHS ‐ E(iHS | Allele Frequency)  SD(iHS | Allele Frequency)  E(iHS | Allele Freq.):   Estimated from empirical distribution SD(iHS | Allele Freq.): Estimated from empirical distribution Integrated Haplotype Score (iHS) = iHS 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 54 / 65
  55. 55. Selection Index iHS iHS Overview iHS and REHH are EHH based methods to detect  positive selection iHS outperforms REHH in specific allele frequencies They don’t completely outperform each other 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 55 / 65
  56. 56. Selection Index iHS iHS: Problem #1 Still can’t detect selection in high frequency (old)  alleles Relatively High EHH values  are not present high  frequency (old) alleles! Use a reference population If pos. sel. didn’t take place  in ref. population, EHH is  high 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 56 / 65
  57. 57. Selection Index xp-EHH xp-EHH: sabeti(2007)[8] Population 별, 같은 allele별 integreted EHH를 비교!! 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 57 / 65
  58. 58. Selection Index xp-EHH Cross Population EHH (XP‐EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 Same allele but diff population AATTACAGATTACA CACATAG 20 AATTACAGATTACA CACACAG 30 0.5 XP‐EHH = ln(3.3/0.5) = 1.89  Z‐score Norn Integrate EHH over distance from allele Calculated for fwd/rev sides independently Integrate until EHH = 0.04 in e.a. population 3.3 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 58 / 65
  59. 59. Selection Index xp-EHH REHH and iHS are more or less complementary e.a. is better at detecting pos. sel. at diff freqs. XP‐EHH Can detect pos. sel. in high freq. alleles Susceptible to population variation in  recombination rate Overview 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 59 / 65
  60. 60. Selection Index xp-EHH Final Verdict: REHH vs iHS vs XP‐EHH REHH iHS test XP‐EHH 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 60 / 65
  61. 61. Selection Index xp-EHH Rsb[9] Population끼리 비교하는 또다른 지표. Population별로만 비교. Locus별로 두 allele의 integrated EHH의 average: iES Locus의 대략적인 selection정도를 population끼리 비교. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 61 / 65
  62. 62. Practice Contents 1 Fst Wright’s F-statistics Cockerham’s θ-statistics 2 Selection Index EHH iHS xp-EHH 3 Practice 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 62 / 65
  63. 63. Practice FST hierfstat[3] PER3 gene in HGDP(Human Genome Diversity Panel): 289 SNPs & 7 population EHH, iHS rehh[2] 패키지 자체 제공 예제 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 63 / 65
  64. 64. Practice Reference I [1] Cockerham, C. C. (1969). Variance of gene frequencies. Evolution, pages 72–84. [2] Gautier, M. and Vitalis, R. (2012). rehh: an r package to detect footprints of selection in genome-wide snp data from haplotype structure. Bioinformatics, 28(8):1176–1177. [3] Goudet, J. (2005). Hierfstat, a package for r to compute and test hierarchical f-statistics. Molecular Ecology Notes, 5(1):184–186. [4] Hamilton, M. (2011). Population genetics. John Wiley & Sons. [5] Holsinger, K. E. and Weir, B. S. (2009). Genetics in geographically structured populations: defining, estimating and interpreting fst. Nature Reviews Genetics, 10(9):639–650. [6] Huerta-S´anchez, E., Jin, X., Bianba, Z., Peter, B. M., Vinckenbosch, N., Liang, Y., Yi, X., He, M., Somel, M., Ni, P., et al. (2014). Altitude adaptation in tibetans caused by introgression of denisovan-like dna. Nature, 512(7513):194–197. [7] Sabeti, P. C., Reich, D. E., Higgins, J. M., Levine, H. Z., Richter, D. J., Schaffner, S. F., Gabriel, S. B., Platko, J. V., Patterson, N. J., McDonald, G. J., et al. (2002). Detecting recent positive selection in the human genome from haplotype structure. Nature, 419(6909):832–837. [8] Sabeti, P. C., Varilly, P., Fry, B., Lohmueller, J., Hostetter, E., Cotsapas, C., Xie, X., Byrne, E. H., McCarroll, S. A., Gaudet, R., et al. (2007). Genome-wide detection and characterization of positive selection in human populations. Nature, 449(7164):913–918. [9] Tang, K., Thornton, K. R., and Stoneking, M. (2007). A new approach for using genome scans to detect recent positive selection in the human genome. PLoS biology, 5(7):e171. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 64 / 65
  65. 65. Practice END Email : secondmath85@gmail.com Office: (02)880-2743 H.P: 010-9192-5385 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 65 / 65

×