SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
EVOLUTIONARY FORCES
SHAPING HUMAN
GENETIC VARIATION
Ryan D. Hernandez
ryan.hernandez@ucsf.edu twitter: @rdhernand
http://www.finca.org
Nile
River
Red
Sea
Andaman
Islands
Meadowcroft
19,000-12,000
years ago
Kennewick
9,500 years ago
Spirit Cave
9,500-9,400
years ago
Monte Verde
14,800 years ago
Niah Cave
40,000 years ago
Qafzeh
100,000
years ago
Lake Mungo
45,000 years ago
Malakunanja
50,000 years ago
Omo Kibish
Oldest modern human
195,000 years ago
Pestera cu Oase
35,000 years ago
Yana River
30,000 years ago
Zhoukoudian
(Shandingdong)
11,000 years ago
Minatogawa
18,000 years ago
Clovis
13,500
years ago
Klasies River Mouth
120,000 years ago
EQUATOR
40,000-30,000
years ago
20,000-15,000
years ago
50,000
years ago
15,000-12,000
years ago
200,000
years ago
70,000-50,000
years ago
40,000
years ago
AUSTRALIA
ASIA
AFRICA
EUROPE
NORTH
AMERICA
SOUTH
AMERICA
1
2
3
4
5
6
Migration date Generalized route
Human Migration
Fossil or
artifact site
40,000
years ago
SOURCES: SUSAN ANTON, NEW YORK UNIVERSITY; ALISON BROOKS, GEORGE WASHINGTON UNIVERSITY; PETER
FORSTER, UNIVERSITY OF CAMBRIDGE; JAMES F. O'CONNELL, UNIVERSITY OF UTAH; STEPHEN OPPENHEIMER,
OXFORD UNIVERSITY; SPENCER WELLS, NATIONAL GEOGRAPHIC SOCIETY; OFER BAR-YOSEF, HARVARD UNIVERSITY
NGM MAPS
© 2006 National Geographic Society. All rights reserved.
Human
colonization of
the world
http://ngm.nationalgeographic.com
Putatively neutral diversity levels
The Effect of “Positive Selection”
Adaptive

Neutral

Nearly Neutral

Mildly Deleterious

Fairly Deleterious

Strongly Deleterious
“Selective Sweep”
Adaptive

Neutral

Nearly Neutral

Mildly Deleterious

Fairly Deleterious

Strongly Deleterious
Putatively neutral diversity levels
The Effect of “Positive Selection”
Adaptive

Neutral

Nearly Neutral

Mildly Deleterious

Fairly Deleterious

Strongly Deleterious
“Selective Sweep”
Putatively neutral diversity levels
The Effect of “Positive Selection”
“Selective Sweep”
• Repeated fixation of functional mutations in coding regions over
evolutionary timescales can lead to a disproportional number of
amino acid substitutions relative to observed polymorphisms.
• This can be summarized by a 2x2 table and analyzed using the
McDonald-Kreitman test:
Non-Syn Syn
Fixed F F
Polymorphic P P1000 Genomes Project Data
Adaptive

Neutral

Nearly Neutral

Mildly Deleterious

Fairly Deleterious

Strongly Deleterious
Putatively neutral diversity levels
The Effect of “Positive Selection”
SnIPRE: an improvement to MKT
Since few SNPs and
substitutions are usually
observed per gene, MKT
can be noisy. Pooling
observations across the
genome using a mixed
effects model vastly
increases power.
Eilertson et al, 2012
SnIPREASR in 1000 Genomes Project
Human-chimp divergence
Pos Sel Conserved
410 8027
• Conserved genes are either neutral or under
purifying selection.
SnIPREASR: an improvement to SnIPRE
• Alignments are generated using MOSAIC, a
program we developed that rigorously integrates
putative orthologs from an arbitrary number of
sources.!
!
• Using PAML, we perform AIC-based model
selection to infer the substitutions along the
human lineage since our divergence with chimp. pythonhosted.org/bio-­‐MOSAIC/
Maher & Hernandez (arXiv)
Hum
an
Chim
pO
rangG
orilla
…
Cyrus Maher
SnIPREASR works well for positive selection
• Simulations: Human-specific substitutions; Gutenkunst et al.
demographic model.
• 𝛾 is the population scaled selection coefficient.
• SnIPREASR is best-powered to estimate values of 𝛾>0.
Hum
an
Chim
pO
rangG
orilla
…
ASR removes genes positively selected in chimp
Human-chimp divergence
Pos Sel Conserved
Human only
(ASR)
Pos Sel 343 0 343
Conserved 67 8027 8094
410 8027
• Conserved genes are either neutral or under
purifying selection.
• 67/410 (16%) of genes identified as positively
selected when comparing human-chimp are
conserved along the human lineage.
Positively selected genes dominated
by smell & response to pathogens
GOrilla
Amino acid
substitution
Neutral
diversity
levels …
Reflects the fraction
of amino acid
substitutions that are
adaptive
n substitutions
…
Reflects the typical
strength of selection
The footprint of adaptive amino acid substitutions
• Goal: compare the pattern around amino acid substitutions to
the pattern around synonymous substitutions.
Hernandez et al. Science (2011)
Observed Patterns of Diversity
Around Human Substitutions
Hernandez et al. Science (2011)
Genetic diversity
reduced: π=f0π0
(decrease in effective
population size [Ne])
Adaptive

Neutral

Nearly Neutral

Mildly Deleterious

Fairly Deleterious

Strongly Deleterious
Putatively neutral diversity levels
The Effect of Negative Selection
Genetic diversity
reduced: π=f0π0
(decrease in effective
population size [Ne])
Adaptive

Neutral

Nearly Neutral

Mildly Deleterious

Fairly Deleterious

Strongly Deleterious
Putatively neutral diversity levels
The Effect of Negative Selection
Putatively neutral diversity levels
The Effect of Negative Selection
Genetic diversity
reduced: π=f0π0
(decrease in effective
population size [Ne])
Adaptive

Neutral

Nearly Neutral

Mildly Deleterious

Fairly Deleterious

Strongly Deleterious
Background Selection in Humans
Observed
Predicted
}
}
Hernandez, et al. Science (2011).
BGS correlates with Fst at neutral sites
4 - Population Differentiation as a Function of BGS!
The decrease in Ne locally across the genome as a result of BGS (inferred2 by the value, B, in which lower
values indicate stronger BGS) may impact the rate of genetic drift at specific loci. To investigate this
effect, we measured FST between TGP populations as a function of BGS strength. Our results suggest that
the strength of BGS is a predictor of population differentiation, with an increase in genetic drift driving
this effect.
5 - Forward Simulations of Demography and BGS!
Using a distribution of fitness effects and a demographic model inferred from previous studies3,4, we ran
forward simulations using SFS_CODE5 to estimate the effect of human demography on determining the
reduction in genetic diversity caused by BGS, observing that the effects of BGS are strongest for those
populations that have experienced sharp population bottlenecks (i.e., Europeans and Asians). However, the
expected reduction in diversity due to BGS across all human populations is still greater than for a
simulated population of constant size, illustrating the importance of population expansions for determining
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
● ●
● ●
●
● ●
●
● ●
● ● ● ●
● ● ●
●
Fst (estimator method) vs. Background Selection
African vs. Asian
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ● ●
●
●
● ●
●
●
●
● ●
● ●
●
●
●
● ●
●
●
●
●
Fst (estimator method) vs. Background Selection
African vs. European
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
●
● ● ● ●
●
●
●
● ● ● ●
●
●
●
● ●
● ●
●
●
●
Fst (estimator method) vs. Background Selection
European vs. Asian
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
B value!
FST!
BGS strength!
populationdifferentiation!
FST!
FST!
B value! B value!
African vs. Asian! African vs. European! European vs. Asian!
0.100.120.140.160.180.200.22
0.100.120.140.160.180.200.22
0.100.120.140.160.180.200.22
0-24 225-249 475-499 725-749 975-979 0-24 225-249 475-499 725-749 975-979 0-24 225-249 475-499 725-749 975-979
B value B value
4 - Population Differentiation as a Function of BGS!
The decrease in Ne locally across the genome as a result of BGS (inferred2 by the value, B, in which lower
values indicate stronger BGS) may impact the rate of genetic drift at specific loci. To investigate this
effect, we measured FST between TGP populations as a function of BGS strength. Our results suggest tha
the strength of BGS is a predictor of population differentiation, with an increase in genetic drift driving
this effect.
5 - Forward Simulations of Demography and BGS!
Using a distribution of fitness effects and a demographic model inferred from previous studies3,4, we ran
forward simulations using SFS_CODE5 to estimate the effect of human demography on determining the
reduction in genetic diversity caused by BGS, observing that the effects of BGS are strongest for those
populations that have experienced sharp population bottlenecks (i.e., Europeans and Asians). However, the
expected reduction in diversity due to BGS across all human populations is still greater than for a
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
● ●
● ●
●
● ●
●
● ●
● ● ● ●
● ● ●
●
Fst (estimator method) vs. Background Selection
African vs. Asian
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ● ●
●
●
● ●
●
●
●
● ●
● ●
●
●
●
● ●
●
●
●
●
Fst (estimator method) vs. Background Selection
African vs. European
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
●
● ● ● ●
●
●
●
● ● ● ●
●
●
●
● ●
● ●
●
●
●
Fst (estimator method) vs. Background Selection
European vs. Asian
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
B value!
FST!
BGS strength!
populationdifferentiation!
FST!
FST!
B value! B value!
African vs. Asian! African vs. European! European vs. Asian!
B value
4 - Population Differentiation as a Function of BGS!
The decrease in Ne locally across the genome as a result of BGS (inferred2 by the value, B, in which lower
values indicate stronger BGS) may impact the rate of genetic drift at specific loci. To investigate this
effect, we measured FST between TGP populations as a function of BGS strength. Our results suggest tha
the strength of BGS is a predictor of population differentiation, with an increase in genetic drift driving
this effect.
5 - Forward Simulations of Demography and BGS!
Using a distribution of fitness effects and a demographic model inferred from previous studies3,4, we ran
forward simulations using SFS_CODE5 to estimate the effect of human demography on determining the
reduction in genetic diversity caused by BGS, observing that the effects of BGS are strongest for those
populations that have experienced sharp population bottlenecks (i.e., Europeans and Asians). However, the
expected reduction in diversity due to BGS across all human populations is still greater than for a
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
● ●
● ●
●
● ●
●
● ●
● ● ● ●
● ● ●
●
Fst (estimator method) vs. Background Selection
African vs. Asian
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ● ●
●
●
● ●
●
●
●
● ●
● ●
●
●
●
● ●
●
●
●
●
Fst (estimator method) vs. Background Selection
African vs. European
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
●
● ● ● ●
●
●
●
● ● ● ●
●
●
●
● ●
● ●
●
●
●
Fst (estimator method) vs. Background Selection
European vs. Asian
0.100.120.140.160.180.200.22
0−24 225−249 475−499 725−749 975−1000
B value!
FST!
BGS strength!
populationdifferentiation!
FST!
FST!
B value! B value!
African vs. Asian! African vs. European! European vs. Asian!
strong weak
• Neutral sites defined as PhyloP ⊂ (-1.2, 1.2)
BGS in the human genome
Low Coverage
WGS
High Coverage
exome
of BGS!
, in which lower
investigate this
ults suggest that
etic drift driving
● ● ●
●
●
●
● ● ● ●
●
●
●
● ●
● ●
●
●
●
vs. Background Selection
n vs. Asian
499 725−749 975−1000
alue!
vs. Asian!
ES
LW
YR
MS
GW
IB
CE
TS
GB
FI
CH
JP
CH
KH
CD
BE
PJ
IT
ST
GI
ES
LW
Y
MS
GW
IB
CE
T
GB
FI
CH
JP
CH
KH
CD
BE
PJ
IT
ST
GI
AFR!
!
EUR!
!
EASN!
!
SASN!
!
AFR!
!
EUR!
!
EASN!
!
SASN!
!
4 - BGS Skews the SFS Towards Rare Variants!
Purifying selection on linked sites can cause distortions in gene geneologies, leading to potential skews in
the site-frequency spectrum. To investigate these effects, we measured the SFS as a function of B
separately across the high-coverage and low-coverage regions of phase 3 TGP populations. We observed a
marked increase in the number of of rare variants, especially singletons, in both datasets as a function of
BGS strength. This pattern is amplified in non-African vs. African populations.
Derived Allele Count (log-scale)!
frequency!frequency!
0.00.10.20.30.40.5
YRI
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
0.00.10.20.30.4
CHS
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
CHS!
0.00.10.20.30.40.5
TSI
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
Derived Allele Count (log-scale)!
0.00.10.20.30.40.5
CHS
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
Derived Allele Count (log-scale)!
0.00.10.20.30.40.5
ITU
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
Derived Allele Count (log-scale)!
0.00.10.20.30.4
ITU
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
0.00.10.20.30.4 YRI
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
YRI!
0.00.10.20.30.4
TSI
1 2 3 5 10 25 50 150
B: 0−50
B: 476−525
B: 951−1000
TSI!
Low-!
Coverage!
High-!
Coverage!
ratiovec[1]
1.351.45
●
Low−Coverage
High−Coverage
Ratio of Singleton Frequency in Strong BGS Bin vs. Weak BGS Bin!
ITU!
ratio!
• Neutral sites defined as PhyloP ⊂ (-1.2, 1.2)
Modeling assumptions impact results
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns = -5
distance (bp)
ππ0
L = 5e4
L = 5e5
L = 1e6
L = 1e7
L = 5e7
L = 1e8
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns = -5
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns~Γ(α, β)
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns~Γ(α, β)
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns = -5
distance (bp)
ππ0
L = 5e4
L = 5e5
L = 1e6
L = 1e7
L = 5e7
L = 1e8
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns = -5
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns~Γ(α, β)
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns~Γ(α, β)
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns = -5
distance (bp)
ππ0
L = 5e4
L = 5e5
L = 1e6
L = 1e7
L = 5e7
L = 1e8
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns = -5
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns~Γ(α, β)
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns~Γ(α, β)
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns = -5
distance (bp)
ππ0
L = 5e4
L = 5e5
L = 1e6
L = 1e7
L = 5e7
L = 1e8
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns = -5
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Multiplicative, 2Ns~Γ(α, β)
distance (bp)
ππ0
-20000 -10000 0 10000 20000
0.00.20.40.60.81.0
Additive, 2Ns~Γ(α, β)
distance (bp)
ππ0
Lawrence Uricchio
Complex signatures of selection
• Soft selective sweeps result in multiple
haplotypes increasing in frequency.
Soft Sweep
Zach Szpiech
Extended Multiple Haplotype Homozygosity
-- haplotype sample size!
-- set of distinct haplotypes from the locus to marker x!
-- ith most frequent haplotype!
-- number of haplotypes
EHH
SelScan: Szpiech & Hernandez (arXiv)
Sorry, redacted for now… More
coming soon!!
Power
0 0.01 0.02 0.05 0.10
160%
120%
80%
40%
0%
Constant Demography (s = 0.01)
0.70
0.80
0.90
Frequency at which selection begins
%increaseinpoweroveriHS
Sampling
Frequency
0 0.01 0.02 0.05 0.10
140%
120%
60%
100%
80%
40%
20%
0%
African Demography (s = 0.01)
0.70
0.80
0.90
Frequency at which selection begins
%increaseinpoweroveriHS
Sampling
Frequency
0 0.01 0.02 0.05 0.10
60%
100%
80%
40%
20%
0%
European Demography (s = 0.01)
0.70
0.80
0.90
Frequency at which selection begins
%increaseinpoweroveriHS
Sampling
Frequency
0 0.01 0.02 0.05 0.10
100%
60%
80%
40%
20%
0%
Constant Demography (s = 0.01)
0.70
0.80
0.90
Frequency at which selection begins
Power
Sampling
Frequency
0 0.01 0.02 0.05 0.10
100%
60%
80%
40%
20%
0%
African Demography (s = 0.01)
0.70
0.80
0.90
Frequency at which selection begins
Power
Sampling
Frequency
0 0.01 0.02 0.05 0.10
100%
60%
80%
40%
20%
0%
European Demography (s = 0.01)
0.70
0.80
0.90
Frequency at which selection begins
Power
Sampling
Frequency
A genomic approach to
detecting selection
• Most SNPs are non-coding.
• Most regulatory elements do not act on the
nearest gene.
• We can use genome-wide signatures of selection
to infer selection on genes using eQTL
information.
ARTICLE
Sherlock: Detecting Gene-Disease Associations
by Matching Patterns of Expression QTL and GWAS
Xin He,1,2 Chris K. Fuller,1 Yi Song,1 Qingying Meng,3 Bin Zhang,4 Xia Yang,3 and Hao Li1,*
Genetic mapping of complex diseases to date depends on variations inside or close to the genes that perturb their activities. A strong
body of evidence suggests that changes in gene expression play a key role in complex diseases and that numerous loci perturb gene
expression in trans. The information in trans variants, however, has largely been ignored in the current analysis paradigm. Here we pre-
sent a statistical framework for genetic mapping by utilizing collective information in both cis and trans variants. We reason that for a
disease-associated gene, any genetic variation that perturbs its expression is also likely to influence the disease risk. Thus, the expression
quantitative trait loci (eQTL) of the gene, which constitute a unique ‘‘genetic signature,’’ should overlap significantly with the set of loci
associated with the disease. We translate this idea into a computational algorithm (named Sherlock) to search for gene-disease associa-
tions from GWASs, taking advantage of independent eQTL data. Application of this strategy to Crohn disease and type 2 diabetes pre-
dicts a number of genes with possible disease roles, including several predictions supported by solid experimental evidence. Importantly,
predicted genes are often implicated by multiple trans eQTL with moderate associations. These genes are far from any GWAS association
signals and thus cannot be identified from the GWAS alone. Our approach allows analysis of association data from a new perspective and
is applicable to any complex phenotype. It is readily generalizable to molecular traits other than gene expression, such as metabolites,
noncoding RNAs, and epigenetic modifications.
Introduction
Recent application of genome-wide association studies
(GWASs) to complex human diseases led to the discovery
both cis- and trans-expression QTL in the context of associ-
ation studies. So far, information from trans variations has
largely been ignored because only cis variants can be as-
signed to their target genes based on proximity by using
the GWAS data alone. The growing collection of eQTLHe et al. AJHG (2013)
Detecting selection on
regulatory networks
Figure 1. The Sherlock Algorithm: Matching Genetic Signatures of Gene Expression Traits to that of the Disease to Identify Gene-
Disease Associations He et al. AJHG (2013)
Detecting selection on regulatory networks
Rank GENE BF
1 IFNAR2 14.0216
2 DARS 13.3106
3 RARRES2 12.7859
4 SLC25A43 11.8157
5** EXT1 11.4169
6 FAM20B 11.3852
7** MICB 11.2997
8** MICA 11.2997
9** HLA-­‐B 11.2997
10** HLA-­‐C 11.2359
11 RHBDL1 11.1828
12** RBMS3 11.142
13 FNBP1 11.1387
14 P4HB 10.8784
15** SOX5 10.8667
Rank GENE BF
16 KCNK3 10.7472
17 RGS20 10.5487
18 MPST 10.5474
19** HLA-­‐DPB1 10.4441
20 QSOX1 10.4326
21** IL16 10.4201
22** SYT17 10.3908
23 MALL 10.3165
24** CRTC1 10.2577
25 MEMO1 10.2574
26 ISOC2 10.2464
27 PCF11 10.0775
28 XKR8 10.0043
29 RNF216L 10.0043
30** SCG2 10.0012
** indicates genes in GWAS association database for complex phenotype
Selection on standing variation
driven by response to pathogens
Description P-value FDR q-value
cytokine-mediated signaling
pathway
5.92E-06 6.26E-02
immune effector process 7.47E-06 3.95E-02
regulation of immune system
process
7.47E-06 2.64E-02
regulation of defense response
to virus
8.53E-06 2.26E-02
lymphocyte costimulation 9.36E-06 1.98E-02
T cell costimulation 9.36E-06 1.65E-02
GOrilla
Haplotype-based selection signals 

recapitulate geography
−5 0 5
−50510
Top 1% of windows
PC1 (14.4%)
PC2(12.6%)
ACB
ASW
CDX
CEU
CHBCHS
CLM
FIN GBRGIH IBS
JPT
KHV
LWK
MKK
MXL
PEL PUR
TSI
YRI• TGP samples with
phased OMNI
genotype data
• Used iHS
• 100kb windows for
each population are
coded 1 if selection
score is in top 1% 

(0 otherwise)
Conclusions
• Many complex signatures of selection in the human
genome.
• Mixtures of positive and negative selection
• Complicated modes of selection (including soft sweeps)
• Predominant signature of ancient human-lineage
selection seems to be from olfactory processes
• Recent selection on standing variation associated with
complex traits, including pathogen response.
Thanks!
1000 Genomes Project Consortium
Funding: NHGRI; QB3; CHARM; CTSI
ryan.hernandez@ucsf.edu
Nicolas
Strauli
Cyrus
Maher
Raul
Torres
Lawrence
Uricchio
Zach
Szpiech

Contenu connexe

En vedette

Hernandez ashg 2016_share
Hernandez ashg 2016_shareHernandez ashg 2016_share
Hernandez ashg 2016_sharerdhernand
 
How do we teach science
How do we teach scienceHow do we teach science
How do we teach sciencelanie1501
 
Trippy store
Trippy storeTrippy store
Trippy storeGabi Gc
 
Dream of Detroit Community Meeting Presentation
Dream of Detroit Community Meeting PresentationDream of Detroit Community Meeting Presentation
Dream of Detroit Community Meeting Presentationdreamofdetroit
 
Psed 2 ( theorist)1
Psed 2 ( theorist)1Psed 2 ( theorist)1
Psed 2 ( theorist)1lanie1501
 
Brain based lesson plan
Brain based lesson planBrain based lesson plan
Brain based lesson planlanie1501
 
Integrated lesson plan (2)
Integrated lesson plan (2)Integrated lesson plan (2)
Integrated lesson plan (2)lanie1501
 
Brain based lesson plan
Brain based lesson planBrain based lesson plan
Brain based lesson planlanie1501
 

En vedette (10)

Hernandez ashg 2016_share
Hernandez ashg 2016_shareHernandez ashg 2016_share
Hernandez ashg 2016_share
 
How do we teach science
How do we teach scienceHow do we teach science
How do we teach science
 
Trippy store
Trippy storeTrippy store
Trippy store
 
Dream of Detroit Community Meeting Presentation
Dream of Detroit Community Meeting PresentationDream of Detroit Community Meeting Presentation
Dream of Detroit Community Meeting Presentation
 
คอม
คอมคอม
คอม
 
5 min spn_audioscript
5 min spn_audioscript5 min spn_audioscript
5 min spn_audioscript
 
Psed 2 ( theorist)1
Psed 2 ( theorist)1Psed 2 ( theorist)1
Psed 2 ( theorist)1
 
Brain based lesson plan
Brain based lesson planBrain based lesson plan
Brain based lesson plan
 
Integrated lesson plan (2)
Integrated lesson plan (2)Integrated lesson plan (2)
Integrated lesson plan (2)
 
Brain based lesson plan
Brain based lesson planBrain based lesson plan
Brain based lesson plan
 

Dernier

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 

Dernier (20)

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 

Evolutionary Forces Shaping Human Genetic Variation

  • 1. EVOLUTIONARY FORCES SHAPING HUMAN GENETIC VARIATION Ryan D. Hernandez ryan.hernandez@ucsf.edu twitter: @rdhernand
  • 3. Nile River Red Sea Andaman Islands Meadowcroft 19,000-12,000 years ago Kennewick 9,500 years ago Spirit Cave 9,500-9,400 years ago Monte Verde 14,800 years ago Niah Cave 40,000 years ago Qafzeh 100,000 years ago Lake Mungo 45,000 years ago Malakunanja 50,000 years ago Omo Kibish Oldest modern human 195,000 years ago Pestera cu Oase 35,000 years ago Yana River 30,000 years ago Zhoukoudian (Shandingdong) 11,000 years ago Minatogawa 18,000 years ago Clovis 13,500 years ago Klasies River Mouth 120,000 years ago EQUATOR 40,000-30,000 years ago 20,000-15,000 years ago 50,000 years ago 15,000-12,000 years ago 200,000 years ago 70,000-50,000 years ago 40,000 years ago AUSTRALIA ASIA AFRICA EUROPE NORTH AMERICA SOUTH AMERICA 1 2 3 4 5 6 Migration date Generalized route Human Migration Fossil or artifact site 40,000 years ago SOURCES: SUSAN ANTON, NEW YORK UNIVERSITY; ALISON BROOKS, GEORGE WASHINGTON UNIVERSITY; PETER FORSTER, UNIVERSITY OF CAMBRIDGE; JAMES F. O'CONNELL, UNIVERSITY OF UTAH; STEPHEN OPPENHEIMER, OXFORD UNIVERSITY; SPENCER WELLS, NATIONAL GEOGRAPHIC SOCIETY; OFER BAR-YOSEF, HARVARD UNIVERSITY NGM MAPS © 2006 National Geographic Society. All rights reserved. Human colonization of the world http://ngm.nationalgeographic.com
  • 4. Putatively neutral diversity levels The Effect of “Positive Selection” Adaptive Neutral
 Nearly Neutral
 Mildly Deleterious
 Fairly Deleterious
 Strongly Deleterious
  • 5. “Selective Sweep” Adaptive Neutral
 Nearly Neutral
 Mildly Deleterious
 Fairly Deleterious
 Strongly Deleterious Putatively neutral diversity levels The Effect of “Positive Selection”
  • 6. Adaptive Neutral
 Nearly Neutral
 Mildly Deleterious
 Fairly Deleterious
 Strongly Deleterious “Selective Sweep” Putatively neutral diversity levels The Effect of “Positive Selection”
  • 7. “Selective Sweep” • Repeated fixation of functional mutations in coding regions over evolutionary timescales can lead to a disproportional number of amino acid substitutions relative to observed polymorphisms. • This can be summarized by a 2x2 table and analyzed using the McDonald-Kreitman test: Non-Syn Syn Fixed F F Polymorphic P P1000 Genomes Project Data Adaptive Neutral
 Nearly Neutral
 Mildly Deleterious
 Fairly Deleterious
 Strongly Deleterious Putatively neutral diversity levels The Effect of “Positive Selection”
  • 8. SnIPRE: an improvement to MKT Since few SNPs and substitutions are usually observed per gene, MKT can be noisy. Pooling observations across the genome using a mixed effects model vastly increases power. Eilertson et al, 2012
  • 9. SnIPREASR in 1000 Genomes Project Human-chimp divergence Pos Sel Conserved 410 8027 • Conserved genes are either neutral or under purifying selection.
  • 10. SnIPREASR: an improvement to SnIPRE • Alignments are generated using MOSAIC, a program we developed that rigorously integrates putative orthologs from an arbitrary number of sources.! ! • Using PAML, we perform AIC-based model selection to infer the substitutions along the human lineage since our divergence with chimp. pythonhosted.org/bio-­‐MOSAIC/ Maher & Hernandez (arXiv) Hum an Chim pO rangG orilla … Cyrus Maher
  • 11. SnIPREASR works well for positive selection • Simulations: Human-specific substitutions; Gutenkunst et al. demographic model. • 𝛾 is the population scaled selection coefficient. • SnIPREASR is best-powered to estimate values of 𝛾>0. Hum an Chim pO rangG orilla …
  • 12. ASR removes genes positively selected in chimp Human-chimp divergence Pos Sel Conserved Human only (ASR) Pos Sel 343 0 343 Conserved 67 8027 8094 410 8027 • Conserved genes are either neutral or under purifying selection. • 67/410 (16%) of genes identified as positively selected when comparing human-chimp are conserved along the human lineage.
  • 13. Positively selected genes dominated by smell & response to pathogens GOrilla
  • 14. Amino acid substitution Neutral diversity levels … Reflects the fraction of amino acid substitutions that are adaptive n substitutions … Reflects the typical strength of selection The footprint of adaptive amino acid substitutions • Goal: compare the pattern around amino acid substitutions to the pattern around synonymous substitutions. Hernandez et al. Science (2011)
  • 15. Observed Patterns of Diversity Around Human Substitutions Hernandez et al. Science (2011)
  • 16. Genetic diversity reduced: π=f0π0 (decrease in effective population size [Ne]) Adaptive Neutral
 Nearly Neutral
 Mildly Deleterious
 Fairly Deleterious
 Strongly Deleterious Putatively neutral diversity levels The Effect of Negative Selection
  • 17. Genetic diversity reduced: π=f0π0 (decrease in effective population size [Ne]) Adaptive Neutral
 Nearly Neutral
 Mildly Deleterious
 Fairly Deleterious
 Strongly Deleterious Putatively neutral diversity levels The Effect of Negative Selection
  • 18. Putatively neutral diversity levels The Effect of Negative Selection Genetic diversity reduced: π=f0π0 (decrease in effective population size [Ne]) Adaptive Neutral
 Nearly Neutral
 Mildly Deleterious
 Fairly Deleterious
 Strongly Deleterious
  • 19. Background Selection in Humans Observed Predicted } } Hernandez, et al. Science (2011).
  • 20. BGS correlates with Fst at neutral sites 4 - Population Differentiation as a Function of BGS! The decrease in Ne locally across the genome as a result of BGS (inferred2 by the value, B, in which lower values indicate stronger BGS) may impact the rate of genetic drift at specific loci. To investigate this effect, we measured FST between TGP populations as a function of BGS strength. Our results suggest that the strength of BGS is a predictor of population differentiation, with an increase in genetic drift driving this effect. 5 - Forward Simulations of Demography and BGS! Using a distribution of fitness effects and a demographic model inferred from previous studies3,4, we ran forward simulations using SFS_CODE5 to estimate the effect of human demography on determining the reduction in genetic diversity caused by BGS, observing that the effects of BGS are strongest for those populations that have experienced sharp population bottlenecks (i.e., Europeans and Asians). However, the expected reduction in diversity due to BGS across all human populations is still greater than for a simulated population of constant size, illustrating the importance of population expansions for determining ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection African vs. Asian 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection African vs. European 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection European vs. Asian 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 B value! FST! BGS strength! populationdifferentiation! FST! FST! B value! B value! African vs. Asian! African vs. European! European vs. Asian! 0.100.120.140.160.180.200.22 0.100.120.140.160.180.200.22 0.100.120.140.160.180.200.22 0-24 225-249 475-499 725-749 975-979 0-24 225-249 475-499 725-749 975-979 0-24 225-249 475-499 725-749 975-979 B value B value 4 - Population Differentiation as a Function of BGS! The decrease in Ne locally across the genome as a result of BGS (inferred2 by the value, B, in which lower values indicate stronger BGS) may impact the rate of genetic drift at specific loci. To investigate this effect, we measured FST between TGP populations as a function of BGS strength. Our results suggest tha the strength of BGS is a predictor of population differentiation, with an increase in genetic drift driving this effect. 5 - Forward Simulations of Demography and BGS! Using a distribution of fitness effects and a demographic model inferred from previous studies3,4, we ran forward simulations using SFS_CODE5 to estimate the effect of human demography on determining the reduction in genetic diversity caused by BGS, observing that the effects of BGS are strongest for those populations that have experienced sharp population bottlenecks (i.e., Europeans and Asians). However, the expected reduction in diversity due to BGS across all human populations is still greater than for a ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection African vs. Asian 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection African vs. European 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection European vs. Asian 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 B value! FST! BGS strength! populationdifferentiation! FST! FST! B value! B value! African vs. Asian! African vs. European! European vs. Asian! B value 4 - Population Differentiation as a Function of BGS! The decrease in Ne locally across the genome as a result of BGS (inferred2 by the value, B, in which lower values indicate stronger BGS) may impact the rate of genetic drift at specific loci. To investigate this effect, we measured FST between TGP populations as a function of BGS strength. Our results suggest tha the strength of BGS is a predictor of population differentiation, with an increase in genetic drift driving this effect. 5 - Forward Simulations of Demography and BGS! Using a distribution of fitness effects and a demographic model inferred from previous studies3,4, we ran forward simulations using SFS_CODE5 to estimate the effect of human demography on determining the reduction in genetic diversity caused by BGS, observing that the effects of BGS are strongest for those populations that have experienced sharp population bottlenecks (i.e., Europeans and Asians). However, the expected reduction in diversity due to BGS across all human populations is still greater than for a ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection African vs. Asian 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection African vs. European 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Fst (estimator method) vs. Background Selection European vs. Asian 0.100.120.140.160.180.200.22 0−24 225−249 475−499 725−749 975−1000 B value! FST! BGS strength! populationdifferentiation! FST! FST! B value! B value! African vs. Asian! African vs. European! European vs. Asian! strong weak • Neutral sites defined as PhyloP ⊂ (-1.2, 1.2)
  • 21. BGS in the human genome Low Coverage WGS High Coverage exome of BGS! , in which lower investigate this ults suggest that etic drift driving ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● vs. Background Selection n vs. Asian 499 725−749 975−1000 alue! vs. Asian! ES LW YR MS GW IB CE TS GB FI CH JP CH KH CD BE PJ IT ST GI ES LW Y MS GW IB CE T GB FI CH JP CH KH CD BE PJ IT ST GI AFR! ! EUR! ! EASN! ! SASN! ! AFR! ! EUR! ! EASN! ! SASN! ! 4 - BGS Skews the SFS Towards Rare Variants! Purifying selection on linked sites can cause distortions in gene geneologies, leading to potential skews in the site-frequency spectrum. To investigate these effects, we measured the SFS as a function of B separately across the high-coverage and low-coverage regions of phase 3 TGP populations. We observed a marked increase in the number of of rare variants, especially singletons, in both datasets as a function of BGS strength. This pattern is amplified in non-African vs. African populations. Derived Allele Count (log-scale)! frequency!frequency! 0.00.10.20.30.40.5 YRI 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 0.00.10.20.30.4 CHS 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 CHS! 0.00.10.20.30.40.5 TSI 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 Derived Allele Count (log-scale)! 0.00.10.20.30.40.5 CHS 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 Derived Allele Count (log-scale)! 0.00.10.20.30.40.5 ITU 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 Derived Allele Count (log-scale)! 0.00.10.20.30.4 ITU 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 0.00.10.20.30.4 YRI 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 YRI! 0.00.10.20.30.4 TSI 1 2 3 5 10 25 50 150 B: 0−50 B: 476−525 B: 951−1000 TSI! Low-! Coverage! High-! Coverage! ratiovec[1] 1.351.45 ● Low−Coverage High−Coverage Ratio of Singleton Frequency in Strong BGS Bin vs. Weak BGS Bin! ITU! ratio! • Neutral sites defined as PhyloP ⊂ (-1.2, 1.2)
  • 22. Modeling assumptions impact results -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns = -5 distance (bp) ππ0 L = 5e4 L = 5e5 L = 1e6 L = 1e7 L = 5e7 L = 1e8 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns = -5 distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns~Γ(α, β) distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns~Γ(α, β) distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns = -5 distance (bp) ππ0 L = 5e4 L = 5e5 L = 1e6 L = 1e7 L = 5e7 L = 1e8 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns = -5 distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns~Γ(α, β) distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns~Γ(α, β) distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns = -5 distance (bp) ππ0 L = 5e4 L = 5e5 L = 1e6 L = 1e7 L = 5e7 L = 1e8 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns = -5 distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns~Γ(α, β) distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns~Γ(α, β) distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns = -5 distance (bp) ππ0 L = 5e4 L = 5e5 L = 1e6 L = 1e7 L = 5e7 L = 1e8 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns = -5 distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Multiplicative, 2Ns~Γ(α, β) distance (bp) ππ0 -20000 -10000 0 10000 20000 0.00.20.40.60.81.0 Additive, 2Ns~Γ(α, β) distance (bp) ππ0 Lawrence Uricchio
  • 23. Complex signatures of selection • Soft selective sweeps result in multiple haplotypes increasing in frequency. Soft Sweep Zach Szpiech
  • 24. Extended Multiple Haplotype Homozygosity -- haplotype sample size! -- set of distinct haplotypes from the locus to marker x! -- ith most frequent haplotype! -- number of haplotypes EHH SelScan: Szpiech & Hernandez (arXiv) Sorry, redacted for now… More coming soon!!
  • 25. Power 0 0.01 0.02 0.05 0.10 160% 120% 80% 40% 0% Constant Demography (s = 0.01) 0.70 0.80 0.90 Frequency at which selection begins %increaseinpoweroveriHS Sampling Frequency 0 0.01 0.02 0.05 0.10 140% 120% 60% 100% 80% 40% 20% 0% African Demography (s = 0.01) 0.70 0.80 0.90 Frequency at which selection begins %increaseinpoweroveriHS Sampling Frequency 0 0.01 0.02 0.05 0.10 60% 100% 80% 40% 20% 0% European Demography (s = 0.01) 0.70 0.80 0.90 Frequency at which selection begins %increaseinpoweroveriHS Sampling Frequency 0 0.01 0.02 0.05 0.10 100% 60% 80% 40% 20% 0% Constant Demography (s = 0.01) 0.70 0.80 0.90 Frequency at which selection begins Power Sampling Frequency 0 0.01 0.02 0.05 0.10 100% 60% 80% 40% 20% 0% African Demography (s = 0.01) 0.70 0.80 0.90 Frequency at which selection begins Power Sampling Frequency 0 0.01 0.02 0.05 0.10 100% 60% 80% 40% 20% 0% European Demography (s = 0.01) 0.70 0.80 0.90 Frequency at which selection begins Power Sampling Frequency
  • 26. A genomic approach to detecting selection • Most SNPs are non-coding. • Most regulatory elements do not act on the nearest gene. • We can use genome-wide signatures of selection to infer selection on genes using eQTL information. ARTICLE Sherlock: Detecting Gene-Disease Associations by Matching Patterns of Expression QTL and GWAS Xin He,1,2 Chris K. Fuller,1 Yi Song,1 Qingying Meng,3 Bin Zhang,4 Xia Yang,3 and Hao Li1,* Genetic mapping of complex diseases to date depends on variations inside or close to the genes that perturb their activities. A strong body of evidence suggests that changes in gene expression play a key role in complex diseases and that numerous loci perturb gene expression in trans. The information in trans variants, however, has largely been ignored in the current analysis paradigm. Here we pre- sent a statistical framework for genetic mapping by utilizing collective information in both cis and trans variants. We reason that for a disease-associated gene, any genetic variation that perturbs its expression is also likely to influence the disease risk. Thus, the expression quantitative trait loci (eQTL) of the gene, which constitute a unique ‘‘genetic signature,’’ should overlap significantly with the set of loci associated with the disease. We translate this idea into a computational algorithm (named Sherlock) to search for gene-disease associa- tions from GWASs, taking advantage of independent eQTL data. Application of this strategy to Crohn disease and type 2 diabetes pre- dicts a number of genes with possible disease roles, including several predictions supported by solid experimental evidence. Importantly, predicted genes are often implicated by multiple trans eQTL with moderate associations. These genes are far from any GWAS association signals and thus cannot be identified from the GWAS alone. Our approach allows analysis of association data from a new perspective and is applicable to any complex phenotype. It is readily generalizable to molecular traits other than gene expression, such as metabolites, noncoding RNAs, and epigenetic modifications. Introduction Recent application of genome-wide association studies (GWASs) to complex human diseases led to the discovery both cis- and trans-expression QTL in the context of associ- ation studies. So far, information from trans variations has largely been ignored because only cis variants can be as- signed to their target genes based on proximity by using the GWAS data alone. The growing collection of eQTLHe et al. AJHG (2013)
  • 27. Detecting selection on regulatory networks Figure 1. The Sherlock Algorithm: Matching Genetic Signatures of Gene Expression Traits to that of the Disease to Identify Gene- Disease Associations He et al. AJHG (2013)
  • 28. Detecting selection on regulatory networks Rank GENE BF 1 IFNAR2 14.0216 2 DARS 13.3106 3 RARRES2 12.7859 4 SLC25A43 11.8157 5** EXT1 11.4169 6 FAM20B 11.3852 7** MICB 11.2997 8** MICA 11.2997 9** HLA-­‐B 11.2997 10** HLA-­‐C 11.2359 11 RHBDL1 11.1828 12** RBMS3 11.142 13 FNBP1 11.1387 14 P4HB 10.8784 15** SOX5 10.8667 Rank GENE BF 16 KCNK3 10.7472 17 RGS20 10.5487 18 MPST 10.5474 19** HLA-­‐DPB1 10.4441 20 QSOX1 10.4326 21** IL16 10.4201 22** SYT17 10.3908 23 MALL 10.3165 24** CRTC1 10.2577 25 MEMO1 10.2574 26 ISOC2 10.2464 27 PCF11 10.0775 28 XKR8 10.0043 29 RNF216L 10.0043 30** SCG2 10.0012 ** indicates genes in GWAS association database for complex phenotype
  • 29. Selection on standing variation driven by response to pathogens Description P-value FDR q-value cytokine-mediated signaling pathway 5.92E-06 6.26E-02 immune effector process 7.47E-06 3.95E-02 regulation of immune system process 7.47E-06 2.64E-02 regulation of defense response to virus 8.53E-06 2.26E-02 lymphocyte costimulation 9.36E-06 1.98E-02 T cell costimulation 9.36E-06 1.65E-02 GOrilla
  • 30. Haplotype-based selection signals 
 recapitulate geography −5 0 5 −50510 Top 1% of windows PC1 (14.4%) PC2(12.6%) ACB ASW CDX CEU CHBCHS CLM FIN GBRGIH IBS JPT KHV LWK MKK MXL PEL PUR TSI YRI• TGP samples with phased OMNI genotype data • Used iHS • 100kb windows for each population are coded 1 if selection score is in top 1% 
 (0 otherwise)
  • 31. Conclusions • Many complex signatures of selection in the human genome. • Mixtures of positive and negative selection • Complicated modes of selection (including soft sweeps) • Predominant signature of ancient human-lineage selection seems to be from olfactory processes • Recent selection on standing variation associated with complex traits, including pathogen response.
  • 32. Thanks! 1000 Genomes Project Consortium Funding: NHGRI; QB3; CHARM; CTSI ryan.hernandez@ucsf.edu Nicolas Strauli Cyrus Maher Raul Torres Lawrence Uricchio Zach Szpiech