SlideShare une entreprise Scribd logo
1  sur  84
Combining SNPs, STRs, & Genealogy
to build a Surname Origins Tree
Dr Maurice Gleeson
11th Annual FTDNA Conference
15th Nov 2015
http://gleesondna.blogspot.co.uk/
YouTube – DNA and Family History Research
Google: YouTube Genetic Genealogy Ireland
A Combined Mutation / Family History Tree
… using DNA markers when people run out
… is it possible? Can you do it?
Topics for Discussion
• Building a tree with STRs
• Building a tree with SNPs
• Combining STRs & SNPs
• Dating branching points in the tree
• Combining STRs, SNPs & genealogy
• Opportunities for the years ahead
Topics for Discussion
• Building a tree with STRs
• Building a tree with SNPs
• Combining STRs & SNPs
• Dating branching points in the tree
• Combining STRs, SNPs & genealogy
• Challenges for the years ahead
Modal Haplotype for Lineage II
• Lots of Parallel Mutations!
o Back Mutations remain hidden
• Is resolution enough to define the tree?
• Is this the “best fit” model?
570 (17-18)
CDYa (38>39) CDYa (38>39)
3
Branch numbers
Courtesy of Ralph Taylor
G64
G39
Fluxus cladogram
• It can help
- useful to check against
the Hand-Drawn Tree
• Shows “maximum
parsimony” version
• Cumbersome, fiddly,
easy to make mistakes,
difficult to interpret,
time-consuming
• Difficult to visualise as a
“Family Tree”
• Gives all markers equal
weight & ignores differing
mutation rates
www.isogg.org/wiki/Cladogram
Courtesy of Ralph Taylor
G64
G39
Fluxus cladogram
• Several “Best Fit” models
- at least 8 BF models …
- Tree is not anchored
• No single “most likely” option
• So not enough information
at 37 markers to define
the branching pattern
• Parallel Mutations still
persist
- 390, 392, CDYa&b
• Back Mutations also possible
• Not clear which mutation
came before which
www.isogg.org/wiki/Cladogram
570 (17-18)
CDYa (38>39) CDYa (38>39)
Hand Drawn Tree
570 (17-18)
CDYa (38>39) CDYa (38>39)
Fluxus Tree v1
Branch numbers
570 (17-18)
CDYa (38>39) CDYa (38>39)
Hand Drawn Tree
570 (17-18)
CDYa (38>39) CDYa (38>39)
Fluxus Tree v1
Branch numbers
Fluxus Cladogram
(111 markers)
G64
G39
G73
G64
G39
Fluxus Cladogram
(37 markers)
www.isogg.org/wiki/Cladogram Courtesy of Ralph Taylor
Essential technology for project success
(37 markers)
Fluxus Cladogram
(111 markers)
G64
G39
G73
G64
G39
Fluxus Cladogram
(37 markers)
Courtesy of Ralph Taylor
• No weighting … but mutation rates vary by a factor of 400
• James Irvine developed an algorithm for weighting markers
weighting = 99* (1 – mutation rate/0.04)2
https://en.wikipedia.org/wiki/List_of_Y-STR_markers
www.isogg.org/wiki/Cladogram Courtesy of Ralph Taylor
• Torso disappears
• No alternative pathways
= 1 single “Best Fit” model
Fluxus Cladogram
(111 markers)
G64
G39
G73
Fluxus Cladogram
(111 markers,
weighted)
Some markers behave unusually
• Marker 389: this is tested in 2 parts – mutation in Part 1 is also
counted in Part 2 => so just use Part 2 (389ii) … and we did!
– www.familytreedna.com/learn/y-dna-testing/y-str/different-str-markers-dys389i-dys398ii-
dys389-2-result-family-tree-dna-different-genographic-project/
• Multi-copy markers 464abcd
(but also 385, 459, YCAII, CDY, DYF395S1, 413)
– mutations in multi-copy markers may not be in the correct order
– Kittler test defines relative positions for 385 … not applicable here?
– www.familytreedna.com/learn/y-dna-testing/y-str/infinite-allele-palindromic-markers/
– http://www.isogg.org/wiki/DYS_464
• Multi-copy marker 464abcd: 2 types = c & g
– 464x test defines which type (but not position) … not accounted for!
– http://www.dna-fingerprint.com/static/PalindromicPres.pdf
• 464abcd, CDYa & b: fast-mutating palindromic markers
– http://www.isogg.org/wiki/RecLOH
Fluxus Cladogram
(111 markers,
weighted)
Fluxus Cladogram
(111 markers,
weighted, no CDY,464)
Which is more accurate?
with or without CDY & 464?
or some version in between?
How likely is it that 464 & CDY will screw things up?
• Gleeson surname origin = 1000 AD
 Surname has had 1000 years to mutate
= 33.3 generations (30 y/gen)
• How many mutations would you expect in 1000 years?
• CDY mutation rate = 0.03531 / gen
= 1.176 per member = c.16 mutations for all 14 branches of Lineage II
Observed rate is 4 for CDYa, and 3 for CDYb
=> 12/16 and 13/16 mutations respectively are hidden?
– So predictions based on CDY will be incorrect (12/16 + 13/16)/2 = 78%
of the time?
• 464 mutation rate = 0.00566 / gen
= 0.188 per member = 2.6 per 14 members (on each of 464abcd)
Observed rate is 0 for 464a & d, and 2 for 464b & c
=> 2.6/2.6 & 0.6/2.6 mutations respectively are hidden?
– So predictions based on 464 will be incorrect 62% of the time?
https://en.wikipedia.org/wiki/List_of_Y-STR_markers
How likely is it that 464 & CDY will screw things up?
• Less of a problem in those branches related within the last
200-300 years?
– less time to mutate back
– lower chance of back mutations
– more useful for branch-defining
• More of a problem with those branches more distantly
related (600-1000 yrs)?
– more time to mutate back
– higher chance of back mutations
– less useful for branch-defining
 Choose v3a (i.e. use CDY & 464 data)
• Tree will be less than 100% correct
• Be especially wary of mutations in more distant reaches of
the tree
https://en.wikipedia.org/wiki/List_of_Y-STR_markers
Y-12 HDT Y-37 HDT
Caveats & Limitations
• Missing data
– Fluxus fills in the blanks - is its “best guess" valid?
– No adequate mutation rates for many markers
• The Tree is not yet “anchored”
– Moreso in the upper reaches of the tree (sub-branches seem stable)
– Several interpretations are still possible, even at 111 markers (v3a vs v4)
– Will this reduce as more people test? or upgrade?
– Are there hidden Back Mutations?
• Tree may be skewed by recent mutations (last 5-6 generations)
=> Triangulate on each MDKA
– Test at least 2 known distant cousins from each family branch in order to
characterise the haplotype of each MDKA
– Helps eliminate recent mutations which might cloud the interpretation
– Costly … $339 for a 111 marker test … x2 = $678
• Is there Convergence in the Tree? (e.g. 3/111)
www.isogg.org/wiki/Fluxus
Topics for Discussion
• Brief overview of key concepts
• Building a tree with STRs
• Building a tree with SNPs
• Combining STRs & SNPs
• Dating branching points in the tree
• Combining STRs, SNPs & genealogy
• Challenges for the years ahead
http://dna-explained.com/2014/10/15/tenth-annual-family-tree-dna-conference-wrapup/
Deep Clade Panel 2.0
- Targeted subclade panels
- $119
Is fine-scale SNP testing
the best method of determining
branching patterns within a Genetic Family?
… how to do it as cheaply &
efficiently as possible?
Google: YouTube Genetic Genealogy Ireland
Working with SNPs
– Opportunities & Challenges
• Declaring SNPs - false positives
• Missing SNPs - false negatives
• Constant change
– “Known, Novel, Shared & Private”
• No name, just a location
• SNP naming process unregulated
– Same SNP, different names
• Making results user-friendly
• Lots of help available
– independent verification & interpretation possible
Problems encountered with “declaring a genuine SNP”
Problem Reason(s) Implication
Detection No coverage False negative – SNP is present on Y but
remains undetected
Low no. of
Calls
Poor coverage False Negative – SNP present but fails to
meet threshold criteria
Recognition Detection Filter /
Threshold too strict?
False Negative - SNP is present in data but
missed by analysis - detectable by manual
analysis of possible SNPs on BAM file
Localisation Difficult location on Y
(centromere, palindrome,
in STR / repetitive region)
False Positive or Negative - SNP may be
genuine but its exact position cannot be
known for sure or may vary
Instability Unstable SNP – frequent
& unpredictable mutation
False Positive or Negative - SNP may or may
not be genuine
InDels Not SNPs, but rather a
deletion (usually)
False Positive or Negative - may or may not
be genuine
So is the SNP really present?
… or absent?
Just because it is detected, doesn’t mean it is there …
Just because it’s not detected, doesn’t mean it isn’t there
SNPs
Known SNPs
(already
discovered)
New SNPs
(never discovered
before)
Shared
(with someone
else)
Not shared
(Unique / Private)
“Known, Novel, Shared & Private”
– the fluid categorisation of SNPs
Shared
Novel
Variants
No names …
just positions
Private SNPs
(unique)
No names …
just positions
FTDNA
Results (FT)
Project
Admin (LL)
Haplogroup
Admins*
Alex (Big Tree)
Williamson
Nigel (Munster)
McCarthy
YFULL (YF)
11
2
3
2
1
4
Shared
Novel
Variants in
Z16437
subgroup
* Neal Downing, John Murphy, James Kane & Z255 Yahoo group
Lisa Little, project member
Gleeson Family Tree based on newly discovered SNP markers
Lisa Little, project member
Z255 Haplogroup Project Colour Coded Spreadsheet
(John Murphy)
Gleeson-specific SNP markers
https://groups.yahoo.com/neo/groups/R1b-Z255-Project
James Kane’s tree
www.it2kane.org/matrix/R.html
https://www.familytreedna.com/groups/r-l21-south-irish/about/background
http://www.ytree.net
Alex Williamson’s “Big Tree”
… aka BY2853
Jan 2015
Apr 2015
Jun 2015
Oct 2015
www.ytree.net/DisplayTree.php?blockID=319&star=false
Clicking on a marker or name
brings up further analysis
www.ytree.net/MutMatrix.php
Grey = no coverage
Pink = marginal coverage
My simplistic interpretation
+ Definite
* Probable
** Possible
*** Unlikely
The Big Tree: R-A5629 Mutation Matrix of Shared SNPs
Currently Unique SNPs … 3 (1), 3 (2), 13 (5) = 19 (8)
http://www.ytree.net/SNPinfoForPerson.php?personID=1288Alex Williamson’s “Big Tree”
YFULL Novel SNPs
Alex Williamson’s “Big Tree”
www.yfull.com
• Are they really SNPs?
- different thresholds & filters
• SNPs trapped in Private Collections
- Private SNPs will be liberated as more people test
& SNPs become “not private” anymore – move up into the
shared area of the tree … but they will run out! When?
• No names, just locations
- will need to be translated into SNP names in time
=> consult Ybrowse, other utilities??
Inconsistency in “declaring
a genuine SNP”
Different strokes for different folks
Who is right?
… or more accurately …
who has estimated correctly?
End Result
SNP = definite, probable, possible, or unlikely
… subject to change ... & Sanger Sequencing?
Despite NGS, Sanger Sequencing
will still be required
• Chip-based SNP testing will still be
needed to confirm or refute
discoveries made by NGS
• Multiple Deep Clade Panels will
need to be created
… for subclades, surnames, & genetic clusters
Some Bold Predictions …
Topics for Discussion
• Brief overview of key concepts
• Building a tree with STRs
• Building a tree with SNPs
• Combining STRs & SNPs
• Dating branching points in the tree
• Combining STRs, SNPs & genealogy
• Challenges for the years ahead
• SNP results consistent?
• Need to tidy it up
456 15-16
• SNPs are further up the tree than STRs
• Tell us nothing about branches on left
• Only use “definite SNPs” (not probable/possible)
• Private SNPs are still trapped in Private Collections
Mutation sequence?
BY2853 > A5629 > 456 …
> G68 (Glisson, Branch 14)
> A5628
> Y16880 (Branch 2,7,6)
> A660 (Branch 9)
http://freepages.genealogy.rootsweb.ancestry.com/~skibbgirl/McCarthyDNAProject/
G54 G39
G51
G66 G22 G42 G55 G57 G21
Nigel McCarthy
G54 G39
G51
G73
G66 G22 G42 G55 G57 G21
Nigel McCarthy’s Z255 Group E
http://freepages.genealogy.rootsweb.ancestry.com/~skibbgirl/McCarthyDNAProject/
G68
No BY2852 block
Extra marker
Private SNPsPrivate SNPsPrivate SNPs
2 pink SNPs omitted
Differing
Modal
Haplotype
<67 markers excluded
Topics for Discussion
• Brief overview of key concepts
• Building a tree with STRs
• Building a tree with SNPs
• Combining STRs & SNPs
• Dating branching points in the tree
• Combining STRs, SNPs & genealogy
• Challenges for the years ahead
Iain McDonald, The 2015 report to the U106 group (Sep 2015)
www.jb.man.ac.uk/~mcdonald/genetics/u106-geography-2015-revised.pdf
www.familytreedna.com/groups/tmrca-case-studies/about
Up till now, we know there are branches that come off the Modal
But which came first?
Can we place them in the correct order?
G57, 60393
G21, N74958
G55, 338070
G39, N101540
G51, 244645
• YFULL analysis offers TMRCA estimates for SNPs
… and includes Calculation Formula
-60% to +50%
750
325
50
0
3
10
Probability
Markers
tested GD 5%
MLE
50% 95% Range (%)
12 1 3 17 >24 -82% to ???
25 1 1 7 20 -85% to + 186%
37 1 0 3 10 -100% to + 233%
67 2 1 4 11 -75% to +175%
111 6 4 8 15 -50% to +88%
495 24 6 9 12 -33% to +33%
G21 G57
MLE, Maximum Likelihood Estimate(?)
• Ranges are wide & skewed toward distant generations
• 111 markers gives the “best estimate”
with smallest upper ranges
but still almost double the mid-value
• Individually extracted 5%, 50% & 95% estimates (90% Confidence Interval)
• Markers tested: White = 111, Yellow = 67, Cream = 37, Blue = 25
• 50% probability estimate ranges from 1 to >24 generations
• Use triangulation to get better overall estimate?
TMRCA Triangulation
750
325
50
3
3
6,4,6
3,3
8,3,11
24,22,21,21*3,
>24,18,15,20,22
9
12
11*3,1522,14,13*3,1
6
11
2
25
5.3 9.5
13
21
3
9.5
?
14,14,11,11,22,22,17,17,18,18,15,
15,(20,13,14,14)*3,18,10,10,10
14,14,11,11,22,22,17,17,18,18,15,
15,(20,13,14,14)*3,18,10,10,1014.3
TMRCA Triangulation
Will additional STR markers help refine TMRCA estimates?
• But … 5% differ? ... some are missing? ... not detected by NGS?
• 35 mutations between G21 & G55
• 24 mutations between G21 & G57
• 9 mutations between G21 & G57
http://dna-project.clan-donald-usa.org/tmrca.htm
0
3
10
Probability
Markers
tested GD 5%
MLE
50% 95% Range (%)
12 1 3 17 >24 -82% to ???
25 1 1 7 20 -85% to + 186%
37 1 0 3 10 -100% to + 233%
67 2 1 4 11 -75% to +175%
111 6 4 8 15 -50% to +88%
495 24 6 9 12 -33% to +33%
Probability
Markers
tested GD 5%
MLE
50% 95% Range (%)
12 1 3 17 >24 -82% to ???
25 1 1 7 20 -85% to + 186%
37 1 0 3 10 -100% to + 233%
67 2 1 4 11 -75% to +175%
111 6 4 8 15 -50% to +88%
495 24 6 9 12 -33% to +33%
G21 G57
750
325
50
3
3
6,4,6
3,3
8,3,11
24,22,21,21*3,
>24,18,15,20,22
9
12
11*3,1522,14,13*3,1
6
11
2
25
5.3 9.5
13
21
3
9.5
?14,16,18,18
13,10
16.5
14.3
11.5
7
Topics for Discussion
• Brief overview of key concepts
• Building a tree with STRs
• Building a tree with SNPs
• Combining STRs & SNPs
• Dating branching points in the tree
• Combining STRs, SNPs & genealogy
• Challenges for the years ahead
750
325
50
3
3
6,4,6
3,3
8,3,11
24,22,21,21*3,
>24,18,15,20,22
9
12
11*3,1522,14,13*3,1
6
11
2
25
5.3 9.5
13
21
3
9.5
?14,16,18,18
13,10
16.5
14.3
11.5
???? ????
7
750
325
50
3
3
6,4,6
3,3
8,3,11
24,22,21,21*3,
>24,18,15,20,22
9
12
11*3,1522,14,13*3,1
6
11
2
25
5.3 9.5
13
21
3
9.5
?14,16,18,18
13,10
16.5
14.3
11.5
???? ????
7
MDKA
Profile
MDKA Profiles
http://gleesondna.blogspot.com
A Combined Mutation / Family History Tree
… using DNA markers when people run out
… is it possible?
Topics for Discussion
• Brief overview of key concepts
• Building a tree with STRs
• Building a tree with SNPs
• Combining STRs & SNPs
• Dating branching points in the tree
• Combining STRs, SNPs & genealogy
• Opportunities for the years ahead
Lessons Learned & Future Opportunities
• Transcription errors are easy => triple-check, automate
• Re STRs
– Lots of Parallel Mutations … where are the Back Mutations?
– 111 markers best define the branching pattern
– Placement of CDY & 464 is likely to be incorrect (esp. in
upstream generations)
– Most project members have not tested other male cousins
to triangulate on their MDKA
– Convergence may be a problem (even at 3/111)
– We need more people to test
– We need more people to upgrade to 111 markers
– YFULL analysis liberates 495 STRs
Lessons Learned & Future Opportunities
• Re SNPs
– Difficult to declare a genuine SNP
– Different SNPs from different lips
– Definite, probable, possible, unlikely
– Likely to be lots of false negatives (& false positives)
– No names (locations too long)
– Naming is unregulated
– Many SNPs trapped in Private Collections
– Current NGS is discovery, not confirmatory =>
further testing (with other NGS?) needed to confirm
Lessons Learned & Future Opportunities
• Re combining STRs & SNPs
– Adding SNPs changed the upper reaches of the tree
– SNPs are still located relatively upstream - STRs offer better
definition downstream
– Start with the Modal of your Haplogoup subgroup
• Re TMRCA estimates
– SNP-based estimates work best for distant branching
points (haplogroup projects)
– STR-based estimates have wide ranges, and skewed
toward distant generations
– Even at 111, upper range ~ double the mid-value
– Even 495 markers has a wide range (+/- 33%)
Lessons Learned & Future Opportunities
• Re combining STRs, SNPs & genealogy
– We need to overlay documentary data on DNA
– Some pedigrees not supplied / incomplete
– Need to add MPRs to all (MDKA Profile)
– Need to take a One Name Study approach?
• Collate all Gleeson data worldwide
• Establish a relational database (Access?)
• Assign data to different family branches
• This early draft MHT serves as a useful basis
– Will evolve over time as more people test & upgrade
– Will faciltate collaboration between project members
– Will help attract new project members
Vision 2020
Where will we be in 5 years time?
Here are some bold predictions …
What would happen if …
• Everyone upgraded to 111 markers?
– Better definition of branching pattern
– More precise TMRCA estimates (with narrower range)
• Everyone did the Big Y?
– SNPs only good for upstream branches? (<1500 AD)
– We will run out of Private SNPs
• Everyone tested on a Surname Specific Panel?
– Would elucidate branching pattern up to 1500 AD? Later?
• Everyone did Whole Genome Sequencing?
– No better than Big Y? Better coverage? Better read length?
– What will happen to Probable / Possible / Unlikely SNPs?
Some Bold Predictions …
• (To help stimulate discussion & to learn)
• What is most useful for Surname Projects –
more SNPs or more STRs?
– More STRs … we will run out of Private SNPs
– 111 vs 50,000
– 500 vs 40?
• In 2020, FTDNA will offer 500 STRs for $129
Some Bold Predictions …
• How do we best generate a Surname-Specific
SNP Panel?
– Q: How many discovery Big Y tests are needed to
liberate sufficient Private SNPs to adequately
define the Surname Panel?
– A: 5-10 Big Y tests per genetic cluster
– We need another few people to Big Y test, then
generate the Surname Panel for Lineage II
• In 2020, FTDNA will offer over 4000
Surname Specific SNP Panels
for $100 each
Generate MHTree
More tools
Lineage I
Lineage II
Lineage III
Lineage IV
Lineage II Mutation History Tree
Acknowledgements
• Bennett Greenspan
• Max Blankfield
• Janine Cloud
• FTDNA team
• Judy Claassen
• Lisa Little
• James Irvine
• Ralph Taylor
• John Cleary
• Haplogroup Admins
• John Murphy
• Neal Downing
• James Kane
• Alex Williamson
• Nigel McCarthy
• Dennis Wright
• Alasdair MacDonald
• YFULL team
The Genetic Genealogy Community

Contenu connexe

En vedette

Supercharge Your Project Members
Supercharge Your Project MembersSupercharge Your Project Members
Supercharge Your Project MembersFamily Tree DNA
 
Surname DNA Journal Update 2016
Surname DNA Journal Update 2016Surname DNA Journal Update 2016
Surname DNA Journal Update 2016Family Tree DNA
 
Personal Privacy In Public Projects
Personal Privacy In Public ProjectsPersonal Privacy In Public Projects
Personal Privacy In Public ProjectsFamily Tree DNA
 
Autosomes & Agamemnon's Face
Autosomes & Agamemnon's FaceAutosomes & Agamemnon's Face
Autosomes & Agamemnon's FaceFamily Tree DNA
 
The Genographic Project 2015
The Genographic Project 2015The Genographic Project 2015
The Genographic Project 2015Family Tree DNA
 
Family Tree DNA Conference -- Administrators' Library
Family Tree DNA Conference -- Administrators' LibraryFamily Tree DNA Conference -- Administrators' Library
Family Tree DNA Conference -- Administrators' LibraryFamily Tree DNA
 
The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites Family Tree DNA
 
The Paternal Tree of Humanity
The Paternal Tree of HumanityThe Paternal Tree of Humanity
The Paternal Tree of HumanityFamily Tree DNA
 
Native American Mitochondrial Haplogroup Discoveries
Native American Mitochondrial Haplogroup DiscoveriesNative American Mitochondrial Haplogroup Discoveries
Native American Mitochondrial Haplogroup DiscoveriesFamily Tree DNA
 
Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...
Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...
Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...ANGRAD
 
SharePoint Tutorial and SharePoint Training - Introduction
SharePoint Tutorial and SharePoint Training - IntroductionSharePoint Tutorial and SharePoint Training - Introduction
SharePoint Tutorial and SharePoint Training - IntroductionGregory Zelfond
 
Patologia Geral: Aula 04 2009 - Alterações Cadavéricas
Patologia Geral: Aula 04 2009 - Alterações CadavéricasPatologia Geral: Aula 04 2009 - Alterações Cadavéricas
Patologia Geral: Aula 04 2009 - Alterações CadavéricasUFPEL
 
07 desvios-posturais
07 desvios-posturais07 desvios-posturais
07 desvios-posturaistaniamendonca
 
Barreras de entrada y salida
Barreras de entrada y salidaBarreras de entrada y salida
Barreras de entrada y salidaMAIK8712
 
RESUMÃO DE CIRURGIA NA ODONTOLOGIA
RESUMÃO DE CIRURGIA NA ODONTOLOGIARESUMÃO DE CIRURGIA NA ODONTOLOGIA
RESUMÃO DE CIRURGIA NA ODONTOLOGIARayssa Mendonça
 
Housekeeping Role and Cleaning Equipment
Housekeeping Role and Cleaning EquipmentHousekeeping Role and Cleaning Equipment
Housekeeping Role and Cleaning Equipmentsaumyajeet dutta
 
Diapositivas proyecto de lecto escritura
Diapositivas proyecto de lecto escrituraDiapositivas proyecto de lecto escritura
Diapositivas proyecto de lecto escrituramarticarojas
 
Proyecto de Aprendizaje "Descubriendo el Mundo de la Lecto – Escritura"
Proyecto de Aprendizaje "Descubriendo el Mundo  de la Lecto – Escritura"Proyecto de Aprendizaje "Descubriendo el Mundo  de la Lecto – Escritura"
Proyecto de Aprendizaje "Descubriendo el Mundo de la Lecto – Escritura"Joselyn Castañeda
 

En vedette (19)

Supercharge Your Project Members
Supercharge Your Project MembersSupercharge Your Project Members
Supercharge Your Project Members
 
Surname DNA Journal Update 2016
Surname DNA Journal Update 2016Surname DNA Journal Update 2016
Surname DNA Journal Update 2016
 
Personal Privacy In Public Projects
Personal Privacy In Public ProjectsPersonal Privacy In Public Projects
Personal Privacy In Public Projects
 
Autosomes & Agamemnon's Face
Autosomes & Agamemnon's FaceAutosomes & Agamemnon's Face
Autosomes & Agamemnon's Face
 
The Genographic Project 2015
The Genographic Project 2015The Genographic Project 2015
The Genographic Project 2015
 
Family Tree DNA Conference -- Administrators' Library
Family Tree DNA Conference -- Administrators' LibraryFamily Tree DNA Conference -- Administrators' Library
Family Tree DNA Conference -- Administrators' Library
 
The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites
 
The Paternal Tree of Humanity
The Paternal Tree of HumanityThe Paternal Tree of Humanity
The Paternal Tree of Humanity
 
Native American Mitochondrial Haplogroup Discoveries
Native American Mitochondrial Haplogroup DiscoveriesNative American Mitochondrial Haplogroup Discoveries
Native American Mitochondrial Haplogroup Discoveries
 
Gap 101 – The Basics
Gap 101 – The BasicsGap 101 – The Basics
Gap 101 – The Basics
 
Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...
Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...
Sinaes – INSTRUMENTO DE AVALIAÇÃO DE CURSOS DE GRADUAÇÃO NAS MODALIDADES PRES...
 
SharePoint Tutorial and SharePoint Training - Introduction
SharePoint Tutorial and SharePoint Training - IntroductionSharePoint Tutorial and SharePoint Training - Introduction
SharePoint Tutorial and SharePoint Training - Introduction
 
Patologia Geral: Aula 04 2009 - Alterações Cadavéricas
Patologia Geral: Aula 04 2009 - Alterações CadavéricasPatologia Geral: Aula 04 2009 - Alterações Cadavéricas
Patologia Geral: Aula 04 2009 - Alterações Cadavéricas
 
07 desvios-posturais
07 desvios-posturais07 desvios-posturais
07 desvios-posturais
 
Barreras de entrada y salida
Barreras de entrada y salidaBarreras de entrada y salida
Barreras de entrada y salida
 
RESUMÃO DE CIRURGIA NA ODONTOLOGIA
RESUMÃO DE CIRURGIA NA ODONTOLOGIARESUMÃO DE CIRURGIA NA ODONTOLOGIA
RESUMÃO DE CIRURGIA NA ODONTOLOGIA
 
Housekeeping Role and Cleaning Equipment
Housekeeping Role and Cleaning EquipmentHousekeeping Role and Cleaning Equipment
Housekeeping Role and Cleaning Equipment
 
Diapositivas proyecto de lecto escritura
Diapositivas proyecto de lecto escrituraDiapositivas proyecto de lecto escritura
Diapositivas proyecto de lecto escritura
 
Proyecto de Aprendizaje "Descubriendo el Mundo de la Lecto – Escritura"
Proyecto de Aprendizaje "Descubriendo el Mundo  de la Lecto – Escritura"Proyecto de Aprendizaje "Descubriendo el Mundo  de la Lecto – Escritura"
Proyecto de Aprendizaje "Descubriendo el Mundo de la Lecto – Escritura"
 

Similaire à Building a Mutation History Tree

Inference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' worldInference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' worldJoe Parker
 
2015 Bioc4010 lecture1and2
2015 Bioc4010 lecture1and22015 Bioc4010 lecture1and2
2015 Bioc4010 lecture1and2Dan Gaston
 
Interpreting genetic-genealogy-results web-optimized
Interpreting genetic-genealogy-results web-optimizedInterpreting genetic-genealogy-results web-optimized
Interpreting genetic-genealogy-results web-optimizeddonnie harold harris
 
Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding Anilkumar C
 
Jillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian Aurisano
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian Aurisano
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian Aurisano
 
U Florida / Gainesville talk, apr 13 2011
U Florida / Gainesville  talk, apr 13 2011U Florida / Gainesville  talk, apr 13 2011
U Florida / Gainesville talk, apr 13 2011c.titus.brown
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian Aurisano
 
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.QIAGEN
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchFabien Gandon
 
Introduction to Genetic Algorithms and Evolutionary Computation
Introduction to Genetic Algorithms and Evolutionary ComputationIntroduction to Genetic Algorithms and Evolutionary Computation
Introduction to Genetic Algorithms and Evolutionary ComputationAleksander Stensby
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Aleksander Stensby
 

Similaire à Building a Mutation History Tree (20)

2014 ucl
2014 ucl2014 ucl
2014 ucl
 
2014 villefranche
2014 villefranche2014 villefranche
2014 villefranche
 
2014 naples
2014 naples2014 naples
2014 naples
 
1 md2016 homology
1 md2016 homology1 md2016 homology
1 md2016 homology
 
Inference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' worldInference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' world
 
2015 Bioc4010 lecture1and2
2015 Bioc4010 lecture1and22015 Bioc4010 lecture1and2
2015 Bioc4010 lecture1and2
 
Interpreting genetic-genealogy-results web-optimized
Interpreting genetic-genealogy-results web-optimizedInterpreting genetic-genealogy-results web-optimized
Interpreting genetic-genealogy-results web-optimized
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
2016 bergen-sars
2016 bergen-sars2016 bergen-sars
2016 bergen-sars
 
Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding
 
Jillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideo
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2
 
U Florida / Gainesville talk, apr 13 2011
U Florida / Gainesville  talk, apr 13 2011U Florida / Gainesville  talk, apr 13 2011
U Florida / Gainesville talk, apr 13 2011
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3
 
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.
 
Sept2016 sv nist_intro
Sept2016 sv nist_introSept2016 sv nist_intro
Sept2016 sv nist_intro
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
 
Introduction to Genetic Algorithms and Evolutionary Computation
Introduction to Genetic Algorithms and Evolutionary ComputationIntroduction to Genetic Algorithms and Evolutionary Computation
Introduction to Genetic Algorithms and Evolutionary Computation
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014
 

Dernier

trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squaresusmanzain586
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 

Dernier (20)

trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squares
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 

Building a Mutation History Tree

  • 1. Combining SNPs, STRs, & Genealogy to build a Surname Origins Tree Dr Maurice Gleeson 11th Annual FTDNA Conference 15th Nov 2015 http://gleesondna.blogspot.co.uk/ YouTube – DNA and Family History Research
  • 2. Google: YouTube Genetic Genealogy Ireland
  • 3. A Combined Mutation / Family History Tree … using DNA markers when people run out … is it possible? Can you do it?
  • 4. Topics for Discussion • Building a tree with STRs • Building a tree with SNPs • Combining STRs & SNPs • Dating branching points in the tree • Combining STRs, SNPs & genealogy • Opportunities for the years ahead
  • 5. Topics for Discussion • Building a tree with STRs • Building a tree with SNPs • Combining STRs & SNPs • Dating branching points in the tree • Combining STRs, SNPs & genealogy • Challenges for the years ahead
  • 6.
  • 7.
  • 8. Modal Haplotype for Lineage II • Lots of Parallel Mutations! o Back Mutations remain hidden • Is resolution enough to define the tree? • Is this the “best fit” model? 570 (17-18) CDYa (38>39) CDYa (38>39) 3 Branch numbers
  • 9. Courtesy of Ralph Taylor G64 G39 Fluxus cladogram • It can help - useful to check against the Hand-Drawn Tree • Shows “maximum parsimony” version • Cumbersome, fiddly, easy to make mistakes, difficult to interpret, time-consuming • Difficult to visualise as a “Family Tree” • Gives all markers equal weight & ignores differing mutation rates www.isogg.org/wiki/Cladogram
  • 10. Courtesy of Ralph Taylor G64 G39 Fluxus cladogram • Several “Best Fit” models - at least 8 BF models … - Tree is not anchored • No single “most likely” option • So not enough information at 37 markers to define the branching pattern • Parallel Mutations still persist - 390, 392, CDYa&b • Back Mutations also possible • Not clear which mutation came before which www.isogg.org/wiki/Cladogram
  • 11. 570 (17-18) CDYa (38>39) CDYa (38>39) Hand Drawn Tree 570 (17-18) CDYa (38>39) CDYa (38>39) Fluxus Tree v1 Branch numbers
  • 12. 570 (17-18) CDYa (38>39) CDYa (38>39) Hand Drawn Tree 570 (17-18) CDYa (38>39) CDYa (38>39) Fluxus Tree v1 Branch numbers
  • 13. Fluxus Cladogram (111 markers) G64 G39 G73 G64 G39 Fluxus Cladogram (37 markers) www.isogg.org/wiki/Cladogram Courtesy of Ralph Taylor
  • 14. Essential technology for project success
  • 16. Fluxus Cladogram (111 markers) G64 G39 G73 G64 G39 Fluxus Cladogram (37 markers) Courtesy of Ralph Taylor • No weighting … but mutation rates vary by a factor of 400 • James Irvine developed an algorithm for weighting markers weighting = 99* (1 – mutation rate/0.04)2 https://en.wikipedia.org/wiki/List_of_Y-STR_markers
  • 17. www.isogg.org/wiki/Cladogram Courtesy of Ralph Taylor • Torso disappears • No alternative pathways = 1 single “Best Fit” model Fluxus Cladogram (111 markers) G64 G39 G73 Fluxus Cladogram (111 markers, weighted)
  • 18.
  • 19. Some markers behave unusually • Marker 389: this is tested in 2 parts – mutation in Part 1 is also counted in Part 2 => so just use Part 2 (389ii) … and we did! – www.familytreedna.com/learn/y-dna-testing/y-str/different-str-markers-dys389i-dys398ii- dys389-2-result-family-tree-dna-different-genographic-project/ • Multi-copy markers 464abcd (but also 385, 459, YCAII, CDY, DYF395S1, 413) – mutations in multi-copy markers may not be in the correct order – Kittler test defines relative positions for 385 … not applicable here? – www.familytreedna.com/learn/y-dna-testing/y-str/infinite-allele-palindromic-markers/ – http://www.isogg.org/wiki/DYS_464 • Multi-copy marker 464abcd: 2 types = c & g – 464x test defines which type (but not position) … not accounted for! – http://www.dna-fingerprint.com/static/PalindromicPres.pdf • 464abcd, CDYa & b: fast-mutating palindromic markers – http://www.isogg.org/wiki/RecLOH
  • 20. Fluxus Cladogram (111 markers, weighted) Fluxus Cladogram (111 markers, weighted, no CDY,464)
  • 21.
  • 22. Which is more accurate? with or without CDY & 464? or some version in between?
  • 23. How likely is it that 464 & CDY will screw things up? • Gleeson surname origin = 1000 AD  Surname has had 1000 years to mutate = 33.3 generations (30 y/gen) • How many mutations would you expect in 1000 years? • CDY mutation rate = 0.03531 / gen = 1.176 per member = c.16 mutations for all 14 branches of Lineage II Observed rate is 4 for CDYa, and 3 for CDYb => 12/16 and 13/16 mutations respectively are hidden? – So predictions based on CDY will be incorrect (12/16 + 13/16)/2 = 78% of the time? • 464 mutation rate = 0.00566 / gen = 0.188 per member = 2.6 per 14 members (on each of 464abcd) Observed rate is 0 for 464a & d, and 2 for 464b & c => 2.6/2.6 & 0.6/2.6 mutations respectively are hidden? – So predictions based on 464 will be incorrect 62% of the time? https://en.wikipedia.org/wiki/List_of_Y-STR_markers
  • 24. How likely is it that 464 & CDY will screw things up? • Less of a problem in those branches related within the last 200-300 years? – less time to mutate back – lower chance of back mutations – more useful for branch-defining • More of a problem with those branches more distantly related (600-1000 yrs)? – more time to mutate back – higher chance of back mutations – less useful for branch-defining  Choose v3a (i.e. use CDY & 464 data) • Tree will be less than 100% correct • Be especially wary of mutations in more distant reaches of the tree https://en.wikipedia.org/wiki/List_of_Y-STR_markers
  • 26. Caveats & Limitations • Missing data – Fluxus fills in the blanks - is its “best guess" valid? – No adequate mutation rates for many markers • The Tree is not yet “anchored” – Moreso in the upper reaches of the tree (sub-branches seem stable) – Several interpretations are still possible, even at 111 markers (v3a vs v4) – Will this reduce as more people test? or upgrade? – Are there hidden Back Mutations? • Tree may be skewed by recent mutations (last 5-6 generations) => Triangulate on each MDKA – Test at least 2 known distant cousins from each family branch in order to characterise the haplotype of each MDKA – Helps eliminate recent mutations which might cloud the interpretation – Costly … $339 for a 111 marker test … x2 = $678 • Is there Convergence in the Tree? (e.g. 3/111) www.isogg.org/wiki/Fluxus
  • 27. Topics for Discussion • Brief overview of key concepts • Building a tree with STRs • Building a tree with SNPs • Combining STRs & SNPs • Dating branching points in the tree • Combining STRs, SNPs & genealogy • Challenges for the years ahead
  • 29. Is fine-scale SNP testing the best method of determining branching patterns within a Genetic Family? … how to do it as cheaply & efficiently as possible?
  • 30. Google: YouTube Genetic Genealogy Ireland
  • 31. Working with SNPs – Opportunities & Challenges • Declaring SNPs - false positives • Missing SNPs - false negatives • Constant change – “Known, Novel, Shared & Private” • No name, just a location • SNP naming process unregulated – Same SNP, different names • Making results user-friendly • Lots of help available – independent verification & interpretation possible
  • 32. Problems encountered with “declaring a genuine SNP” Problem Reason(s) Implication Detection No coverage False negative – SNP is present on Y but remains undetected Low no. of Calls Poor coverage False Negative – SNP present but fails to meet threshold criteria Recognition Detection Filter / Threshold too strict? False Negative - SNP is present in data but missed by analysis - detectable by manual analysis of possible SNPs on BAM file Localisation Difficult location on Y (centromere, palindrome, in STR / repetitive region) False Positive or Negative - SNP may be genuine but its exact position cannot be known for sure or may vary Instability Unstable SNP – frequent & unpredictable mutation False Positive or Negative - SNP may or may not be genuine InDels Not SNPs, but rather a deletion (usually) False Positive or Negative - may or may not be genuine So is the SNP really present? … or absent? Just because it is detected, doesn’t mean it is there … Just because it’s not detected, doesn’t mean it isn’t there
  • 33. SNPs Known SNPs (already discovered) New SNPs (never discovered before) Shared (with someone else) Not shared (Unique / Private) “Known, Novel, Shared & Private” – the fluid categorisation of SNPs
  • 34.
  • 36. Private SNPs (unique) No names … just positions
  • 37. FTDNA Results (FT) Project Admin (LL) Haplogroup Admins* Alex (Big Tree) Williamson Nigel (Munster) McCarthy YFULL (YF) 11 2 3 2 1 4 Shared Novel Variants in Z16437 subgroup * Neal Downing, John Murphy, James Kane & Z255 Yahoo group
  • 39. Gleeson Family Tree based on newly discovered SNP markers Lisa Little, project member
  • 40. Z255 Haplogroup Project Colour Coded Spreadsheet (John Murphy) Gleeson-specific SNP markers https://groups.yahoo.com/neo/groups/R1b-Z255-Project
  • 43. … aka BY2853 Jan 2015 Apr 2015 Jun 2015 Oct 2015 www.ytree.net/DisplayTree.php?blockID=319&star=false Clicking on a marker or name brings up further analysis
  • 44. www.ytree.net/MutMatrix.php Grey = no coverage Pink = marginal coverage My simplistic interpretation + Definite * Probable ** Possible *** Unlikely The Big Tree: R-A5629 Mutation Matrix of Shared SNPs
  • 45. Currently Unique SNPs … 3 (1), 3 (2), 13 (5) = 19 (8) http://www.ytree.net/SNPinfoForPerson.php?personID=1288Alex Williamson’s “Big Tree”
  • 46. YFULL Novel SNPs Alex Williamson’s “Big Tree” www.yfull.com
  • 47. • Are they really SNPs? - different thresholds & filters • SNPs trapped in Private Collections - Private SNPs will be liberated as more people test & SNPs become “not private” anymore – move up into the shared area of the tree … but they will run out! When? • No names, just locations - will need to be translated into SNP names in time => consult Ybrowse, other utilities?? Inconsistency in “declaring a genuine SNP”
  • 48. Different strokes for different folks Who is right? … or more accurately … who has estimated correctly? End Result SNP = definite, probable, possible, or unlikely … subject to change ... & Sanger Sequencing?
  • 49. Despite NGS, Sanger Sequencing will still be required • Chip-based SNP testing will still be needed to confirm or refute discoveries made by NGS • Multiple Deep Clade Panels will need to be created … for subclades, surnames, & genetic clusters Some Bold Predictions …
  • 50. Topics for Discussion • Brief overview of key concepts • Building a tree with STRs • Building a tree with SNPs • Combining STRs & SNPs • Dating branching points in the tree • Combining STRs, SNPs & genealogy • Challenges for the years ahead
  • 51. • SNP results consistent? • Need to tidy it up 456 15-16
  • 52. • SNPs are further up the tree than STRs • Tell us nothing about branches on left • Only use “definite SNPs” (not probable/possible) • Private SNPs are still trapped in Private Collections Mutation sequence? BY2853 > A5629 > 456 … > G68 (Glisson, Branch 14) > A5628 > Y16880 (Branch 2,7,6) > A660 (Branch 9)
  • 54. G54 G39 G51 G73 G66 G22 G42 G55 G57 G21 Nigel McCarthy’s Z255 Group E http://freepages.genealogy.rootsweb.ancestry.com/~skibbgirl/McCarthyDNAProject/ G68 No BY2852 block Extra marker Private SNPsPrivate SNPsPrivate SNPs 2 pink SNPs omitted Differing Modal Haplotype <67 markers excluded
  • 55. Topics for Discussion • Brief overview of key concepts • Building a tree with STRs • Building a tree with SNPs • Combining STRs & SNPs • Dating branching points in the tree • Combining STRs, SNPs & genealogy • Challenges for the years ahead
  • 56. Iain McDonald, The 2015 report to the U106 group (Sep 2015) www.jb.man.ac.uk/~mcdonald/genetics/u106-geography-2015-revised.pdf
  • 57. www.familytreedna.com/groups/tmrca-case-studies/about Up till now, we know there are branches that come off the Modal But which came first? Can we place them in the correct order?
  • 58.
  • 59. G57, 60393 G21, N74958 G55, 338070 G39, N101540 G51, 244645 • YFULL analysis offers TMRCA estimates for SNPs … and includes Calculation Formula -60% to +50%
  • 61. 0 3 10 Probability Markers tested GD 5% MLE 50% 95% Range (%) 12 1 3 17 >24 -82% to ??? 25 1 1 7 20 -85% to + 186% 37 1 0 3 10 -100% to + 233% 67 2 1 4 11 -75% to +175% 111 6 4 8 15 -50% to +88% 495 24 6 9 12 -33% to +33% G21 G57 MLE, Maximum Likelihood Estimate(?) • Ranges are wide & skewed toward distant generations • 111 markers gives the “best estimate” with smallest upper ranges but still almost double the mid-value
  • 62. • Individually extracted 5%, 50% & 95% estimates (90% Confidence Interval) • Markers tested: White = 111, Yellow = 67, Cream = 37, Blue = 25 • 50% probability estimate ranges from 1 to >24 generations • Use triangulation to get better overall estimate? TMRCA Triangulation
  • 64. Will additional STR markers help refine TMRCA estimates? • But … 5% differ? ... some are missing? ... not detected by NGS? • 35 mutations between G21 & G55 • 24 mutations between G21 & G57 • 9 mutations between G21 & G57
  • 66. 0 3 10 Probability Markers tested GD 5% MLE 50% 95% Range (%) 12 1 3 17 >24 -82% to ??? 25 1 1 7 20 -85% to + 186% 37 1 0 3 10 -100% to + 233% 67 2 1 4 11 -75% to +175% 111 6 4 8 15 -50% to +88% 495 24 6 9 12 -33% to +33% Probability Markers tested GD 5% MLE 50% 95% Range (%) 12 1 3 17 >24 -82% to ??? 25 1 1 7 20 -85% to + 186% 37 1 0 3 10 -100% to + 233% 67 2 1 4 11 -75% to +175% 111 6 4 8 15 -50% to +88% 495 24 6 9 12 -33% to +33% G21 G57
  • 68. Topics for Discussion • Brief overview of key concepts • Building a tree with STRs • Building a tree with SNPs • Combining STRs & SNPs • Dating branching points in the tree • Combining STRs, SNPs & genealogy • Challenges for the years ahead
  • 70.
  • 73. A Combined Mutation / Family History Tree … using DNA markers when people run out … is it possible?
  • 74. Topics for Discussion • Brief overview of key concepts • Building a tree with STRs • Building a tree with SNPs • Combining STRs & SNPs • Dating branching points in the tree • Combining STRs, SNPs & genealogy • Opportunities for the years ahead
  • 75. Lessons Learned & Future Opportunities • Transcription errors are easy => triple-check, automate • Re STRs – Lots of Parallel Mutations … where are the Back Mutations? – 111 markers best define the branching pattern – Placement of CDY & 464 is likely to be incorrect (esp. in upstream generations) – Most project members have not tested other male cousins to triangulate on their MDKA – Convergence may be a problem (even at 3/111) – We need more people to test – We need more people to upgrade to 111 markers – YFULL analysis liberates 495 STRs
  • 76. Lessons Learned & Future Opportunities • Re SNPs – Difficult to declare a genuine SNP – Different SNPs from different lips – Definite, probable, possible, unlikely – Likely to be lots of false negatives (& false positives) – No names (locations too long) – Naming is unregulated – Many SNPs trapped in Private Collections – Current NGS is discovery, not confirmatory => further testing (with other NGS?) needed to confirm
  • 77. Lessons Learned & Future Opportunities • Re combining STRs & SNPs – Adding SNPs changed the upper reaches of the tree – SNPs are still located relatively upstream - STRs offer better definition downstream – Start with the Modal of your Haplogoup subgroup • Re TMRCA estimates – SNP-based estimates work best for distant branching points (haplogroup projects) – STR-based estimates have wide ranges, and skewed toward distant generations – Even at 111, upper range ~ double the mid-value – Even 495 markers has a wide range (+/- 33%)
  • 78. Lessons Learned & Future Opportunities • Re combining STRs, SNPs & genealogy – We need to overlay documentary data on DNA – Some pedigrees not supplied / incomplete – Need to add MPRs to all (MDKA Profile) – Need to take a One Name Study approach? • Collate all Gleeson data worldwide • Establish a relational database (Access?) • Assign data to different family branches • This early draft MHT serves as a useful basis – Will evolve over time as more people test & upgrade – Will faciltate collaboration between project members – Will help attract new project members
  • 79. Vision 2020 Where will we be in 5 years time? Here are some bold predictions …
  • 80. What would happen if … • Everyone upgraded to 111 markers? – Better definition of branching pattern – More precise TMRCA estimates (with narrower range) • Everyone did the Big Y? – SNPs only good for upstream branches? (<1500 AD) – We will run out of Private SNPs • Everyone tested on a Surname Specific Panel? – Would elucidate branching pattern up to 1500 AD? Later? • Everyone did Whole Genome Sequencing? – No better than Big Y? Better coverage? Better read length? – What will happen to Probable / Possible / Unlikely SNPs?
  • 81. Some Bold Predictions … • (To help stimulate discussion & to learn) • What is most useful for Surname Projects – more SNPs or more STRs? – More STRs … we will run out of Private SNPs – 111 vs 50,000 – 500 vs 40? • In 2020, FTDNA will offer 500 STRs for $129
  • 82. Some Bold Predictions … • How do we best generate a Surname-Specific SNP Panel? – Q: How many discovery Big Y tests are needed to liberate sufficient Private SNPs to adequately define the Surname Panel? – A: 5-10 Big Y tests per genetic cluster – We need another few people to Big Y test, then generate the Surname Panel for Lineage II • In 2020, FTDNA will offer over 4000 Surname Specific SNP Panels for $100 each
  • 83. Generate MHTree More tools Lineage I Lineage II Lineage III Lineage IV Lineage II Mutation History Tree
  • 84. Acknowledgements • Bennett Greenspan • Max Blankfield • Janine Cloud • FTDNA team • Judy Claassen • Lisa Little • James Irvine • Ralph Taylor • John Cleary • Haplogroup Admins • John Murphy • Neal Downing • James Kane • Alex Williamson • Nigel McCarthy • Dennis Wright • Alasdair MacDonald • YFULL team The Genetic Genealogy Community

Notes de l'éditeur

  1. What are the chances of 5 parallel mutations!?? Several “best fit” models So not enough resolution at 37 markers to define the branching pattern Parallel mutations unavoidable – either 390 &392, CDYa&b … Several pathways to the final mutations per member ... But not clear which came before which
  2. Looks like the constellation of Ursus Major Or instructions on how to assemble Swedish furniture Several “best fit” models So not enough resolution at 37 markers to define a single “most probable” option for the branching pattern Parallel mutations unavoidable – either 390 &392, CDYa&b … Several pathways to the final mutations per member ... But not clear which came before which Best Fit option also includes possible Back Mutations
  3. Several “best fit” models So not enough resolution at 37 markers to define a single “most probable” option for the branching pattern Parallel mutations unavoidable – either 390 &392, CDYa&b … Several pathways to the final mutations per member ... But not clear which came before which Best Fit option also includes possible Back Mutations
  4. If we compare the Hand Drawn Tree with the Fluxus-based tree (or rather 1 version thereof – as several different versions are possible) The main area where the Fluxus improves on what we already have in the HDT is in the amalgamation of Branches 2 and 6 So if we move them over beside each other you can see that both branches have parallel mutations on marker 456
  5. If we take a closer look at these branches, let’s assume that the mutation in marker 456 on Branch 6 occurred before the mutation in marker 389 This allows us to create branch 6 as a sub-branch of branch 2 rather than both branches being offshoots from the modal haplotype This new branching configuration is then reinserted into our tree and branches 3, 4 & 5 moved over to make room Only Branches 2 and 6 are reconfigured … Everything else remains the same … All other Parallel Mutations still persist So our Hand Drawn Tree comes out pretty well compared to the Fluxus-based Tree But that is only at 37 markers … when we have to deal with 111 markers, and many more project members, the option of a Hand Drawn Tree becomes unfeasible and we have to turn to Fluxus or other software to help us achieve the Best Fit Tree
  6. Even at 111 markers there is no overall most likely Best Fit Tree there are 2 possible pathways to G71, and 2 to G22 However the rest of the branches appear to be relatively well demarcated by the increase in the number of markers One problem however is that not everyone has tested to 111 markers and whereas “99” can be put in place of missing marker values, thus allowing the programme to insert the “best fit” marker value for those that are missing, there is no guarantee that the programme has chosen the appropriate / correct values Nevertheless, this cladogram can now be converted into a Family Tree Some additional members have been added to the tree, namely G02, G68, G70, G71, & G73
  7. Essential piece of technology
  8. Compared to the 37-marker based Fluxus Tree … - some of the branches haven’t changed at all - Some new branches have been added as new members have joined - 11-14 in green  - Branch 1 has accumulated a few more mutations - Branches 4 & 5 have retained their relative position, with 5 being an offshoot of 4 - Branches 2 & 6 have retained their shape (6 an offshoot of 2), as have 9 & 10, and have accumulated a few more mutations as well But major changes have happened to Branches 7 and 8 - Branch 7 (G55) used to be most closely positioned to Branch 8 (G66). GD was 2/37 but is now 11/111. Now it is closest to Branch 6 (G21); GD was 4/37, now 8/111. - similarly Branch 8 (G66) has moved over to an entirely different part of the tree. It is now closest to Branch 4 (G22): GD was 4/37, now 2/111 (reconfiguring the tree has removed a parallel mutation at 464b).  Parallel Mutations still exist (CDYa 38-39, 464c 17-16, CDYb 40-39) and others have appeared (461, 390) but others have disappeared completely (464b, 456) A Back Mutations is now evident - in Branch 1 (G05): 464b mutates forward from 16-17, then back from 17-16, then mutates again from 16-15 in Branch 11 (G02) ?? Generally these developments seem to represent [??a significant step forward], even if James and Ralph aren’t too confident they have hit upon a reliable weighting algorithm, or that the basic mutation rates used (Chandler/Ancestry) are reliable.  But it does seem that the use of some weighting alogithm, even if it’s exact form and content are unreliable, is better than none. 
  9. From James Irvine: “Conventional fluxus diagrams give equal weight to all markers, regardless of the fact that their mutation rates vary by a factor of 400, or perhaps arbitrarily exclude the fastest moving markers such as the CDYs.  James Irvine introduced me to the concept of weighting markers, and he and Ralph Taylor, who has kindly produced all my fluxus diagrams, have come up with the simple weighting algorithm of:      weighting = 99* (1 – mutation rate/0.04)2.  Note the application of this algorithm has the effect of making the “torso” or green ring disappear, although this is only significant if it clearly explains the data in a way that is still consistent with a most parsimonious version of the tree.”
  10. Changes only apparent in the upper reaches of the tree No change in lower area - relationship of sub-branches remains the same … with the exception of Branch 1 (G05) This no longer seems as closely related to Branches 8,4,5 as previously It may be more closely related to Branch 13 (G70) G70 (Branch 13) now does not come directly off the modal but has a mutation in CDYb Following a further mutation (in 464c), 2 other branches now split off = Branch 10 (G54) and Branch 1 (G05) This new configuration allows us to … remove some of the mutations that were indicated in some of the branches (now crossed-out in red) And reposition them (in black) to other areas of the tree (eg on Branch 1, the crossed-out markers CDYb & 464c are repositioned to further up the tree) But during this process, Nigel McCarthy spotted an error in Branch 6 … GATA A10 is only present in Branch 6 (G21) and not in Branch 7 (G55) This turned out to be a transcription error during the initial transfer of the data from FTDNA to WFN The lesson here is: transcription errors are easy to make and happen all the time => we need to double-check and triple-check all these values So now Branch 7 is no longer a sub-branch of Branch 6
  11. Caveats: 1) Fluxus fills in the blanks / missing data - the question remains: is its “best guess" valid? 2) Some markers behave unusually … - 464: mutations may not be in the correct order - Kittler test needed to define relative positions - 389: marker is in two parts. A mutation in the first part is also counted as a mutation in the second part 3) Some markers (esp.  68-111) have no modal value - need more people to test, & at higher levels - may become differentiating in the future  4) the Tree may be skewed by recent mutations (ie within the past 5-6 generations). Ideally it would be optimal to test at least 2 known distant cousins from each family branch in order to characterise the haplotype of each MDKA. Triangulation on all MDKAs would help eliminate recent mutations which might cloud the interpretation of the tree beyond the level of the MDKAs. 
  12. G71 has become part of G22 G22 is now a sub-branch of G66 G02 is a separate branch from the Modal and no longer a sub-branch of G05 G70 remains in roughly the same relation to other branches, as do G57, G55, G21, G68, G54, & G51 So removing the unreliable markers CDY and 464 does not result in substantial changes to most of the tree, because there are other mutations that maintain the tree structure / branching pattern
  13. G71 has become part of G22 G22 is now a sub-branch of G66 G02 is a separate branch from the Modal and no longer a sub-branch of G05 G70 remains in roughly the same relation to other branches, as do G57, G55, G21, G68, G54, & G51 So removing the unreliable markers CDY and 464 does not result in substantial changes to most of the tree, because there are other mutations that maintain the tree structure / branching pattern
  14. G71 has become part of G22 G22 is now a sub-branch of G66 G02 is a separate branch from the Modal and no longer a sub-branch of G05 G70 remains in roughly the same relation to other branches, as do G57, G55, G21, G68, G54, & G51 So removing the unreliable markers CDY and 464 does not result in substantial changes to most of the tree, because there are other mutations that maintain the tree structure / branching pattern
  15. Caveats: 1) Fluxus fills in the blanks / missing data - the question remains: is its “best guess" valid? 2) Some markers behave unusually … - 464: mutations may not be in the correct order - Kittler test needed to define relative positions - 389: marker is in two parts. A mutation in the first part is also counted as a mutation in the second part 3) Some markers (esp.  68-111) have no modal value - need more people to test, & at higher levels - may become differentiating in the future  4) the Tree may be skewed by recent mutations (ie within the past 5-6 generations). Ideally it would be optimal to test at least 2 known distant cousins from each family branch in order to characterise the haplotype of each MDKA. Triangulation on all MDKAs would help eliminate recent mutations which might cloud the interpretation of the tree beyond the level of the MDKAs. 
  16. SNP testing SNP testing is required to confirm haplogroup assignments, to learn more about your deep ancestry and to rule out false positive matches. Y-SNP chip tests are available from the Genographic Project and BritainsDNA. More comprehensive sequencing tests using next generation sequencing technology are available from Full Genomes Corporation and Family Tree DNA. Single SNPs can be ordered from Family Tree DNA and YSEQ. For advice on SNP testing consult the project administrator of the relevant Y-DNA haplogroup project. Single SNPs There are currently two companies that offer single SNPs. Family Tree DNA offer a range of single SNPs at US $39 per SNP.[1] SNPs can only be ordered by existing FTDNA customers who have already taken a Y chromosome DNA test with the company. In October 2013 over 3500 individual SNPs were available to order from FTDNA. The placement on the phylogenetic tree is unknown for most of them. SNPs with a known placement are highlighted in the customer's personal results page under the Y-DNA Haplotree & SNPs section. Customers can also request that new SNPs are added to the catalogue. YSEQ is a new company established by Thomas and Astrid Krahn in November 2013. YSEQ offer a potentially unlimited menu of single SNPs to order. For further details of this company see the blog post by Debbie Kennett entitled YSEQ a new company offering single SNPs. Full Genomes Corporation have indicated that they will soon be offering single SNPs for sale. Deep clade tests Family Tree DNA used the term "deep clade test" to refer to a panel of Y-chromosome SNP tests. This panel was intended to establish which particular subclade the Y-chromosome belonged to. The deep clade test was effectively superseded by the new Geno 2.0 test from the Genographic Project. This new test was introduced in the autumn/fall of 2012 and tests over 12,000 Y-DNA SNPs. In January 2013 FTDNA announced that they were removing the deep clade test from sale.[2] A link is now provided that will allow people to order the Genographic 2.0 test, whose Y-SNP results can be transferred back to Family Tree DNA. Family Tree DNA announced at their conference in November 2013 that they would be reintroducing some deep clade tests in 2014, probably in the first quarter of the year.
  17. Sometimes true SNPs are not detected by the machine (FALSE NEGATIVE) due to poor coverage Sometimes they are detected by the machine but not recognised by the analyst / analyser Some true snps are missed (and James / John picks them up) (FALSE NEGATIVE) If they are detected (by machine & analyst), some are clearly true positives Others are acceptable quality Others are ambiguous Others are unreliable And others are clearly not true positives (ie false matches) But each of these assessments could be true or false or unsure How many of each will eventually be true or false SNPs? What is the sensitivity & specificity? Find out the L21 story Detection Threshold / Filter can include criteria of coverage, quality, region of the Y, multiple copy on the Y, known presence in multiple haplogroups From John Cleary: As for the Recognition one, I’m not convinced by the “Problem with recognition algorithm” issue.  Do we have concrete examples of such problems?  After all, we’re talking about something that is essentially comparing two strings of text symbols with each other, which sounds like a pretty simple kind of algorithm to write, for those who can do such a thing.  It seems to me that when a SNP is missed – if there are such cases – it will be because of other filters written into the algorithm, in other words it is a type of the problem in the line above when something is rejected because it doesn’t meet the threshold criteria.  These can be criteria of coverage, quality, region of the Y, multiple copy on the Y, known presence in multiple haplogroups etc.  If these filters are set too strictly, then viable SNP candidates can be rejected and never be thrown up for manual analysis.      I doubt whether ‘manual analysis of BAM’ is really feasible if positions for investigation are not being thrown up by a prior automated search of the BAM file.  We can’t eyeball 14 million positions manually.  We might get lucky in some cases, but what we really want is a soft set of filters that will throw up borderline cases for manual investigation, and err on the side of the false positive, so we can investigate and reject them if they are flawed.  Do we know that our friends at YFull / FGC / ClarifY are not in fact doing exactly this?   So perhaps the Recognition one could be something like – Filters/algorithms insensitive to borderline cases??  False Positive – SNP can be investigated by manual analysis of BAM file; False Negative – SNP will be missed?? And I think we should build a log of cases of the latter type, so that we know it is actually happening.
  18. Only interested in zero difference Known SNPs – high & medium confidence Novel Variants – high only Shared NV – high & medium Unique NV – high & medium
  19. Men whose NGS data have been fully analyzed are indicated with a grey background color. Red is used for men whose data has not yet been fully analyzed. His position on the tree is not yet final, and will in general be downstream of the current position. He may not be positive for all the SNPs/INDELs in the block he descends from. A green SNP name with a '?' indicates that the SNP's status for the block is uncertain, but assumed to be positive. The same SNP probably occurs in an upstream block. It will be necessary to check BAM files or perhaps Sanger sequence some men to prove the result. A red SNP name with a '?' indicates that the SNP's status for the block is uncertain, but assumed to be negative. The same SNP probably occurs in a parallel block. It will be necessary to check BAM files or perhaps Sanger sequence some men to prove the result. Mutations written with a red background fall within a region of the Y chromosome, such as the palindromic region, which has left the position of the mutation ambiguous. The true mutation may be at the indicated position, or at any one of a number of alternate positions.
  20. In Alex’s Big Tree, we are looking at only those SNPs that are shared between the men who have currently tested. Private SNPs are excluded His Tree is characterised by Undifferentiated SNP blocks. It’s useful to look at the progression of Alex’s Big Tree over time Most of this happened in 2015 so these results are really very new & the science is changing rapidly In the early days, the 2 Gleason men were lumped together with a Carroll man pending analysis of the second Gleeson man’s results (indicated by the red background) But by April, Alex’s analysis split the Carroll man from the Gleason men And when my Dad did his Big Y, his results split the Gleason group in two Notice how all the SNPs apart from 1 (A660) have NOT been named and are identified by their location numbers only - this is a challenge because quoting these numbers or using them in conversation or checking them is unwieldy But by June, some of the SNPs had names, some were still referred to by location only, and several had 2 separate names - another challenge: a single name would help avoid confusion And by Oct, a Glisson man split the Gleeson bunch into 3 distinct branches, and also split the A5629 SNP block into 2 - instead of a block of 5 SNPs, it is now a block of 4 SNPs with 1 SNP breaking away to form its own sub-branch And look at the Carroll man, he has now been joined by a second Carroll man and now there is a whole block of Shared SNPs above their names.  These would have been in the first Carroll man’s Private Collection of SNPs prior to the arrival of the second Carroll man. This nicely shows how, as more people test, more SNPs will move from individuals Private Collections into the Shared SNP sections of the Big Tree
  21. Grey = no coverage Pink = marginal coverage + Definite * Probable ** Possible *** Unlikely
  22. L21 story
  23. First, let’s identify those members who have undergone SNP-testing (all have done the Big Y test) Red arrow = Branch 9 Yellow arrow = Branch 14 Purple arrow = Branch 2 (with sub-branches 6 and 7) So at first glance, the SNPs seem consistent with the present version of the Fluxus-based Tree The ancient Glisson branch (14) is completely separate from all others and characterised by the absence of SNP A5628 2 of the 3 brothers in Branch 9 have tested and are grouped together under the same SNP mutation block A660 And the 3 remaining Big Y testees are all closely related to each other by STR analysis & are all grouped under SNP Y16880 So at first glance, the STR grouping and SNP grouping seem consistent with each other The BY2853 SNP block can be imagined to be positioned far up in the tree, above the Gleeson MRCA Similarly for the next block of SNPs (A5629 block) Then Glisson & the A5628 block split off … but this is where an inconsistency emerges All the remaining testees (Branches 2,7,8,9,10) sit underneath A5628 So we have to create a different link to their branches A second problem is that Branches 9 & 10 do not have a 456 16-15 mutation So to make this fit, we could imagine that mutation 456 was a relatively early mutation for ALL the branches in question, followed some time later by A5628, and somewhere along the way Branch 9 (& 10) developed a Back Mutation in marker 456 which explains why this mutation is missing from the haplotypes of the present-day members of Branch 9 This further reconfiguration of the tree suggests that other branches that have not yet been tested may also have the A5628 mutation i.e. seeing as how Glisson is an ancient branch, it may be that Branches 8,4,5,12 also share the A5628 mutation – single SNP testing?
  24. The major difference is the differing Modal Haplotype at the top of the tree I use the MH for the surname group, But Nigel uses the MH from the overall Z255 group … which is probably a better way of doing it The branching pattern is identical – it’s just that sometimes his mutations are mirror-images of mine Noticeable in 464, CDY, 461, 390, & 389 But overall, the two trees are consistent with each other I take mine one step further by including 37-marker data
  25. Using the 495 STR marker TMRCA estimates, the branching point can be further refined We can also calculate the missing value between Branch 8,5,4,12 and the MH Are these TMRCA estimates consistent with the Branching Pattern? Yes, except for the red circle (Branch 13) but this could be because it is based on a Y37 result in G70
  26. Using the 495 STR marker TMRCA estimates, the branching point can be further refined We can also calculate the missing value between Branch 8,5,4,12 and the MH Are these TMRCA estimates consistent with the Branching Pattern? Yes, except for the red circle (Branch 13) but this could be because it is based on a Y37 result in G70
  27. Using the 495 STR marker TMRCA estimates, the branching point can be further refined We can also calculate the missing value between Branch 8,5,4,12 and the MH Are these TMRCA estimates consistent with the Branching Pattern? Yes, except for the red circle (Branch 13) but this could be because it is based on a Y37 result in G70