SlideShare une entreprise Scribd logo
1  sur  63
Whole-Genome Prokaryote Phylogeny without Sequence Alignment Bailin HAO  and  Ji QI T-Life Research Center, Fudan University Shanghai 200433, China Institute of Theoretical Physics, Academia Sinica Beijing 100080, China http://www.itp.ac.cn/~hao/
Classification of Prokaryotes: A Long-Standing Problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The SSU rRNA Tree of Life: A big progress in molecular phylogeny of prokaryotes as evidenced by the history of the Bergey’s Manual
Bergey’s Manual Trust: Bergey’s Manual   ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Our Final Result ,[object Object],[object Object],[object Object],[object Object],[object Object]
 
Protein Tree for 145 Organisms From 82 Genera (K=5) 16 Archaea (11 genera, 16 species) 123 Bacteria (65 genera, 98 species) 6  Eukaryotes
 
Complete Bacterial Genomes Appeared since 1995 Early Expectations: ,[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Debate on Lateral Gene Transfer ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],Gene  A A ’ B Gene  B ’ C ? 1st species 2nd species
Our Motivations: ,[object Object],[object Object],[object Object],[object Object],[object Object]
Other Whole-Genome Approaches ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Comparison of Complete Genomes/Proteomes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],} }
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],↑
A Key Improvement: Subtraction of Random Background ,[object Object],[object Object],[object Object],[object Object],[object Object]
Frequency and Probability ,[object Object],[object Object],[object Object],[object Object]
Predicting #(K-strings) from that of  lengths (K-1) and (K-2) strings ,[object Object],[object Object],[object Object]
(K-2)-th Order Markov Model ,[object Object],[object Object]
[object Object],[object Object]
Composition Distance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Materials: Genomes from NCBI ( ftp.ncbi.nih.gov/genomes/Bacteria/ ) Not the original GenBank files 6 Eucaryote genomes were included for reference Tree construction: Neighbor-Joining in Phylip
Protein Tree for 132 species (K=5) 16 Archaea (11 genera, 16 species) 110 Bacteria (57 genera, 88 species) 6  Eukaryotes
 
Protein Tree for 132 species K=6 16 Archaea (11 genera, 16 species) 110 Bacteria (57 genera, 88 species) 6  Eukaryotes
 
Protein Class vs. Whole Proteome ,[object Object],[object Object]
Genus Tree based on Ribosomal Proteins
A Genus Tree based on Aminoacyl tRNA synthetases
Chloroplast Tree ,[object Object],[object Object],[object Object]
Chloroplast tree
Coronaviruses including Human SARS-CoV ,[object Object],[object Object],[object Object]
Coronavirus tree
Understanding the Subtraction Procedure: Analysis of Extreme Cases in  E. coli ,[object Object],[object Object],[object Object],[object Object]
GKSTL: how 58 reduces to 0.646? ,[object Object],[object Object],[object Object],[object Object],[object Object]
HAMSC: how 1 grows to 197? ,[object Object],[object Object],[object Object],[object Object],[object Object]
6121 Exact Matches of GKSTL In PIR Rel.1.26 with >1.2 Mil  Proteins ,[object Object],[object Object],[object Object]
15 Exact Matches of  HAMSC: In PIR Rel.1.26 with >1.2 Mil Proteins   ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Stable Topology of the Tree ,[object Object],[object Object],[object Object],[object Object]
Statistical Test of the Tree ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
About 70% genes for every species were selected in one bootstrap
“ K-string Picture” of Evolution ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Problem of Higher Taxa ,[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusion:   The Tree of Life is saved! There is phylogenetic information in the prokaryotic proteomes. Time to work on molecular definition of taxa. Thank you!
 
 
Protein Tree for 132 species (K=5) 16 Archaea (11 genera, 16 species) 110 Bacteria (57 genera, 88 species) 6  Eukaryotes
 
 
A Failed Attempt Using Avoidance Sinatures
 
Comparison with the Bergey’s Manual
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],  Phyla Classes Orders Families Genera Species Strains Archaea 2 7 9 9 9 13 13 Bacteria 9 14 23 28 37 46 57 Total 11 21 32 37 46 59 70
Early expectation from genome data ,[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Subtraction of Random Background ,[object Object],[object Object],[object Object]
What to do next ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Tendances

Tendances (20)

Phylogeny
PhylogenyPhylogeny
Phylogeny
 
Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014
 
A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...
 
Molecular Evolution and Phylogenetics (2009)
Molecular Evolution and Phylogenetics (2009)Molecular Evolution and Phylogenetics (2009)
Molecular Evolution and Phylogenetics (2009)
 
Tools in phylogeny
Tools in phylogeny Tools in phylogeny
Tools in phylogeny
 
2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge2015 bioinformatics phylogenetics_wim_vancriekinge
2015 bioinformatics phylogenetics_wim_vancriekinge
 
philogenetic tree
philogenetic treephilogenetic tree
philogenetic tree
 
Phylogenetic data analysis
Phylogenetic data analysisPhylogenetic data analysis
Phylogenetic data analysis
 
Phylogenetics an overview
Phylogenetics an overviewPhylogenetics an overview
Phylogenetics an overview
 
Phylogenetic studies
Phylogenetic studiesPhylogenetic studies
Phylogenetic studies
 
Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion
 
Bioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsBioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogenetics
 
Parsimony methods
Parsimony methodsParsimony methods
Parsimony methods
 
Microbial taxonomy
Microbial taxonomyMicrobial taxonomy
Microbial taxonomy
 
Phylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny ofPhylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny of
 
Phylogenetic tree
Phylogenetic treePhylogenetic tree
Phylogenetic tree
 
2.3 Phylogenetic Trees
2.3 Phylogenetic Trees2.3 Phylogenetic Trees
2.3 Phylogenetic Trees
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshell
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
Phylogenetic tree construction
Phylogenetic tree constructionPhylogenetic tree construction
Phylogenetic tree construction
 

En vedette (6)

kjkl
kjklkjkl
kjkl
 
vsv5
vsv5vsv5
vsv5
 
bai giang di truyen1
bai giang di truyen1bai giang di truyen1
bai giang di truyen1
 
vsv4
vsv4vsv4
vsv4
 
vsv6
vsv6vsv6
vsv6
 
vsv2
vsv2vsv2
vsv2
 

Similaire à bai2

Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083
Robin Gutell
 
Next generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable cropsNext generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable crops
Pulipati Gangadhara Rao
 
L14 human genome
L14 human genomeL14 human genome
L14 human genome
MUBOSScz
 
Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391
Robin Gutell
 

Similaire à bai2 (20)

Bioinformatics A Biased Overview
Bioinformatics A Biased OverviewBioinformatics A Biased Overview
Bioinformatics A Biased Overview
 
Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277
 
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative GenomicsMicrobial Phylogenomics (EVE161) Class 13 - Comparative Genomics
Microbial Phylogenomics (EVE161) Class 13 - Comparative Genomics
 
2000 JME (51)278-285
2000 JME (51)278-2852000 JME (51)278-285
2000 JME (51)278-285
 
Final Draft
Final DraftFinal Draft
Final Draft
 
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
Microbial Phylogenomics (EVE161) Class 7: rRNA PCR and Major Groups
 
Gutell 095.imb.2005.14.625
Gutell 095.imb.2005.14.625Gutell 095.imb.2005.14.625
Gutell 095.imb.2005.14.625
 
Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083
 
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuksThe need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
 
Next generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable cropsNext generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable crops
 
L14 human genome
L14 human genomeL14 human genome
L14 human genome
 
9739142.ppt
9739142.ppt9739142.ppt
9739142.ppt
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
Beiko networks 2019_final
Beiko networks 2019_finalBeiko networks 2019_final
Beiko networks 2019_final
 
Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391Gutell 056.mpe.1996.05.0391
Gutell 056.mpe.1996.05.0391
 
EVE 161 Lecture 6
EVE 161 Lecture 6EVE 161 Lecture 6
EVE 161 Lecture 6
 
Metagenomics as a tool for biodiversity and health
Metagenomics as a tool for biodiversity and healthMetagenomics as a tool for biodiversity and health
Metagenomics as a tool for biodiversity and health
 
Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5Microbial Phylogenomics (EVE161) Class 5
Microbial Phylogenomics (EVE161) Class 5
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 

Plus de Hiệp Lê Bá (9)

vsv3
vsv3vsv3
vsv3
 
vsv1
vsv1vsv1
vsv1
 
tai lieu1
tai lieu1tai lieu1
tai lieu1
 
m31-a2
m31-a2m31-a2
m31-a2
 
069799en
069799en069799en
069799en
 
853
853853
853
 
At
AtAt
At
 
tài liêu1
tài liêu1tài liêu1
tài liêu1
 
MIỄN DỊCH BÀI 1
MIỄN DỊCH BÀI 1 MIỄN DỊCH BÀI 1
MIỄN DỊCH BÀI 1
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

bai2

  • 1. Whole-Genome Prokaryote Phylogeny without Sequence Alignment Bailin HAO and Ji QI T-Life Research Center, Fudan University Shanghai 200433, China Institute of Theoretical Physics, Academia Sinica Beijing 100080, China http://www.itp.ac.cn/~hao/
  • 2.
  • 3. The SSU rRNA Tree of Life: A big progress in molecular phylogeny of prokaryotes as evidenced by the history of the Bergey’s Manual
  • 4.
  • 5.
  • 6.  
  • 7. Protein Tree for 145 Organisms From 82 Genera (K=5) 16 Archaea (11 genera, 16 species) 123 Bacteria (65 genera, 98 species) 6 Eukaryotes
  • 8.  
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. Materials: Genomes from NCBI ( ftp.ncbi.nih.gov/genomes/Bacteria/ ) Not the original GenBank files 6 Eucaryote genomes were included for reference Tree construction: Neighbor-Joining in Phylip
  • 25. Protein Tree for 132 species (K=5) 16 Archaea (11 genera, 16 species) 110 Bacteria (57 genera, 88 species) 6 Eukaryotes
  • 26.  
  • 27. Protein Tree for 132 species K=6 16 Archaea (11 genera, 16 species) 110 Bacteria (57 genera, 88 species) 6 Eukaryotes
  • 28.  
  • 29.
  • 30. Genus Tree based on Ribosomal Proteins
  • 31. A Genus Tree based on Aminoacyl tRNA synthetases
  • 32.
  • 34.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43. About 70% genes for every species were selected in one bootstrap
  • 44.
  • 45.
  • 46.
  • 47. Conclusion: The Tree of Life is saved! There is phylogenetic information in the prokaryotic proteomes. Time to work on molecular definition of taxa. Thank you!
  • 48.  
  • 49.  
  • 50. Protein Tree for 132 species (K=5) 16 Archaea (11 genera, 16 species) 110 Bacteria (57 genera, 88 species) 6 Eukaryotes
  • 51.  
  • 52.  
  • 53. A Failed Attempt Using Avoidance Sinatures
  • 54.  
  • 55. Comparison with the Bergey’s Manual
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.