SlideShare une entreprise Scribd logo
1  sur  125
Télécharger pour lire hors ligne
DÉCOUVERTE ET EXPLORATION
DES MODULES CONSERVÉS DE
TRANSFORMATIONS CHIMIQUES
DANS LE MÉTABOLISME
MARIA SOROKINA
3 FÉVRIER 2016
Ecole doctorale
Structure et Dynamique des Systèmes Vivants
What is the metabolism?
2
What is the metabolism?
Metabolism is the overall biochemical processes by which
living organisms are maintained in life, grow, reproduce and
interact with the environment
3
What is the metabolism?
Metabolism is the overall biochemical processes by which
living organisms are maintained in life, grow, reproduce and
interact with the environment
« Μεταβολή » (metabôlé) – greek – change, transformation
4
What is the metabolism?
Metabolism is the overall biochemical processes by which
living organisms are maintained in life, grow, reproduce and
interact with the environment
« Μεταβολή » (metabôlé) – greek – change, transformation
Chemical transformations mainly concern small molecules –
metabolites- which are modified by (bio)chemical reactions
5
What is the metabolism?
Metabolism is the overall biochemical processes by which
living organisms are maintained in life, grow, reproduce and
interact with the environment
« Μεταβολή » (metabôlé) – greek – change, transformation
Chemical transformations mainly concern small molecules –
metabolites- which are modified by (bio)chemical reactions
6
Successive reactions aiming the production or degradation of a target metabolite are
described in metabolic pathways
What is the metabolism?
Metabolism is the overall biochemical processes by which
living organisms are maintained in life, grow, reproduce and
interact with the environment
« Μεταβολή » (metabôlé) – greek – change, transformation
Chemical transformations mainly concern small molecules –
metabolites- which are modified by (bio)chemical reactions
Biochemical reactions are often catalysed by enzymes – proteins encoded in the
organism genome and having the ability to facilitate specific reactions
7
Successive reactions aiming the production or degradation of a target metabolite are
described in metabolic pathways
Biochemical reactions are often catalysed by enzymes – proteins encoded
in the organism genome and having the ability to facilitate specific
reactions
8
Enzymes
Genome
Reactions
Transforming metabolites
Presentation Outline
Introduction: From Genome to Metabolism
Part I: Orphan Enzymes
Part II: Reaction Molecular Signature Network and Conserved Modules
Part III: Combining Genomic and Metabolic Contexts
Conclusions & Perspectives
9
From Genome To Metabolism
Enzymes
Genome
Reactions
Transforming metabolites
From Genome To Metabolism
Sequencing
Enzymes
Genome
Reactions
Transforming metabolites
From Genome To Metabolism
Finding CDS
(protein-coding genes)
Sequencing
Enzymes
Genome
Reactions
Transforming metabolites
From Genome To Metabolism
Sequencing
Functional annotation
Finding CDS
(protein-coding genes)
Enzymes
Genome
Reactions
Transforming metabolites
Functional Annotation
Assigning a biological function to a protein
«  Through experimentation (high confidence)
«  Homology detection through sequence similarity (BLAST….)
«  Genomic context
«  Protein structural analysis
«  Rules-based annotation systems
«  Community annotation systems
14
Enzymes
Genome
Reactions
Transforming metabolites
From Genome To Metabolism
Sequencing
Functional annotation
Metabolism
Reconstruction
Finding CDS
(protein-coding genes)
Representing The Metabolism
Models for structural analysis
«  Networks
Models for flow analyses in metabolism
«  Flux Balance Analysis
«  Capacitance analysis
Models for dynamic analysis
«  Models including reaction kinetics
16
Representing The Metabolism
Models for structural analysis
«  Networks
Models for flow analyses in metabolism
«  Flux Balance Analysis
«  Capacitance analysis
Models for dynamic analysis
«  Models including reaction kinetics
17
Metabolic Networks
18
Toy example:
part of the Escherichia coli metabolism
Metabolic Networks
Bipartite network of metabolites and reactions
o  Nodes = metabolites and reactions
19
Pyruvate
Formate
Acetyl-CoA
Acetaldehyde
Ethanol
Coenzyme A
NADH
NAD+
Reaction
2.3.1.54
Reaction
1.2.1.10
Reaction 1.1.1.1
Metabolic Networks
Metabolite network
o  Nodes = metabolites
o  Edge between two nodes if
there is a reaction where one
of the metabolites is the
substrate and the other is
the product
20
Pyruvate
Formate
Acetyl-CoA
Acetaldehyde
Ethanol
Coenzyme A
NADH
NAD+
Metabolic Networks
Reaction network
o  Nodes = reactions
o  Edge between two nodes if
there is a metabolite produced
by a reaction substrate of the
other reaction
21
Reaction 2.3.1.54
Reaction 1.2.1.10
Reaction 1.1.1.1
Metabolic Networks
Enzyme network
o  Nodes = enzymes
o  Edge between two nodes if
there is a metabolite produced
by an enzyme substrate of the
other enzyme
o  Limitations :
o  An enzyme can catalyse several reactions
o  A reaction can be catalysed by several enzymes
o  Incomplete knowledge of enzymes (orphan enzymes)
22
Pyruvate formate
lyase
Acetaldehyde
dehydrogenase
Alcohol
dehydrogenase
Metabolic Networks
Ubiquitous compounds problem
CO2 ATP/ADP H2O H+ NAD(P)+/NAD(P)H ….
Create important hubs in the metabolic network
! need to take them into account!
“Primary” and “secondary” metabolites
in reactions in pathways
23
Ubiquitous: existing or being everywhere,
especially at the same time; omnipresent:
Hub: A highly connected node in a graph
Main difficulties in metabolic network reconstruction
from whole genomes:
« Gene functional annotation issues
24
Main difficulties in metabolic network reconstruction
from whole genomes:
« Gene functional annotation issues
25
>60% of functional
annotations in UniProt
may be erroneous
2009
Main difficulties in metabolic network reconstruction
from whole genomes:
« Gene functional annotation issues
« Orphan enzymes
26
Part I
Orphan Enzymes
What is an orphan enzyme?
An “orphan enzyme activity” (or “orphan enzyme” for short) is a known
biochemical activity for which there is any associated sequence (yet)
28
Orphan enzymes
2004: Karp: Call for an enzyme genomics initiative. (38% of orphan enzymes)
2005: Lespinet & Labedan: Orphan enzymes? (42% of orphan enzymes)
2006: Lespinet & Labedan: ORENZA database. (36% of orphan enzymes)
2007: Chen & Vitkup: Distribution of orphan metabolic activities. (34% of orphan enzymes)
2007: Pouliot & Karp: A survey of orphan enzyme activities. (34% of orphan enzymes)
29
Orphan enzymes
22%
78%
>5,000 enzymatic activities
IUBMB - EC numbers
Orphan
enzymes
30
Enzyme Commission (EC) number:
Official classification of enzyme activities
Reaction class
Metabolite type
Reaction nature
Serial number
Enzyme activities and annotated proteins over years
Limited number of recently discovered activities
Protein
sequencing
DNA sequencing
Expression cloning
Genomics
31
Enzyme discovery and protein families
23%
77%
>14,000 protein families
Pfam
22%
78%
>5,000 enzymatic activities
IUBMB - EC numbers
Unknown
functionOrphan
enzymes
32
Enzyme discovery and protein families
Newly discovered enzymatic activities are mostly associated with already
known enzyme families 33
Local Orphan Enzymes
Enzymatic activities that have been observed
in at least one organism of a given clade and
having a sequence associated in an other clade
but not in this one
34
Local orphan EC
numbers
Achaea Bacteria Eukaryotes
Total number of
concerned EC
numbers
79 133 299
% of EC retrieved
with PRIAM
(significant hit with a
detected protein)
30% 30% 59%
Main difficulties in metabolic network reconstruction
from whole genomes:
« Gene functional annotation issues
« Orphan enzymes
« Lack of knowledge on organism metabolic diversity
35
Part II
Reaction Molecular Signature Networks and Conserved
Modules
37
38
5,830 nodes
11,197 edges
39
57% of nodes
83% of edges
Reactions from
model organisms
✴  E. coli
✴  B. subtilis
✴  S. cerevisiae
✴  H. sapiens
✴  A. thaliana
✴  D. melanogaster
40
41
57% nodes suppressed
83% edges suppressed
42
Lack of knowledge about metabolism diversity in non-model organisms
43
Lack of knowledge about metabolism diversity in non-model organisms
What strategy can be adopted to counter this lack of knowledge?
44
All main hypotheses on metabolic pathway evolution agree about the
importance of enzyme promiscuity, i.e. the capacity of enzymes to catalyze
one or several reactions on more or less different substrates…
…we should look at the conservation of chemical transformations in
pathways and not only the conservation of enzymatic reaction
45
Reactions and chemical transformation types
Dehydrogenation
46
How to represent molecules, reactions and
their chemical transformation types ?
Representing Molecules
47
Representing Molecules
48
Need to be able to describe
molecular substructures and their
proprieties
Molecular Signatures
49
50
Molecular signature
set of sub-graphs of given diameter (height) centered on each atom of the
molecule
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
51
Molecular signature
set of sub-graphs of given diameter (height) centered on each atom of the
molecule
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
52
Molecular signature
set of sub-graphs of given diameter (height) centered on each atom of the
molecule
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
53
Molecular signature
set of sub-graphs of given diameter (height) centered on each atom of the
molecule
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
54
Molecular signature
set of sub-graphs of given diameter (height) centered on each atom of the
molecule
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
How To Represent Reactions And Their Chemical
Transformation Type?
55
56
Reaction molecular signature (RMS)
difference between molecular signatures of products and substrates of the
reaction
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
57
Reaction molecular signature (RMS)
difference between molecular signatures of products and substrates of the
reaction
… specifically, it consists in keeping changing substructures, or, a way to encode the chemical
transformation
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
58
Reaction molecular signature (RMS)
difference between molecular signatures of products and substrates of the
reaction
… specifically, it consists in keeping changing substructures, or, a way to encode the chemical
transformation
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
59
Reaction molecular signature (RMS)
difference between molecular signatures of products and substrates of the
reaction
… specifically, it consists in keeping changing substructures, or, a way to encode the chemical
transformation
Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling
53(4), 887–97 (2013)
60
Reaction molecular signature (RMS)
★  At height 0, the RMS is null
61
Reaction molecular signature (RMS)
★  At height 0, the RMS is null
★  Height 1 RMS:
-1.0*[O]([C][P])
1.0*[O]([H][C])
-1.0*[O]([H][H])
1.0*[O]([H][P])
0.0
62
Reaction molecular signature (RMS):
★  At height 0, the RMS is null (all atoms are subtracted)
★  Height 1 RMS:
★  Height 2 RMS:
-1.0*[O]([C][P])
1.0*[O]([H][C])
-1.0*[O]([H][H])
1.0*[O]([H][P])
0.0
1.0*[C@@]([H][C@@]([H][C][O])[C@@]([H][C@][O])[O]([H]))
1.0*[C@@]([H][C@]([H][C@@][O])[C@@]([H][C@@][O])[O]([H]))
1.0*[C@@]([H][C@]([H][C@@][O])[O]([H])[O]([C@@]))
-1.0*[C@@]([H][C@]([H][C@][O])[C@@]([H][C@][O])[O]([H]))
-1.0*[C@@]([H][C@]([H][O][O])[C@@]([H][C@][O])[O]([H]))
1.0*[C@@]([H][C]([H][H][O])[C@@]([H][C@@][O])[O]([C@@]))
1.0*[C@]([H][C@@]([H][C@@][O])[C@@]([H][O][O])[O]([H]))
-1.0*[C@]([H][C@@]([H][C@@][O])[O]([H])[O]([C@]))
-1.0*[C@]([H][C@]([H][C][O])[C@@]([H][C@@][O])[O]([H]))
-1.0*[C@]([H][C]([H][H][O])[C@]([H][C@@][O])[O]([C@]))
1.0*[C]([H][H][C@@]([H][C@@][O])[O]([H]))
-1.0*[C]([H][H][C@]([H][C@][O])[O]([P]))
1.0*[H]([C@@]([C@@][C@@][O]))
-1.0*[H]([C@@]([C@][C@@][O]))
1.0*[H]([C@@]([C@][O][O]))
1.0*[H]([C@@]([C][C@@][O]))
1.0*[H]([C@]([C@@][C@@][O]))
-1.0*[H]([C@]([C@@][O][O]))
-1.0*[H]([C@]([C@][C@@][O]))
-1.0*[H]([C@]([C][C@][O]))
2.0*[H]([C]([H][C@@][O]))
-2.0*[H]([C]([H][C@][O]))
1.0*[H]([O]([C@@]))
-1.0*[H]([O]([C@]))
1.0*[H]([O]([C]))
-2.0*[H]([O]([H]))
1.0*[H]([O]([P]))
1.0*[O]([C@@]([H][C][C@@])[C@@]([H][C@][O]))
-1.0*[O]([C@]([H][C][C@])[C@]([H][C@@][O]))
-1.0*[O]([C]([H][H][C@])[P]([O][O]=[O]))
63
RMS group reactions on the basis of performed chemical
transformation type
=
Reaction molecular signature
Molecular signature
Reaction
64
=
Reaction molecular signature
Molecular signature
Reaction network
Reaction
65
«  Nodes represent reactions
«  Two nodes are linked by a
directed edge if there is a
metabolite produced by the first
reaction that is consumed by the
second reaction
«  5,830 nodes
«  11,197 edges
66
=
Reaction molecular signature
Molecular signature
Reaction network
RMS network
Reaction
67
Transformation of a reaction network in a RMS network
68
Transformation of a reaction network in a RMS network
69
Transformation of a reaction network in a RMS network
70
Transformation of a reaction network in a RMS network
71
Markov chains transition probabilities of order 1 between connected RMSMarkov chains transition probabilities of order 1 between RMSi and RMSj
72
3,365 nodes
8,721 edges
5,830 nodes
11,197 edges Node reduction rate : 0.57
X1 X0,57
73
=
Reaction molecular signature
Molecular signature
Reaction network
RMS network
Reaction
Search and analysis of conserved paths
Path conservation metrics
74
75
Pathway conservation index (PCI)
✴  Computed for each RMS path present in at least one known metabolic pathway
✴  Represents the number of corresponding reaction paths that are present in at
least one MetaCyc pathway
… captures the chemical redundancy across the known metabolism
76
77
Beta-oxydation module - PCI = 14
(conserved in 14 pathways)
78
Aldoxime biosynthesis- PCI = 7
(conserved in 7 pathways)
Pathway conservation index (PCI)
for all MetaCyc pathways
79
Paths of length 2 & PCI>=2 : 365 conserved modules
Previous study: Muto et al. (J. Chem. Inf. Model., 2013) identified 34 conserved modules
Pathway type
MetaCyc pathways with
conserved modules
Biosynthesis 263 (42%)
Degradation 172 (47%)
Detox 3 (27%)
Energy 61 (78%)
Other 19 (33%)
All 518 (46%)
Pathway conservation index (PCI)
for all MetaCyc pathways
80
Paths of length 2 & PCI>=2 : 365 conserved modules
Previous study: Muto et al. (J. Chem. Inf. Model., 2013) identified 34 conserved modules
Pathway type
MetaCyc pathways with
conserved modules
Biosynthesis 263 (42%)
Degradation 172 (47%)
Detox 3 (27%)
Energy 61 (78%)
Other 19 (33%)
All 518 (46%)
Pathway conservation index (PCI)
for all MetaCyc pathways
81
Paths of length 2 & PCI>=2 : 365 conserved modules
Previous study: Muto et al. (J. Chem. Inf. Model., 2013) identified 34 conserved modules
Pathway type
MetaCyc pathways with
conserved modules
Biosynthesis 263 (42%)
Degradation 172 (47%)
Detox 3 (27%)
Energy 61 (78%)
Other 19 (33%)
All 518 (46%)
Conservation of all RMS paths in the network
82
83
Path enumeration in the RMS network
>72,000 paths of length 2 (2 edges and 3 nodes)
RMS path scores
84
RMS path scores
85
RMS path scores
86
wRea
Number of reactions described by a RMS
scoreRea
diversity of reactions performing the same chemical transformation
RMS path scores
87
wPageRank
Feedback centrality: the more neighbours a node
has, the more it is central. The more a node is
central, the more its neighbours are central
scorePageRank
topological importance of the module in the network by highlighting
chemical hubs
RMS path scores
88
wProt
Estimation of the number of proteins
associated to a given RMS
scoreProt
diversity of enzymes performing the same chemical transformation
RMS path scores
89
wProt
Estimation of the number of proteins
associated to a given RMS
scoreProt
diversity of enzymes performing the same chemical transformation
30% of RMS with weightProt=0
Link with orphan enzymes?
!
RMS path scores
90
Significant difference between scores distributions of known metabolic pathways and all/random
paths in the RMS network
(Kruskall-Wallis & Tuckey HSD tests for validation: p-value<<0.05)
RMS path scores
91
Learning pathway types from known metabolic pathways using rules combining
scoreProt, scoreRea and scorePageRank
NNge algorithm
Pathway type prediction with an accuracy of 89% for RMS paths
5 metabolic pathway types:
✴  biosynthesis
✴  degradation
✴  detoxification
✴  energy creation
✴  other
92
93
=
Reaction molecular signature
Molecular signature
Reaction network
RMS network
Reaction
Search and analysis of conserved paths Linking to genomic context
Part III
Combining Genomic and Metabolic Contexts
95
Metabolic context: RMS network
Genomic context: gene cluster
Gene Clusters: Operons
96
Operon: genomic unit containing a group of genes:
«  co-localised on the same strand
«  controlled by the same promoter
«  co-transcripted in a polycistronic ARNm
«  often associated to a same cellular function
Directons
predicting operons
97
Maximal set of adjacent CDS localised on the same DNA strand and not interrupted by a
CDS on the opposite strand
98
Linking directon genes to RMS
EC number
RMS
…
…
Known
enzymes
99
Linking directon genes to RMS
RMS1
RMS2
RMS3
RMS4
A Pfam is often associated to several RMS
! A gene is therefore often associated to several RMS
100
Projection of directon RMS on the network
101
Extraction of selected nodes and all edges – selection of
maximal connected components
102
Gene-based node colouring
103
Best paths selection for the directon
«  Max number of gene colours
«  High path scores (scoreRea,
scorePageRank, scoreProt)
104
Protein Family Case
Case study for the Baeyer-Villiger MonoOxygenases protein family
A protein family is a group of
proteins that share a common
evolutionary origin, reflected by
their related functions and
similarities in sequence or structure.
Baeyer-Villiger MonoOxygenases (BVMOs)
«  Flavoenzymes (FAD dependent)
«  Water soluble
«  Two classes: I and II
105
linear or cyclic ketone ester or lactone
106
★All RMS
catalysed by the
protein family
★All directons
containing a
member of the
protein family
Directon clustering
based on their RMS
content
Network projection
of common RMS
from each directon
cluster
Path selection: max
colours, high scores
Baeyer-Villiger Monooxygenation
107
3 RMS describing this
type of reaction
108
★All RMS catalysed
by the protein
family
★All directons
containing a
member of the
protein family
Directon clustering
based on their RMS
content
Network projection
of common RMS
from each directon
cluster
Path selection: max
colours, high scores
Directons containing a BVMO
«  814 BVMO sequences
«  812 directons
«  468 organisms – only bacteria
109
110
★All RMS catalysed
by the protein
family
★All directons
containing a
member of the
protein family
Directon
clustering based
on their RMS
content
Network projection
of common RMS
from each directon
cluster
Path selection: max
colours, high scores
111
Clustering of BVMO-containing directons according
their content in RMS
Cluster 1
•  251 directons
•  0 common RMS
Cluster 2
•  308 directons
•  32 common RMS
Cluster 3
•  125 directons
•  10 common RMS
Cluster 4
•  69 directons
•  86 common RMS
Cluster 5
•  59 directons
•  5 common RMS
112
★All RMS catalysed
by the protein
family
★All directons
containing a
member of the
protein family
Directon clustering
based on their RMS
content
Network
projection of
common RMS
from each
directon cluster
Path selection: max
colours, high scores
113
Cluster projection on the RMS network
& selection of maximal connected components
Cluster 2
«  Pink nodes: RMS BVMOs
«  Grey nodes: RMS known to be in
BVMOs metabolic context
«  Blue nodes: RMS never seen in BVMO
metabolic context
«  Green edges: links between RMS from
known metabolic paths where BVMOs
are involved
114
Cluster projection on the RMS network
& selection of maximal connected components
115
★All RMS catalysed
by the protein
family
★All directons
containing a
member of the
protein family
Directon clustering
based on their RMS
content
Network projection
of common RMS
from each directon
cluster
Path selection:
max colours, high
scores
116
Path selection
RMS path scoreRea scoreProt scorePageRank
Path 1 7.9 2.5 5.0 10-4
Path 2 7.6 2.5 4.8 10-4
Path 3 5.5 0.5 3.6 10-4
Path 4 5.2 0.4 3.4 10-4
117
Difficulty to define the exact location on the molecule
where the described reaction happens
To Conclude….
What has been done?
«  Orphan enzyme survey
«  Update of statistics
«  Protein families and local orphan enzymes
«  A new representation of metabolism using a network of chemical
transformations
«  Definition and detection of conserved modules
«  Rules for module type prediction
«  Network exploration using genomic and metabolic contexts
«  Definition of a strategy to explore the functional diversity of enzyme families
«  Application to the Baeyer-Villiger Monooxygenases
119
What’s next?
Method improvements
«  Detect branched and cyclic conserved modules
«  Determine specific domains/profiles for RMS: using PRIAM/MKDOM-like
methods
«  Improve gene cluster projection on the RMS network
Applications
«  RMS to classify enzyme activities
«  Assign sequences for orphan enzymes and reactions for orphan metabolites
«  Application on other protein families
«  A way to study biological systems from a chemical point of view
120
David Vallenet
Claudine Médigue
Systems biology team:
Karine Bastard
Mark Stam
Jonathan Mercier
Guillaume Reboul
And all LABGeM
Jean-Loup Faulon
Olivier Lespinet
121
Acknowledgements
122
123
Additional slides
Metabolic Networks
Metabolites hypergraph
o  Nodes = metabolites
o  Hyperedge linking all metabolites implied in the reaction
124
Pyruvate
FormateAcetyl-CoA
Acetaldehyde
Ethanol
Coenzyme A
NADH
NAD+
125
EC numbers vs RMS

Contenu connexe

Tendances

Enzyme histochemistry
Enzyme histochemistryEnzyme histochemistry
Enzyme histochemistryShabab Ali
 
Molecular chaperones
Molecular chaperonesMolecular chaperones
Molecular chaperonesanju vs
 
Principles of clinical enzymology
Principles of clinical enzymologyPrinciples of clinical enzymology
Principles of clinical enzymologyAli Raza Ph.D
 
Learning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and Proteins
Learning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and ProteinsLearning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and Proteins
Learning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and ProteinsTauqeer Ahmad
 
Heat shock proteins presentation
Heat shock proteins presentationHeat shock proteins presentation
Heat shock proteins presentationPooja Chaudhary
 
Enzyme mode of action by KK Sahu sir
Enzyme mode of action by KK Sahu sirEnzyme mode of action by KK Sahu sir
Enzyme mode of action by KK Sahu sirKAUSHAL SAHU
 
Regulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAIN
Regulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAINRegulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAIN
Regulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAINGunaaditya Kalavagadda
 
Enzymology BIOCHEMISTRY REVISION NOTES
Enzymology BIOCHEMISTRY REVISION NOTESEnzymology BIOCHEMISTRY REVISION NOTES
Enzymology BIOCHEMISTRY REVISION NOTESTONY SCARIA
 

Tendances (20)

Enzyme
EnzymeEnzyme
Enzyme
 
Enzymes
Enzymes Enzymes
Enzymes
 
Isoenzyme
IsoenzymeIsoenzyme
Isoenzyme
 
Enzymes introduction
Enzymes introductionEnzymes introduction
Enzymes introduction
 
Enzyme histochemistry
Enzyme histochemistryEnzyme histochemistry
Enzyme histochemistry
 
Enzyme regulation
Enzyme regulationEnzyme regulation
Enzyme regulation
 
Metabolic control
Metabolic controlMetabolic control
Metabolic control
 
Heat shock proteins
Heat shock proteinsHeat shock proteins
Heat shock proteins
 
Enzymes
EnzymesEnzymes
Enzymes
 
Molecular chaperones
Molecular chaperonesMolecular chaperones
Molecular chaperones
 
Principles of clinical enzymology
Principles of clinical enzymologyPrinciples of clinical enzymology
Principles of clinical enzymology
 
Molecular chaperones
Molecular chaperonesMolecular chaperones
Molecular chaperones
 
11 proteases.ppt
11  proteases.ppt11  proteases.ppt
11 proteases.ppt
 
Learning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and Proteins
Learning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and ProteinsLearning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and Proteins
Learning Keys , Lehninger Chapter # 3 Amino Acids,Peptides and Proteins
 
Chaperones
Chaperones Chaperones
Chaperones
 
Heat shock proteins presentation
Heat shock proteins presentationHeat shock proteins presentation
Heat shock proteins presentation
 
Enzyme mode of action by KK Sahu sir
Enzyme mode of action by KK Sahu sirEnzyme mode of action by KK Sahu sir
Enzyme mode of action by KK Sahu sir
 
Heat shock proteins final presentation
Heat shock proteins final presentationHeat shock proteins final presentation
Heat shock proteins final presentation
 
Regulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAIN
Regulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAINRegulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAIN
Regulatory enzymes:THE ENZYMES WHICH CATALYSE AGAIN AND AGAIN
 
Enzymology BIOCHEMISTRY REVISION NOTES
Enzymology BIOCHEMISTRY REVISION NOTESEnzymology BIOCHEMISTRY REVISION NOTES
Enzymology BIOCHEMISTRY REVISION NOTES
 

En vedette

Κοινωνική ευθύνη της επιχείρησης
Κοινωνική ευθύνη της επιχείρησηςΚοινωνική ευθύνη της επιχείρησης
Κοινωνική ευθύνη της επιχείρησηςSofiaVasilaki1
 
인터넷바카라[[SX797。CΟM]]바카라사이트 사이트
인터넷바카라[[SX797。CΟM]]바카라사이트 사이트인터넷바카라[[SX797。CΟM]]바카라사이트 사이트
인터넷바카라[[SX797。CΟM]]바카라사이트 사이트hijhfkjdsh
 
모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트
모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트
모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트ajshdajsh
 
The Awesome Python Class Part-2
The Awesome Python Class Part-2The Awesome Python Class Part-2
The Awesome Python Class Part-2Binay Kumar Ray
 
Marketing Múliplas Soluções
Marketing Múliplas SoluçõesMarketing Múliplas Soluções
Marketing Múliplas SoluçõesMarcos Alves
 
global warming
global warmingglobal warming
global warmingulfayona
 
What is magine
What is magineWhat is magine
What is magineMagine
 
Esterhuyse, Stephan Christiaan - Skills Matrix Template
Esterhuyse, Stephan Christiaan - Skills Matrix TemplateEsterhuyse, Stephan Christiaan - Skills Matrix Template
Esterhuyse, Stephan Christiaan - Skills Matrix TemplateStephan Esterhuyse
 
Innovative Businesses: IKEA, Starbucks, COCO-MAT
Innovative Businesses: IKEA, Starbucks, COCO-MATInnovative Businesses: IKEA, Starbucks, COCO-MAT
Innovative Businesses: IKEA, Starbucks, COCO-MATMarianna Nakou
 
Instant search - A hands-on tutorial
Instant search  - A hands-on tutorialInstant search  - A hands-on tutorial
Instant search - A hands-on tutorialGanesh Venkataraman
 
Machine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business LeadersMachine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business LeadersSudha Jamthe
 
Big Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + CouchbaseBig Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + CouchbaseFujio Turner
 

En vedette (13)

Κοινωνική ευθύνη της επιχείρησης
Κοινωνική ευθύνη της επιχείρησηςΚοινωνική ευθύνη της επιχείρησης
Κοινωνική ευθύνη της επιχείρησης
 
인터넷바카라[[SX797。CΟM]]바카라사이트 사이트
인터넷바카라[[SX797。CΟM]]바카라사이트 사이트인터넷바카라[[SX797。CΟM]]바카라사이트 사이트
인터넷바카라[[SX797。CΟM]]바카라사이트 사이트
 
These_Maria_Sorokina
These_Maria_SorokinaThese_Maria_Sorokina
These_Maria_Sorokina
 
모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트
모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트
모바일카지노\\【SX797。СOM】\\인터넷카지노 사이트
 
The Awesome Python Class Part-2
The Awesome Python Class Part-2The Awesome Python Class Part-2
The Awesome Python Class Part-2
 
Marketing Múliplas Soluções
Marketing Múliplas SoluçõesMarketing Múliplas Soluções
Marketing Múliplas Soluções
 
global warming
global warmingglobal warming
global warming
 
What is magine
What is magineWhat is magine
What is magine
 
Esterhuyse, Stephan Christiaan - Skills Matrix Template
Esterhuyse, Stephan Christiaan - Skills Matrix TemplateEsterhuyse, Stephan Christiaan - Skills Matrix Template
Esterhuyse, Stephan Christiaan - Skills Matrix Template
 
Innovative Businesses: IKEA, Starbucks, COCO-MAT
Innovative Businesses: IKEA, Starbucks, COCO-MATInnovative Businesses: IKEA, Starbucks, COCO-MAT
Innovative Businesses: IKEA, Starbucks, COCO-MAT
 
Instant search - A hands-on tutorial
Instant search  - A hands-on tutorialInstant search  - A hands-on tutorial
Instant search - A hands-on tutorial
 
Machine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business LeadersMachine Learning Introduction for Digital Business Leaders
Machine Learning Introduction for Digital Business Leaders
 
Big Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + CouchbaseBig Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + Couchbase
 

Similaire à soutenance

Similaire à soutenance (20)

Metabolomics
MetabolomicsMetabolomics
Metabolomics
 
Metabolic_networks_lecture2 (1).ppt
Metabolic_networks_lecture2 (1).pptMetabolic_networks_lecture2 (1).ppt
Metabolic_networks_lecture2 (1).ppt
 
biochemistry ppt 3 by Sohail Riaz.pptx
biochemistry ppt 3 by Sohail Riaz.pptxbiochemistry ppt 3 by Sohail Riaz.pptx
biochemistry ppt 3 by Sohail Riaz.pptx
 
Sr chapter007
Sr chapter007Sr chapter007
Sr chapter007
 
Metabolism a
Metabolism aMetabolism a
Metabolism a
 
PROTEOMICS.pptx
PROTEOMICS.pptxPROTEOMICS.pptx
PROTEOMICS.pptx
 
Introduction of biochemstry
Introduction of biochemstryIntroduction of biochemstry
Introduction of biochemstry
 
Metabolomics-II.pdf
Metabolomics-II.pdfMetabolomics-II.pdf
Metabolomics-II.pdf
 
Bio final review game 10 2 (1)
Bio final review game 10   2 (1)Bio final review game 10   2 (1)
Bio final review game 10 2 (1)
 
Enzymes (General Introduction & Action Mechanism)
Enzymes (General Introduction & Action Mechanism) Enzymes (General Introduction & Action Mechanism)
Enzymes (General Introduction & Action Mechanism)
 
Sub-optimal phenotypes of double-knockout of E.coli
Sub-optimal phenotypes of double-knockout of E.coliSub-optimal phenotypes of double-knockout of E.coli
Sub-optimal phenotypes of double-knockout of E.coli
 
CE508 Lecture 1 2006.ppt
CE508 Lecture 1 2006.pptCE508 Lecture 1 2006.ppt
CE508 Lecture 1 2006.ppt
 
CE508-Lecture 1 2007.ppt
CE508-Lecture 1 2007.pptCE508-Lecture 1 2007.ppt
CE508-Lecture 1 2007.ppt
 
unit2
unit2unit2
unit2
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
Examples Of Epigenetics
Examples Of EpigeneticsExamples Of Epigenetics
Examples Of Epigenetics
 
Enzymes
EnzymesEnzymes
Enzymes
 
Enzymes
EnzymesEnzymes
Enzymes
 
MICRO Chap 3 Classification, Metabolism
MICRO Chap 3   Classification, MetabolismMICRO Chap 3   Classification, Metabolism
MICRO Chap 3 Classification, Metabolism
 
metabolomics-Overview.pdf
metabolomics-Overview.pdfmetabolomics-Overview.pdf
metabolomics-Overview.pdf
 

soutenance

  • 1. DÉCOUVERTE ET EXPLORATION DES MODULES CONSERVÉS DE TRANSFORMATIONS CHIMIQUES DANS LE MÉTABOLISME MARIA SOROKINA 3 FÉVRIER 2016 Ecole doctorale Structure et Dynamique des Systèmes Vivants
  • 2. What is the metabolism? 2
  • 3. What is the metabolism? Metabolism is the overall biochemical processes by which living organisms are maintained in life, grow, reproduce and interact with the environment 3
  • 4. What is the metabolism? Metabolism is the overall biochemical processes by which living organisms are maintained in life, grow, reproduce and interact with the environment « Μεταβολή » (metabôlé) – greek – change, transformation 4
  • 5. What is the metabolism? Metabolism is the overall biochemical processes by which living organisms are maintained in life, grow, reproduce and interact with the environment « Μεταβολή » (metabôlé) – greek – change, transformation Chemical transformations mainly concern small molecules – metabolites- which are modified by (bio)chemical reactions 5
  • 6. What is the metabolism? Metabolism is the overall biochemical processes by which living organisms are maintained in life, grow, reproduce and interact with the environment « Μεταβολή » (metabôlé) – greek – change, transformation Chemical transformations mainly concern small molecules – metabolites- which are modified by (bio)chemical reactions 6 Successive reactions aiming the production or degradation of a target metabolite are described in metabolic pathways
  • 7. What is the metabolism? Metabolism is the overall biochemical processes by which living organisms are maintained in life, grow, reproduce and interact with the environment « Μεταβολή » (metabôlé) – greek – change, transformation Chemical transformations mainly concern small molecules – metabolites- which are modified by (bio)chemical reactions Biochemical reactions are often catalysed by enzymes – proteins encoded in the organism genome and having the ability to facilitate specific reactions 7 Successive reactions aiming the production or degradation of a target metabolite are described in metabolic pathways
  • 8. Biochemical reactions are often catalysed by enzymes – proteins encoded in the organism genome and having the ability to facilitate specific reactions 8 Enzymes Genome Reactions Transforming metabolites
  • 9. Presentation Outline Introduction: From Genome to Metabolism Part I: Orphan Enzymes Part II: Reaction Molecular Signature Network and Conserved Modules Part III: Combining Genomic and Metabolic Contexts Conclusions & Perspectives 9
  • 10. From Genome To Metabolism Enzymes Genome Reactions Transforming metabolites
  • 11. From Genome To Metabolism Sequencing Enzymes Genome Reactions Transforming metabolites
  • 12. From Genome To Metabolism Finding CDS (protein-coding genes) Sequencing Enzymes Genome Reactions Transforming metabolites
  • 13. From Genome To Metabolism Sequencing Functional annotation Finding CDS (protein-coding genes) Enzymes Genome Reactions Transforming metabolites
  • 14. Functional Annotation Assigning a biological function to a protein «  Through experimentation (high confidence) «  Homology detection through sequence similarity (BLAST….) «  Genomic context «  Protein structural analysis «  Rules-based annotation systems «  Community annotation systems 14
  • 15. Enzymes Genome Reactions Transforming metabolites From Genome To Metabolism Sequencing Functional annotation Metabolism Reconstruction Finding CDS (protein-coding genes)
  • 16. Representing The Metabolism Models for structural analysis «  Networks Models for flow analyses in metabolism «  Flux Balance Analysis «  Capacitance analysis Models for dynamic analysis «  Models including reaction kinetics 16
  • 17. Representing The Metabolism Models for structural analysis «  Networks Models for flow analyses in metabolism «  Flux Balance Analysis «  Capacitance analysis Models for dynamic analysis «  Models including reaction kinetics 17
  • 18. Metabolic Networks 18 Toy example: part of the Escherichia coli metabolism
  • 19. Metabolic Networks Bipartite network of metabolites and reactions o  Nodes = metabolites and reactions 19 Pyruvate Formate Acetyl-CoA Acetaldehyde Ethanol Coenzyme A NADH NAD+ Reaction 2.3.1.54 Reaction 1.2.1.10 Reaction 1.1.1.1
  • 20. Metabolic Networks Metabolite network o  Nodes = metabolites o  Edge between two nodes if there is a reaction where one of the metabolites is the substrate and the other is the product 20 Pyruvate Formate Acetyl-CoA Acetaldehyde Ethanol Coenzyme A NADH NAD+
  • 21. Metabolic Networks Reaction network o  Nodes = reactions o  Edge between two nodes if there is a metabolite produced by a reaction substrate of the other reaction 21 Reaction 2.3.1.54 Reaction 1.2.1.10 Reaction 1.1.1.1
  • 22. Metabolic Networks Enzyme network o  Nodes = enzymes o  Edge between two nodes if there is a metabolite produced by an enzyme substrate of the other enzyme o  Limitations : o  An enzyme can catalyse several reactions o  A reaction can be catalysed by several enzymes o  Incomplete knowledge of enzymes (orphan enzymes) 22 Pyruvate formate lyase Acetaldehyde dehydrogenase Alcohol dehydrogenase
  • 23. Metabolic Networks Ubiquitous compounds problem CO2 ATP/ADP H2O H+ NAD(P)+/NAD(P)H …. Create important hubs in the metabolic network ! need to take them into account! “Primary” and “secondary” metabolites in reactions in pathways 23 Ubiquitous: existing or being everywhere, especially at the same time; omnipresent: Hub: A highly connected node in a graph
  • 24. Main difficulties in metabolic network reconstruction from whole genomes: « Gene functional annotation issues 24
  • 25. Main difficulties in metabolic network reconstruction from whole genomes: « Gene functional annotation issues 25 >60% of functional annotations in UniProt may be erroneous 2009
  • 26. Main difficulties in metabolic network reconstruction from whole genomes: « Gene functional annotation issues « Orphan enzymes 26
  • 28. What is an orphan enzyme? An “orphan enzyme activity” (or “orphan enzyme” for short) is a known biochemical activity for which there is any associated sequence (yet) 28
  • 29. Orphan enzymes 2004: Karp: Call for an enzyme genomics initiative. (38% of orphan enzymes) 2005: Lespinet & Labedan: Orphan enzymes? (42% of orphan enzymes) 2006: Lespinet & Labedan: ORENZA database. (36% of orphan enzymes) 2007: Chen & Vitkup: Distribution of orphan metabolic activities. (34% of orphan enzymes) 2007: Pouliot & Karp: A survey of orphan enzyme activities. (34% of orphan enzymes) 29
  • 30. Orphan enzymes 22% 78% >5,000 enzymatic activities IUBMB - EC numbers Orphan enzymes 30 Enzyme Commission (EC) number: Official classification of enzyme activities Reaction class Metabolite type Reaction nature Serial number
  • 31. Enzyme activities and annotated proteins over years Limited number of recently discovered activities Protein sequencing DNA sequencing Expression cloning Genomics 31
  • 32. Enzyme discovery and protein families 23% 77% >14,000 protein families Pfam 22% 78% >5,000 enzymatic activities IUBMB - EC numbers Unknown functionOrphan enzymes 32
  • 33. Enzyme discovery and protein families Newly discovered enzymatic activities are mostly associated with already known enzyme families 33
  • 34. Local Orphan Enzymes Enzymatic activities that have been observed in at least one organism of a given clade and having a sequence associated in an other clade but not in this one 34 Local orphan EC numbers Achaea Bacteria Eukaryotes Total number of concerned EC numbers 79 133 299 % of EC retrieved with PRIAM (significant hit with a detected protein) 30% 30% 59%
  • 35. Main difficulties in metabolic network reconstruction from whole genomes: « Gene functional annotation issues « Orphan enzymes « Lack of knowledge on organism metabolic diversity 35
  • 36. Part II Reaction Molecular Signature Networks and Conserved Modules
  • 37. 37
  • 39. 39 57% of nodes 83% of edges Reactions from model organisms ✴  E. coli ✴  B. subtilis ✴  S. cerevisiae ✴  H. sapiens ✴  A. thaliana ✴  D. melanogaster
  • 40. 40
  • 41. 41 57% nodes suppressed 83% edges suppressed
  • 42. 42 Lack of knowledge about metabolism diversity in non-model organisms
  • 43. 43 Lack of knowledge about metabolism diversity in non-model organisms What strategy can be adopted to counter this lack of knowledge?
  • 44. 44 All main hypotheses on metabolic pathway evolution agree about the importance of enzyme promiscuity, i.e. the capacity of enzymes to catalyze one or several reactions on more or less different substrates… …we should look at the conservation of chemical transformations in pathways and not only the conservation of enzymatic reaction
  • 45. 45 Reactions and chemical transformation types Dehydrogenation
  • 46. 46 How to represent molecules, reactions and their chemical transformation types ?
  • 48. Representing Molecules 48 Need to be able to describe molecular substructures and their proprieties
  • 50. 50 Molecular signature set of sub-graphs of given diameter (height) centered on each atom of the molecule Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 51. 51 Molecular signature set of sub-graphs of given diameter (height) centered on each atom of the molecule Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 52. 52 Molecular signature set of sub-graphs of given diameter (height) centered on each atom of the molecule Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 53. 53 Molecular signature set of sub-graphs of given diameter (height) centered on each atom of the molecule Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 54. 54 Molecular signature set of sub-graphs of given diameter (height) centered on each atom of the molecule Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 55. How To Represent Reactions And Their Chemical Transformation Type? 55
  • 56. 56 Reaction molecular signature (RMS) difference between molecular signatures of products and substrates of the reaction Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 57. 57 Reaction molecular signature (RMS) difference between molecular signatures of products and substrates of the reaction … specifically, it consists in keeping changing substructures, or, a way to encode the chemical transformation Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 58. 58 Reaction molecular signature (RMS) difference between molecular signatures of products and substrates of the reaction … specifically, it consists in keeping changing substructures, or, a way to encode the chemical transformation Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 59. 59 Reaction molecular signature (RMS) difference between molecular signatures of products and substrates of the reaction … specifically, it consists in keeping changing substructures, or, a way to encode the chemical transformation Carbonell, P., Carlsson, L., Faulon, J.-L.: Stereo signature molecular descriptor. Journal of Chemical Information and Modeling 53(4), 887–97 (2013)
  • 60. 60 Reaction molecular signature (RMS) ★  At height 0, the RMS is null
  • 61. 61 Reaction molecular signature (RMS) ★  At height 0, the RMS is null ★  Height 1 RMS: -1.0*[O]([C][P]) 1.0*[O]([H][C]) -1.0*[O]([H][H]) 1.0*[O]([H][P]) 0.0
  • 62. 62 Reaction molecular signature (RMS): ★  At height 0, the RMS is null (all atoms are subtracted) ★  Height 1 RMS: ★  Height 2 RMS: -1.0*[O]([C][P]) 1.0*[O]([H][C]) -1.0*[O]([H][H]) 1.0*[O]([H][P]) 0.0 1.0*[C@@]([H][C@@]([H][C][O])[C@@]([H][C@][O])[O]([H])) 1.0*[C@@]([H][C@]([H][C@@][O])[C@@]([H][C@@][O])[O]([H])) 1.0*[C@@]([H][C@]([H][C@@][O])[O]([H])[O]([C@@])) -1.0*[C@@]([H][C@]([H][C@][O])[C@@]([H][C@][O])[O]([H])) -1.0*[C@@]([H][C@]([H][O][O])[C@@]([H][C@][O])[O]([H])) 1.0*[C@@]([H][C]([H][H][O])[C@@]([H][C@@][O])[O]([C@@])) 1.0*[C@]([H][C@@]([H][C@@][O])[C@@]([H][O][O])[O]([H])) -1.0*[C@]([H][C@@]([H][C@@][O])[O]([H])[O]([C@])) -1.0*[C@]([H][C@]([H][C][O])[C@@]([H][C@@][O])[O]([H])) -1.0*[C@]([H][C]([H][H][O])[C@]([H][C@@][O])[O]([C@])) 1.0*[C]([H][H][C@@]([H][C@@][O])[O]([H])) -1.0*[C]([H][H][C@]([H][C@][O])[O]([P])) 1.0*[H]([C@@]([C@@][C@@][O])) -1.0*[H]([C@@]([C@][C@@][O])) 1.0*[H]([C@@]([C@][O][O])) 1.0*[H]([C@@]([C][C@@][O])) 1.0*[H]([C@]([C@@][C@@][O])) -1.0*[H]([C@]([C@@][O][O])) -1.0*[H]([C@]([C@][C@@][O])) -1.0*[H]([C@]([C][C@][O])) 2.0*[H]([C]([H][C@@][O])) -2.0*[H]([C]([H][C@][O])) 1.0*[H]([O]([C@@])) -1.0*[H]([O]([C@])) 1.0*[H]([O]([C])) -2.0*[H]([O]([H])) 1.0*[H]([O]([P])) 1.0*[O]([C@@]([H][C][C@@])[C@@]([H][C@][O])) -1.0*[O]([C@]([H][C][C@])[C@]([H][C@@][O])) -1.0*[O]([C]([H][H][C@])[P]([O][O]=[O]))
  • 63. 63 RMS group reactions on the basis of performed chemical transformation type
  • 65. = Reaction molecular signature Molecular signature Reaction network Reaction 65
  • 66. «  Nodes represent reactions «  Two nodes are linked by a directed edge if there is a metabolite produced by the first reaction that is consumed by the second reaction «  5,830 nodes «  11,197 edges 66
  • 67. = Reaction molecular signature Molecular signature Reaction network RMS network Reaction 67
  • 68. Transformation of a reaction network in a RMS network 68
  • 69. Transformation of a reaction network in a RMS network 69
  • 70. Transformation of a reaction network in a RMS network 70
  • 71. Transformation of a reaction network in a RMS network 71 Markov chains transition probabilities of order 1 between connected RMSMarkov chains transition probabilities of order 1 between RMSi and RMSj
  • 72. 72 3,365 nodes 8,721 edges 5,830 nodes 11,197 edges Node reduction rate : 0.57 X1 X0,57
  • 73. 73 = Reaction molecular signature Molecular signature Reaction network RMS network Reaction Search and analysis of conserved paths
  • 75. 75 Pathway conservation index (PCI) ✴  Computed for each RMS path present in at least one known metabolic pathway ✴  Represents the number of corresponding reaction paths that are present in at least one MetaCyc pathway … captures the chemical redundancy across the known metabolism
  • 76. 76
  • 77. 77 Beta-oxydation module - PCI = 14 (conserved in 14 pathways)
  • 78. 78 Aldoxime biosynthesis- PCI = 7 (conserved in 7 pathways)
  • 79. Pathway conservation index (PCI) for all MetaCyc pathways 79 Paths of length 2 & PCI>=2 : 365 conserved modules Previous study: Muto et al. (J. Chem. Inf. Model., 2013) identified 34 conserved modules Pathway type MetaCyc pathways with conserved modules Biosynthesis 263 (42%) Degradation 172 (47%) Detox 3 (27%) Energy 61 (78%) Other 19 (33%) All 518 (46%)
  • 80. Pathway conservation index (PCI) for all MetaCyc pathways 80 Paths of length 2 & PCI>=2 : 365 conserved modules Previous study: Muto et al. (J. Chem. Inf. Model., 2013) identified 34 conserved modules Pathway type MetaCyc pathways with conserved modules Biosynthesis 263 (42%) Degradation 172 (47%) Detox 3 (27%) Energy 61 (78%) Other 19 (33%) All 518 (46%)
  • 81. Pathway conservation index (PCI) for all MetaCyc pathways 81 Paths of length 2 & PCI>=2 : 365 conserved modules Previous study: Muto et al. (J. Chem. Inf. Model., 2013) identified 34 conserved modules Pathway type MetaCyc pathways with conserved modules Biosynthesis 263 (42%) Degradation 172 (47%) Detox 3 (27%) Energy 61 (78%) Other 19 (33%) All 518 (46%)
  • 82. Conservation of all RMS paths in the network 82
  • 83. 83 Path enumeration in the RMS network >72,000 paths of length 2 (2 edges and 3 nodes)
  • 86. RMS path scores 86 wRea Number of reactions described by a RMS scoreRea diversity of reactions performing the same chemical transformation
  • 87. RMS path scores 87 wPageRank Feedback centrality: the more neighbours a node has, the more it is central. The more a node is central, the more its neighbours are central scorePageRank topological importance of the module in the network by highlighting chemical hubs
  • 88. RMS path scores 88 wProt Estimation of the number of proteins associated to a given RMS scoreProt diversity of enzymes performing the same chemical transformation
  • 89. RMS path scores 89 wProt Estimation of the number of proteins associated to a given RMS scoreProt diversity of enzymes performing the same chemical transformation 30% of RMS with weightProt=0 Link with orphan enzymes? !
  • 90. RMS path scores 90 Significant difference between scores distributions of known metabolic pathways and all/random paths in the RMS network (Kruskall-Wallis & Tuckey HSD tests for validation: p-value<<0.05)
  • 91. RMS path scores 91 Learning pathway types from known metabolic pathways using rules combining scoreProt, scoreRea and scorePageRank NNge algorithm Pathway type prediction with an accuracy of 89% for RMS paths 5 metabolic pathway types: ✴  biosynthesis ✴  degradation ✴  detoxification ✴  energy creation ✴  other
  • 92. 92
  • 93. 93 = Reaction molecular signature Molecular signature Reaction network RMS network Reaction Search and analysis of conserved paths Linking to genomic context
  • 94. Part III Combining Genomic and Metabolic Contexts
  • 95. 95 Metabolic context: RMS network Genomic context: gene cluster
  • 96. Gene Clusters: Operons 96 Operon: genomic unit containing a group of genes: «  co-localised on the same strand «  controlled by the same promoter «  co-transcripted in a polycistronic ARNm «  often associated to a same cellular function
  • 97. Directons predicting operons 97 Maximal set of adjacent CDS localised on the same DNA strand and not interrupted by a CDS on the opposite strand
  • 98. 98 Linking directon genes to RMS EC number RMS … … Known enzymes
  • 99. 99 Linking directon genes to RMS RMS1 RMS2 RMS3 RMS4 A Pfam is often associated to several RMS ! A gene is therefore often associated to several RMS
  • 100. 100 Projection of directon RMS on the network
  • 101. 101 Extraction of selected nodes and all edges – selection of maximal connected components
  • 103. 103 Best paths selection for the directon «  Max number of gene colours «  High path scores (scoreRea, scorePageRank, scoreProt)
  • 104. 104 Protein Family Case Case study for the Baeyer-Villiger MonoOxygenases protein family A protein family is a group of proteins that share a common evolutionary origin, reflected by their related functions and similarities in sequence or structure.
  • 105. Baeyer-Villiger MonoOxygenases (BVMOs) «  Flavoenzymes (FAD dependent) «  Water soluble «  Two classes: I and II 105 linear or cyclic ketone ester or lactone
  • 106. 106 ★All RMS catalysed by the protein family ★All directons containing a member of the protein family Directon clustering based on their RMS content Network projection of common RMS from each directon cluster Path selection: max colours, high scores
  • 107. Baeyer-Villiger Monooxygenation 107 3 RMS describing this type of reaction
  • 108. 108 ★All RMS catalysed by the protein family ★All directons containing a member of the protein family Directon clustering based on their RMS content Network projection of common RMS from each directon cluster Path selection: max colours, high scores
  • 109. Directons containing a BVMO «  814 BVMO sequences «  812 directons «  468 organisms – only bacteria 109
  • 110. 110 ★All RMS catalysed by the protein family ★All directons containing a member of the protein family Directon clustering based on their RMS content Network projection of common RMS from each directon cluster Path selection: max colours, high scores
  • 111. 111 Clustering of BVMO-containing directons according their content in RMS Cluster 1 •  251 directons •  0 common RMS Cluster 2 •  308 directons •  32 common RMS Cluster 3 •  125 directons •  10 common RMS Cluster 4 •  69 directons •  86 common RMS Cluster 5 •  59 directons •  5 common RMS
  • 112. 112 ★All RMS catalysed by the protein family ★All directons containing a member of the protein family Directon clustering based on their RMS content Network projection of common RMS from each directon cluster Path selection: max colours, high scores
  • 113. 113 Cluster projection on the RMS network & selection of maximal connected components Cluster 2 «  Pink nodes: RMS BVMOs «  Grey nodes: RMS known to be in BVMOs metabolic context «  Blue nodes: RMS never seen in BVMO metabolic context «  Green edges: links between RMS from known metabolic paths where BVMOs are involved
  • 114. 114 Cluster projection on the RMS network & selection of maximal connected components
  • 115. 115 ★All RMS catalysed by the protein family ★All directons containing a member of the protein family Directon clustering based on their RMS content Network projection of common RMS from each directon cluster Path selection: max colours, high scores
  • 116. 116 Path selection RMS path scoreRea scoreProt scorePageRank Path 1 7.9 2.5 5.0 10-4 Path 2 7.6 2.5 4.8 10-4 Path 3 5.5 0.5 3.6 10-4 Path 4 5.2 0.4 3.4 10-4
  • 117. 117 Difficulty to define the exact location on the molecule where the described reaction happens
  • 119. What has been done? «  Orphan enzyme survey «  Update of statistics «  Protein families and local orphan enzymes «  A new representation of metabolism using a network of chemical transformations «  Definition and detection of conserved modules «  Rules for module type prediction «  Network exploration using genomic and metabolic contexts «  Definition of a strategy to explore the functional diversity of enzyme families «  Application to the Baeyer-Villiger Monooxygenases 119
  • 120. What’s next? Method improvements «  Detect branched and cyclic conserved modules «  Determine specific domains/profiles for RMS: using PRIAM/MKDOM-like methods «  Improve gene cluster projection on the RMS network Applications «  RMS to classify enzyme activities «  Assign sequences for orphan enzymes and reactions for orphan metabolites «  Application on other protein families «  A way to study biological systems from a chemical point of view 120
  • 121. David Vallenet Claudine Médigue Systems biology team: Karine Bastard Mark Stam Jonathan Mercier Guillaume Reboul And all LABGeM Jean-Loup Faulon Olivier Lespinet 121 Acknowledgements
  • 122. 122
  • 124. Metabolic Networks Metabolites hypergraph o  Nodes = metabolites o  Hyperedge linking all metabolites implied in the reaction 124 Pyruvate FormateAcetyl-CoA Acetaldehyde Ethanol Coenzyme A NADH NAD+