2. Session 3 – Applications and
Case Studies
14:00 Systems Biology in Cancer
– Dr Andrei Zinovyev, Institut Curie, Paris
14:20 Single Molecule Imaging Technology
– Professor George Fraser, University of Leicester
14:35 Knowledge Engineering for Biomedical
Research
– Dr Jonathan Tedds, University of Leicester
14:50 Applications in an SME Environment
– Dr Kevin Slater, PetScreen Ltd
4. Systems Biology of Cancer
Andrei Zinovyev
Institut Curie - INSERM U900 / Mines ParisTech
Computational Systems Biology of Cancer
Stratified Medicine - Opportunities for Business
Leicester - 23 January 2013
5. Institut Curie, Bioinformatics and Systems
Biology of Cancer Department
Institut Curie
• Created in 1909 by Marie Curie
• From fundamental research to innovative treatment
• Comprehensive Cancer Center
• 2 cancer hospitals, focus on breast cancer, pediatric tumors,
uveal melanoma
• 15 research departments
• 3,000 staff
Computational Systems Biology of Cancer group (http://sysbio.curie.fr)
• 15 people (physicists, mathematicians, biologists)
• Cancer data analysis
• Mathematical modeling of cancer processes
• Collaborations with pharmaceutical companies
6. Example of Stratified/Personalized Medicine:
SHIVA clinical trial at Institut Curie
Informed
Patients with refractory
consent
cancer (all tumor types)
signed
Tumor biopsy
+
Blood sample
High
Throughput
Sequencing Therapy based on molecular profiling
- Approved molecularly targeted agent
Informed
Molecular
profiling
consent
signed R
Conventional therapy based on
Molecular oncologist’s choice
Prospective Eligible
biology
cohort patient
board
Cross-over
Specific
NO therapy YES
available
7. One of the problems of personalized medicine:
existence of complex feedbacks in a cancer cell
An example of «paradoxal» answer to treatment (Prahallad et al, Nature 2012)
8. HOW SYSTEMS BIOLOGY CAN HELP?
(what is systems biology?)
can it be a support for rational
decision-making in stratified medicine?
9. Two “systems biologies”
2001
2002
…studying biological systems by …studying structure and dynamics
systematically disturbing them of cellular and organismal
and monitoring the gene, protein function, rather than the
and informational pathway characteristics of isolated parts of a
responses and integrating these cell, with particular emphasis on
data in mathematical models emerging system
properties such as robustness…
Danger: high-throughput stamp Danger: creating fruitless
collection abstractions
10. Computational Systems Biology of Cancer
Specific flavor of systems biology
Object: cancer and cancer treatment
Tools:
1) High-throughput data with
particular emphasis on individual
genomic data,
2) Statistical analysis in large dimensions
3) Mathematical modeling (“what if”
questions)
Objective: prediction of cancer
treatment success in a concrete patient
(virtual tumour in virtual patient?)
11. Computational Systems Biology of Cancer
group at Institut Curie
Objective of our group: based on existing knowledge and
data, be able to explain why certain mutations of normal
genome can lead to tumorigenesis, and how to reverse
their effect?
Tools:
Formal representation of biological knowledge
(map of cancer)
Mathematical modeling (“animation”) of biological
diagrams
Mechanistic models of epistasy (genetic interactions)
12. Cancer: hallmarks, networks and maps
Task: assemble this network
at its full complexity
Problems:
What language to use?
How to navigate?
How to maintain?
Hanahan and Weinberg, 2011, Cell
How to use?
13. Towards an
Atlas of Cancer Signaling Networks
Atlas of Cancer Signalling Networks
RB/E2F-Cell Cycle DNA repair-Cell Cycle
• CellDesigner tool (Diagram
editor for signaling networks
representation)
• Systems Biology Graphical
Notation (SBGN) visual
syntax
Calzone et al, Kuperstein et al,
Mol Syst Bio 2008 unpublished
Cell Survival Cell death-energy metabolism
• Coming: maps of
EMT, motility,
Cohen et al, Fourquet et al,
unpublished unpublished
polarity, immune
response
14. NaviCell: Navigation and curation of
Atlas of Cancer Signaling Networks
Atlas of Cancer Signalling Networks
NaviCell = Google map + Semantic zoom + Blog
Google map Blog
Semantic zoom
NaviCell: a web tool for navigation, curation and maintenance of molecular interaction maps. http://navicell.curie.fr
Kuperstein I, Pook S, Cohen DPA, Calzone L, Barillot E and Zinovyev A (submitted) navicell@curie.fr
16. Pathway “staining” and Anna Karenina’s principle
506, G1, T1, noninvasive 1533-1, G3, T4, invasive 2307, G2, T2, invasive
870-1, normal 3721-10, normal 915-1, normal
17. Using the maps:
finding alternative routes
All path of length <30 from Through ROS formation by the
succinate to DNA damage
respiratory chain
Through transfer of the reductive
equivalents of succinate to NADPH and
thioredoxin, then ROS detoxification
or RNR activity and DNA repair
Through reduction of ubiquinone, the
oxidative equivalents of which are
necessary for pyrimidine biosynthesis
and DNA repair
(see Khutornenko AA et al., PNAS, 2010,107,12828)
18. Example: Cell fate decision mechanism fragilities
utilized by cancers (Calzone et al, 2010)
Ewing’s
Lung cancers,
sarcoma,
cervical cancers,
lung cancer,
oesophageal
squamous neuroblastomas
Lymphomas
cell carcinomas
Colorectal
Lymphomas, tumors
breast cancer
19. Compute phenotype probabilities using
state transition graphs
Asynchronous state transition graph
Influence graph =
The probability to reach
a final state from
an initial state
= probability of observing
a phenotype in
experiment Apoptosis Necrosis Survival
20. Validate the model with mutants
TNF=1
Example : Caspase 8 deletion
• ≈ 85% survival (NFkB)
• ≈ 15% necrosis
• No apoptosis
Qualitatively consistent with the literature
“TNF-induced apoptosis is blocked though not necrosis”
[Kawahara, Ohsawa et al., J Cell Biol 1998]
(Jurkat cells, C8-/-)
Naïve NFkB apoptosis necrosis
survival survival
21. Synthetic lethality and cancer treatment:
hot topic in new anticancer drug development
If gene A is already
mutated in cancer cells,
Gene A Gene B targeting B will specifically
kill cancer cells leaving normal
cells intact
Gene A Gene B Example: BRCA1+PARP
synthetic lethal pair
(PARP inhibitors,
Helleday, Carcinogenesis, 2010)
Gene A Gene B If gene A is amplified in cancer,
then one should look for
synthetic dosage lethality
There is a big promise here for stratified medicine
22. Example: Metastases in mouse model
of colon cancer
Experimental system: p53-null mouse
Colon cancer is associated with:
Mutations in APC gene (b-catenin/WNT pathway)
Mutations in RAS gene
Less frequent mutations in many other pathways
(Notch, MLH, PTEN, SMAD, etc.)
Question: what combination of mutations in these pathways
lead to rapid metastatic tumorigenesis?
25. Synthetic interaction between p53 and overexpression of NICD
leads to EMT in a mouse model of metastasizing colon cancer
p53 is down
NICD NICD
NICD is up NICD is up and p53 is down
26. Take home message
Implementing Personalized (Stratified) medicine has a number
of obstacles, including complex response of cancer cells to
treatment
Understanding and predicting this response requires either
“try and fail” approach
or / and
more intelligent guess (systems biology)
Use of synthetic interactions (synthetic lethality) is a new
paradigm of individualized cancer treatment
27. Acknowledgements
Curie - INSERM U900 Funding
MAE MOST-FI P2R
/ Mines ParisTech ANR SITCON
Ligue contre le cancer
EC FP7 APO-SYS
Computational Systems ANR CALAMAR
Biology of Cancer team INCA SYBEWING
Emmanuel Barillot Curie-Servier Alliance
Institut des Systèmes Complexes
Valentina Boeva
Collaborators EC FP7 ASSET
Eric Bonnet INCA IVOIRES
Laurence Calzone Daniel Louvard (Institut Curie) INCA Breast cancer predisposition
David Cohen Sylvie Robine (Institut Curie) Investissements d’avenir Bio-
QuickTime™ et un
décompresseur
sont requis pour visionner cette image.
Simon Fourquet Boris Zhivotovsky (Karolinska) informatique ABS4NGS
Inna Kuperstein Wolf-Dietrich Heyer (UC Davis) EC FP7 RAID
Loredana Martignetti Alexander Gorban (Leicester, UK) Cancéropole IDF
Data integration
Tatiana Popova
ITMO cancer Systems
Daniel Rovera Biology INVADE
Meriem Sefta PIC Computational Systems
Gautier Stoll Biology of Cancer
Bruno Tesson
Paola Vera-Licona
29. A Physical Analysis of Microarray Data
G.W. Fraser
Space Research Centre, Department of Physics and Astronomy,
Michael Atiyah Building, University of Leicester, Leicester LEI 7RH, UK.
30. The Future of Biology is the Detection of Light
• Spin-off company since 2002 based on ESA/ESTEC optical STJ detector technology
• Disruptive hyperspectral imaging of unequalled sensitivity
• Operation at 0.3 K
• Hardware entry point to studies of basic fluorophore response and microarray analysis
Self-quenching
1.5
Texas Red 5
Comparison of measured and tabulated
emission spectra
4
Counts/10nm/second
Alexa 488
1
3
S(n)
2 Fluorescein-EX
0.5
Alexa 546
1
0
0
450 500 550 600 650 700 750 800
0 5 10 15 20
Wavelength (nm)
n , Fluorophores/molecule
31. The Microarray as a Two-Dimensional
Electronic Imaging Device
Microarrays exhibit a number of “confounding factors” familiar to the
detector physicist :
• Spatial non-uniformity (imperfect flat-field and fixed-pattern noise)
• Temporal variability (photobleaching)
• Integral Non-linearity (output not linearly dependent on input)
• Digital divide errors and preferred locations *
• Differential Non-linearity (non-uniform sensitivity) *
Data from:
(a) two-colour Red/Green Cy3,Cy5 spotted arrays
(SMD Blader3932 and Willert wnt3a)
(b) Affymetrix Genepix (TDF458 SMD)
(c) Quantile data (courtesy Dr J Luo, MRC Toxicology Unit / Tas Gohir)
46. BRISSKit context: The I4Health goal of applying knowledge engineering to close the
‘ICT gap’ between research and healthcare (Beck, T. et al 2012)
Data as a public good & research efficiencies
= strategic priority for government, NHS, funders (e.g. MRC, Wellcome,
CRUK)
47. Overview of BRISSKit
• Developing “software as a service” data
management infrastructure based on open-
source applications
• More efficient & easier for researchers
• Offers significant savings in research database
and IT support costs
• Development funded by HEFCE
• University of Leicester in partnership with the
University Hospitals Leicester Trust and the
Cardiovascular BRU
48.
49. BRISSkit USPs
Integrated support for core research processes
Well-established mature open source applications as
protoyped in Cardiovascular: fully UK customised
A platform for seamless management and integration
between applications
An API allows integration with existing clinical systems
Easy set up, use and administration through browser
(including on mobile devices)
Capability of being hosted in any compliant cloud
provider including UHL (NHS information governance)
50. BRISSkit components = web services
CiviCRM
Enables end-to-end
contact management
for volunteers and
research participants,
tracking approaches,
contact, responses,
recruitment,
exclusions.
CiviCRM was designed
for the 'civic sector'
and has an object
model that reflects
community building
and non-profit
relationships.
51. OBiBa Onyx
Records participant
consent, questionnaire
data and primary
specimen IDs.
Web-based, secure
data entry by research
staff. E.g. used for all
patient recruits in
LCBRU – mobile
computing on wards
and outpatient clinic in
TMF.
Await significant new
release…
52. caTissue
Holds data on
primary, derived
and aliquot
specimen,
including linear
and 2d barcodes.
Storage
inventory, order
tracking –
currently over
30,000 LCBRU
samples stored
and recorded.
54. The semantic bridge
Bio-ontology!
OBiBa Onyx i2b2
Records participant
Cohort selection and
consent, questionnaire
data querying
?
data and primary
specimen IDs
56. Market: who is BRISSkit for?
Modular approaches and scalable tools with open
source licenses make good investments
• Individual researchers and associates
• enterprise-level tools without the IT overheads
• Research themes and departments
• stand-alone instances of required tools to
accelerate research
• Research units and centres
• integrated toolkit with clinical data loading
services, or 'jigsaw pieces' to complement existing
provision
59. Dogs, Cancer and Mathematics
An SME Perspective on University Collaboration.
Kevin Slater
Ilias Alexandrakis, Renu Tuli
Alexander Gorban, Evgeny Mirkes
60.
61. Why Dogs? Why Lymphoma?
USA dog population = 78 Million
Canine Lymphoma - Incidence
- 20% of all canine tumours are lymphoma cases
- 0.1% of older dogs will develop lymphoma
- Very high incidence in some breeds, e.g. Golden Retrievers 25% in USA
Canine Lymphoma - Symptoms
• Lymphadenopathy
• Lethargy
• Weakness
• Fever
• Anorexia
• Pu/Pd
62. Canine Lymphoma – Treatment
• Predominantly treated with chemotherapy
• Diverse range of treatment protocols
• Initially responds well to treatment
Canine Lymphoma – Prognosis
B-cell lymphoma favorable to T-cell lymphoma
Clinical stage (Stage V has poorer prognosis against Stage I)
Dogs treated with chemotherapy experience a greater survival
time
Recurrence almost inevitable
Presents a good model for Non-Hodgkin’s Lymphoma in humans
63. Canine Lymphoma Diagnosis
Cytology
Histology
Immunophenotyping (T or B cell)
• Generally invasive procedures
• FNA prone to no diagnostic samples
• Not suitable to treatment monitoring
64. Serum Biomarkers
Serum easily accessible
Potential for picking up circulating biomarkers
Diverse array of cancers whereby potential biomarkers identified
Prostate cancer
Breast cancer
Melanoma
Developed a serum biomarker approach to assist with detection of
canine lymphoma
68. Data Processing
79 peaks identified on first pass.
Greater than 30 peaks with P<0.05 (Mann Whitney U-
test) between the two populations
Manual triage resulted in19 candidate peaks for CART
analysis
Final algorithm focuses on two key biomarkers, 1 up
regulated and 1 down regulated
69. Classification and Regression Tree
Breiman L, Friedman JH, Olshen RA, Stone CJ.
Classification and Regression Trees.
Chapman & Hall (Wadsworth, Inc.): New York, 1984.
70. Bioinformatic model generation - CART
Initial Training Sample Set (Biomarker Identification):
Samples used to develop model (n=21) randomly selected:
10 non-lymphoma
11 lymphoma
Initial Test Sample Set (Biomarker Verification):
Samples used as independent test set (n= 158):
82 Non-lymphoma
76 Lymphoma
These samples were blind to the algorithm
72. Summary of Biomarker Identification Studies
• Protein sequence analysis identified 3 different biomarkers
• Limited information in the literature about the function of 2
biomarkers and their involvement in lymphoma
• Third biomarker identified as Haptoglobin, know to be
unregulated in canine lymphoma.
• No antibodies available to the unique biomarkers, therefore
had to work with human antibodies with poor cross
reactivity to the canine proteins.
75. Acute Phase Protein Response in Dogs
Infection Inflammation
Monocyte - Macrophage
IL-1 IL-6 TNF-α
C-RP
Haptoglobin
SAA
AGP
76. APP in Malignant Lymphoma
Sig Diff from
control P <0.0001 P <0.0001;<0.0001; <0.001; <0.02
>20 31.6
>100 224 136
18
90
Outside values Outside values
C-reactive protein (mg/L)
Haptoglobin (g/L)
Far outside values 16 Far outside values
80
70 14
60 12
50 10
40 8
30 6
20 4
10 2
0 0
Lymphoma
CLL
Control
ALL
Myeloma
CLL
myeloma
control
lymphoma
ALL
C-reactive protein Haptoglobin
lymphoma (n=16), acute lymphoblastic leukaemia (ALL) (n=11),
chronic lymphocytic leukaemia (CLL) (n=7) and multiple myeloma (n=9) Control (n=25)
Mischke et al Vet J 2006 174:188-92
77. From MS to ELISA
Development Use of
More than 19 Further of a multi Biomarker
protein peaks investigation marker test Pattern
identified as in order to Identification using Acute Software to
significantly characterise of Haptoglobin Phase Proteins create unique
different on and identify (Haptoglobin, algorithms
MS the proteins
CRP)
A unique new method of quickly and accurately diagnosing
canine lymphoma .
The combination of two Acute Phase Protein Assays,
Haptoglobin and a specific canine CRP, combined with a
unique Diagnostic Algorithm provide a diagnostic system
78. Tri-Screen Assay Development
• Serum samples collected from dogs with lymphoma, healthy dogs and dogs
with other diseases (many with similar presentation to lymphoma). Positive
samples were confirmed by either FNA or excisional biopsy. Non lymphoma
dogs were confirmed to be free of the disease at a minimum of six months after
providing the serum sample
• Samples were tested in batches using HAPT & CRP assay kits
• Ciphergen Biomarker Pattern Software was used to generate a series of
algorithms using the Classification and Regression Tree (CART) procedure.
Through an iterative process, the software uses the training set of data to build
trees to a point when optimal differentiation between the populations is
achieved.
• Blinded sample test performed.
79. Classification and Regression Tree
Breiman L, Friedman JH, Olshen RA, Stone CJ.
Classification and Regression Trees.
Chapman & Hall (Wadsworth, Inc.): New York, 1984.
82. Developments with The University of Leicester
Two cohorts
Database
Lymphoma – 97, Other disease – 135 Healthy – 71
Clinically suspected Healthy
Problems
Differential diagnosis Screening
Challenge: The Estimation of Lymphoma Risk
83. Methodologies Risk maps
K nearest neighbours
• Classic kNN with k from 1 to 30
• kNN with Fisher’s distance transformations
• kNN with adaptive distance
transformations
Decision tree
• Information gain (C4.5)
• Gini gain (CART)
• DKM
Probability density function estimation
• Radial-basis function (statistics kernel)
• Three random values (Lymphoma, Other diseases, Healthy)
x-axis CRP, y-axis Hapt
84. Software tools
Database maintenance
• Add new data
• Delete old data
Microsoft Excel
Selection of the best methods for
each problem and input data set.
Best solutions are exported
Canine lymphoma software to the applet
Providing access for
practitioner vets to
the diagnosis applet
86. Summary
• MS and other proteomic work confirmed already known findings that APP
levels are increased in canine lymphoma
• Application of CART algorithms is able to confer improved specificity over
previously non-specific APP assays.
• Facilitated the development of a useful test kit to aid in the differential
diagnosis of lymphoma in dogs.
• The delivery and performance of this test has been dramatically enhanced
through working with the Dept of Mathematics at the University of
Leicester.
• We have so far been unable to produce a reliable canine ELISAs for the
two previously unknown biomarkers discovered in the MS work.
• However, we have very good ELISA’s for these markers in human blood.
• Now embarking on a study of these markers in human NHL
88. From Visualisation to Prediction
using Data
Professor Jeremy Levesley
Department of Mathematics
University of Leicester
Stratified Medicine, January 2013
www.le.ac.uk
89. Data Mining: Confluence of Multiple
Disciplines
Database
Statistics
Technology
Machine
Learning
Data Mining Visualization
Information Other
Science Disciplines
2
90. What do we have
• Practical experience in Data Mining for Medical
Datasets (~40 expert and diagnostic systems,
main technique: Neural Networks, Cluster
Analysis, Visualization)
• New algorithms for Data Approximation and
Visualization
• Fast algorithms for Neural Networks
3
91. Growing principal tree:
branching data distribution
Iris data set
4 Together with A. Zinovyev
Toy data set
(Curie, Paris)
93. The process
• Data consolidation and preparation
• Data selection and preprocessing
• Data mining tasks and methods
• Automated exploration and discovery
• Prediction and classification
• Interpretation and evaluation
• Visualization tools can be very helpful
95. www.spaceideashub.com
enquiries@spaceideashub.com
0116 229 7700
Contact us for a FREE 2‐day
project, problem evaluation and consulting
Space IDEAS Hub
@spaceideashub