CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CDAC 2018 Ciccolella inferring
1. Inferring cancer progression from single
cell sequencing while allowing loss of
mutations
Simone Ciccolella, Mauricio Soto Gomez, Murray Patterson,
Gianluca Della Vedova, Iman Hajirasouliha and Paola Bonizzoni
Lake Como School of Advanced Studies, 2018
3. Cancer evolution
Simone Ciccolella CDAC 2018
• Different clones make different fractions of the tumor
• Accumulation of mutations over time
• Being able to detect the evolutionary history of a tumor is a key stone for developing targeted therapies
4. Cancer evolution
Simone Ciccolella CDAC 2018
• Different clones make different fractions of the tumor
• Accumulation of mutations over time
• Being able to detect the evolutionary history of a tumor is a key stone for developing targeted therapies
Picture from: Ding et al., Nature, 2012.
5. Infinite Sites Assumption
Simone Ciccolella CDAC 2018
• The most assumed assumption for the inference of cancer evolutions
• Permits the use of the simplest phylogeny model
• Easiest model from a computational perspective
No two mutations can occur at the same locus (site). Kimura, Genetics, 1969.
6. Infinite Sites Assumption
Simone Ciccolella CDAC 2018
• The most assumed assumption for the inference of cancer evolutions
• Permits the use of the simplest phylogeny model
• Easiest model from a computational perspective
No two mutations can occur at the same locus (site). Kimura, Genetics, 1969.
• “Our results refute the general validity of the infinite sites assumption and indicate that more complex models
are needed to adequately quantify intra-tumor heterogeneity for more effective cancer treatment.”
From: Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors.
Kuipers et al., Genome Research, 2017.
• “In genomically unstable cancers, deletion of large chromosomal segments is common.”
From: Phylogenetic analysis of metastatic progression in breast cancer using somatic mutations and copy number aberrations.
Brown et al., Nature 8, 2017.
7. Phylogenies: Perfect vs Dollo
Simone Ciccolella CDAC 2018
Perfect Phylogeny
BA
FC D
E
Each mutation is acquired once in the evolutionary
history
H
8. Phylogenies: Perfect vs Dollo
Simone Ciccolella CDAC 2018
Perfect Phylogeny Dollo(k ) Phylogeny
BA
FC D
E
Each mutation is acquired once in the evolutionary
history
BA
FC D
A1
–
E
B1
–
G
A2
–
H
I
Each mutation is acquired once, but it can be
lost at most k times in the evolutionary history
H
9. Loss of a mutation
Simone Ciccolella CDAC 2018
……
……
10. Loss of a mutation
Simone Ciccolella CDAC 2018
……
……
……
11. Loss of a mutation
Simone Ciccolella CDAC 2018
12. Loss of a mutation
Simone Ciccolella CDAC 2018
……
……
13. Loss of a mutation
Simone Ciccolella CDAC 2018
……
……
……
14. Loss of a mutation
Simone Ciccolella CDAC 2018
16. Available methods for SCS
Simone Ciccolella CDAC 2018
SCITE [1]:
• Markov Chain Monte Carlo (MCMC) maximum likelihood tree search
• Relies on the Perfect Phylogeny model
• Produces solutions with respect to the Infinite Site Assumption
[1] Tree inference for single-cell data.
Jahn K., Kuipers J. and Beerenwinkel N., Genome Biology, 2016.
17. Available methods for SCS
Simone Ciccolella CDAC 2018
SCITE [1]:
• Markov Chain Monte Carlo (MCMC) maximum likelihood tree search
• Relies on the Perfect Phylogeny model
• Produces solutions with respect to the Infinite Site Assumption
SiFit [2]:
• Hidden Markov Model (HMM) maximum likelihood tree search
• Does not impose any specific phylogeny model
• Can produce solutions that violate the Infinite Site Assumption
[1] Tree inference for single-cell data.
Jahn K., Kuipers J. and Beerenwinkel N., Genome Biology, 2016.
[2] SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models.
Zafar H., Tzen A., Navin N., Chen K. and Nakhleh L., Genome Biology, 2017.
18. SASC: Methods
Simone Ciccolella CDAC 2018
SASC is a Simulated Annealing maximum likelihood tree search algorithm:
• Optimization criteria:
max 1 1 log ( 𝑃 𝐼89 𝐸89 )
98
• A heuristic method for approximating the global optimum of a given function in large search space.
• The algorithm moves from one solution to a new one with a set of predefined moves.
• If the new solution is better, it will accepted with probability 1, otherwise the acceptance probability
change w.r.t. the temperature:
𝑝 = 𝑒
@AB C@ADEF@G H GIJ C@ADEF@G
EIKLIMNEDMI
19. SASC: Moves – Subtree Prune and Regraft
Simone Ciccolella CDAC 2018
A
B C
D E F
G HI
PruneRegraft
20. SASC: Moves – Subtree Prune and Regraft
Simone Ciccolella CDAC 2018
A
B C
D E F
G HI
PruneRegraft
A
B C
D E
F
G H
I
21. SASC: Moves – Swap nodes labels
Simone Ciccolella CDAC 2018
A
B C
D E
F
G H
I
22. SASC: Moves – Swap nodes labels
Simone Ciccolella CDAC 2018
A
F C
D E
B
G H
I
A
B C
D E
F
G H
I
23. SASC: Moves – Add a deletion
Simone Ciccolella CDAC 2018
A
F C
D E
B
G H
I
24. SASC: Moves – Add a deletion
Simone Ciccolella CDAC 2018
A
F C
D E
B
G H
I
A
F C
D E
I
B
G H
F1
–
25. SASC: Moves – Remove a deletion
Simone Ciccolella CDAC 2018
A
F C
D E
B
G H
F1
–I
26. SASC: Moves – Remove a deletion
Simone Ciccolella CDAC 2018
A
F C
D E
B
G H
F1
–I
A
F C
D E
I
B
G H
28. SASC: Assignment of cells
Simone Ciccolella CDAC 2018
A
B C
D E
F
G H
I
cell6
cell1
cell2
cell3
cell4
cell5
cell7cell8
29. Results: Simulated data
Simone Ciccolella CDAC 2018
• Ancestor-Descendant accuracy:
Pairs of mutations in Ancestor-Descendant relationship
correctly inferred
• Different lineages accuracy:
Pairs of mutations in different branches correctly inferred
F1 scores
30. Results: Simulated data
Simone Ciccolella CDAC 2018
• Accuracy of deletions detection:
Classification accuracy of mutational losses
(*) SCITE and SiFit detect no deletion
31. Results: Real data
Simone Ciccolella CDAC 2018
Data from: Dissecting the clonal origins of childhood acute lymphoblastic leukemia by
single-cell genomics. Gawad et al., Proceedings of the National Academy of Sciences,
2014.
Lymphoblastic Leukemia Breast Cancer
Data from: Single-cell RNA-seq enables comprehensive tumour and immune cell
profiling in primary breast cancer. Chung et al., Nature Communications, 2017.
Bold-faced mutations are driver mutations
32. Conclusions
Simone Ciccolella CDAC 2018
• SASC is an accurate tool for inferring intra-tumor progression and subclonal composition from SCS
data
• SASC is highly accurate on both simulated and real datasets
• SASC infers mutation losses employing a Dollo model
• SASC provides a new progression model on Single Cell data
• Future directions:
1. Explore different heuristics (Genetic Programming)
2. Define new methods that reduce the dimensions and complexity of the search space
3. Find a better name for the tool