1) The document summarizes research on analyzing the genetic code found in cancers. It discusses how the genetic code is being decoded through advances in sequencing technology that have allowed researchers to read the 3.2 billion letters in the human genome.
2) Mutations or "bugs" in the genetic code can cause cancers by disrupting genes involved in processes like cell growth. New sequencing methods are helping researchers precisely locate mutations and determine which may drive tumor growth.
3) Interpreting the sequencing data is an ongoing challenge but identifying recurrent mutations and comparing mutations across samples and genomes can help distinguish driver mutations from harmless changes.
2. The Code
♂
♀
• A code passed from cell to cell
• ~3.2 billion ‘letters’ in total
• Encodes about 22,000 proteins
Image Credit: National Human Genome Research Institute
3. ‘Bugs’ in the Code
Sustain
proliferative
signalling
Evade
growth
suppressors
Mutations in the Genetic Code
Avoid immune
GACCTGGCAGCCAGGAACGTACTGGT
destruction
Deregulating
cellular energetics
Enabling replicative
immortality
• Vast majority have no
consequence…
Tumor-promoting
inflammation
…but occasionally…
Resisting cell
death
GACCTGGCAGCC----ACGTACTGGT
Genome instability
& mutation
Promoting
local blood
supply
• Alteration causes a selective
Activating
growth advantage, increasing
invasion &
the ratio of cell birth to cell
metastasis
death.
Hanahan D & Weinberg R (2011) Hallmarks of Cancer: The Next Generation Cell , Volume 144, Issue 5, Pages 646-674
4. Reading the Code: 1
$100,000,000
1400
1200
$10,000,000
1000
$1,000,000
800
$100,000
Tb
600
$10,000
400
$1,000
200
Cost/Genome
Short Read Archive
$100
0
2001
2002
2003
Draft Human
Genome Project
2004
2005
2006
2007
‘Massively Parralel’
sequencing
2008
2009
2010
1st Tumour
2011
2012
2013
2014
>12, 000 tumours
Wetterstrand KA. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP) [27/01/2014]
Wang and Wheeler (2014) Genomic Sequencing for Cancer Diagnosis and Therapy Annu. Rev. Med. 65: 33-48
5. Reading the Code: 2
Challenge: Rate of data production has overtaken improvements
in long-term storage capacity.
Response: Novel Compression Algorithms
Stein (2010) The case for cloud computing in genome informatics Genome Biology, 11:207
Bonfield JK, Mahoney MV (2013) Compression of FASTQ and SAM Format Sequencing Data. PLoS ONE 8(3): e59190.
6. Reading the Code: 3
‘I would say the Human Genome Project is probably
‘I would say the Human Genome Project is probably
‘I would say the Human Genome Project is probably
more significant than splitting the atom or going
more significant than splitting the atom or going
more significant than splitting the atom or going
to the moon.’ Francis Collins, CNN.
to the moon.’ Francis Collins, CNN.
to the moon.’ Francis Collins, CNN.
Shatter & Scan
probab
ing to
ignifi
X 300M
y more
ly mor
obably
Genome
the Hu
ifican
Proje
itting
signif
ing to
oject
roject
Identify Original
‘
the Human Genome Project
more significant’
probably
8. Placing the Code Snippets
Encodes a protein change in
the EGFR gene
This particular coding
change (‘L858R’) renders the
tumour sensitive to targeted
therapy (Afatinib)
‘Personalised medicine’
D Gonzalez de Castro, P A Clarke, B Al-Lazikani and P Workman (2013) Personalized Cancer Medicine: Molecular
Diagnostics, Predictive biomarkers, and Drug Resistance Clinical Pharmacology & Therapeutics; 93 3, 252–259
9. Identifying Bugs, confidently
• “What are we missing?”
The influence of
sample heterogeneity,
purity and read depth.
• ‘False’ mutations:
sequencing errors &
inaccurate alignment.
• Mutation calling is a
work in progress.
• Crowdsourcing a
solution.
Cibulskiset al (2013) A comparative unravels of algorithmsmutations inSNV detection in cancer Bioinformatics 2013;29:2223Caldas C (2012) Cancer sequencing analysissomaticevolution Nature Biotechnologyheterogeneous cancer samples Nature
Roberts et al (2013) Sensitive detection of clonal point for somatic impure and 30, 408–410
Biotechnology 31, 213–219
2230
10. Interpreting the Code
Distinguishing between mutations that confer a selective advantage and those
that are selectively neutral.
• Recurrent mutations
• Account for variable background
mutation rates
• Comparative genomics
Imielinski Wheeler Mapping the hallmarks of lung for cancer diagnosis and therapy. Annu sequencing. 14;65:33-48.
Wang L & M (2012)DA. (2014) Genomic sequencingadenocarcinoma with massively parallel Rev Med. JanCell. 2012 Sep
14;150(6):1107-20.
11. Thank you for your attention
Particular thanks go to:
BABS, Prof. Swanton
& above all to the patients