Predicting Stroke Patient Recovery from Brain Images: A Machine Learning Approach

1

PREDICTING STROKE PATIENT
RECOVERY FROM BRAIN IMAGES:
A MACHINE LEARNING
APPROACH
Alastair Smith
Supervised by Prof. Glyn Humphreys

Objectives
2

 Can machine learning techniques applied to Computed
Tomography (CT) brain imaging data provide meaningful
predictions of functional recovery in stroke patients?

 By exploring multiple machine learning techniques examine which approach provides
the most accurate predictions?

 What aspects of the images is utilised by the machine learning algorithms to inform
predictions?

Introduction

Stroke: The Consequences
3

Impact in the U.K. (National Stroke Strategy, 2007)
 Every year approximately 110,000 people in England have a stroke, with over 900,000
people currently living in England who have had a stroke.
 Stroke is the single largest cause of adult disability with a third of people who have a
stroke left with long-term disability.
 Stroke costs the NHS and the economy about £7 billion a year, despite U.K. services being
among the most expensive, outcomes for U.K. patients are comparatively poor with
unnecessarily long lengths of stay and high levels of avoidable disability and mortality.

Recovery & Rehabilitation:
 Effects include physical disability, loss of cognitive and communication skills, mental
health problems.
 Recovery program specific to patient symptoms and commonly requires intervention
from physiotherapists, psychologists, occupational therapists, speech therapists and
specialist nurses and doctors.
 A third of patients make a close to full recovery physically and are able to live an
independent life, a third will require assistance in daily activities, and a third of
patient will die within a year. (http://www.nhs.uk)

Introduction

Machine Learning & Brain Imaging (1)
4

 Machine Learning Techniques:
 Increasingly Influential in Neuroscience and Clinical Medicine
(Belazzi & Zupan, 2008)
 Informing individual patient management, selecting appropriate
treatments (Seker et al, 2003)

 Brain Imaging Data
 Large number of features, small number of samples
 Avoids ‘overfitting’ problem

Introduction

5

 MRI & fMRI
 Support Vector Machine (SVM) applied to MRI data
 Ecker et al (2010), Autistic Spectrum Disorder
 Kloppel et al (2008), Alzheimer's Disease (acc = 96%, n=68)
 Detection of other diseases: Fan et al (2005), Kawasaki et al (2007)

 SVM applied to fMRI data
 Classifiers developed to distinguish between stimuli, mental states and behaviours, demonstrating
data contains sufficient information
 For review see Norman et al (2006) and Haynes & Rees (2006)
 Saur et al (2010) predicting recovery of stroke patients language abilities after 6 months,
(acc = 76%, n=21)

 Relevance Vector Regression (RVR) applied to fMRI data
 Stonnington et al (2010):
 Predicted continuous measure
 Clinical measures of Alzheimer's Disease
 Predicted Score and actual scores highly correlated (p<0.0001, n=163)

Introduction

6

 PET & RVM
 Phillips et al (2011):
 Distinguish between levels of consciousness
 Acc = 100%, n = 58

 Computed Tomography (CT)
 Automated image segmentation, Li et al (2006)
 Haemorrhage detection, Liu et al (2008)
 Reid et al (2010):
 CT derived variables did not significantly improve multivariate logistic
regression models predictions of functional recovery in stroke patients

Introduction

Nottingham Extended ADL
7

 Ranked assessment of patients ability to complete activities of daily living (ADL)
independently
 Developed specifically for use with stoke patients (Nouri & Lincoln, 1987)
 Completed by patient or carer via post or interview

 Demonstrated to be a useful measure of outcome in stroke research
 Gladman et al (1993)
 Cited in 14 studies as a measure of stroke patient outcomes (Green et al, 2001)

 Composed of 21 questions, split in to 4 subsections:
 Mobility, Kitchen, Domestic, Leisure

 High scores indicate low disability
 Maximum score = 21, Minimum Score = 0

Method

Data Acquisition
8

 Participants

 Patients of to stroke units within West Midlands area

 Recruited as part of Birmingham University Cognitive Screen (BUCS) project
Inclusion Criteria: Exclusion Criteria:
• Informed Consent • Unwell
• New Acute Stroke • Decline to participate
• Alert • Concentration span <35mins
• Sufficient English Comprehension

 All patients selected for current study had suffered ischemic stroke
Time from stroke Time from stroke
Age to scan (days) to testing (days) n

NEADL 69.54 1.79 299.3 155

Method

NEADL data sets
9

20
18

Very Good Recovery
Very Poor Recovery

16
Bottom 42 percentile

Top 42 percentile
14
12
No.

10
8
6
4
2
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
NEADL

Poor Recovery Good Recovery

Score n Mean SD

Good Recovery >=17 65 19.3 1.46

Poor Recovery <17 90 9.02 4.72

Very Good Recovery >=17 65 19.3 1.46

Very Poor Recovery <=12 65 14.5 1.24

Method

Data Acquisition
10

 Computed Tomography (CT) images:
 Capture density of tissue
 In-plane resolution 0.5x0.5mm², slice thickness 4-5mm
 Whole Brain

 Pre-processing & Image Compression
 Images of poor quality (due to head movement or other imaging issues)
removed from sample
 Images normalised to an in-house CT template (Ashbumer & Friston, 2003)
using SPM8
 Images segmented using unified segmentation SPM8 (Seghier et al, 2005) to
form Grey Matter, White Matter and Cerebrospinal Fluid images
 A further Abnormal tissue class was produced by adding an additional
probability map (Seghier et al, 2008)
 Smoothed Grey and White matter using a 12mm³ FHWM Gaussian kernel

Method

Training & Testing
11

 Cross Validation
 Applied in 5 folds
 Data set(s) randomly divided into 5 equal test sets
 In each fold
 Model trained on all samples not present in test set
 Model tested on ability to assign correct labels to test set

 Measures of performance
 Performance measures record mean performance across all 5 folds
 Accuracy = Proportion of correct classifications
 Specificity = Proportion of samples correctly classified as ‘Bad’
 Sensitivity = Proportion of samples correctly classified as ‘Good’
 MCC = Matthews Correlation Coefficient (Matthews, 1975)
 Common measure of performance for classifiers within machine learning literature
 Balanced measure allows for uneven samples
 Correlation coefficient equal to phi coefficient
 +1 = perfect prediction

Method

Improving Efficiency
12

 Recursive Feature Elimination (RFE):
 Features with the lowest weights attributed by the model are eliminated
iteratively
 On each iteration:
 Feature with lowest weight identified and eliminated from training data
 New model trained on new training set
 Training therefore becomes focused on voxels for which high weights are
assigned

 Principle Component Analysis (PCA):
 Reduce dimensionality of data set
 Transforms set of correlated variables to smaller set of set of
uncorrelated variables
PCA applied to 2D data set (Jehan, 2005)

Method

Machine Learning Techniques
13

 Support Vector Machine (Classifier):
 Images treated as points in higher dimensional space
 SVM aims to identify a hyperplane that separates the two classes, while maximising the distance between classes.
 The hyperlane is defined by the set of images (support vectors) that lie on the maximal margin
 Joachims (2002, 1999), based on Vapnik (1995)

 Sparse Logistic Regression (Classifier):
 Logistic regression method applied within Bayesian framework
 Sparse Gaussian prior is assumed with mean zero
 Iterative algorithm in which least informative features are pruned
according to assigned weights
 Yamashita et al (2008)

 Relevance Vector Machine (Classification & Regression) Optimal Separating Hyperplane defined by
 Applies Bayesian techniques within a functional form similar to that of an SVM set of support vectors

 Probabilistic model therefore able to indicate probability of class membership
 By altering the conditional distribution of the target variable RVMs can be applied to both classification and regression problems
 Tipping et al (2001, 2003).

Method

NEADL Results (SVM)
14

SVM
Standard with PCA with RFE 99% Var Extremes

Tissue Type UnG AbT AbT AbT SmG
max 65% 69% 69% 70% 74%
Accuracy / Pearson's r
mean n/a 59% 62% 60% 65%
Sensitivity max 54% 46% 66% 66% 71%
Specificity max 73% 87% 71% 73% 76%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48
p< max 0.001 0.001 0.0001 0.0001 0.0001

Results

NEADL Results (SVM)
15

SVM

max 65% 69% 69% 70% 74%
mean n/a 59% 62% 60% 65%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48
p< max 0.001 0.001 0.0001 0.0001 0.0001

Results

NEADL Results (SVM)
16

SVM

max 65% 69% 69% 70% 74%
mean n/a 59% 62% 60% 65%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48
p< max 0.001 0.001 0.0001 0.0001 0.0001

Results

NEADL Results (SVM)
17

SVM

max 65% 69% 69% 70% 74%
mean n/a 59% 62% 60% 65%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48
p< max 0.001 0.001 0.0001 0.0001 0.0001

Results

NEADL Results (SVM)
18

SVM

max 65% 69% 69% 70% 74%
mean n/a 59% 62% 60% 65%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48
p< max 0.001 0.001 0.0001 0.0001 0.0001

Results

NEADL Results (SVM)
19

SVM

Tissue Type UnG AbT AbT AbT SmG Sagittal Plane
max 65% 69% 69% 70% 74%
mean n/a 59% 62% 60% 65%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48
p< max 0.001 0.001 0.0001 0.0001 0.0001

Horizontal Plane Frontal Section
Relevance map threshold at 90%:
• Voxels with weights (absolute value)
attributed by model in top 10 percentile

• Blue = negative weight
• Red = positive weight
R L
R L

Results

NEADL Results (SVM & SLR)
20

SVM SLR
Standard with PCA with RFE 99% Var Extremes Standard with PCA
(99%) & RFE

Tissue Type UnG AbT AbT AbT SmG UnG AbT
max 65% 69% 69% 70% 74% 58% 68%
mean n/a 59% 62% 60% 65% n/a 58%
Sensitivity max 54% 46% 66% 66% 71% 50% 74%
Specificity max 73% 87% 71% 73% 76% 63% 62%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48 0.13 0.37
p< max 0.001 0.001 0.0001 0.0001 0.0001 0.15 0.0001

Results

NEADL Results (SVM & SLR)
21

SVM SLR
Standard with PCA with RFE 99% Var Extremes Standard with PCA
(99%) & RFE

Tissue Type UnG AbT AbT AbT SmG UnG AbT
max 65% 69% 69% 70% 74% 58% 68%
mean n/a 59% 62% 60% 65% n/a 58%
Sensitivity max 54% 46% 66% 66% 71% 50% 74%
Specificity max 73% 87% 71% 73% 76% 63% 62%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48 0.13 0.37
p< max 0.001 0.001 0.0001 0.0001 0.0001 0.15 0.0001

Results

NEADL Results (SVM, SLR & RVM)
22

SVM SLR RVM
Standard with PCA with RFE 99% Var Extremes Standard with PCA Standard with PCA
(99%) & RFE (99%) & RFE

Tissue Type UnG AbT AbT AbT SmG UnG AbT SmG AbT
max 65% 69% 69% 70% 74% 58% 68% 67% 69%
mean n/a 59% 62% 60% 65% n/a 58% 58%
Sensitivity max 54% 46% 66% 66% 71% 50% 74% 53% 77%
Specificity max 73% 87% 71% 73% 76% 63% 62% 76% 62%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48 0.13 0.37 0.33 0.40
p< max 0.001 0.001 0.0001 0.0001 0.0001 0.15 0.0001 0.0001 0.0001

Results

NEADL Results (SVM, SLR & RVM)
23

SVM SLR RVM
Standard with PCA with RFE 99% Var Extremes Standard with PCA Standard with PCA
(99%) & RFE (99%) & RFE

Tissue Type UnG AbT AbT AbT SmG UnG AbT SmG AbT
max 65% 69% 69% 70% 74% 58% 68% 67% 69%
mean n/a 59% 62% 60% 65% n/a 58% 58%
Sensitivity max 54% 46% 66% 66% 71% 50% 74% 53% 77%
Specificity max 73% 87% 71% 73% 76% 63% 62% 76% 62%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48 0.13 0.37 0.33 0.40
p< max 0.001 0.001 0.0001 0.0001 0.0001 0.15 0.0001 0.0001 0.0001

Results

NEADL Results (SVM, SLR, RVM & RVR)
24

SVM SLR RVM RVR
Standard with PCA with RFE 99% Var Extremes Standard with PCA Standard with PCA Standard with PCA (99%), RFE
(99%) & RFE (99%) & RFE & Standardised Scores

Tissue Type UnG AbT AbT AbT SmG UnG AbT SmG AbT UnG AbT
max 65% 69% 69% 70% 74% 58% 68% 67% 69% 0.28 0.39
mean n/a 59% 62% 60% 65% n/a 58% 58% n/a 0.35
Sensitivity max 54% 46% 66% 66% 71% 50% 74% 53% 77%
Specificity max 73% 87% 71% 73% 76% 63% 62% 76% 62%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48 0.13 0.37 0.33 0.40 6.75 0.76
p< max 0.001 0.001 0.0001 0.0001 0.0001 0.15 0.0001 0.0001 0.0001 0.001 0.0001

Results

NEADL Results (SVM, SLR, RVM & RVR)
25

SVM SLR RVM RVR
Standard with PCA with RFE 99% Var Extremes Standard with PCA Standard with PCA Standard with PCA (99%), RFE
(99%) & RFE (99%) & RFE & Standardised Scores

Tissue Type UnG AbT AbT AbT SmG UnG AbT SmG AbT UnG AbT
max 65% 69% 69% 70% 74% 58% 68% 67% 69% 0.28 0.39
mean n/a 59% 62% 60% 65% n/a 58% 58% n/a 0.35
Sensitivity max 54% 46% 66% 66% 71% 50% 74% 53% 77%
Specificity max 73% 87% 71% 73% 76% 63% 62% 76% 62%
MCC / RMSE max / min 0.27 0.30 0.37 0.40 0.48 0.13 0.37 0.33 0.40 6.75 0.76
p< max 0.001 0.001 0.0001 0.0001 0.0001 0.15 0.0001 0.0001 0.0001 0.001 0.0001

Results

Summary
26

 Abnormal Tissue, Smoothed Grey Matter and Unsmoothed Grey Matter consistently
outperform other tissue types

 Application of PCA and RFE improves model performance

 Best performance produced when model trained on extreme samples within data set

 RVM, SVM & SLR classifiers predict patient recovery with significant levels of accuracy
(p<0.001)

 SVM & RVM produce similar levels of performance yet outperform SLR

 RVR predictions are highly correlated with true scores (p<0.001)

Discussion

Wider Implications
27

 Performance comparable to results in literature

 Saur et al (2010) predict language outcome 6 months after stroke with 76% accuracy
using SVM classifier

 Stonnington et al (2010) correlation between predicted and actual clinical measures of
Alzheimer's Disease (P<0.0001)

 Stroke lesions generally more heterogeneous than those typically found in
Alzheimer's Disease patients

 Few studies within currently literature applying Machine Learning to CT data to
predict patient recovery

Discussion

Methodological Issues
28

 Model evaluation and selection
 Noise may account for maximum values

 Accepted methods of evaluation and model selection:

 Average across 100 trials with sample order randomised

 Adapt algorithm to select when performance peaks

 Analyse in the context of 100 random trials with scores randomly assigned

Discussion

Future Study
29

 Improving Performance:
 Poor performance currently restricts application to patient management or assessment of
intervention programs
 Additional Variables – e.g. blood vessel effected
 Isolate ROI:
 Informed by literature (Saur et al, 2010)
 Weight maps (Ecker, 2010)
 Ensemble methods (Optiz, 1999):
 Train on individual lobes
 Bootstrap Aggregating

 Predict improvement in ADL scores
 Saur at al, 2010

 Investigate role of weighted voxels

Discussion

Acknowledgments
30

 Alan Meeson
 Provided:
 Original code for machine learning algorithms
 Support and guidance throughout project

 Vaia Lestou
 Assisted in the design and analysis of current study

Discussion

Predicting Stroke Patient Recovery from Brain Images: A Machine Learning Approach

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (17)

Similaire à Predicting Stroke Patient Recovery from Brain Images: A Machine Learning Approach

Similaire à Predicting Stroke Patient Recovery from Brain Images: A Machine Learning Approach (20)

Predicting Stroke Patient Recovery from Brain Images: A Machine Learning Approach