This document discusses a hierarchical classification system for skin cancer images. It uses binary classifiers and voting to achieve higher accuracy (76%) than a pretrained CNN model (22%) with fewer images. The system initially classifies lesions with CNNs. Features are extracted from the CNN output and used to build a hierarchical classifier with pairwise classifiers and voting. This approach derives a taxonomy of skin cancer classes from image data alone, replicating expert domain knowledge. It can be deployed on mobile applications to improve healthcare access.
2. • Detection of Skin Cancer is usually done by visual diagnosis that
comprises of an initial screening of skin lesions by analysing features
like colour, depth and texture of lesions by medical professionals,
potentially followed by dermoscopic evaluation, biopsies and
histopathological examination. In this project we extract domain
knowledge out of image data of the lesions, creating a hierarchy of
classes of skin cancer by the use of classification using
binary(pairwise) classifiers and applying a voting mechanism on the
results
3. • Previously only deep learning algorithms like CNN have been used for
cancer image classification requiring very large number of images to
train the model to be able to achieve desired accuracy of prediction.
In this project we develop an algorithm using the power of binary
classifiers and voting on predictions to be able to get more accuracy
(76% for our model as compared to 22% of a pretrained CNN model
for same image dataset!) hence making the requirement of large
number of image dataset dispensable.
4. • Initially, lesions are classified into various diagnoses with the help of
convolutional neural networks (CNNs) for automated classification of
lesion images with fine grained variability. Input to this CNN model
are diagnosis labels and images. The output of the fully connected
layer from CNN is then used to extract features and build a binary
hierarchical classifier (bottom up approach) based on similarity
between classes by using voting mechanism on pairwise classifiers
and hence the taxonomy that is otherwise known only to domain
experts and dermatologists, is now derived from data. This fast and
scalable method can further be deployed on mobile applications
hence bringing primary healthcare to a broader range of use
5. • The Inception V3 model architecture that was pre-trained using 1.28
million images for 1000 categories from the ImageNet Large Scale
Visual Recognition Challenge was used in this project and trained on
Skin Cancer image dataset using transfer learning. The output of this
model from the last fully connected layer after adding few added
layers, is then used as input to the algorithm of binary classification
and voting