2_Image Classification.pdf

國立臺北護理健康大學 NTUNHS
Image Classification
Orozco Hsu
2022-03-28
1

About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2

Tutorial
Content
3
Image in AI process types
Overall technologies
Homework
Deep Learning history timeline
Exercise in image classification

Code
• Download code
• https://github.com/orozcohsu/ntunhs_2022_01.git
• Folder/file
• 20220328_inter_master/run.ipynb
4

Code
5
Click button
Open it with Colab
Copy it to your
google drive
Check your google
drive

• From 1943-2019
6
Ref: https://machinelearningknowledge.ai/brief-history-of-deep-learning/

• 2018 Turing Award
• Bengio, Hinton, and LeCun, are sometimes referred to as the "Godfathers of
AI" and "Godfathers of Deep Learning
7
Ref: https://awards.acm.org/about/2018-turing

• ImageNet dataset
• Over 15 million images with more than 22,000 categories
• ILSVRC
8
Ref: https://image-net.org/about.php
Ref:
https://www.cs.princeton.edu/courses/archiv
e/spr18/cos598B/slides/cos598b_7feb18_ima
genet.pdf

• ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
9
Ref: https://medium.com/nanonets/how-to-automate-surveillance-with-deep-learning-c8dea1d6387f
Ref: https://image-net.org/index.php

• Object detection
10
val top-1: Probability of predicting only once and being error
val top-5: Predict five times, as long as one guess is error, the probability of being error

• Object localization
11

• Object detection from video
• IoU (Intersection over Union)
• IoU > 0.5 (we set it as TP)
• IoU < 0.5 (we set it as FP)
12

More
13
Ref: https://chih-sheng-huang821.medium.com/%E6%B7%B1%E5%BA%A6%E5%AD%B8%E7%BF%92%E7%B3%BB%E5%88%97-
%E4%BB%80%E9%BA%BC%E6%98%AFap-map-aaf089920848
Precision = TP/(TP + FP)
Recall = TP/(TP + FN)

More
• AP (Average Precision): used to object detection
• mAP (more Average Precision): more objects detection
14
precision-recall curve
AP = area under curve, AUC
IS dog possibility
Precision:3/4 =0.75 Recall:3/8= 0.375
Precision:5/10 =0.5 Recall:5/8= 0.625

• CS231n:
• Convolutional Neural Networks for Visual Recognition
15
Ref: http://cs231n.stanford.edu/index.html

More
• How to Automate Surveillance with Deep Learning?
16
Ref: https://medium.com/nanonets/how-to-automate-surveillance-with-deep-learning-c8dea1d6387f

Image for deep learning process flow
17
Ref: https://www.aldec.com/en/solutions/embedded/deep-learning-using-fpga

Main type of AI image recognition
• Image classification
• Image detection
• Image segmentation
18
Can you tell the difference?

Image classification
19
Ref: https://becominghuman.ai/building-an-image-classifier-using-deep-learning-in-python-totally-from-a-beginners-perspective-be8dbaf22dd8

Image detection
20
Ref: https://medium.com/ai-techsystems/image-detection-recognition-and-image-
classification-with-machine-learning-92226ea5f595

Image segmentation
21
Ref: https://tariq-hasan.github.io/concepts/computer-vision-semantic-segmentation/

More
22
Ref: https://medium.com/@ssuchieh/openvino-the-inference-engine-c22d482dce1
Ref: https://docs.openvino.ai/latest/openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html

More
• Object Detection Raspberry Pi using OpenCV
• Online computer vision courses
23
Ref: https://www.youtube.com/watch?v=Vg9rrOFmwHo
Ref: https://www.computervision.zone/course-list/
Ref: https://www.youtube.com/watch?v=46TBBb5rAy4
Ref: https://circuitdigest.com/microcontroller-projects/license-plate-recognition-using-raspberry-pi-and-
opencv

The input is fed to the network of stacked
Conv, Pool and Dense layers
24
Ref: https://learnopencv.com/image-classification-using-convolutional-neural-networks-in-keras/
Output can be a Softmax
layer indication

More
• Softmax
25
Ref: https://rstudio-pubs-static.s3.amazonaws.com/337306_79a7966fad184532ab3ad66b322fe96e.html

Network architecture
28
Ref: https://www.sciencedirect.com/science/article/pii/S0386111219301566

Network architecture
29
Ref: https://cs231n.github.io/convolutional-networks/

Grey images
30
Support format: jpeg, png, bmp, gif

Vector/ Raster image
• Input image: bitmap(raster)/ vector images
• Channels: RGB: black[0,0,0] - white[255,255,255]; gray[0-255]
32

Improving model prediction accuracy
• Different geometric transformations
33
Ref: https://medium.com/analytics-vidhya/data-augmentation-is-it-really-necessary-b3cb12ab3c3f

Feature extractor
• Kernel map: image edge-detection,
sharpen… called image Filters
• Convolutional: Convolutional and
pooling layers which act as the feature
extractor
• Feature maps: The output of kernel
map process
34
Feature maps1
Feature maps2

Feature extractor
• Stride
35

Feature extractor
36
Ref: https://learnopencv.com/wp-content/uploads/2017/11/convolution-example-matrix.gif
(4*1)+(3*-1)+(3*1)+(2*-1)+(2*1)+(7*-1)

More
• What if you want the feature map to be of the same size as the input
image? Using zero padding on it.
37
Ref: https://towardsdatascience.com/convolution-neural-networks-a-beginners-guide-implementing-a-mnist-hand-written-digit-8aa60330d022

feature extractor
• Pooling
• Max
• Average
38
Ref: https://www.researchgate.net/figure/Toy-example-illustrating-the-drawbacks-of-
max-pooling-and-average-pooling_fig2_300020038

Feature extractor
• Spatial Contextual Module (pixel wise)
39
Ref: https://www.researchgate.net/figure/The-sum-pooling-strategy-for-feature-maps-in-a-convolutional-layer-a-Input-cloud-image_fig2_323433191

Classifier
• Flatten Layer
• It is used to convert the data into 1D arrays to create a single feature vector.
After flattening we forward the data to a fully connected layer for final
classification
40
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/

Classifier
• Dense Layer
• It is a fully connected layer. Each node in this layer is connected to the
previous layer
• This layer is used at the final stage of CNN to perform classification
41

Classifier
42
Ref: https://stats.stackexchange.com/questions/188277/activation-function-for-first-layer-nodes-in-an-ann

Classifier
• Activation function
• It enables neural networks to
become non-linear
• It has two main roles
• One is the function itself involving
in feed forward step
• Two is the derivative of the
function involving in
backpropagation step
• You can think feed forward as
prediction and backpropagation
as training or learning
43
Ref: https://sefiks.com/2020/02/02/dance-moves-of-deep-learning-activation-functions/

Classifier
• Dropout Layer
• It is used to prevent the network from overfitting
44

Keras framework
• Keras is a deep learning API written in Python, running on top of the
machine learning platform TensorFlow.
• It was developed with a focus on enabling fast experimentation
• Integrates with TensorFlow2
• Efficiently executing low-level tensor operations on CPU, GPU, or TPU
• Faster developing for deep-learning networks
• Provides FULL Connection, Convolutional, Pooling, RNN, LSTM…
• The latest version: 2.8 (2022-02-03)
45
Ref: https://keras.io/about/

Dog and cat image dataset
• This example shows how to do image classification from scratch,
starting from JPEG image files on disk, without leveraging pre-trained
weights or a pre-made Keras Application model
• We demonstrate the workflow on Cats vs Dogs binary classification
dataset
46
What is the pre-trained weights?

• Cat and Dog: 23,422 images
• Training: 18,738 images
• Validation: 4,684 images
• Image size are not fixed!
• 350X320
• 448X329
• …
• …
47

More
• Data imbalance
• If a dataset consists of 100 cat and 900 dog images. If we train the neural
network on this data, it will just learn to predict dog every time
• In this case, we can easily balance the data using sampling techniques
• Down-sampling
• By removing some dog examples
• Up-sampling
• By creating more cat examples using image augmentation or any other method
48
Ref: Multi-Label Image Classification with Neural Network | Keras | by Shiva Verma | Towards Data Science

• Network Calculator
49
Ref: https://madebyollin.github.io/convnet-calculator/

• epochs = 50
50
…

More
• Install the conda package if it happens to error
51
!conda uninstall pydot
!conda uninstall pydotplus
!conda uninstall graphviz
!conda install pydot
!conda install pydotplus
Graphviz is open source graph visualization software.
Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks.
https://graphviz.org

More
• Model progress can be saved during and after training. This means a
model can resume where it left off and avoid long training times.
Saving also means you can share your model and others can recreate
your work
• https://www.tensorflow.org/tutorials/keras/save_and_load
52

More
• Transformer model (Transformers Replace CNNs, RNNs)
53
Ref: https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/

Homework
• Try to reduce the Cat and Dog images to 500 separately, and
check the result of performance.
• Try to add additional bird images as the third image label
• Download the bird images
• https://drive.google.com/file/d/1NgmjVrRug_qPqlfU_Zb2O-
kyd5Hblaat/view?usp=sharing
• Modify the code, build the model to multi-class prediction
54

Hint
• one-hot-encoded for the label/classes like [0, 0, 0, 0, 0, 0, 1]
• categorical_crossentropy
• sparse label/classes like [1, 2, 3, 4, 5, 6, 7]
• sparse_categorical_crossentropy
55

2_Image Classification.pdf

Recommandé

Recommandé

Contenu connexe

Similaire à 2_Image Classification.pdf

Similaire à 2_Image Classification.pdf (20)

Plus de FEG

Plus de FEG (20)

Dernier

Dernier (20)

2_Image Classification.pdf