2. About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
3. Tutorial
Content
3
Image in AI process types
Overall technologies
Homework
Deep Learning history timeline
Exercise in image classification
6. Deep Learning history timeline
• From 1943-2019
6
Ref: https://machinelearningknowledge.ai/brief-history-of-deep-learning/
7. Deep Learning history timeline
• 2018 Turing Award
• Bengio, Hinton, and LeCun, are sometimes referred to as the "Godfathers of
AI" and "Godfathers of Deep Learning
7
Ref: https://awards.acm.org/about/2018-turing
8. Deep Learning history timeline
• ImageNet dataset
• Over 15 million images with more than 22,000 categories
• ILSVRC
8
Ref: https://image-net.org/about.php
Ref:
https://www.cs.princeton.edu/courses/archiv
e/spr18/cos598B/slides/cos598b_7feb18_ima
genet.pdf
9. Deep Learning history timeline
• ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
9
Ref: https://medium.com/nanonets/how-to-automate-surveillance-with-deep-learning-c8dea1d6387f
Ref: https://image-net.org/index.php
10. Deep Learning history timeline
• Object detection
10
val top-1: Probability of predicting only once and being error
val top-5: Predict five times, as long as one guess is error, the probability of being error
12. Deep Learning history timeline
• Object detection from video
• IoU (Intersection over Union)
• IoU > 0.5 (we set it as TP)
• IoU < 0.5 (we set it as FP)
12
14. More
• AP (Average Precision): used to object detection
• mAP (more Average Precision): more objects detection
14
precision-recall curve
AP = area under curve, AUC
IS dog possibility
Precision:3/4 =0.75 Recall:3/8= 0.375
Precision:5/10 =0.5 Recall:5/8= 0.625
15. Deep Learning history timeline
• CS231n:
• Convolutional Neural Networks for Visual Recognition
15
Ref: http://cs231n.stanford.edu/index.html
16. More
• How to Automate Surveillance with Deep Learning?
16
Ref: https://medium.com/nanonets/how-to-automate-surveillance-with-deep-learning-c8dea1d6387f
17. Image for deep learning process flow
17
Ref: https://www.aldec.com/en/solutions/embedded/deep-learning-using-fpga
18. Main type of AI image recognition
• Image classification
• Image detection
• Image segmentation
18
Can you tell the difference?
23. More
• Object Detection Raspberry Pi using OpenCV
• Online computer vision courses
23
Ref: https://www.youtube.com/watch?v=Vg9rrOFmwHo
Ref: https://www.computervision.zone/course-list/
Ref: https://www.youtube.com/watch?v=46TBBb5rAy4
Ref: https://circuitdigest.com/microcontroller-projects/license-plate-recognition-using-raspberry-pi-and-
opencv
24. The input is fed to the network of stacked
Conv, Pool and Dense layers
24
Ref: https://learnopencv.com/image-classification-using-convolutional-neural-networks-in-keras/
Output can be a Softmax
layer indication
33. Improving model prediction accuracy
• Different geometric transformations
33
Ref: https://medium.com/analytics-vidhya/data-augmentation-is-it-really-necessary-b3cb12ab3c3f
34. Feature extractor
• Kernel map: image edge-detection,
sharpen… called image Filters
• Convolutional: Convolutional and
pooling layers which act as the feature
extractor
• Feature maps: The output of kernel
map process
34
Feature maps1
Feature maps2
37. More
• What if you want the feature map to be of the same size as the input
image? Using zero padding on it.
37
Ref: https://towardsdatascience.com/convolution-neural-networks-a-beginners-guide-implementing-a-mnist-hand-written-digit-8aa60330d022
38. feature extractor
• Pooling
• Max
• Average
38
Ref: https://www.researchgate.net/figure/Toy-example-illustrating-the-drawbacks-of-
max-pooling-and-average-pooling_fig2_300020038
40. Classifier
• Flatten Layer
• It is used to convert the data into 1D arrays to create a single feature vector.
After flattening we forward the data to a fully connected layer for final
classification
40
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/
41. Classifier
• Dense Layer
• It is a fully connected layer. Each node in this layer is connected to the
previous layer
• This layer is used at the final stage of CNN to perform classification
41
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/
43. Classifier
• Activation function
• It enables neural networks to
become non-linear
• It has two main roles
• One is the function itself involving
in feed forward step
• Two is the derivative of the
function involving in
backpropagation step
• You can think feed forward as
prediction and backpropagation
as training or learning
43
Ref: https://sefiks.com/2020/02/02/dance-moves-of-deep-learning-activation-functions/
44. Classifier
• Dropout Layer
• It is used to prevent the network from overfitting
44
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/
45. Keras framework
• Keras is a deep learning API written in Python, running on top of the
machine learning platform TensorFlow.
• It was developed with a focus on enabling fast experimentation
• Integrates with TensorFlow2
• Efficiently executing low-level tensor operations on CPU, GPU, or TPU
• Faster developing for deep-learning networks
• Provides FULL Connection, Convolutional, Pooling, RNN, LSTM…
• The latest version: 2.8 (2022-02-03)
45
Ref: https://keras.io/about/
46. Dog and cat image dataset
• This example shows how to do image classification from scratch,
starting from JPEG image files on disk, without leveraging pre-trained
weights or a pre-made Keras Application model
• We demonstrate the workflow on Cats vs Dogs binary classification
dataset
46
What is the pre-trained weights?
47. Dog and cat image dataset
• Cat and Dog: 23,422 images
• Training: 18,738 images
• Validation: 4,684 images
• Image size are not fixed!
• 350X320
• 448X329
• …
• …
47
48. More
• Data imbalance
• If a dataset consists of 100 cat and 900 dog images. If we train the neural
network on this data, it will just learn to predict dog every time
• In this case, we can easily balance the data using sampling techniques
• Down-sampling
• By removing some dog examples
• Up-sampling
• By creating more cat examples using image augmentation or any other method
48
Ref: Multi-Label Image Classification with Neural Network | Keras | by Shiva Verma | Towards Data Science
49. Dog and cat image dataset
• Network Calculator
49
Ref: https://madebyollin.github.io/convnet-calculator/
51. More
• Install the conda package if it happens to error
51
!conda uninstall pydot
!conda uninstall pydotplus
!conda uninstall graphviz
!conda install pydot
!conda install pydotplus
Graphviz is open source graph visualization software.
Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks.
https://graphviz.org
52. More
• Model progress can be saved during and after training. This means a
model can resume where it left off and avoid long training times.
Saving also means you can share your model and others can recreate
your work
• https://www.tensorflow.org/tutorials/keras/save_and_load
52
53. More
• Transformer model (Transformers Replace CNNs, RNNs)
53
Ref: https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/
54. Homework
• Try to reduce the Cat and Dog images to 500 separately, and
check the result of performance.
• Try to add additional bird images as the third image label
• Download the bird images
• https://drive.google.com/file/d/1NgmjVrRug_qPqlfU_Zb2O-
kyd5Hblaat/view?usp=sharing
• Modify the code, build the model to multi-class prediction
54
55. Hint
• one-hot-encoded for the label/classes like [0, 0, 0, 0, 0, 0, 1]
• categorical_crossentropy
• sparse label/classes like [1, 2, 3, 4, 5, 6, 7]
• sparse_categorical_crossentropy
55