SlideShare une entreprise Scribd logo
1  sur  66
Convolutional neural network
Implementation of a deep convolutional
network
Consider Padding=0 and stride=1
Transfer learning with pre-trained CNN (This we already did in previous unit)
• Data augmentation is a technique to artificially create new training data from existing training data. This is done
by applying domain-specific techniques to examples from the training data that create new and different
training examples.
• Image data augmentation is perhaps the most well-known type of data augmentation and involves creating
transformed versions of images in the training dataset that belong to the same class as the original image.
• Transforms include a range of operations from the field of image manipulation, such as shifts, flips, zooms, and
much more.
• The intent is to expand the training dataset with new, plausible examples. This means, variations of the training
set images that are likely to be seen by the model. For example, a horizontal flip of a picture of a cat may make
sense, because the photo could have been taken from the left or right.
• A vertical flip of the photo of a cat does not make sense and would probably not be appropriately given that the
model is very unlikely to see a photo of an upside down cat.
Data augmentation
• As such, it is clear that the choice of the specific data augmentation techniques used for a training dataset
must be chosen carefully and within the context of the training dataset and knowledge of the problem
domain.
• In addition, it can be useful to experiment with data augmentation methods in isolation and in concert to see
if they result in a measurable improvement to model performance, perhaps with a small prototype dataset,
model, and training run.
• Modern deep learning algorithms, such as the convolutional neural network, or CNN, can learn features that
are invariant to their location in the image.
• Nevertheless, augmentation can further aid in this transform invariant approach to learning and can aid the
model in learning features that are also invariant to transforms such as left-to-right to top-to-bottom ordering,
light levels in photographs, and more.
• Image data augmentation is typically only applied to the training dataset, and not to the validation or test
dataset. This is different from data preparation such as image resizing and pixel scaling; they must be
performed consistently across all datasets that interact with the model.
Some of the most common data augmentation techniques used for images are:
Position augmentation
• Scaling
• Cropping
• Flipping
• Padding
• Rotation
• Translation
• Affine transformation
Color augmentation
• Brightness
• Contrast
• Saturation
• Hue
Image segmentation
Image segmentation is a computer vision task that segments an image into multiple areas by assigning
a label to every pixel of the image. It provides much more information about an image than object
detection, which draws a bounding box around the detected object, or image classification, which
assigns a label to the object.
Segmentation is useful and can be used in real-world applications such as medical imaging, clothes
segmentation, flooding maps, self-driving cars, etc.
There are two types of image segmentation:
• Semantic segmentation: classify each pixel with a label.
• Instance segmentation: classify each pixel and differentiate each object instance.
U-Net is a semantic segmentation technique originally proposed for medical imaging segmentation. It’s
one of the earlier deep learning segmentation models, and the U-Net architecture is also used in many
GAN variants such as the Pix2Pix generator.
U-Net Architecture
The model architecture is fairly simple: an encoder (for downsampling) and a decoder (for upsampling) with
skip connections. As Figure shows, it shapes like the letter U hence the name U-Net.
The gray arrows indicate the skip connections that concatenate the encoder feature map with the decoder,
which helps the backward flow of gradients for improved training.
Import required libraries
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
import numpy as np
Dataset
we can use tfds to load the dataset by specifying the name of the dataset, and get the dataset
info by setting with_info=True
dataset, info = tfds.load('oxford_iiit_pet:3.*.*', with_info=True)
Print the dataset info with print(info), and you will see all kinds of detailed information about the Oxford
pet dataset. For example, in Figure below, it can seen there are a total of 7349 images with a built-in
test/train split.
To make a few changes to the downloaded data before start of training U-Net with it.
First, resize the images and masks to 128x128:
def resize(input_image, input_mask):
input_image = tf.image.resize(input_image, (128, 128), method="nearest")
input_mask = tf.image.resize(input_mask, (128, 128), method="nearest")
return input_image, input_mask
Create a function to augment the dataset by flipping them horizontally:
def augment(input_image, input_mask):
if tf.random.uniform(()) > 0.5:
# Random flipping of the image and mask
input_image = tf.image.flip_left_right(input_image)
input_mask = tf.image.flip_left_right(input_mask)
return input_image, input_mask
Create a function to normalize the dataset by scaling the images to the range of [-1, 1] and decreasing the image mask
by 1:
def normalize(input_image, input_mask):
input_image = tf.cast(input_image, tf.float32) / 255.0
input_mask -= 1
return input_image, input_mask
create two functions to preprocess the training and test datasets with a slight difference between the two – we only
perform image augmentation on the training dataset.
def load_image_train(datapoint):
input_image = datapoint["image"]
input_mask = datapoint["segmentation_mask"]
input_image, input_mask = resize(input_image, input_mask)
input_image, input_mask = augment(input_image, input_mask)
input_image, input_mask = normalize(input_image, input_mask)
return input_image, input_mask
def load_image_test(datapoint):
input_image = datapoint["image"]
input_mask = datapoint["segmentation_mask"]
input_image, input_mask = resize(input_image, input_mask)
input_image, input_mask = normalize(input_image, input_mask)
return input_image, input_mask
Now, build an input pipeline with tf.data by using the map() function:
train_dataset = dataset["train"].map(load_image_train, num_parallel_calls=tf.data.AUTOTUNE)
test_dataset = dataset["test"].map(load_image_test, num_parallel_calls=tf.data.AUTOTUNE)
If we execute print(train_dataset), we will notice that the image is in the shape of 128x128x3 of tf.float32
while the image mask is in the shape of 128x128x1 with the data type of tf.uint8.
We define a batch size of 64 and a buffer size of 1000 for creating batches of training and test datasets.
With the original TFDS dataset, there are 3680 training samples and 3669 test samples, which are further
split into validation/test sets. We will use the train_batches and the validation_batches for training the U-
Net model. After the training finishes, we will then use the test_batches to test the model predictions.
BATCH_SIZE = 64
BUFFER_SIZE = 1000
train_batches = train_dataset.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()
train_batches = train_batches.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
validation_batches = test_dataset.take(3000).batch(BATCH_SIZE)
test_batches = test_dataset.skip(3000).take(669).batch(BATCH_SIZE)
Now the datasets are ready for training. Let’s visualize a random sample image and its mask from the
training dataset, to get an idea of how the data looks.​​
def display(display_list):
plt.figure(figsize=(15, 15))
title = ["Input Image", "True Mask", "Predicted Mask"]
for i in range(len(display_list)):
plt.subplot(1, len(display_list), i+1)
plt.title(title[i])
plt.imshow(tf.keras.utils.array_to_img(display_list[i]))
plt.axis("off")
plt.show()
sample_batch = next(iter(train_batches))
random_index = np.random.choice(sample_batch[0].shape[0])
sample_image, sample_mask = sample_batch[0][random_index], sample_batch[1][random_index]
display([sample_image, sample_mask])
Output​​
Model Architecture
Now that we have the data ready for training, let’s define the U-Net model architecture. As mentioned earlier, the
U-Net is shaped like a letter U with an encoder, decoder, and the skip connections between them. So we will create
a few building blocks to make the U-Net model.
Building blocks
First, we create a function double_conv_block with layers Conv2D-ReLU-Conv2D-ReLU, which we will use in both the
encoder (or the contracting path) and the bottleneck of the U-Net.
def double_conv_block(x, n_filters):
# Conv2D then ReLU activation
x = layers.Conv2D(n_filters, 3, padding = "same", activation = "relu", kernel_initializer = "he_normal")(x)
# Conv2D then ReLU activation
x = layers.Conv2D(n_filters, 3, padding = "same", activation = "relu", kernel_initializer = "he_normal")(x)
return x
Then we define a downsample_block function for downsampling or feature extraction to be used in the encoder.
def downsample_block(x, n_filters):
f = double_conv_block(x, n_filters)
p = layers.MaxPool2D(2)(f)
p = layers.Dropout(0.3)(p)
return f, p
Finally, we define an upsampling function upsample_block for the decoder (or expanding path) of the U-Net.
def upsample_block(x, conv_features, n_filters):
# upsample
x = layers.Conv2DTranspose(n_filters, 3, 2, padding="same")(x)
# concatenate
x = layers.concatenate([x, conv_features])
# dropout
x = layers.Dropout(0.3)(x)
# Conv2D twice with ReLU activation
x = double_conv_block(x, n_filters)
return x
U-Net has a fairly simple architecture; however, to create the skip connections between
the encoder and decoder, we will need to concatenate some layers. So the Keras
Functional API is most appropriate for this purpose.
First, we create a build_unet_model function, specify the inputs, encoder layers,
bottleneck, decoder layers, and finally the output layer with Conv2D with activation of
softmax. Note the input image shape is 128x128x3. The output has three channels
corresponding to the three classes that the model will classify each pixel for: background,
foreground object, and object outline.
# inputs
inputs = layers.Input(shape=(128,128,3))
# encoder: contracting path - downsample
# 1 - downsample
f1, p1 = downsample_block(inputs, 64)
# 2 - downsample
f2, p2 = downsample_block(p1, 128)
# 3 - downsample
f3, p3 = downsample_block(p2, 256)
# 4 - downsample
f4, p4 = downsample_block(p3, 512)
# 5 - bottleneck
bottleneck = double_conv_block(p4, 1024)
# decoder: expanding path - upsample
# 6 - upsample
u6 = upsample_block(bottleneck, f4, 512)
# 7 - upsample
u7 = upsample_block(u6, f3, 256)
# 8 - upsample
u8 = upsample_block(u7, f2, 128)
# 9 - upsample
u9 = upsample_block(u8, f1, 64)
# outputs
outputs = layers.Conv2D(3, 1, padding="same", activation = "softmax")(u9)
# unet model with Keras Functional API
unet_model = tf.keras.Model(inputs, outputs, name="U-Net")
Compile and Train U-Net
To compile unet_model, we specify the optimizer, the loss function, and the accuracy metrics to track
during training:
unet_model.compile(optimizer=tf.keras.optimizers.Adam(),
loss="sparse_categorical_crossentropy",
metrics="accuracy")
We train the unet_model by calling model.fit() and training it for 20 epochs.
NUM_EPOCHS = 20
TRAIN_LENGTH = info.splits["train"].num_examples
STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE
VAL_SUBSPLITS = 5
TEST_LENTH = info.splits["test"].num_examples
VALIDATION_STEPS = TEST_LENTH // BATCH_SIZE // VAL_SUBSPLITS
model_history = unet_model.fit(train_batches,
epochs=NUM_EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_steps=VALIDATION_STEPS,
validation_data=test_batches)
After training for 20 epochs, we get a training accuracy and a validation accuracy of ~0.88. The
learning curve during training indicates that the model is doing well on both the training dataset and
validation set, which indicates the model is generalizing well without much overfitting (as shown in
Figure below).
Prediction
Now that we have completed training the unet_model, let’s use it to make predictions on a
few sample images of the test dataset.
def create_mask(pred_mask):
pred_mask = tf.argmax(pred_mask, axis=-1)
pred_mask = pred_mask[..., tf.newaxis]
return pred_mask[0]
def show_predictions(dataset=None, num=1):
if dataset:
for image, mask in dataset.take(num):
pred_mask = unet_model.predict(image)
display([image[0], mask[0], create_mask(pred_mask)])
else:
display([sample_image, sample_mask,
create_mask(model.predict(sample_image[tf.newaxis, ...]))])
count = 0
for i in test_batches:
count +=1
print("number of batches:", count)
Figure below shows the input images, the true masks, and the masks predicted by the trained U-Net model
Thank You!!!

Contenu connexe

Similaire à CNN_INTRO.pptx

Ai_Project_report
Ai_Project_reportAi_Project_report
Ai_Project_report
Ravi Gupta
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Sitakanta Mishra
 

Similaire à CNN_INTRO.pptx (20)

Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
 
How to Build a Neural Network and Make Predictions
How to Build a Neural Network and Make PredictionsHow to Build a Neural Network and Make Predictions
How to Build a Neural Network and Make Predictions
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)
 
Dssg talk CNN intro
Dssg talk CNN introDssg talk CNN intro
Dssg talk CNN intro
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
[Paper] learning video representations from correspondence proposals
[Paper]  learning video representations from correspondence proposals[Paper]  learning video representations from correspondence proposals
[Paper] learning video representations from correspondence proposals
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 
IMAGE DE-NOISING USING DEEP NEURAL NETWORK
IMAGE DE-NOISING USING DEEP NEURAL NETWORKIMAGE DE-NOISING USING DEEP NEURAL NETWORK
IMAGE DE-NOISING USING DEEP NEURAL NETWORK
 
Linear Regression (Machine Learning)
Linear Regression (Machine Learning)Linear Regression (Machine Learning)
Linear Regression (Machine Learning)
 
Multiclass Recognition with Multiple Feature Trees
Multiclass Recognition with Multiple Feature TreesMulticlass Recognition with Multiple Feature Trees
Multiclass Recognition with Multiple Feature Trees
 
Multiclass recognition with
Multiclass recognition withMulticlass recognition with
Multiclass recognition with
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
 
自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用自然方策勾配法の基礎と応用
自然方策勾配法の基礎と応用
 
BMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorialBMVA summer school MATLAB programming tutorial
BMVA summer school MATLAB programming tutorial
 
Ai_Project_report
Ai_Project_reportAi_Project_report
Ai_Project_report
 
Tensor flow
Tensor flowTensor flow
Tensor flow
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 
research_paper
research_paperresearch_paper
research_paper
 

Plus de NiharikaThakur32 (12)

Introduction to Heart Anomaly 2.pptx
Introduction to Heart Anomaly 2.pptxIntroduction to Heart Anomaly 2.pptx
Introduction to Heart Anomaly 2.pptx
 
Introduction to Heart Anomaly 1.pptx
Introduction to Heart Anomaly 1.pptxIntroduction to Heart Anomaly 1.pptx
Introduction to Heart Anomaly 1.pptx
 
CNN Basics.pdf
CNN Basics.pdfCNN Basics.pdf
CNN Basics.pdf
 
Lecture-11.pdf
Lecture-11.pdfLecture-11.pdf
Lecture-11.pdf
 
GAN.pdf
GAN.pdfGAN.pdf
GAN.pdf
 
RNN.pdf
RNN.pdfRNN.pdf
RNN.pdf
 
UNIT-4.pdf
UNIT-4.pdfUNIT-4.pdf
UNIT-4.pdf
 
UNIT-4.pdf
UNIT-4.pdfUNIT-4.pdf
UNIT-4.pdf
 
CNN.pptx
CNN.pptxCNN.pptx
CNN.pptx
 
CNN.pptx
CNN.pptxCNN.pptx
CNN.pptx
 
Lecture-1.pptx
Lecture-1.pptxLecture-1.pptx
Lecture-1.pptx
 
UNIT-4.pptx
UNIT-4.pptxUNIT-4.pptx
UNIT-4.pptx
 

Dernier

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Dernier (20)

VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 

CNN_INTRO.pptx

  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. Implementation of a deep convolutional network
  • 34.
  • 35. Transfer learning with pre-trained CNN (This we already did in previous unit)
  • 36. • Data augmentation is a technique to artificially create new training data from existing training data. This is done by applying domain-specific techniques to examples from the training data that create new and different training examples. • Image data augmentation is perhaps the most well-known type of data augmentation and involves creating transformed versions of images in the training dataset that belong to the same class as the original image. • Transforms include a range of operations from the field of image manipulation, such as shifts, flips, zooms, and much more. • The intent is to expand the training dataset with new, plausible examples. This means, variations of the training set images that are likely to be seen by the model. For example, a horizontal flip of a picture of a cat may make sense, because the photo could have been taken from the left or right. • A vertical flip of the photo of a cat does not make sense and would probably not be appropriately given that the model is very unlikely to see a photo of an upside down cat. Data augmentation
  • 37. • As such, it is clear that the choice of the specific data augmentation techniques used for a training dataset must be chosen carefully and within the context of the training dataset and knowledge of the problem domain. • In addition, it can be useful to experiment with data augmentation methods in isolation and in concert to see if they result in a measurable improvement to model performance, perhaps with a small prototype dataset, model, and training run. • Modern deep learning algorithms, such as the convolutional neural network, or CNN, can learn features that are invariant to their location in the image. • Nevertheless, augmentation can further aid in this transform invariant approach to learning and can aid the model in learning features that are also invariant to transforms such as left-to-right to top-to-bottom ordering, light levels in photographs, and more. • Image data augmentation is typically only applied to the training dataset, and not to the validation or test dataset. This is different from data preparation such as image resizing and pixel scaling; they must be performed consistently across all datasets that interact with the model.
  • 38. Some of the most common data augmentation techniques used for images are: Position augmentation • Scaling • Cropping • Flipping • Padding • Rotation • Translation • Affine transformation Color augmentation • Brightness • Contrast • Saturation • Hue
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 50. Image segmentation is a computer vision task that segments an image into multiple areas by assigning a label to every pixel of the image. It provides much more information about an image than object detection, which draws a bounding box around the detected object, or image classification, which assigns a label to the object. Segmentation is useful and can be used in real-world applications such as medical imaging, clothes segmentation, flooding maps, self-driving cars, etc. There are two types of image segmentation: • Semantic segmentation: classify each pixel with a label. • Instance segmentation: classify each pixel and differentiate each object instance. U-Net is a semantic segmentation technique originally proposed for medical imaging segmentation. It’s one of the earlier deep learning segmentation models, and the U-Net architecture is also used in many GAN variants such as the Pix2Pix generator.
  • 51. U-Net Architecture The model architecture is fairly simple: an encoder (for downsampling) and a decoder (for upsampling) with skip connections. As Figure shows, it shapes like the letter U hence the name U-Net. The gray arrows indicate the skip connections that concatenate the encoder feature map with the decoder, which helps the backward flow of gradients for improved training.
  • 52. Import required libraries import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers import tensorflow_datasets as tfds import matplotlib.pyplot as plt import numpy as np Dataset we can use tfds to load the dataset by specifying the name of the dataset, and get the dataset info by setting with_info=True dataset, info = tfds.load('oxford_iiit_pet:3.*.*', with_info=True)
  • 53. Print the dataset info with print(info), and you will see all kinds of detailed information about the Oxford pet dataset. For example, in Figure below, it can seen there are a total of 7349 images with a built-in test/train split.
  • 54. To make a few changes to the downloaded data before start of training U-Net with it. First, resize the images and masks to 128x128: def resize(input_image, input_mask): input_image = tf.image.resize(input_image, (128, 128), method="nearest") input_mask = tf.image.resize(input_mask, (128, 128), method="nearest") return input_image, input_mask Create a function to augment the dataset by flipping them horizontally: def augment(input_image, input_mask): if tf.random.uniform(()) > 0.5: # Random flipping of the image and mask input_image = tf.image.flip_left_right(input_image) input_mask = tf.image.flip_left_right(input_mask) return input_image, input_mask
  • 55. Create a function to normalize the dataset by scaling the images to the range of [-1, 1] and decreasing the image mask by 1: def normalize(input_image, input_mask): input_image = tf.cast(input_image, tf.float32) / 255.0 input_mask -= 1 return input_image, input_mask create two functions to preprocess the training and test datasets with a slight difference between the two – we only perform image augmentation on the training dataset. def load_image_train(datapoint): input_image = datapoint["image"] input_mask = datapoint["segmentation_mask"] input_image, input_mask = resize(input_image, input_mask) input_image, input_mask = augment(input_image, input_mask) input_image, input_mask = normalize(input_image, input_mask) return input_image, input_mask def load_image_test(datapoint): input_image = datapoint["image"] input_mask = datapoint["segmentation_mask"] input_image, input_mask = resize(input_image, input_mask) input_image, input_mask = normalize(input_image, input_mask) return input_image, input_mask
  • 56. Now, build an input pipeline with tf.data by using the map() function: train_dataset = dataset["train"].map(load_image_train, num_parallel_calls=tf.data.AUTOTUNE) test_dataset = dataset["test"].map(load_image_test, num_parallel_calls=tf.data.AUTOTUNE) If we execute print(train_dataset), we will notice that the image is in the shape of 128x128x3 of tf.float32 while the image mask is in the shape of 128x128x1 with the data type of tf.uint8. We define a batch size of 64 and a buffer size of 1000 for creating batches of training and test datasets. With the original TFDS dataset, there are 3680 training samples and 3669 test samples, which are further split into validation/test sets. We will use the train_batches and the validation_batches for training the U- Net model. After the training finishes, we will then use the test_batches to test the model predictions. BATCH_SIZE = 64 BUFFER_SIZE = 1000 train_batches = train_dataset.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat() train_batches = train_batches.prefetch(buffer_size=tf.data.experimental.AUTOTUNE) validation_batches = test_dataset.take(3000).batch(BATCH_SIZE) test_batches = test_dataset.skip(3000).take(669).batch(BATCH_SIZE)
  • 57. Now the datasets are ready for training. Let’s visualize a random sample image and its mask from the training dataset, to get an idea of how the data looks.​​ def display(display_list): plt.figure(figsize=(15, 15)) title = ["Input Image", "True Mask", "Predicted Mask"] for i in range(len(display_list)): plt.subplot(1, len(display_list), i+1) plt.title(title[i]) plt.imshow(tf.keras.utils.array_to_img(display_list[i])) plt.axis("off") plt.show() sample_batch = next(iter(train_batches)) random_index = np.random.choice(sample_batch[0].shape[0]) sample_image, sample_mask = sample_batch[0][random_index], sample_batch[1][random_index] display([sample_image, sample_mask]) Output​​
  • 58. Model Architecture Now that we have the data ready for training, let’s define the U-Net model architecture. As mentioned earlier, the U-Net is shaped like a letter U with an encoder, decoder, and the skip connections between them. So we will create a few building blocks to make the U-Net model. Building blocks First, we create a function double_conv_block with layers Conv2D-ReLU-Conv2D-ReLU, which we will use in both the encoder (or the contracting path) and the bottleneck of the U-Net. def double_conv_block(x, n_filters): # Conv2D then ReLU activation x = layers.Conv2D(n_filters, 3, padding = "same", activation = "relu", kernel_initializer = "he_normal")(x) # Conv2D then ReLU activation x = layers.Conv2D(n_filters, 3, padding = "same", activation = "relu", kernel_initializer = "he_normal")(x) return x Then we define a downsample_block function for downsampling or feature extraction to be used in the encoder. def downsample_block(x, n_filters): f = double_conv_block(x, n_filters) p = layers.MaxPool2D(2)(f) p = layers.Dropout(0.3)(p) return f, p
  • 59. Finally, we define an upsampling function upsample_block for the decoder (or expanding path) of the U-Net. def upsample_block(x, conv_features, n_filters): # upsample x = layers.Conv2DTranspose(n_filters, 3, 2, padding="same")(x) # concatenate x = layers.concatenate([x, conv_features]) # dropout x = layers.Dropout(0.3)(x) # Conv2D twice with ReLU activation x = double_conv_block(x, n_filters) return x
  • 60. U-Net has a fairly simple architecture; however, to create the skip connections between the encoder and decoder, we will need to concatenate some layers. So the Keras Functional API is most appropriate for this purpose. First, we create a build_unet_model function, specify the inputs, encoder layers, bottleneck, decoder layers, and finally the output layer with Conv2D with activation of softmax. Note the input image shape is 128x128x3. The output has three channels corresponding to the three classes that the model will classify each pixel for: background, foreground object, and object outline.
  • 61. # inputs inputs = layers.Input(shape=(128,128,3)) # encoder: contracting path - downsample # 1 - downsample f1, p1 = downsample_block(inputs, 64) # 2 - downsample f2, p2 = downsample_block(p1, 128) # 3 - downsample f3, p3 = downsample_block(p2, 256) # 4 - downsample f4, p4 = downsample_block(p3, 512) # 5 - bottleneck bottleneck = double_conv_block(p4, 1024) # decoder: expanding path - upsample # 6 - upsample u6 = upsample_block(bottleneck, f4, 512) # 7 - upsample u7 = upsample_block(u6, f3, 256) # 8 - upsample u8 = upsample_block(u7, f2, 128) # 9 - upsample u9 = upsample_block(u8, f1, 64) # outputs outputs = layers.Conv2D(3, 1, padding="same", activation = "softmax")(u9) # unet model with Keras Functional API unet_model = tf.keras.Model(inputs, outputs, name="U-Net")
  • 62. Compile and Train U-Net To compile unet_model, we specify the optimizer, the loss function, and the accuracy metrics to track during training: unet_model.compile(optimizer=tf.keras.optimizers.Adam(), loss="sparse_categorical_crossentropy", metrics="accuracy") We train the unet_model by calling model.fit() and training it for 20 epochs. NUM_EPOCHS = 20 TRAIN_LENGTH = info.splits["train"].num_examples STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE VAL_SUBSPLITS = 5 TEST_LENTH = info.splits["test"].num_examples VALIDATION_STEPS = TEST_LENTH // BATCH_SIZE // VAL_SUBSPLITS model_history = unet_model.fit(train_batches, epochs=NUM_EPOCHS, steps_per_epoch=STEPS_PER_EPOCH, validation_steps=VALIDATION_STEPS, validation_data=test_batches)
  • 63. After training for 20 epochs, we get a training accuracy and a validation accuracy of ~0.88. The learning curve during training indicates that the model is doing well on both the training dataset and validation set, which indicates the model is generalizing well without much overfitting (as shown in Figure below).
  • 64. Prediction Now that we have completed training the unet_model, let’s use it to make predictions on a few sample images of the test dataset. def create_mask(pred_mask): pred_mask = tf.argmax(pred_mask, axis=-1) pred_mask = pred_mask[..., tf.newaxis] return pred_mask[0] def show_predictions(dataset=None, num=1): if dataset: for image, mask in dataset.take(num): pred_mask = unet_model.predict(image) display([image[0], mask[0], create_mask(pred_mask)]) else: display([sample_image, sample_mask, create_mask(model.predict(sample_image[tf.newaxis, ...]))]) count = 0 for i in test_batches: count +=1 print("number of batches:", count)
  • 65. Figure below shows the input images, the true masks, and the masks predicted by the trained U-Net model