5. Scientific computing package to replace NumPy to use
the power of GPU.
A deep learning research platform that provide
maximum flexibility and speed
6. A complete Python rewrite of Machine Learning library
called Torch, written in Lua
Chainer — Deep learning library, huge in NLP
community, big inspiration to PyTorch Team
HIPS Autograd - Automatic differentiation library,
become on of big feature of PyTorch
In need of dynamic execution
7. January 2017
PyTorch was born 🍼
July 2017
Kaggle Data Science Bowl won using PyTorch 🎉
August 2017
PyTorch 0.2 🚢
September 2017
fast.ai switch to PyTorch 🚀
October 2017
SalesForce releases QRNN 🖖
November 2017
Uber releases Pyro 🚗
December 2017
PyTorch 0.3 release! 🛳
2017 in review
8.
9. Killer Features
Just Python
On Steroid
Dynamic computation allows flexibility
of input
Best suited for research and
prototyping
10. Summary
• PyTorch is Python machine learning library focus
on research purposes
• Released on January 2017, used by tech
companies and universities
• Dynamic and pythonic way to do machine
learning
21. Operators
import torch
x = torch.Tensor(5, 3)
# Randomize Tensor
y = torch.rand(5, 3)
# Add
print(x + y) # or
print(torch.add(x, y))
# Matrix Multiplication
a = torch.randn(2, 3)
b = torch.randn(3, 3)
print(torch.mm(a, b))
https://pytorch.org/docs/stable/tensors.html
22. Working With
import torch
a = torch.ones(5)
print(a) # tensor([1., 1., 1., 1., 1.])
b = a.numpy()
print(b) # [1. 1. 1. 1. 1.]
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a) "#[2. 2. 2. 2. 2.]
print(b)
#tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
23. Working With GPU
import torch
x = torch.Tensor(5, 3)
y = torch.rand(5, 3)
if torch.cuda.is_available():
x = x.cuda()
y = y.cuda()
x + y
24. Summary
• Tensor is like rubiks or multidimentional array
• Scalar, Vector, Matrix and Tensor is the same
with different dimension
• We can use torch.Tensor() to create a
tensor.
26. Differen7a7on
Refresher
y = f(x) = 2xIF THEN
dy
dx
= 2
IF THENy = f(x1, x2,…,xn) [
dy
dx1
,
dy
dx2
, . . . ,
dy
dxn
]
Is the gradient of y w.r.t [x1, x2, …, xn]
27. Autograd
• Calculus chain rule on steroid
• Derivative of function within a function
• Complex functions can be written as many
compositions of simple functions
• Provides auto differentiation on all tensor
operations
• In torch.autograd module
28. Variable
• Crucial data structure, needed for automatic
differentiation
• Wrapper around Tensor
• Records reference to the creator function
29. Variable
import torch
from torch.autograd import Variable
x = Variable(torch.FloatTensor([11.2]),
requires_grad=True)
y = 2 * x
print(x)
# tensor([11.2000], requires_grad=True)
print(y)
# tensor([22.4000], grad_fn=<MulBackward>)
print(x.data) # tensor([11.2000])
print(y.data) # tensor([22.4000])
print(x.grad_fn) # None
print(y.grad_fn)
# <MulBackward object at 0x10ae58e48>
y.backward() # Calculates the gradients
print(x.grad) # tensor([2.])
30. Summary
• Autograd provides auto differentiation on all tensor
operations, inside torch.autograd module
• Variable is wrapper around Tensor that will records
reference to the creator function
41. Access Dataset
Iterate data and train the model
for i, data in enumerate(trainloader):
data, labels = data
print(type(data)) # <class 'torch.Tensor'>
print(data.size()) # torch.Size([10, 3, 32, 32])
print(type(labels)) # <class 'torch.Tensor'>
print(labels.size()) # torch.Size([10])
# Model training happens here""...
43. Summary
• We can use existing dataset provided by torch and
torchvision such as CIFAR10
• Dataset is training example, epochs is on pass
througout the model, batch is subset of training
model and iteration is a single pass of one batch
50. Summary
• Neural net is a collection of neurons that related to
each other consist of input, weight, bias and output.
• To generate an output we need to activate it using
activation function such as Sigmoid, tanh or ReLU.
55. The Iris
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
from torch.autograd import Variable
from data import iris
Import things
56. The Iris
class IrisNet(nn.Module):
def "__init"__(self, input_size,
hidden1_size, hidden2_size, num_classes):
super(IrisNet, self)."__init"__()
self.layer1 = nn.Linear(input_size, hidden1_size)
self.act1 = nn.ReLU()
self.layer2 = nn.Linear(hidden1_size, hidden2_size)
self.act2 = nn.ReLU()
self.layer3 = nn.Linear(hidden2_size, num_classes)
def forward(self, x):
out = self.layer1(x)
out = self.act1(out)
out = self.layer2(out)
out = self.act2(out)
out = self.layer3(out)
return out
model = IrisNet(4, 100, 50, 3)
print(model)
Create Module and Instance
59. Loss Func7on
In PyTorch
• L1Loss
• MSELoss
• CrossEntropyLoss
• BCELoss
• SoftMarginLoss
• More: https://pytorch.org/docs/stable/nn.html?#loss-functions
60. Loss Func7on
CrossEntropyLoss
Measures the performance of a classification model
whose output is a probability value between 0 and 1.
🍎 🍌 🍍
Prediction 0.02 0.88 0.1 Actual
🍌
Loss Score 0.98 0.12 0.9
63. Back to Iris
Let’s Train The Neural Network
net = IrisNet(4, 100, 50, 3)
# Loss Function
criterion = nn.CrossEntropyLoss()
# Optimizer
learning_rate = 0.001
optimizer = torch.optim.SGD(net.parameters(),
lr=learning_rate,
nesterov=True,
momentum=0.9,
dampening=0)
64. Back to Iris
Let’s Train The Neural Network
num_epochs = 500
for epoch in range(num_epochs):
train_correct = 0
train_total = 0
for i, (items, classes) in enumerate(train_loader):
# Convert torch tensor to Variable
items = Variable(items)
classes = Variable(classes)
65. Back to Iris
Let’s Train The Neural Network
net.train() # Training mode
optimizer.zero_grad() # Reset gradients from past operation
outputs = net(items) # Forward pass
loss = criterion(outputs, classes) # Calculate the loss
loss.backward() # Calculate the gradient
optimizer.step() # Adjust weight based on gradients
train_total += classes.size(0)
_, predicted = torch.max(outputs.data, 1)
train_correct += (predicted "== classes.data).sum()
print('Epoch %d/%d, Iteration %d/%d, Loss: %.4f'
%(epoch+1, num_epochs, i+1,
len(train_ds)"//batch_size, loss.data[0]))
66. Back to Iris
Let’s Train The Neural Network
net.eval() # Put the network into evaluation mode
train_loss.append(loss.data[0])
train_accuracy.append((100 * train_correct / train_total))
# Record the testing loss
test_items = torch.FloatTensor(test_ds.data.values[:, 0:4])
test_classes = torch.LongTensor(test_ds.data.values[:, 4])
outputs = net(Variable(test_items))
loss = criterion(outputs, Variable(test_classes))
test_loss.append(loss.data[0])
# Record the testing accuracy
_, predicted = torch.max(outputs.data, 1)
total = test_classes.size(0)
correct = (predicted "== test_classes).sum()
test_accuracy.append((100 * correct / total))
67. Back to Iris
Let’s Train The Neural Network
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
from torch.autograd import Variable
from data import iris
# Create the module
class IrisNet(nn.Module):
def "__init"__(self, input_size, hidden1_size, hidden2_size, num_classes):
super(IrisNet, self)."__init"__()
self.layer1 = nn.Linear(input_size, hidden1_size)
self.act1 = nn.ReLU()
self.layer2 = nn.Linear(hidden1_size, hidden2_size)
self.act2 = nn.ReLU()
self.layer3 = nn.Linear(hidden2_size, num_classes)
def forward(self, x):
out = self.layer1(x)
out = self.act1(out)
out = self.layer2(out)
out = self.act2(out)
out = self.layer3(out)
return out
# Create a model instance
model = IrisNet(4, 100, 50, 3)
print(model)
# Create the DataLoader
batch_size = 60
iris_data_file = 'data/iris.data.txt'
train_ds, test_ds = iris.get_datasets(iris_data_file)
print('# instances in training set: ', len(train_ds))
print('# instances in testing/validation set: ', len(test_ds))
train_loader = torch.utils.data.DataLoader(dataset=train_ds, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_ds, batch_size=batch_size, shuffle=True)
# Model
net = IrisNet(4, 100, 50, 3)
# Loss Function
criterion = nn.CrossEntropyLoss()
# Optimizer
learning_rate = 0.001
optimizer = torch.optim.SGD(net.parameters(),
lr=learning_rate,
nesterov=True,
momentum=0.9,
dampening=0)
# Training iteration
num_epochs = 500
train_loss = []
test_loss = []
train_accuracy = []
test_accuracy = []
for epoch in range(num_epochs):
train_correct = 0
train_total = 0
for i, (items, classes) in enumerate(train_loader):
# Convert torch tensor to Variable
items = Variable(items)
classes = Variable(classes)
net.train() # Training mode
optimizer.zero_grad() # Reset gradients from past operation
outputs = net(items) # Forward pass
loss = criterion(outputs, classes) # Calculate the loss
loss.backward() # Calculate the gradient
optimizer.step() # Adjust weight/parameter based on gradients
train_total += classes.size(0)
_, predicted = torch.max(outputs.data, 1)
train_correct += (predicted "== classes.data).sum()
print('Epoch %d/%d, Iteration %d/%d, Loss: %.4f'
%(epoch+1, num_epochs, i+1, len(train_ds)"//batch_size, loss.data[0]))
net.eval() # Put the network into evaluation mode
train_loss.append(loss.data[0])
train_accuracy.append((100 * train_correct / train_total))
# Record the testing loss
test_items = torch.FloatTensor(test_ds.data.values[:, 0:4])
test_classes = torch.LongTensor(test_ds.data.values[:, 4])
outputs = net(Variable(test_items))
loss = criterion(outputs, Variable(test_classes))
test_loss.append(loss.data[0])
# Record the testing accuracy
_, predicted = torch.max(outputs.data, 1)
total = test_classes.size(0)
correct = (predicted "== test_classes).sum()
test_accuracy.append((100 * correct / total))
68.
69. Summary
• Created feed forward neural network to predict a
type of flower
• Start from read the dataset and dataloader
• Choose a loss function and optimizer
• Train and evaluation