Machine learning is the kind of programming which gives computers the capability to automatically learn from data without being explicitly programmed.
This means in other words that these programs change their behavior by learning from data.
In this course we will cover various aspects of machine learning
Of course, everything will be related to Python. So it is Machine Learning by using Python.
What is the best programming language for machine learning?
Python is clearly one of the top players!
k-nearest Neighbor Classifier
Neural networks
Neural Networks from Scratch in Python
Neural Network in Python using Numypy
Dropout Neural Networks
Neural Networks with Scikit
Machine Learning with Scikit and Python
Naive Bayes Classifier
Introduction into Text Classification using Naive Bayes and Python
2. 2
Overview
• Machine learning is the kind of programming which gives
computers the capability to automatically learn from data
without being explicitly programmed.
• This means in other words that these programs change
their behavior by learning from data.
• In this course we will cover various aspects of machine
learning
• Of course, everything will be related to Python. So it is
Machine Learning by using Python.
• What is the best programming language for machine
learning?
• Python is clearly one of the top players!
3. 3
Topics
• k-nearest Neighbor Classifier
• Neural networks
– Neural Networks from Scratch in Python
– Neural Network in Python using Numypy
– Dropout Neural Networks
– Neural Networks with Scikit
– Machine Learning with Scikit and Python
• Naive Bayes Classifier
• Introduction into Text Classification using Naive
Bayes and Python
4. 4
Machine learning can be roughly
separated into three categories:
• Supervised learning
– The machine learning program is both given the input data and
the corresponding labelling. This means that the learn data has
to be labelled by a human being beforehand.
• Unsupervised learning
– No labels are provided to the learning algorithm. The algorithm
has to figure out the a clustering of the input data.
• Reinforcement learning
– A computer program dynamically interacts with its environment.
This means that the program receives positive and/or negative
feedback to improve it performance.
6. 6
Machine Learning Terminology
• Classifier, A program or a function which
maps from unlabeled instances to classes
is called a classifier.
7. 7
• Confusion Matrix, A confusion matrix, also
called a contingency table or error matrix, is
used to visualize the performance of a classifier.
• The columns of the matrix represent the
instances of the predicted classes and the rows
represent the instances of the actual class.
(Note: It can be the other way around as well.)
• In the case of binary classification the table has
2 rows and 2 columns.
8. 8
• This means that the
classifier correctly
predicted a male person
in 42 cases and it wrongly
predicted 8 male
instances as female. It
correctly predicted 32
instances as female. 18
cases had been wrongly
predicted as male instead
of female.
Example:
3218female
842maleActual
classes
femalemale
Predicted classesConfusion
Matrix
9. 9
Accuracy (error rate)
• Accuracy is a statistical measure which is defined as the
quotient of correct predictions made by a classifier
divided by the sum of predictions made by the classifier.
• The classifier in our previous example predicted correctly
predicted 42 male instances and 32 female instance.
• Therefore, the accuracy can be calculated by:
• accuracy
= (42+32)/(42+8+18+32)(42+32)/(42+8+18+32)
• which is 0.72
• Let's assume we have a classifier, which always predicts
"female". We have an accuracy of 50 % in this case.
11. 11
• We will demonstrate
the so-called
accuracy paradox.
• A spam recogition
classifier is described
by the following
confusion matrix:
914ham
14spamActual
classes
hamspam
Predicted classesConfusion
Matrix
12. 12
• The accuracy of this
classifier is (4 + 91) /
100, i.e. 95 %.
• The following
classifier predicts
solely "ham" and has
the same accuracy.
• The accuracy of this
classifier is 95%,
even though it is not
capable of
recognizing any spam
at all.
950ham
50spamActual
classes
hamspam
Predicted
classes
Confusion
Matrix
14. 14
Supervised learning
• The machine learning program is both
given the input data and the corresponding
labelling. This means that the learn data
has to be labelled by a human being
beforehand.
15. 15
Unsupervised learning
• No labels are provided to the learning
algorithm. The algorithm has to figure out
the a clustering of the input data.
16. 16
Reinforcement learning
• A computer program dynamically interacts
with its environment. This means that the
program receives positive and/or negative
feedback to improve it performance.
19. 19
• "Show me who your friends are and I’ll tell you
who you are?"
• The concept of the k-nearest neighbor classifier
can hardly be simpler described. This is an old
saying, which can be found in many languages
and many cultures. It's also metnioned in other
words in the Bible: "He who walks with wise men
will be wise, but the companion of fools will
suffer harm" (Proverbs 13:20 )
20. 20
k-nearest neighbor classifier
• This means that the concept of the k-
nearest neighbor classifier is part of our
everyday life and judging: Imagine you
meet a group of people, they are all very
young, stylish and sportive.
• They talk about there friend Ben, who isn't
with them. So, what is your imagination of
Ben? Right, you imagine him as being
yong, stylish and sportive as well.
21. 21
k-nearest neighbor classifier
• If you learn that Ben lives in a neighborhood
where people vote conservative and that the
average income is above 200000 dollars a year?
• Both his neighbors make even more than
300,000 dollars per year?
• What do you think of Ben?
• Most probably, you do not consider him to be an
underdog and you may suspect him to be a
conservative as well?
22. 22
k-nearest neighbor classifier
• The principle behind nearest neighbor
classification consists in finding a predefined
number, i.e. the 'k' - of training samples closest
in distance to a new sample, which has to be
classified. The label of the new sample will be
defined from these neighbors.
• k-nearest neighbor classifiers have a fixed user
defined constant for the number of neighbors
which have to be determined.
• There are also radius-based neighbor learning
algorithms, which have a varying number of
neighbors based on the local density of points,
all the samples inside of a fixed radius.
23. 23
Neighbors-based methods are
known as non-generalizing
machine learning methods
• The distance can, in general, be any
metric measure: standard Euclidean
distance is the most common choice.
• Neighbors-based methods are known as
non-generalizing machine learning
methods, since they simply "remember"
all of its training data.
• Classification can be computed by a
majority vote of the nearest neighbors of
the unknown sample.
24. 24
k-NN algorithm
• The k-NN algorithm is among the simplest
of all machine learning algorithms, but
despite its simplicity, it has been quite
successful in a large number of
classification and regression problems, for
example character recognition or image
analysis.
25. 25
• Now let's get a little bit more
mathematically:
• The k-Nearest-Neighbor Classifier (k-NN)
works directly on the learned samples,
instead of creating rules compared to
other classification methods.
26. 26
Nearest Neighbor Algorithm:
• Given a set of
categories {c1,c2,...cn}{c1,c2,...cn}, also called
classes, e.g. {"male", "female"}.
• There is also a learnset LS consisting of labelled
instances.
• The task of classification consists in assigning a
category or class to an arbitrary instance. If the
instance 0 is an element of LS, the label of the
instance will be used.
• Now, we will look at the case where 0 is not
in LS:0 is compared with all instances of LS.
• A distance metric is used for comparison. We
determine the k closest neighbors of 0, i.e. the
items with the smallest distances. k is a user
defined constant and a positive integer, which is
usually small.
27. 27
• The most common class of LS will be assigned
to the instance 0. If k = 1, then the object is
simply assigned to the class of that single
nearest neighbor.
• The algorithm for the k-nearest neighbor
classifier is among the simplest of all machine
learning algorithms. k-NN is a type of instance-
based learning, or lazy learning, where the
function is only approximated locally and all the
computations are performed, when we do the
actual classification.
29. 29
Preparing the Dataset
• Before we actually start with writing a nearest neighbor
classifier, we need to think about the data, i.e. the
learnset.
• We will use the "iris" dataset provided by the datasets of
the sklearn module.
• The data set consists of 50 samples from each of three
species of Iris
– Iris setosa,
– Iris virginica and
– Iris versicolor.
• Four features were measured from each sample: the
length and the width of the sepals and petals, in
centimetres.
30. 30
Preparing the Dataset
import numpy as np
from sklearn import datasets
iris = datasets.load_iris()
iris_data = iris.data
iris_labels = iris.target
print (iris_data[0], iris_data[79], iris_data[100])
print (iris_labels[0], iris_labels[79], iris_labels[100])
[5.1 3.5 1.4 0.2] [5.7 2.6 3.5 1. ] [6.3 3.3 6. 2.5] 0 1 2
We create a learnset from the sets above. We use permutation from np.random
to split the data randomly.
32. 32
Preparing the Dataset
• The following code is only necessary to
visualize the data of our learnset.
• Our data consists of four values per iris
item, so we will reduce the data to three
values by summing up the third and fourth
value.
• This way, we are capable of depicting the
data in 3-dimensional space:
35. 35
k-Nearest Neighbor
• The k-NN is an instance-based classifier.
The underlying idea is that the likelihood
that two instances of the instance space
belong to the same category or class
increases with the proximity of the
instance. Proximity or closeness can be
defined with a distance or similarity
function.