The document provides an overview of how to implement artificial intelligence solutions. It discusses getting started in AI by either creating new techniques as a scientist or implementing existing techniques as an engineer. It then covers various machine learning algorithms like linear regression, decision trees, random forests, naive bayes, k-nearest neighbors, k-means, and support vector machines. Finally, it introduces deep learning concepts like artificial neural networks, neurons, layers, gradients, optimizers, overfitting, and regularization. The document serves as a guide for implementing both machine learning and deep learning techniques for AI applications.
14. What is Artificial Intelligence?
Artificial intelligence (Artificial Intelligence, or AI) is the simulation of human
intelligence processes by machines, especially computer systems. These
processes include learning (the acquisition of information and rules for the
use of information), reasoning (using the rules to reach approximate or
definitive conclusions) and self-correction.
18. It is able to understand us
Although we give
very basic concepts,
AI can create more
complex concepts.
19. Create videos controlling a person
It has been possible to
create videos
supplanting someone's
identity.
20. Songs composed by AI
Song that is
composed from
the sound (hum) of
the human voice.
21.
22. 1958
The computer will be the first device to
think as the human brain. It can
understand its surroundings,
recognize you speak your name and
translate in real time.
It is possible to build brains that
could reproduce themselves on an
assembly and which would be
conscious of their existence.
Rosenblatt murio en un accidente
24. Fun fact about George Boole
Geoffrey Hinton is a great-great-great-great
grandson of George Boole.
He changed the world from the mechanical
industrial era to the digital era.
25. AI is also in the interest of governments
Geoffrey Hinton
currently resides in
Canada and the
government is
investing in being a
reference in the AI
scene.
28. How do I start?
The first thing you should know is that you start either as an implementer of
existing techniques or as a creator of new techniques.
29. Creator of new techniques (Scientist)
In order to propose new algorithms, deep knowledge is required in several
areas, including:
● Linear algebra
● Statistics
● Equation systems
● Calculus
● Parallel computing
● Data structures
● Programming languages (usually C or Python)
… Among other areas
30. As an implementer you just need to know how each technique differs from
another and know which one to use depending on the problem. The process
of implementing only implies adapting the inputs to which each algorithm
requires, without the need to program a new algorithm. This requires:
● Data processing
● Programming languages (usually C or Python)
● Parallel computing
Among others
Implementer of existing techniques (Engineering)
31. This talk will focus on how to implement
While it is true that the great discoveries come from the generation of new
algorithms, reaching a sufficient level of understanding of the foundations of
artificial intelligence takes time and usually requires postgraduate studies.
As an implementer you will eventually be able to improve existing
techniques.
32. The evolution of the methods
A simple way to understand how to implement AI is compared to a
conventional programming method. A method is a piece of code that runs
and returns a result, example:
house_price = get_price( house_location )
The result can be searched in a database or calculated based on an existing
formula (existing rules).
33. The evolution of the methods
When we implement an AI technique, it is implemented in a manner similar
to a method, but there is no defined formula to obtain the result, and it
returns 2 values, a predicted value and the percentage of certainty.
precio_casa, certainty_percentage = predict_price( house_location )
The result is inferred based on previous data. There is no exact formula, it is
always estimated based on similar data.
34. To start a classic example is the Hot Dog and No Hot Dog. For that hundreds
of Hot Dogs photos and hundreds of photos of anything else are collected.
With that many of the techniques detect precisely if there is a Hot Dog. This is
what the function would look like
detected_class, certainty_percentage = is_a_hot_dog( image )
Where the detected_class is “Hot Dog”
And certainty_percentage ranges from 0 to 1
If the value is higher to a threshold (i.e. 0.5)
then It is a Hot Dog
The most basic example: Hot dog / Not Hot Dog
36. Hot dog / Not Hot Dog use case
As in the SIlicon
Valley series
37. Detecting cats and dogs
In this case we need 3 data sets, hundreds of photos of dogs,
hundreds of cats and hundreds of anything else.
detected_classes, certainty_percentages = cat_or_dog( images )
Because they are more than one category, the function returns
a data array, where:
detected_classes[0] is “dog” y certainty_percentages[0] is 0.2
detected_classes[1] is “cat” y certainty_percentages[1] is 0.7
detected_classes[2] is “other” y certainty_percentages[2] is 0.1
Then it is more likely to be a cat.
38. Self-driving cars use the same principles
A car that drives itself, even if it sounds very complex, is just the
implementation of several AI functions, this is a very simple example:
detected_classes, certainty_percentages = vision_sensor_1( camera_image )
IF detected_classes[0] == ‘Another car’ and certainty_percentages[0] < .1
AND detected_classes[1] == ‘Pedestrian’ and certainty_percentages[1] < .1
Then SWITCH_LINE()
39. Autonomous car engineers use models
Dedicated to making autonomous cars is something very attainable for
anyone, since the vast majority of tests are done on models and not on real
cars. Once it is 100% efficient in models, tests are carried out on cars. An
autonomous car laboratory can be set up at a very low cost.
40. Classification and Regression
We have already learned these concepts
with examples of Supervised Learning
Regression predicts a value, as in the
case of the house, a house closer to a
shopping center and main roads, returns
a value in larger price, example 1,000,000
Classification predicts a category, in
the examples the "classes" or categories
to search are returned and based on that
it returns us if it is "dog" or "cat". We
know in advance the categories.
41. But there is an extra case (Clustering)
When we don't know the categories of the data (Unsupervised
Learning) we need the computer to group those that resemble
each other, then give them a name:
class_found, certainty_percentage = find_category( data[0] )
class_found, certainty_percentage = find_category( data[1] )
class_found, certainty_percentage = find_category( data[2] )
This case is called Clustering. An example may be to give pictures
of drawn letters, the result should be to group them into
separate letters.
42. Now you know the basics
With what has been learned, it is enough to raise a problem and solve it using
different types of learning, now we will see what techniques exist and when
each one is used.
44. Machine Learning requires proper data
Machine Learning is the general concept of how a computer can learn.
Learning comes from receiving new information and adapting its parameters
to correctly identify new cases. This requires a lot of information, but you
also need to know what parameters ("features") to analyze.
45. Features?
Yes, we need to find based on
what parameters to
differentiate from one
another, for example there are
very similar species of fish.
In what characteristics would
you identify to identify them.
Color, size, number of fins?
46. The next step is to “annotate” (labeling)
Since we have each record of each specimen now manually we must specify
what "class" is, for example:
And then when unseen data are presented, we can predict.
Color Size Fins CLASS
Golden 20 5 Tilapia
Silver 10 6 Tuna
Silver 30 6 Salmon
Color Size Fins CLASS
Silver 20 6 ???
47. With the Classes and Features we can start
Ready now if we can feed any algorithm to get a result.
The most used are:
● kNN
● Bayes
● Support Vector Machines
● Neural Networks
● Decision Trees
● Random Forest
● ...
Among many others
48. Now we have to train the model
Once we have the Classes, the
Features and choose an algorithm. We
proceed to train it, the outcome is a
file called Model with the rules to
classify data not seen before.
49. But without good features, it won't be accurate
Example:
To predict
The features are not enough, we need to gather more data or features that
are more discriminative.
Color Size Fins CLASS
Golden 20 5 Tilapia
Silver 10 6 Tuna
Silver 30 6 Salmon
Color Size Fins CLASS
Silver 20 6 ???
62. When to use it?
● Simple regression problems
○ How much the rent should cost in certain area
○ How much should I charge for specific amount of work
● Problems where we want to define a rule that separates two
categories that are similar, i.e. Premium or Basic price for customers
under certain parameters (number of rooms vs number of cars)
68. When to use it?
● When we need to know what decisions the machine is taking
● When we need to explain to others how the features are evaluated
● When there are no much features
70. Code
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
clf = RandomForestClassifier(n_estimators=100)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print('accuracy is',accuracy_score(y_pred, y_test))
71. When to use it?
● When we want to know alternatives of how to evaluate a problem.
● When we want to manually discard flows that are biased
● When we want to manage ensembles from one single method.
75. Code
from sklearn.naive_bayes import GaussianNB
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
clf = GaussianNB()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print('accuracy is',accuracy_score(y_pred, y_test))
76. When to use it?
● When we want to know the probabilities of the different cases.
● When we need a probabilistic model.
● When we need an easy way to prove in paper.
82. Code
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import NearestNeighbors
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
clf = NearestNeighbors(n_neighbors=5)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print('accuracy is',accuracy_score(y_pred, y_test))
83. When to use it?
● When intuition says that the problem can be solved from getting thee
most similar option.
● When the information is no exhaustive.
● When we want to justify the decision of the algorithm in a common
human reasoning.
86. Code
from sklearn.cluster import KMeans
from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.preprocessing import scale
digits = load_digits()
data = scale(digits.data)
reduced_data = PCA(n_components=2).fit_transform(data)
print(reduced_data)
kmeans = KMeans(init='k-means++', n_clusters=10, n_init=10)
kmeans.fit(reduced_data)
kmeans.predict(reduced_data[:10])
87. When to use it?
● When we don’t know how to understand the data
● When we want to optimize resources by grouping related elements.
● When we want that the computer creates the labels for us.
90. Code
from sklearn.datasets import load_iris
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
classifier = SVC()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print('accuracy is',accuracy_score(y_pred, y_test))
91. When to use it?
● It was the most effective technique before Neural Networks, it can
achieve excellent results with less processing.
● Mathematically speaking, it is based in very strong math principles, it
creates complex multidimensional hyperplanes that separates the classes
precisely.
● It is not a white box technique, but may be the best option for problems
where we want to get the best of Machine Learning approach without
dealing with Neural Networks.
93. Code
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import NearestNeighbors
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
clf = LogisticRegression(multi_class='multinomial')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print('accuracy is',accuracy_score(y_pred, y_test))
94. When to use it?
● When we want to optimize a regression
● When we want to binarize the output
● As a preliminary analysis before implementing neural networks
96. Code
from sklearn.linear_model import Perceptron
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import NearestNeighbors
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
clf = Perceptron()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print('accuracy is',accuracy_score(y_pred, y_test))
97. When to use it?
● When we have very few features and there is no extra details that can be
extracted from hidden layers.
● There are in fact neural networks, and we do not need alway to use
them for deep learning these can be used for machine learning when we
benchmark with other machine learning techniques.
● When we want to get the power of neural networks and we don’t have
much computational power.
132. When to use it?
● When sequences are provided
○ Text sequences
○ Image sequences (videos)
○ Time series
● When we need to provide an ordered output
137. When to use it?
● When we want to benchmark models
● When different models are stronger when these are evaluated together
● When the individual processing is not exhaustive
146. When to use it?
● When a robot explores a place and needs to learn from the environment.
● When we can try as much as we can in a simulator.
● When we want to find the most optimal path
149. When to use it?
● When we have too much features and we do not know which of them
are useful.
● When we want to reduce the dimensionality of our model.
● When we want to plot our decision boundaries.
151. When to use it?
● When we have limited data
● When we want to help our model to generalize more
● When our unseen data comes in very different formats.
163. When to use it?
● When we want to compress data.
● When we need to change one type of input to other type of output.
● When we don’t need much variability in the generated data.
165. When to use it?
● When we need to transfer a style
● When we need more variability in the generated output
● When we need to keep context in the generation.
167. When to use it?
● When we generate text
● When we generate the next sequence from a serie
● When the order in the generated output matters.
168.
169. When to use it?
● When context is an essential part of the generated output
● When we need to keep consistency in the frequency space.
● When we have enough computational resources.
170. Summary
● Predict values from numeric features (Regression)
○ Linear regression
○ Polynomial regressions
○ Any machine learning technique without an activation function
● Predict categories from labeled numeric data (Classification)
○ Logistic Regression
○ k-Nearest Neighbors
○ Decision Tree
○ Random Forest
○ Support Vector Machine
○ Perceptron
○ Multi Layer Perceptron
○ Naive Bayes
171. Summary
● Find labels from unlabeled data (Clustering)
○ K-means
○ LDA
○ NMF
● Detect objects in an image (Classification)
○ Convolutional Neural Networks
● Detect the category of a text or video (Classification)
○ Recurrent Neural Networks
● Learn to navigate and find optimal paths (Try and error)
○ Reinforcement Learning
● Generate images
○ Autoencoders
○ Generative Adversarial Networks
172. Summary
● Generate audio and text
○ Sequence models
○ Transformers
● Explainable approaches
○ k-Nearest Neighbors
○ Naive Bayes
○ Decision Trees
○ Linear regression
○ K-means
173. Resources
● Working examples ready to run online (Jupyter)
○ http://bit.ly/awesome-machine-learning
● Communities around the world
○ http://theschool.ai
174. Conclusions
● Depending on the application that we want to develop we need different
techniques and frameworks
● The model training is the 20% of the work, processing the data to have
something useful is the remaining 80%.
● We need a lot of data to train a model, if you don’t have enough data,
there are always augmentation from transformations and generations
techniques to create more synthetic data form the existing.
● We don’t need to be a data scientist to implement Artificial Intelligence
solutions, we only need to know how the techniques work and how to
tune up.
● Now that you know everything you need, it's time to put it into practice.