SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
B2B Travel technology
Machine Learning Introduction
October 20th 2016
Introduction
Johnny RAHAJARISON
@brainstorm_me
johnny.rahajarison@mylittleadventure.com
MyLittleAdventure
@mylitadventure
2
Agenda
What’s Machine Learning ?
Usage examples
Complexity
Algorithm families
Let’s go!
Troubleshoot
Tech insights
Next steps
Conclusion
3!
Machine learning
Introduction
4
What’s Machine Learning ?
Software that do something without being explicitly
programmed to, just by learning through examples
Same software can be used for various tasks
It learns from experiences with respect to some task and performance,
and improves through experience
5!
Usage examples (1/2)
6!
Some typical usage examples
Use cases : MyLittleAdventure (2/2)
7
Language detection
Clustering
Anomaly detection
Recommendation
Chose of parameters
MyLittleAdventure usage
!
Complexity
8!
"""Tests for convolution related functionality in tensorflow.ops.nn."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from six.moves import xrange # pylint: disable=redefined-builtin
import tensorflow as tf
class Conv2DTransposeTest(tf.test.TestCase):
def testConv2DTransposeSingleStride(self):
with self.test_session():
strides = [1, 1, 1, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 6, 4, 2]
# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]
x = tf.constant(1.0, shape=x_shape, name="x", dtype=tf.float32)
f = tf.constant(1.0, shape=f_shape, name="filter", dtype=tf.float32)
output = tf.nn.conv2d_transpose(x, f, y_shape, strides=strides,
padding="SAME")
value = output.eval()
# We count the number of cells being added at the locations in the output.
# At the center, #cells=kernel_height * kernel_width
# At the corners, #cells=ceil(kernel_height/2) * ceil(kernel_width/2)
# At the borders, #cells=ceil(kernel_height/2)*kernel_width or
# kernel_height * ceil(kernel_width/2)
for n in xrange(x_shape[0]):
for k in xrange(f_shape[2]):
for w in xrange(y_shape[2]):
for h in xrange(y_shape[1]):
target = 4 * 3.0
h_in = h > 0 and h < y_shape[1] - 1
w_in = w > 0 and w < y_shape[2] - 1
if h_in and w_in:
target += 5 * 3.0
"""GradientDescent for TensorFlow."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.python.framework import ops
from tensorflow.python.ops import math_ops
from tensorflow.python.training import optimizer
from tensorflow.python.training import training_ops
class GradientDescentOptimizer(optimizer.Optimizer):
"""Optimizer that implements the gradient descent algorithm.
@@__init__
"""
def __init__(self, learning_rate, use_locking=False, name="GradientDescent"):
"""Construct a new gradient descent optimizer.
Args:
learning_rate: A Tensor or a floating point value. The learning
rate to use.
use_locking: If True use locks for update operations.
name: Optional name prefix for the operations created when applying
gradients. Defaults to "GradientDescent".
"""
super(GradientDescentOptimizer, self).__init__(use_locking, name)
self._learning_rate = learning_rate
def _apply_dense(self, grad, var):
return training_ops.apply_gradient_descent(
var,
math_ops.cast(self._learning_rate_tensor, var.dtype.base_dtype),
grad,
use_locking=self._use_locking).op
def _apply_sparse(self, grad, var):
delta = ops.IndexedSlices(
grad.values *
"""Tests for tensorflow.ops.linalg_grad."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import tensorflow as tf
class ShapeTest(tf.test.TestCase):
def testBatchGradientUnknownSize(self):
with self.test_session():
batch_size = tf.constant(3)
matrix_size = tf.constant(4)
batch_identity = tf.tile(
tf.expand_dims(
tf.diag(tf.ones([matrix_size])), 0), [batch_size, 1, 1])
determinants = tf.matrix_determinant(batch_identity)
reduced = tf.reduce_sum(determinants)
sum_grad = tf.gradients(reduced, batch_identity)[0]
self.assertAllClose(batch_identity.eval(), sum_grad.eval())
class MatrixUnaryFunctorGradientTest(tf.test.TestCase):
pass # Filled in below
def _GetMatrixUnaryFunctorGradientTest(functor_, dtype_, shape_, **kwargs_):
def Test(self):
with self.test_session():
np.random.seed(1)
m = np.random.uniform(low=-1.0,
high=1.0,
size=np.prod(shape_)).reshape(shape_).astype(dtype_)
a = tf.constant(m)
b = functor_(a, **kwargs_)
# Optimal stepsize for central difference is O(epsilon^{1/3}).
epsilon = np.finfo(dtype_).eps
delta = 0.1 * epsilon**(1.0 / 3.0)
# tolerance obtained by looking at actual differences using
# np.linalg.norm(theoretical-numerical, np.inf) on -mavx build
Complex algorithm before
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
And Machine learning now…
Machine learning
Algorithm families
9
Supervised algorithms
10
Supervised algorithms
ClassificationRegression
Unsupervised algorithms
11
Unsupervised algorithms
ClusteringAnomaly detection
Machine learning
Let’s go!
12
Recipe
!13
Collect
Training data
Files, database,
cache, data flow
Selection of model,
and (hyper) parameters
Train algorithm
Use or store your
trained estimator
Make predictions
Measure accuracy
precision
Measure
Collect training data
Get qualitative data
Get some samples
Don’t get data for months and then try
Go fast and try things.
14
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 Granny smith apple
86 6.2 4.7 Mandarin
178 7.1 7.8 Braeburn apple
162 7.4 7.2 Cripps pink apple
118 6.1 8.1 Unidentified lemons
144 6.8 7.4 Turkey orange
362 9.6 9.2 Spanish jumbo orange
… … … …
What about the data?
Fruit identification example
Prepare your data
15
Numerize your features and labels
Put them in same scale (normalization) ?
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
162 7.4 7.2 5
118 6.1 8.1 10
144 6.8 7.4 8
362 9.6 9.2 9
… … … …
We need to have some tests
Training set
Learning phase (60% - 80 %)
Test set
Analytics phase (20% - 40%)
16
Prepare your data (code)
Train algorithm
17
Choose a classifier
Fit the decision tree
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
162 7.4 7.2 5
118 6.1 8.1 10
144 6.8 7.4 8
362 9.6 9.2 9
… … … …
We need to choose an estimator
Make predictions
18
What looks our predictions?
Weight
(g)
Width
(cm)
Height
(cm)
Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
Test set
Weight
(g)
Width
(cm)
Height
(cm)
Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 1
Predictions
!
Measure (1/2)
Evaluate on the dataset that as never ever been learned by your model
19!
Accuracy
Correct predictions / total predictions
Gives a simple confidence score of our performance level
Measure (2/2)
Try to visualize and analyze your data, and know what you want
20!
Actual true Actual false
Predicted
true
True
positive
False
positive
Predicted
false
False
negative
True
negative
Confusion Matrix
Skewed classes
Precision = True positives / #predicted positives
Recall = True positives / #actual positives
F1 score (trade-off) = (precision * recall) /
(precision + recall)
21
Measure and prediction (code)
Machine learning
Troubleshoot
22
Troubleshoot (1/4)
23!
Under/Overfitting situation
Troubleshoot (2/4)
Underfitting
Add / create more features
Use more sophisticated model
Use fewer samples
Decrease regularization
24!
Overfitting
Use fewer features
Use more simple model
Use more samples
Increase regularization
What are the different options ?
Troubleshoot (3/4)
25!
Underfitting Overfitting
Using the learning curves…
Troubleshoot : Model choice (4/4)
26!
Machine learning
Tech insights
27
Platforms : easy, peasy
You don’t even have to code to build something
(*wink wink* business developers)
Built-in models
Data munging
Model management by UI
PaaS
28!
Very high-level solutions
Languages
For understanding &
prototyping implementation
Most Valuable Languages
Comfortable for prototyping,
yet powerful for
industrialisation
For bigger companies &
projects, and fine-tuned
softwares
29!
Matlab Octave Go Python Java C++
What language for what purpose ?
Libraries
Built-in models
Data munging
Fine-tuning
Full integration to your product
30!
You will have great power using a library
Golearn
Machine learning
Next steps…
31
Next steps
Split your data in 3 : Training / Cross validation / Test set
Know the top algorithms
Search advanced techniques and optimizers (online learning, stacking)
Deep and reinforcement learning
Partial and semi-supervised learning
Transfer learning
How to store and analyse big data ? How do we scale ?
!32
Try it ! Find your best tools and have some fun
Conclusion
Try it and let’s get in touch!
Machine learning is not just a buzz word
Difficulties are not always what we think!
Machine learning is rather experiences and tests than just algorithms
There is no perfect unique solution
There is plenty of easy to use solutions for beginners
33!
Machine learning
One more thing!
34
Tensorflow
35
Tensorflow learn
36
Thank you
Machine Learning Introduction
October 20th 2016
Questions ?

Contenu connexe

Tendances

(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning
Rebecca Bilbro
 

Tendances (20)

Tensorflow Ecosystem
Tensorflow EcosystemTensorflow Ecosystem
Tensorflow Ecosystem
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
 
Machine Learning Exposed!
Machine Learning Exposed!Machine Learning Exposed!
Machine Learning Exposed!
 
Towards automating machine learning: benchmarking tools for hyperparameter tu...
Towards automating machine learning: benchmarking tools for hyperparameter tu...Towards automating machine learning: benchmarking tools for hyperparameter tu...
Towards automating machine learning: benchmarking tools for hyperparameter tu...
 
Semi-supervised learning with GANs
Semi-supervised learning with GANsSemi-supervised learning with GANs
Semi-supervised learning with GANs
 
Meta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter OptimizationMeta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter Optimization
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model Analysis
 
PyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine LearningPyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine Learning
 
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment AnalysisA Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
Escaping the Black Box
Escaping the Black BoxEscaping the Black Box
Escaping the Black Box
 
Wolphram Alpha
Wolphram AlphaWolphram Alpha
Wolphram Alpha
 
(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning
 
Python Programming Language | Python Classes | Python Tutorial | Python Train...
Python Programming Language | Python Classes | Python Tutorial | Python Train...Python Programming Language | Python Classes | Python Tutorial | Python Train...
Python Programming Language | Python Classes | Python Tutorial | Python Train...
 
Wolfram alpha A Computational Knowledge Engine Interesting Technology
Wolfram alpha A Computational Knowledge Engine  Interesting Technology Wolfram alpha A Computational Knowledge Engine  Interesting Technology
Wolfram alpha A Computational Knowledge Engine Interesting Technology
 
Wolfram alpha
Wolfram alphaWolfram alpha
Wolfram alpha
 
Deep Learning Review
Deep Learning ReviewDeep Learning Review
Deep Learning Review
 
A powerful comparison of deep learning frameworks for Arabic sentiment analysis
A powerful comparison of deep learning frameworks for Arabic sentiment analysis A powerful comparison of deep learning frameworks for Arabic sentiment analysis
A powerful comparison of deep learning frameworks for Arabic sentiment analysis
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search Engine
 
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
 

En vedette

Introduction of Machine Learning
Introduction of Machine LearningIntroduction of Machine Learning
Introduction of Machine Learning
Mohammad Hossain
 

En vedette (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
 
#48 Machine learning
#48 Machine learning#48 Machine learning
#48 Machine learning
 
Machine Learning for Understanding and Managing Ecosystems
Machine Learning for Understanding and Managing EcosystemsMachine Learning for Understanding and Managing Ecosystems
Machine Learning for Understanding and Managing Ecosystems
 
Data Science For Dummies From a Dummy
Data Science For Dummies From a DummyData Science For Dummies From a Dummy
Data Science For Dummies From a Dummy
 
7 lessons to learn of Brazil defeat, to your company
7 lessons to learn of Brazil defeat, to your company 7 lessons to learn of Brazil defeat, to your company
7 lessons to learn of Brazil defeat, to your company
 
ES6 metaprogramming unleashed
ES6 metaprogramming unleashedES6 metaprogramming unleashed
ES6 metaprogramming unleashed
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummies
 
Quero trabalhar com big data data science, como faço-
Quero trabalhar com big data   data science, como faço-Quero trabalhar com big data   data science, como faço-
Quero trabalhar com big data data science, como faço-
 
Demystifying Machine Learning - How to give your business superpowers.
Demystifying Machine Learning - How to give your business superpowers.Demystifying Machine Learning - How to give your business superpowers.
Demystifying Machine Learning - How to give your business superpowers.
 
Data Science & Big Data, made in Switzerland
Data Science & Big Data, made in SwitzerlandData Science & Big Data, made in Switzerland
Data Science & Big Data, made in Switzerland
 
Actividad 02
Actividad 02Actividad 02
Actividad 02
 
lsrs15_ciandt
lsrs15_ciandtlsrs15_ciandt
lsrs15_ciandt
 
Machine learning the next revolution or just another hype
Machine learning   the next revolution or just another hypeMachine learning   the next revolution or just another hype
Machine learning the next revolution or just another hype
 
7 historical software bugs
 7 historical software bugs 7 historical software bugs
7 historical software bugs
 
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML CampIntroduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
 
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
 
Europython - Machine Learning for dummies with Python
Europython - Machine Learning for dummies with PythonEuropython - Machine Learning for dummies with Python
Europython - Machine Learning for dummies with Python
 
A Nontechnical Introduction to Machine Learning
A Nontechnical Introduction to Machine LearningA Nontechnical Introduction to Machine Learning
A Nontechnical Introduction to Machine Learning
 
Introduction of Machine Learning
Introduction of Machine LearningIntroduction of Machine Learning
Introduction of Machine Learning
 

Similaire à Introduction Machine Learning by MyLittleAdventure

Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 

Similaire à Introduction Machine Learning by MyLittleAdventure (20)

An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Python and test
Python and testPython and test
Python and test
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Developing a test automation strategy by Brian Bayer
Developing a test automation strategy by Brian BayerDeveloping a test automation strategy by Brian Bayer
Developing a test automation strategy by Brian Bayer
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
 
Creating a custom Machine Learning Model for your applications - Java Dev Day...
Creating a custom Machine Learning Model for your applications - Java Dev Day...Creating a custom Machine Learning Model for your applications - Java Dev Day...
Creating a custom Machine Learning Model for your applications - Java Dev Day...
 
2021 06 19 ms student ambassadors nigeria ml net 01 slide-share
2021 06 19 ms student ambassadors nigeria ml net 01   slide-share2021 06 19 ms student ambassadors nigeria ml net 01   slide-share
2021 06 19 ms student ambassadors nigeria ml net 01 slide-share
 
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
 
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
 
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksIBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
 
Creating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/EverywhereCreating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/Everywhere
 
Joomla! Testing - J!DD Germany 2016
Joomla! Testing - J!DD Germany 2016Joomla! Testing - J!DD Germany 2016
Joomla! Testing - J!DD Germany 2016
 
PVS-Studio and static code analysis technique
PVS-Studio and static code analysis techniquePVS-Studio and static code analysis technique
PVS-Studio and static code analysis technique
 
Debugger & Profiler in NetBeans
Debugger & Profiler in NetBeansDebugger & Profiler in NetBeans
Debugger & Profiler in NetBeans
 
Lessons learned on software testing automation
Lessons learned on software testing automationLessons learned on software testing automation
Lessons learned on software testing automation
 
Creating a custom ML model for your application - DevFest Lima 2019
Creating a custom ML model for your application - DevFest Lima 2019Creating a custom ML model for your application - DevFest Lima 2019
Creating a custom ML model for your application - DevFest Lima 2019
 
Software Design
Software DesignSoftware Design
Software Design
 
Behaviour Driven Development and Thinking About Testing
Behaviour Driven Development and Thinking About TestingBehaviour Driven Development and Thinking About Testing
Behaviour Driven Development and Thinking About Testing
 
Bdd and-testing
Bdd and-testingBdd and-testing
Bdd and-testing
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Dernier (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Introduction Machine Learning by MyLittleAdventure

  • 1. B2B Travel technology Machine Learning Introduction October 20th 2016
  • 3. Agenda What’s Machine Learning ? Usage examples Complexity Algorithm families Let’s go! Troubleshoot Tech insights Next steps Conclusion 3!
  • 5. What’s Machine Learning ? Software that do something without being explicitly programmed to, just by learning through examples Same software can be used for various tasks It learns from experiences with respect to some task and performance, and improves through experience 5!
  • 6. Usage examples (1/2) 6! Some typical usage examples
  • 7. Use cases : MyLittleAdventure (2/2) 7 Language detection Clustering Anomaly detection Recommendation Chose of parameters MyLittleAdventure usage !
  • 8. Complexity 8! """Tests for convolution related functionality in tensorflow.ops.nn.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np from six.moves import xrange # pylint: disable=redefined-builtin import tensorflow as tf class Conv2DTransposeTest(tf.test.TestCase): def testConv2DTransposeSingleStride(self): with self.test_session(): strides = [1, 1, 1, 1] # Input, output: [batch, height, width, depth] x_shape = [2, 6, 4, 3] y_shape = [2, 6, 4, 2] # Filter: [kernel_height, kernel_width, output_depth, input_depth] f_shape = [3, 3, 2, 3] x = tf.constant(1.0, shape=x_shape, name="x", dtype=tf.float32) f = tf.constant(1.0, shape=f_shape, name="filter", dtype=tf.float32) output = tf.nn.conv2d_transpose(x, f, y_shape, strides=strides, padding="SAME") value = output.eval() # We count the number of cells being added at the locations in the output. # At the center, #cells=kernel_height * kernel_width # At the corners, #cells=ceil(kernel_height/2) * ceil(kernel_width/2) # At the borders, #cells=ceil(kernel_height/2)*kernel_width or # kernel_height * ceil(kernel_width/2) for n in xrange(x_shape[0]): for k in xrange(f_shape[2]): for w in xrange(y_shape[2]): for h in xrange(y_shape[1]): target = 4 * 3.0 h_in = h > 0 and h < y_shape[1] - 1 w_in = w > 0 and w < y_shape[2] - 1 if h_in and w_in: target += 5 * 3.0 """GradientDescent for TensorFlow.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function from tensorflow.python.framework import ops from tensorflow.python.ops import math_ops from tensorflow.python.training import optimizer from tensorflow.python.training import training_ops class GradientDescentOptimizer(optimizer.Optimizer): """Optimizer that implements the gradient descent algorithm. @@__init__ """ def __init__(self, learning_rate, use_locking=False, name="GradientDescent"): """Construct a new gradient descent optimizer. Args: learning_rate: A Tensor or a floating point value. The learning rate to use. use_locking: If True use locks for update operations. name: Optional name prefix for the operations created when applying gradients. Defaults to "GradientDescent". """ super(GradientDescentOptimizer, self).__init__(use_locking, name) self._learning_rate = learning_rate def _apply_dense(self, grad, var): return training_ops.apply_gradient_descent( var, math_ops.cast(self._learning_rate_tensor, var.dtype.base_dtype), grad, use_locking=self._use_locking).op def _apply_sparse(self, grad, var): delta = ops.IndexedSlices( grad.values * """Tests for tensorflow.ops.linalg_grad.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np import tensorflow as tf class ShapeTest(tf.test.TestCase): def testBatchGradientUnknownSize(self): with self.test_session(): batch_size = tf.constant(3) matrix_size = tf.constant(4) batch_identity = tf.tile( tf.expand_dims( tf.diag(tf.ones([matrix_size])), 0), [batch_size, 1, 1]) determinants = tf.matrix_determinant(batch_identity) reduced = tf.reduce_sum(determinants) sum_grad = tf.gradients(reduced, batch_identity)[0] self.assertAllClose(batch_identity.eval(), sum_grad.eval()) class MatrixUnaryFunctorGradientTest(tf.test.TestCase): pass # Filled in below def _GetMatrixUnaryFunctorGradientTest(functor_, dtype_, shape_, **kwargs_): def Test(self): with self.test_session(): np.random.seed(1) m = np.random.uniform(low=-1.0, high=1.0, size=np.prod(shape_)).reshape(shape_).astype(dtype_) a = tf.constant(m) b = functor_(a, **kwargs_) # Optimal stepsize for central difference is O(epsilon^{1/3}). epsilon = np.finfo(dtype_).eps delta = 0.1 * epsilon**(1.0 / 3.0) # tolerance obtained by looking at actual differences using # np.linalg.norm(theoretical-numerical, np.inf) on -mavx build Complex algorithm before train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) And Machine learning now…
  • 13. Recipe !13 Collect Training data Files, database, cache, data flow Selection of model, and (hyper) parameters Train algorithm Use or store your trained estimator Make predictions Measure accuracy precision Measure
  • 14. Collect training data Get qualitative data Get some samples Don’t get data for months and then try Go fast and try things. 14 Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 Granny smith apple 86 6.2 4.7 Mandarin 178 7.1 7.8 Braeburn apple 162 7.4 7.2 Cripps pink apple 118 6.1 8.1 Unidentified lemons 144 6.8 7.4 Turkey orange 362 9.6 9.2 Spanish jumbo orange … … … … What about the data? Fruit identification example
  • 15. Prepare your data 15 Numerize your features and labels Put them in same scale (normalization) ? Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 162 7.4 7.2 5 118 6.1 8.1 10 144 6.8 7.4 8 362 9.6 9.2 9 … … … … We need to have some tests Training set Learning phase (60% - 80 %) Test set Analytics phase (20% - 40%)
  • 17. Train algorithm 17 Choose a classifier Fit the decision tree Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 162 7.4 7.2 5 118 6.1 8.1 10 144 6.8 7.4 8 362 9.6 9.2 9 … … … … We need to choose an estimator
  • 18. Make predictions 18 What looks our predictions? Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 Test set Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 1 Predictions !
  • 19. Measure (1/2) Evaluate on the dataset that as never ever been learned by your model 19! Accuracy Correct predictions / total predictions Gives a simple confidence score of our performance level
  • 20. Measure (2/2) Try to visualize and analyze your data, and know what you want 20! Actual true Actual false Predicted true True positive False positive Predicted false False negative True negative Confusion Matrix Skewed classes Precision = True positives / #predicted positives Recall = True positives / #actual positives F1 score (trade-off) = (precision * recall) / (precision + recall)
  • 24. Troubleshoot (2/4) Underfitting Add / create more features Use more sophisticated model Use fewer samples Decrease regularization 24! Overfitting Use fewer features Use more simple model Use more samples Increase regularization What are the different options ?
  • 26. Troubleshoot : Model choice (4/4) 26!
  • 28. Platforms : easy, peasy You don’t even have to code to build something (*wink wink* business developers) Built-in models Data munging Model management by UI PaaS 28! Very high-level solutions
  • 29. Languages For understanding & prototyping implementation Most Valuable Languages Comfortable for prototyping, yet powerful for industrialisation For bigger companies & projects, and fine-tuned softwares 29! Matlab Octave Go Python Java C++ What language for what purpose ?
  • 30. Libraries Built-in models Data munging Fine-tuning Full integration to your product 30! You will have great power using a library Golearn
  • 32. Next steps Split your data in 3 : Training / Cross validation / Test set Know the top algorithms Search advanced techniques and optimizers (online learning, stacking) Deep and reinforcement learning Partial and semi-supervised learning Transfer learning How to store and analyse big data ? How do we scale ? !32 Try it ! Find your best tools and have some fun
  • 33. Conclusion Try it and let’s get in touch! Machine learning is not just a buzz word Difficulties are not always what we think! Machine learning is rather experiences and tests than just algorithms There is no perfect unique solution There is plenty of easy to use solutions for beginners 33!
  • 37. Thank you Machine Learning Introduction October 20th 2016 Questions ?