SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
B2B Travel technology
Machine Learning Introduction
October 20th 2016
Introduction
Johnny RAHAJARISON
@brainstorm_me
johnny.rahajarison@mylittleadventure.com
MyLittleAdventure
@mylitadventure
2
Agenda
What’s Machine Learning ?
Usage examples
Complexity
Algorithm families
Let’s go!
Troubleshoot
Tech insights
Next steps
Conclusion
3!
Machine learning
Introduction
4
What’s Machine Learning ?
Software that do something without being explicitly
programmed to, just by learning through examples
Same software can be used for various tasks
It learns from experiences with respect to some task and performance,
and improves through experience
5!
Usage examples (1/2)
6!
Some typical usage examples
Use cases : MyLittleAdventure (2/2)
7
Language detection
Clustering
Anomaly detection
Recommendation
Chose of parameters
MyLittleAdventure usage
!
Complexity
8!
"""Tests for convolution related functionality in tensorflow.ops.nn."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from six.moves import xrange # pylint: disable=redefined-builtin
import tensorflow as tf
class Conv2DTransposeTest(tf.test.TestCase):
def testConv2DTransposeSingleStride(self):
with self.test_session():
strides = [1, 1, 1, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 6, 4, 2]
# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]
x = tf.constant(1.0, shape=x_shape, name="x", dtype=tf.float32)
f = tf.constant(1.0, shape=f_shape, name="filter", dtype=tf.float32)
output = tf.nn.conv2d_transpose(x, f, y_shape, strides=strides,
padding="SAME")
value = output.eval()
# We count the number of cells being added at the locations in the output.
# At the center, #cells=kernel_height * kernel_width
# At the corners, #cells=ceil(kernel_height/2) * ceil(kernel_width/2)
# At the borders, #cells=ceil(kernel_height/2)*kernel_width or
# kernel_height * ceil(kernel_width/2)
for n in xrange(x_shape[0]):
for k in xrange(f_shape[2]):
for w in xrange(y_shape[2]):
for h in xrange(y_shape[1]):
target = 4 * 3.0
h_in = h > 0 and h < y_shape[1] - 1
w_in = w > 0 and w < y_shape[2] - 1
if h_in and w_in:
target += 5 * 3.0
"""GradientDescent for TensorFlow."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.python.framework import ops
from tensorflow.python.ops import math_ops
from tensorflow.python.training import optimizer
from tensorflow.python.training import training_ops
class GradientDescentOptimizer(optimizer.Optimizer):
"""Optimizer that implements the gradient descent algorithm.
@@__init__
"""
def __init__(self, learning_rate, use_locking=False, name="GradientDescent"):
"""Construct a new gradient descent optimizer.
Args:
learning_rate: A Tensor or a floating point value. The learning
rate to use.
use_locking: If True use locks for update operations.
name: Optional name prefix for the operations created when applying
gradients. Defaults to "GradientDescent".
"""
super(GradientDescentOptimizer, self).__init__(use_locking, name)
self._learning_rate = learning_rate
def _apply_dense(self, grad, var):
return training_ops.apply_gradient_descent(
var,
math_ops.cast(self._learning_rate_tensor, var.dtype.base_dtype),
grad,
use_locking=self._use_locking).op
def _apply_sparse(self, grad, var):
delta = ops.IndexedSlices(
grad.values *
"""Tests for tensorflow.ops.linalg_grad."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import tensorflow as tf
class ShapeTest(tf.test.TestCase):
def testBatchGradientUnknownSize(self):
with self.test_session():
batch_size = tf.constant(3)
matrix_size = tf.constant(4)
batch_identity = tf.tile(
tf.expand_dims(
tf.diag(tf.ones([matrix_size])), 0), [batch_size, 1, 1])
determinants = tf.matrix_determinant(batch_identity)
reduced = tf.reduce_sum(determinants)
sum_grad = tf.gradients(reduced, batch_identity)[0]
self.assertAllClose(batch_identity.eval(), sum_grad.eval())
class MatrixUnaryFunctorGradientTest(tf.test.TestCase):
pass # Filled in below
def _GetMatrixUnaryFunctorGradientTest(functor_, dtype_, shape_, **kwargs_):
def Test(self):
with self.test_session():
np.random.seed(1)
m = np.random.uniform(low=-1.0,
high=1.0,
size=np.prod(shape_)).reshape(shape_).astype(dtype_)
a = tf.constant(m)
b = functor_(a, **kwargs_)
# Optimal stepsize for central difference is O(epsilon^{1/3}).
epsilon = np.finfo(dtype_).eps
delta = 0.1 * epsilon**(1.0 / 3.0)
# tolerance obtained by looking at actual differences using
# np.linalg.norm(theoretical-numerical, np.inf) on -mavx build
Complex algorithm before
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
And Machine learning now…
Machine learning
Algorithm families
9
Supervised algorithms
10
Supervised algorithms
ClassificationRegression
Unsupervised algorithms
11
Unsupervised algorithms
ClusteringAnomaly detection
Machine learning
Let’s go!
12
Recipe
!13
Collect
Training data
Files, database,
cache, data flow
Selection of model,
and (hyper) parameters
Train algorithm
Use or store your
trained algorithm
Make predictions
Measure accuracy
precision
Measure
Collect training data
Get qualitative data
Get some samples
Don’t get data for months and then try
Go fast and try things.
14
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 Granny smith apple
86 6.2 4.7 Mandarin
178 7.1 7.8 Braeburn apple
162 7.4 7.2 Cripps pink apple
118 6.1 8.1 Unidentified lemons
144 6.8 7.4 Turkey orange
362 9.6 9.2 Spanish jumbo orange
… … … …
What about the data?
Fruit identification example
Prepare your data
15
Numerize your features and labels
Put them in same scale (normalization) ?
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
162 7.4 7.2 5
118 6.1 8.1 10
144 6.8 7.4 8
362 9.6 9.2 9
… … … …
We need to have some tests
Training set
Learning phase (60% - 80 %)
Test set
Analytics phase (20% - 40%)
Train algorithm
16
Choose a classifier
Fit the decision tree
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
162 7.4 7.2 5
118 6.1 8.1 10
144 6.8 7.4 8
362 9.6 9.2 9
… … … …
We need to choose an estimator
Make predictions
17
What looks our predictions?
Weight
(g)
Width
(cm)
Height
(cm)
Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
Test set
Weight
(g)
Width
(cm)
Height
(cm)
Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 1
Predictions
!
Measure (1/2)
Evaluate on the dataset that as never ever been learned by your model
18!
Error level
Correct predictions / total predictions
Gives a simple confidence score of our performance level
Measure (2/2)
Try to visualize and analyze your data, and know what you want
19!
Actual true Actual false
Predicted
true
True
positive
False
positive
Predicted
false
False
negative
True
negative
Confusion Matrix
Skewed classes
Precision = True positives / #predicted positives
Recall = True positives / #actual positives
F1 score (trade-off) = (precision * recall) /
(precision + recall)
Machine learning
Troubleshoot
20
Troubleshoot (1/4)
21!
Under/Overfitting situation
Troubleshoot (2/4)
Underfitting
Add / create more features
Use more sophisticated model
Use fewer samples
Decrease regularization
22!
Overfitting
Use fewer features
Use more simple model
Use more samples
Increase regularization
What are the different options ?
Troubleshoot (3/4)
23!
Underfitting Overfitting
Using the learning curves…
Troubleshoot : Model choice (4/4)
24!
Machine learning
Tech insights
25
Platforms : easy, peasy
You don’t even have to code to build something
(*wink wink* business developers)
Built-in models
Data munging
Model management by UI
PaaS
26!
Very high-level solutions
Languages
For understanding &
prototyping implementation
Most Valuable Languages
Comfortable for prototyping,
yet powerful for
industrialisation
For bigger companies &
projects, and fine-tuned
softwares
27!
Matlab Octave Go Python Java C++
What language for what purpose ?
Libraries
Built-in models
Data munging
Fine-tuning
Full integration to your product
28!
You will have great power using a library
Golearn
Machine learning
Next steps…
29
Next steps
Split your data in 3 : Training / Cross validation / Test set
Know the top algorithms
Search advanced techniques and optimizers (online learning, stacking)
Deep and reinforcement learning
Partial and semi-supervised learning
Transfer learning
How to store and analyse big data ? How do we scale ?
!30
Try it ! Find your best tools and have some fun
Conclusion
Try it and let’s get in touch!
Machine learning is not just a buzz word
Difficulties are not always what we think!
Machine learning is rather experiences and tests than just algorithms
There is no perfect unique solution
There is plenty of easy to use solutions for beginners
31!
Machine learning
One more thing!
32
Tensor flow & mnist
33
Tensorflow learn
34
Thank you
Machine Learning Introduction
October 20th 2016
Questions ?

Contenu connexe

Tendances

How to complement TDD with static analysis
How to complement TDD with static analysisHow to complement TDD with static analysis
How to complement TDD with static analysisPVS-Studio
 
The future of testing in Pharo
The future of testing in PharoThe future of testing in Pharo
The future of testing in PharoJulienDelp
 
Tortuga-A simulation software
Tortuga-A simulation softwareTortuga-A simulation software
Tortuga-A simulation softwareSyeda Nyma
 
6months industrial training in software testing, jalandhar
6months industrial training in software testing, jalandhar6months industrial training in software testing, jalandhar
6months industrial training in software testing, jalandhardeepikakaler1
 
6 weeks summer training in software testing,jalandhar
6 weeks summer training in software testing,jalandhar6 weeks summer training in software testing,jalandhar
6 weeks summer training in software testing,jalandhardeepikakaler1
 
Aae oop xp_13
Aae oop xp_13Aae oop xp_13
Aae oop xp_13Niit Care
 
Cmis 102 Effective Communication / snaptutorial.com
Cmis 102  Effective Communication / snaptutorial.comCmis 102  Effective Communication / snaptutorial.com
Cmis 102 Effective Communication / snaptutorial.comHarrisGeorg12
 
Cmis 102 Enthusiastic Study / snaptutorial.com
Cmis 102 Enthusiastic Study / snaptutorial.comCmis 102 Enthusiastic Study / snaptutorial.com
Cmis 102 Enthusiastic Study / snaptutorial.comStephenson22
 
TDD in Python With Pytest
TDD in Python With PytestTDD in Python With Pytest
TDD in Python With PytestEddy Reyes
 
The future of Test Automation
The future of Test AutomationThe future of Test Automation
The future of Test AutomationBernd Beersma
 

Tendances (12)

How to complement TDD with static analysis
How to complement TDD with static analysisHow to complement TDD with static analysis
How to complement TDD with static analysis
 
The future of testing in Pharo
The future of testing in PharoThe future of testing in Pharo
The future of testing in Pharo
 
Tortuga-A simulation software
Tortuga-A simulation softwareTortuga-A simulation software
Tortuga-A simulation software
 
6months industrial training in software testing, jalandhar
6months industrial training in software testing, jalandhar6months industrial training in software testing, jalandhar
6months industrial training in software testing, jalandhar
 
6 weeks summer training in software testing,jalandhar
6 weeks summer training in software testing,jalandhar6 weeks summer training in software testing,jalandhar
6 weeks summer training in software testing,jalandhar
 
Aae oop xp_13
Aae oop xp_13Aae oop xp_13
Aae oop xp_13
 
QTP/UFT latest interview questions 2014
QTP/UFT latest interview questions 2014QTP/UFT latest interview questions 2014
QTP/UFT latest interview questions 2014
 
Test
TestTest
Test
 
Cmis 102 Effective Communication / snaptutorial.com
Cmis 102  Effective Communication / snaptutorial.comCmis 102  Effective Communication / snaptutorial.com
Cmis 102 Effective Communication / snaptutorial.com
 
Cmis 102 Enthusiastic Study / snaptutorial.com
Cmis 102 Enthusiastic Study / snaptutorial.comCmis 102 Enthusiastic Study / snaptutorial.com
Cmis 102 Enthusiastic Study / snaptutorial.com
 
TDD in Python With Pytest
TDD in Python With PytestTDD in Python With Pytest
TDD in Python With Pytest
 
The future of Test Automation
The future of Test AutomationThe future of Test Automation
The future of Test Automation
 

Similaire à An introduction to Machine Learning

Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventuremylittleadventure
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple stepsRenjith M P
 
Developing a test automation strategy by Brian Bayer
Developing a test automation strategy by Brian BayerDeveloping a test automation strategy by Brian Bayer
Developing a test automation strategy by Brian BayerQA or the Highway
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
Joomla! Testing - J!DD Germany 2016
Joomla! Testing - J!DD Germany 2016Joomla! Testing - J!DD Germany 2016
Joomla! Testing - J!DD Germany 2016Yves Hoppe
 
Lessons learned on software testing automation
Lessons learned on software testing automationLessons learned on software testing automation
Lessons learned on software testing automationgaoliang641
 
Ml goes fruitful
Ml goes fruitfulMl goes fruitful
Ml goes fruitfulPreeti Negi
 
Creating a custom Machine Learning Model for your applications - Java Dev Day...
Creating a custom Machine Learning Model for your applications - Java Dev Day...Creating a custom Machine Learning Model for your applications - Java Dev Day...
Creating a custom Machine Learning Model for your applications - Java Dev Day...Isabel Palomar
 
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksIBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksSenturus
 
Debugger & Profiler in NetBeans
Debugger & Profiler in NetBeansDebugger & Profiler in NetBeans
Debugger & Profiler in NetBeansHuu Bang Le Phan
 
Software Design
Software DesignSoftware Design
Software DesignSpy Seat
 
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverQA or the Highway
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis Engineering Software Lab
 
Creating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/EverywhereCreating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/EverywhereIsabel Palomar
 
Guide to Generate Extent Report in Kotlin
Guide to Generate Extent Report in KotlinGuide to Generate Extent Report in Kotlin
Guide to Generate Extent Report in KotlinRapidValue
 
2021 06 19 ms student ambassadors nigeria ml net 01 slide-share
2021 06 19 ms student ambassadors nigeria ml net 01   slide-share2021 06 19 ms student ambassadors nigeria ml net 01   slide-share
2021 06 19 ms student ambassadors nigeria ml net 01 slide-shareBruno Capuano
 
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoMLBruno Capuano
 
PVS-Studio and static code analysis technique
PVS-Studio and static code analysis techniquePVS-Studio and static code analysis technique
PVS-Studio and static code analysis techniqueAndrey Karpov
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence WorkshopDavid Tan
 
Weekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine LearningWeekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine LearningBill Liu
 

Similaire à An introduction to Machine Learning (20)

Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventure
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Developing a test automation strategy by Brian Bayer
Developing a test automation strategy by Brian BayerDeveloping a test automation strategy by Brian Bayer
Developing a test automation strategy by Brian Bayer
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Joomla! Testing - J!DD Germany 2016
Joomla! Testing - J!DD Germany 2016Joomla! Testing - J!DD Germany 2016
Joomla! Testing - J!DD Germany 2016
 
Lessons learned on software testing automation
Lessons learned on software testing automationLessons learned on software testing automation
Lessons learned on software testing automation
 
Ml goes fruitful
Ml goes fruitfulMl goes fruitful
Ml goes fruitful
 
Creating a custom Machine Learning Model for your applications - Java Dev Day...
Creating a custom Machine Learning Model for your applications - Java Dev Day...Creating a custom Machine Learning Model for your applications - Java Dev Day...
Creating a custom Machine Learning Model for your applications - Java Dev Day...
 
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksIBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
 
Debugger & Profiler in NetBeans
Debugger & Profiler in NetBeansDebugger & Profiler in NetBeans
Debugger & Profiler in NetBeans
 
Software Design
Software DesignSoftware Design
Software Design
 
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
 
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
Parasoft .TEST, Write better C# Code Using  Data Flow Analysis Parasoft .TEST, Write better C# Code Using  Data Flow Analysis
Parasoft .TEST, Write better C# Code Using Data Flow Analysis
 
Creating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/EverywhereCreating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/Everywhere
 
Guide to Generate Extent Report in Kotlin
Guide to Generate Extent Report in KotlinGuide to Generate Extent Report in Kotlin
Guide to Generate Extent Report in Kotlin
 
2021 06 19 ms student ambassadors nigeria ml net 01 slide-share
2021 06 19 ms student ambassadors nigeria ml net 01   slide-share2021 06 19 ms student ambassadors nigeria ml net 01   slide-share
2021 06 19 ms student ambassadors nigeria ml net 01 slide-share
 
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
2021 02 23 MVP Fusion Getting Started with Machine Learning.Net and AutoML
 
PVS-Studio and static code analysis technique
PVS-Studio and static code analysis techniquePVS-Studio and static code analysis technique
PVS-Studio and static code analysis technique
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence Workshop
 
Weekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine LearningWeekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
 

Dernier

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

An introduction to Machine Learning

  • 1. B2B Travel technology Machine Learning Introduction October 20th 2016
  • 3. Agenda What’s Machine Learning ? Usage examples Complexity Algorithm families Let’s go! Troubleshoot Tech insights Next steps Conclusion 3!
  • 5. What’s Machine Learning ? Software that do something without being explicitly programmed to, just by learning through examples Same software can be used for various tasks It learns from experiences with respect to some task and performance, and improves through experience 5!
  • 6. Usage examples (1/2) 6! Some typical usage examples
  • 7. Use cases : MyLittleAdventure (2/2) 7 Language detection Clustering Anomaly detection Recommendation Chose of parameters MyLittleAdventure usage !
  • 8. Complexity 8! """Tests for convolution related functionality in tensorflow.ops.nn.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np from six.moves import xrange # pylint: disable=redefined-builtin import tensorflow as tf class Conv2DTransposeTest(tf.test.TestCase): def testConv2DTransposeSingleStride(self): with self.test_session(): strides = [1, 1, 1, 1] # Input, output: [batch, height, width, depth] x_shape = [2, 6, 4, 3] y_shape = [2, 6, 4, 2] # Filter: [kernel_height, kernel_width, output_depth, input_depth] f_shape = [3, 3, 2, 3] x = tf.constant(1.0, shape=x_shape, name="x", dtype=tf.float32) f = tf.constant(1.0, shape=f_shape, name="filter", dtype=tf.float32) output = tf.nn.conv2d_transpose(x, f, y_shape, strides=strides, padding="SAME") value = output.eval() # We count the number of cells being added at the locations in the output. # At the center, #cells=kernel_height * kernel_width # At the corners, #cells=ceil(kernel_height/2) * ceil(kernel_width/2) # At the borders, #cells=ceil(kernel_height/2)*kernel_width or # kernel_height * ceil(kernel_width/2) for n in xrange(x_shape[0]): for k in xrange(f_shape[2]): for w in xrange(y_shape[2]): for h in xrange(y_shape[1]): target = 4 * 3.0 h_in = h > 0 and h < y_shape[1] - 1 w_in = w > 0 and w < y_shape[2] - 1 if h_in and w_in: target += 5 * 3.0 """GradientDescent for TensorFlow.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function from tensorflow.python.framework import ops from tensorflow.python.ops import math_ops from tensorflow.python.training import optimizer from tensorflow.python.training import training_ops class GradientDescentOptimizer(optimizer.Optimizer): """Optimizer that implements the gradient descent algorithm. @@__init__ """ def __init__(self, learning_rate, use_locking=False, name="GradientDescent"): """Construct a new gradient descent optimizer. Args: learning_rate: A Tensor or a floating point value. The learning rate to use. use_locking: If True use locks for update operations. name: Optional name prefix for the operations created when applying gradients. Defaults to "GradientDescent". """ super(GradientDescentOptimizer, self).__init__(use_locking, name) self._learning_rate = learning_rate def _apply_dense(self, grad, var): return training_ops.apply_gradient_descent( var, math_ops.cast(self._learning_rate_tensor, var.dtype.base_dtype), grad, use_locking=self._use_locking).op def _apply_sparse(self, grad, var): delta = ops.IndexedSlices( grad.values * """Tests for tensorflow.ops.linalg_grad.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np import tensorflow as tf class ShapeTest(tf.test.TestCase): def testBatchGradientUnknownSize(self): with self.test_session(): batch_size = tf.constant(3) matrix_size = tf.constant(4) batch_identity = tf.tile( tf.expand_dims( tf.diag(tf.ones([matrix_size])), 0), [batch_size, 1, 1]) determinants = tf.matrix_determinant(batch_identity) reduced = tf.reduce_sum(determinants) sum_grad = tf.gradients(reduced, batch_identity)[0] self.assertAllClose(batch_identity.eval(), sum_grad.eval()) class MatrixUnaryFunctorGradientTest(tf.test.TestCase): pass # Filled in below def _GetMatrixUnaryFunctorGradientTest(functor_, dtype_, shape_, **kwargs_): def Test(self): with self.test_session(): np.random.seed(1) m = np.random.uniform(low=-1.0, high=1.0, size=np.prod(shape_)).reshape(shape_).astype(dtype_) a = tf.constant(m) b = functor_(a, **kwargs_) # Optimal stepsize for central difference is O(epsilon^{1/3}). epsilon = np.finfo(dtype_).eps delta = 0.1 * epsilon**(1.0 / 3.0) # tolerance obtained by looking at actual differences using # np.linalg.norm(theoretical-numerical, np.inf) on -mavx build Complex algorithm before train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) And Machine learning now…
  • 13. Recipe !13 Collect Training data Files, database, cache, data flow Selection of model, and (hyper) parameters Train algorithm Use or store your trained algorithm Make predictions Measure accuracy precision Measure
  • 14. Collect training data Get qualitative data Get some samples Don’t get data for months and then try Go fast and try things. 14 Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 Granny smith apple 86 6.2 4.7 Mandarin 178 7.1 7.8 Braeburn apple 162 7.4 7.2 Cripps pink apple 118 6.1 8.1 Unidentified lemons 144 6.8 7.4 Turkey orange 362 9.6 9.2 Spanish jumbo orange … … … … What about the data? Fruit identification example
  • 15. Prepare your data 15 Numerize your features and labels Put them in same scale (normalization) ? Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 162 7.4 7.2 5 118 6.1 8.1 10 144 6.8 7.4 8 362 9.6 9.2 9 … … … … We need to have some tests Training set Learning phase (60% - 80 %) Test set Analytics phase (20% - 40%)
  • 16. Train algorithm 16 Choose a classifier Fit the decision tree Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 162 7.4 7.2 5 118 6.1 8.1 10 144 6.8 7.4 8 362 9.6 9.2 9 … … … … We need to choose an estimator
  • 17. Make predictions 17 What looks our predictions? Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 3 Test set Weight (g) Width (cm) Height (cm) Label 192 8.4 7.3 1 86 6.2 4.7 2 178 7.1 7.8 1 Predictions !
  • 18. Measure (1/2) Evaluate on the dataset that as never ever been learned by your model 18! Error level Correct predictions / total predictions Gives a simple confidence score of our performance level
  • 19. Measure (2/2) Try to visualize and analyze your data, and know what you want 19! Actual true Actual false Predicted true True positive False positive Predicted false False negative True negative Confusion Matrix Skewed classes Precision = True positives / #predicted positives Recall = True positives / #actual positives F1 score (trade-off) = (precision * recall) / (precision + recall)
  • 22. Troubleshoot (2/4) Underfitting Add / create more features Use more sophisticated model Use fewer samples Decrease regularization 22! Overfitting Use fewer features Use more simple model Use more samples Increase regularization What are the different options ?
  • 24. Troubleshoot : Model choice (4/4) 24!
  • 26. Platforms : easy, peasy You don’t even have to code to build something (*wink wink* business developers) Built-in models Data munging Model management by UI PaaS 26! Very high-level solutions
  • 27. Languages For understanding & prototyping implementation Most Valuable Languages Comfortable for prototyping, yet powerful for industrialisation For bigger companies & projects, and fine-tuned softwares 27! Matlab Octave Go Python Java C++ What language for what purpose ?
  • 28. Libraries Built-in models Data munging Fine-tuning Full integration to your product 28! You will have great power using a library Golearn
  • 30. Next steps Split your data in 3 : Training / Cross validation / Test set Know the top algorithms Search advanced techniques and optimizers (online learning, stacking) Deep and reinforcement learning Partial and semi-supervised learning Transfer learning How to store and analyse big data ? How do we scale ? !30 Try it ! Find your best tools and have some fun
  • 31. Conclusion Try it and let’s get in touch! Machine learning is not just a buzz word Difficulties are not always what we think! Machine learning is rather experiences and tests than just algorithms There is no perfect unique solution There is plenty of easy to use solutions for beginners 31!
  • 33. Tensor flow & mnist 33
  • 35. Thank you Machine Learning Introduction October 20th 2016 Questions ?