SlideShare une entreprise Scribd logo
1  sur  21
Practical Machine
Learning and Rails
Andrew Cantino
  VP Engineering, Mavenlink    @tectonic




  Founder, Agile Productions   @ryanstout
This talk will
- introduce machine learning

- make you ML-aware

- have examples
This talk will not
- give you a PhD

- implement algorithms

- cover collaborative filtering,
  optimization, clustering, advanced statistics,   genetic algorithms, classical AI, NLP, ...
What is Machine Learning?
Many different algorithms

that predict data

from other data

using applied statistics.
"Enhance and rotate 20 degrees"
What data?
       The web is data.

                                           User decisions
       APIs         A/B Tests
                                 Databases
                   Logs          Streams



Browser versions
                       Reviews
                                  Clicktrails
Okay. We have data.
What do we do with it?


We   classify it.
Classification
Classification



            OR
Classification



    :)      OR   :(
Classification
• Documents
    o Sort email (Gmail's importance filter)
    o Route questions to appropriate expert (Aardvark)
    o Categorize reviews (Amazon)



•   Users
    o   Expertise; interests; pro vs free; likelihood of paying;
        expected future karma


•   Events
    o   Abnormal vs. normal
Algorithms:
     Decision Tree Learning
Algorithms:
        Decision Tree Learning

                                                                Features
                            Email contains
                            word "viagra"

                            no       yes

           Email contains                    Email contains
            word "Ruby"                       attachment?


           no         yes                     no        yes


   P(Spam)=10%     P(Spam)=5%         P(Spam)=70%       P(Spam)=95%




                                                       Labels
Algorithms:
     Support Vector Machines (SVMs)




                          Graphics from Wikipedia
Algorithms:
     Support Vector Machines (SVMs)




                          Graphics from Wikipedia
Algorithms:
           Naive Bayes

•   Break documents into words and treat each
    word as an independent feature

•   Surprisingly effective on simple text and
    document classification

•   Works well when you have lots of data



                                          Graphics from Wikipedia
Algorithms:
             Naive Bayes

You received 100 emails, 70 of which were spam.
Word                 Spam with this word   Ham with this word

viagra               42 (60%)              1 (3.3%)

ruby                 7 (10%)               15 (50%)

hello                35 (50%)              24 (80%)



A new email contains hello and viagra. The probability that it
is spam is:
P(S|hello,viagra) = P(S) * P(hello,viagra|S) / P(hello,viagra)
                  = 0.7 * (0.5 * 0.6)        / (0.59 * 0.43)
                  = 82%
                                                      Graphics from Wikipedia
Algorithms:
               Neural Nets
                         Hidden layer

Input layer (features)

                                        Output layer (Classification)




                                                      Graphics from Wikipedia
Curse of Dimensionality

The more features
  and labels that you
  have, the more data
  that you need.




       http://www.iro.umontreal.ca/~bengioy/yoshua_en/research_files/CurseDimensionality.jpg
Overfitting
•   With enough parameters, anything is
    possible.

•   We want our algorithms to generalize and
    infer, not memorize specific training
    examples.

•   Therefore, we test our algorithms on
    different data than we train them on.

Contenu connexe

Tendances

無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)Shuyo Nakatani
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)Masahiro Suzuki
 
クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定Hiroshi Nakagawa
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習Preferred Networks
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報Deep Learning JP
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについてDeep Learning JP
 
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII
 
(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel LearningMasahiro Suzuki
 
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for  Self-supervised Learning in Speech,...[DL輪読会]data2vec: A General Framework for  Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...Deep Learning JP
 
研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節Koji Matsuda
 
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについてMasahiro Suzuki
 
【DL輪読会】Segment Anything
【DL輪読会】Segment Anything【DL輪読会】Segment Anything
【DL輪読会】Segment AnythingDeep Learning JP
 
「世界モデル」と関連研究について
「世界モデル」と関連研究について「世界モデル」と関連研究について
「世界モデル」と関連研究についてMasahiro Suzuki
 
[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)
[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)
[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)Yusuke Iwasawa
 
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法についてTransformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法についてSho Takase
 
ZDD入門-お姉さんを救う方法
ZDD入門-お姉さんを救う方法ZDD入門-お姉さんを救う方法
ZDD入門-お姉さんを救う方法nishio
 
ベルヌーイ分布における超パラメータ推定のための経験ベイズ法
ベルヌーイ分布における超パラメータ推定のための経験ベイズ法ベルヌーイ分布における超パラメータ推定のための経験ベイズ法
ベルヌーイ分布における超パラメータ推定のための経験ベイズ法智文 中野
 
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...Deep Learning JP
 

Tendances (20)

ResNetの仕組み
ResNetの仕組みResNetの仕組み
ResNetの仕組み
 
無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)無限関係モデル (続・わかりやすいパターン認識 13章)
無限関係モデル (続・わかりやすいパターン認識 13章)
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定
 
IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習IIBMP2016 深層生成モデルによる表現学習
IIBMP2016 深層生成モデルによる表現学習
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
 
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
 
ELBO型VAEのダメなところ
ELBO型VAEのダメなところELBO型VAEのダメなところ
ELBO型VAEのダメなところ
 
(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning
 
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for  Self-supervised Learning in Speech,...[DL輪読会]data2vec: A General Framework for  Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
 
研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節
 
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
 
【DL輪読会】Segment Anything
【DL輪読会】Segment Anything【DL輪読会】Segment Anything
【DL輪読会】Segment Anything
 
「世界モデル」と関連研究について
「世界モデル」と関連研究について「世界モデル」と関連研究について
「世界モデル」と関連研究について
 
[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)
[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)
[DL輪読会] GAN系の研究まとめ (NIPS2016とICLR2016が中心)
 
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法についてTransformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
 
ZDD入門-お姉さんを救う方法
ZDD入門-お姉さんを救う方法ZDD入門-お姉さんを救う方法
ZDD入門-お姉さんを救う方法
 
ベルヌーイ分布における超パラメータ推定のための経験ベイズ法
ベルヌーイ分布における超パラメータ推定のための経験ベイズ法ベルヌーイ分布における超パラメータ推定のための経験ベイズ法
ベルヌーイ分布における超パラメータ推定のための経験ベイズ法
 
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
【DL輪読会】Ego-Exo: Transferring Visual Representations from Third-person to Firs...
 

Similaire à Practical Machine Learning and Rails Part1

Static Analysis
Static AnalysisStatic Analysis
Static Analysisalice yang
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11darwinrlo
 
Machine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayMachine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayAWS Germany
 
The Art of Identifying Vulnerabilities - CascadiaFest 2015
The Art of Identifying Vulnerabilities  - CascadiaFest 2015The Art of Identifying Vulnerabilities  - CascadiaFest 2015
The Art of Identifying Vulnerabilities - CascadiaFest 2015Adam Baldwin
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9Roger Barga
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Turi, Inc.
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.NetBruno Capuano
 
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...Silvio Cesare
 
The Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From DataThe Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From Datalmrei
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningConnected Data World
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesPradeep Redddy Raamana
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptxShree Shree
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning ClassifiersMostafa
 
Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Salesforce Engineering
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tpseudor00t overflow
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in PythonHilary Mason
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8Roger Barga
 
Part 3 Machine Learnning
Part 3 Machine LearnningPart 3 Machine Learnning
Part 3 Machine LearnningMohamed Essam
 

Similaire à Practical Machine Learning and Rails Part1 (20)

Static Analysis
Static AnalysisStatic Analysis
Static Analysis
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
 
Machine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayMachine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web Day
 
NAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHMNAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHM
 
The Art of Identifying Vulnerabilities - CascadiaFest 2015
The Art of Identifying Vulnerabilities  - CascadiaFest 2015The Art of Identifying Vulnerabilities  - CascadiaFest 2015
The Art of Identifying Vulnerabilities - CascadiaFest 2015
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Data mining on yelp dataset
Data mining on yelp datasetData mining on yelp dataset
Data mining on yelp dataset
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net
 
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
 
The Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From DataThe Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From Data
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep Learning
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practices
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptx
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
 
Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in Python
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8
 
Part 3 Machine Learnning
Part 3 Machine LearnningPart 3 Machine Learnning
Part 3 Machine Learnning
 

Plus de ryanstout

Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevConryanstout
 
Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014ryanstout
 
Reactive programming
Reactive programmingReactive programming
Reactive programmingryanstout
 
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patternsryanstout
 
Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2ryanstout
 
Intro to Advanced JavaScript
Intro to Advanced JavaScriptIntro to Advanced JavaScript
Intro to Advanced JavaScriptryanstout
 

Plus de ryanstout (8)

Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevCon
 
Volt 2015
Volt 2015Volt 2015
Volt 2015
 
Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014
 
Reactive programming
Reactive programmingReactive programming
Reactive programming
 
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patterns
 
EmberJS
EmberJSEmberJS
EmberJS
 
Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2
 
Intro to Advanced JavaScript
Intro to Advanced JavaScriptIntro to Advanced JavaScript
Intro to Advanced JavaScript
 

Dernier

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Dernier (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Practical Machine Learning and Rails Part1

  • 2. Andrew Cantino VP Engineering, Mavenlink @tectonic Founder, Agile Productions @ryanstout
  • 3. This talk will - introduce machine learning - make you ML-aware - have examples
  • 4. This talk will not - give you a PhD - implement algorithms - cover collaborative filtering, optimization, clustering, advanced statistics, genetic algorithms, classical AI, NLP, ...
  • 5. What is Machine Learning? Many different algorithms that predict data from other data using applied statistics.
  • 6. "Enhance and rotate 20 degrees"
  • 7. What data? The web is data. User decisions APIs A/B Tests Databases Logs Streams Browser versions Reviews Clicktrails
  • 8. Okay. We have data. What do we do with it? We classify it.
  • 11. Classification :) OR :(
  • 12. Classification • Documents o Sort email (Gmail's importance filter) o Route questions to appropriate expert (Aardvark) o Categorize reviews (Amazon) • Users o Expertise; interests; pro vs free; likelihood of paying; expected future karma • Events o Abnormal vs. normal
  • 13. Algorithms: Decision Tree Learning
  • 14. Algorithms: Decision Tree Learning Features Email contains word "viagra" no yes Email contains Email contains word "Ruby" attachment? no yes no yes P(Spam)=10% P(Spam)=5% P(Spam)=70% P(Spam)=95% Labels
  • 15. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  • 16. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  • 17. Algorithms: Naive Bayes • Break documents into words and treat each word as an independent feature • Surprisingly effective on simple text and document classification • Works well when you have lots of data Graphics from Wikipedia
  • 18. Algorithms: Naive Bayes You received 100 emails, 70 of which were spam. Word Spam with this word Ham with this word viagra 42 (60%) 1 (3.3%) ruby 7 (10%) 15 (50%) hello 35 (50%) 24 (80%) A new email contains hello and viagra. The probability that it is spam is: P(S|hello,viagra) = P(S) * P(hello,viagra|S) / P(hello,viagra) = 0.7 * (0.5 * 0.6) / (0.59 * 0.43) = 82% Graphics from Wikipedia
  • 19. Algorithms: Neural Nets Hidden layer Input layer (features) Output layer (Classification) Graphics from Wikipedia
  • 20. Curse of Dimensionality The more features and labels that you have, the more data that you need. http://www.iro.umontreal.ca/~bengioy/yoshua_en/research_files/CurseDimensionality.jpg
  • 21. Overfitting • With enough parameters, anything is possible. • We want our algorithms to generalize and infer, not memorize specific training examples. • Therefore, we test our algorithms on different data than we train them on.