SlideShare une entreprise Scribd logo
1  sur  21
Practical Machine
Learning and Rails
Andrew Cantino
  VP Engineering, Mavenlink    @tectonic




  Founder, Agile Productions   @ryanstout
This talk will
- introduce machine learning

- make you ML-aware

- have examples
This talk will not
- give you a PhD

- implement algorithms

- cover collaborative filtering,
  optimization, clustering, advanced statistics,   genetic algorithms, classical AI, NLP, ...
What is Machine Learning?
Many different algorithms

that predict data

from other data

using applied statistics.
"Enhance and rotate 20 degrees"
What data?
       The web is data.

                                           User decisions
       APIs         A/B Tests
                                 Databases
                   Logs          Streams



Browser versions
                       Reviews
                                  Clicktrails
Okay. We have data.
What do we do with it?


We   classify it.
Classification
Classification



            OR
Classification



    :)      OR   :(
Classification
• Documents
    o Sort email (Gmail's importance filter)
    o Route questions to appropriate expert (Aardvark)
    o Categorize reviews (Amazon)



•   Users
    o   Expertise; interests; pro vs free; likelihood of paying;
        expected future karma


•   Events
    o   Abnormal vs. normal
Algorithms:
     Decision Tree Learning
Algorithms:
        Decision Tree Learning

                                                                Features
                            Email contains
                            word "viagra"

                            no       yes

           Email contains                    Email contains
            word "Ruby"                       attachment?


           no         yes                     no        yes


   P(Spam)=10%     P(Spam)=5%         P(Spam)=70%       P(Spam)=95%




                                                       Labels
Algorithms:
     Support Vector Machines (SVMs)




                          Graphics from Wikipedia
Algorithms:
     Support Vector Machines (SVMs)




                          Graphics from Wikipedia
Algorithms:
           Naive Bayes

•   Break documents into words and treat each
    word as an independent feature

•   Surprisingly effective on simple text and
    document classification

•   Works well when you have lots of data



                                          Graphics from Wikipedia
Algorithms:
             Naive Bayes

You received 100 emails, 70 of which were spam.
Word                 Spam with this word   Ham with this word

viagra               42 (60%)              1 (3.3%)

ruby                 7 (10%)               15 (50%)

hello                35 (50%)              24 (80%)



A new email contains hello and viagra. The probability that it
is spam is:
P(S|hello,viagra) = P(S) * P(hello,viagra|S) / P(hello,viagra)
                  = 0.7 * (0.5 * 0.6)        / (0.59 * 0.43)
                  = 82%
                                                      Graphics from Wikipedia
Algorithms:
               Neural Nets
                         Hidden layer

Input layer (features)

                                        Output layer (Classification)




                                                      Graphics from Wikipedia
Curse of Dimensionality

The more features
  and labels that you
  have, the more data
  that you need.




       http://www.iro.umontreal.ca/~bengioy/yoshua_en/research_files/CurseDimensionality.jpg
Overfitting
•   With enough parameters, anything is
    possible.

•   We want our algorithms to generalize and
    infer, not memorize specific training
    examples.

•   Therefore, we test our algorithms on
    different data than we train them on.

Contenu connexe

Tendances

RustによるGPUプログラミング環境
RustによるGPUプログラミング環境RustによるGPUプログラミング環境
RustによるGPUプログラミング環境KiyotomoHiroyasu
 
非参照型メトリクスを用いた放射線動画の評価
非参照型メトリクスを用いた放射線動画の評価非参照型メトリクスを用いた放射線動画の評価
非参照型メトリクスを用いた放射線動画の評価Yutaka KATAYAMA
 
クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定Hiroshi Nakagawa
 
[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会
[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会
[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会S_aiueo32
 
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCHDeep Learning JP
 
[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph Generation[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph GenerationDeep Learning JP
 
深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴Yuya Unno
 
【DL輪読会】Scaling laws for single-agent reinforcement learning
【DL輪読会】Scaling laws for single-agent reinforcement learning【DL輪読会】Scaling laws for single-agent reinforcement learning
【DL輪読会】Scaling laws for single-agent reinforcement learningDeep Learning JP
 
決定木・回帰木に基づくアンサンブル学習の最近
決定木・回帰木に基づくアンサンブル学習の最近決定木・回帰木に基づくアンサンブル学習の最近
決定木・回帰木に基づくアンサンブル学習の最近Ichigaku Takigawa
 
バイオインフォマティクスで実験ノートを取ろう
バイオインフォマティクスで実験ノートを取ろうバイオインフォマティクスで実験ノートを取ろう
バイオインフォマティクスで実験ノートを取ろうMasahiro Kasahara
 
Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...
Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...
Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...カリス 東大AI博士
 
金融リスクとポートフォリオマネジメント
金融リスクとポートフォリオマネジメント金融リスクとポートフォリオマネジメント
金融リスクとポートフォリオマネジメントKei Nakagawa
 
【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】Monocular real time volumetric performance capture 【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】Monocular real time volumetric performance capture Deep Learning JP
 
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative ModelsDeep Learning JP
 
PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説弘毅 露崎
 
文献紹介:CutDepth: Edge-aware Data Augmentation in Depth Estimation
文献紹介:CutDepth: Edge-aware Data Augmentation in Depth Estimation文献紹介:CutDepth: Edge-aware Data Augmentation in Depth Estimation
文献紹介:CutDepth: Edge-aware Data Augmentation in Depth EstimationToru Tamaki
 
Image Filtering in the Frequency Domain
Image Filtering in the Frequency DomainImage Filtering in the Frequency Domain
Image Filtering in the Frequency DomainAmnaakhaan
 
Variational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text GenerationVariational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text Generationharmonylab
 

Tendances (20)

RustによるGPUプログラミング環境
RustによるGPUプログラミング環境RustによるGPUプログラミング環境
RustによるGPUプログラミング環境
 
非参照型メトリクスを用いた放射線動画の評価
非参照型メトリクスを用いた放射線動画の評価非参照型メトリクスを用いた放射線動画の評価
非参照型メトリクスを用いた放射線動画の評価
 
クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定クラシックな機械学習の入門  9. モデル推定
クラシックな機械学習の入門  9. モデル推定
 
[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会
[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会
[cvpaper.challenge] 超解像メタサーベイ #meta-study-group勉強会
 
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
 
[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph Generation[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph Generation
 
深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴
 
【DL輪読会】Scaling laws for single-agent reinforcement learning
【DL輪読会】Scaling laws for single-agent reinforcement learning【DL輪読会】Scaling laws for single-agent reinforcement learning
【DL輪読会】Scaling laws for single-agent reinforcement learning
 
決定木・回帰木に基づくアンサンブル学習の最近
決定木・回帰木に基づくアンサンブル学習の最近決定木・回帰木に基づくアンサンブル学習の最近
決定木・回帰木に基づくアンサンブル学習の最近
 
バイオインフォマティクスで実験ノートを取ろう
バイオインフォマティクスで実験ノートを取ろうバイオインフォマティクスで実験ノートを取ろう
バイオインフォマティクスで実験ノートを取ろう
 
Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...
Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...
Ph.D. Defense Presentation Slides (Changhee Han) カリスの東大博論審査会(公聴会)発表スライド Patho...
 
金融リスクとポートフォリオマネジメント
金融リスクとポートフォリオマネジメント金融リスクとポートフォリオマネジメント
金融リスクとポートフォリオマネジメント
 
Local binary pattern
Local binary patternLocal binary pattern
Local binary pattern
 
劣微分
劣微分劣微分
劣微分
 
【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】Monocular real time volumetric performance capture 【DL輪読会】Monocular real time volumetric performance capture
【DL輪読会】Monocular real time volumetric performance capture
 
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
 
PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説
 
文献紹介:CutDepth: Edge-aware Data Augmentation in Depth Estimation
文献紹介:CutDepth: Edge-aware Data Augmentation in Depth Estimation文献紹介:CutDepth: Edge-aware Data Augmentation in Depth Estimation
文献紹介:CutDepth: Edge-aware Data Augmentation in Depth Estimation
 
Image Filtering in the Frequency Domain
Image Filtering in the Frequency DomainImage Filtering in the Frequency Domain
Image Filtering in the Frequency Domain
 
Variational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text GenerationVariational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text Generation
 

Similaire à Practical Machine Learning and Rails Part1

Static Analysis
Static AnalysisStatic Analysis
Static Analysisalice yang
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11darwinrlo
 
Machine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayMachine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayAWS Germany
 
The Art of Identifying Vulnerabilities - CascadiaFest 2015
The Art of Identifying Vulnerabilities  - CascadiaFest 2015The Art of Identifying Vulnerabilities  - CascadiaFest 2015
The Art of Identifying Vulnerabilities - CascadiaFest 2015Adam Baldwin
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9Roger Barga
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Turi, Inc.
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.NetBruno Capuano
 
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...Silvio Cesare
 
The Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From DataThe Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From Datalmrei
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningConnected Data World
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesPradeep Redddy Raamana
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptxShree Shree
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning ClassifiersMostafa
 
Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Salesforce Engineering
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tpseudor00t overflow
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in PythonHilary Mason
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8Roger Barga
 
Part 3 Machine Learnning
Part 3 Machine LearnningPart 3 Machine Learnning
Part 3 Machine LearnningMohamed Essam
 

Similaire à Practical Machine Learning and Rails Part1 (20)

Static Analysis
Static AnalysisStatic Analysis
Static Analysis
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
 
Machine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayMachine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web Day
 
NAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHMNAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHM
 
The Art of Identifying Vulnerabilities - CascadiaFest 2015
The Art of Identifying Vulnerabilities  - CascadiaFest 2015The Art of Identifying Vulnerabilities  - CascadiaFest 2015
The Art of Identifying Vulnerabilities - CascadiaFest 2015
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Data mining on yelp dataset
Data mining on yelp datasetData mining on yelp dataset
Data mining on yelp dataset
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net
 
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
 
The Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From DataThe Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From Data
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep Learning
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practices
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptx
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
 
Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in Python
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8
 
Part 3 Machine Learnning
Part 3 Machine LearnningPart 3 Machine Learnning
Part 3 Machine Learnning
 

Plus de ryanstout

Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevConryanstout
 
Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014ryanstout
 
Reactive programming
Reactive programmingReactive programming
Reactive programmingryanstout
 
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patternsryanstout
 
Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2ryanstout
 
Intro to Advanced JavaScript
Intro to Advanced JavaScriptIntro to Advanced JavaScript
Intro to Advanced JavaScriptryanstout
 

Plus de ryanstout (8)

Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevCon
 
Volt 2015
Volt 2015Volt 2015
Volt 2015
 
Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014
 
Reactive programming
Reactive programmingReactive programming
Reactive programming
 
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patterns
 
EmberJS
EmberJSEmberJS
EmberJS
 
Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2
 
Intro to Advanced JavaScript
Intro to Advanced JavaScriptIntro to Advanced JavaScript
Intro to Advanced JavaScript
 

Dernier

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 

Dernier (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 

Practical Machine Learning and Rails Part1

  • 2. Andrew Cantino VP Engineering, Mavenlink @tectonic Founder, Agile Productions @ryanstout
  • 3. This talk will - introduce machine learning - make you ML-aware - have examples
  • 4. This talk will not - give you a PhD - implement algorithms - cover collaborative filtering, optimization, clustering, advanced statistics, genetic algorithms, classical AI, NLP, ...
  • 5. What is Machine Learning? Many different algorithms that predict data from other data using applied statistics.
  • 6. "Enhance and rotate 20 degrees"
  • 7. What data? The web is data. User decisions APIs A/B Tests Databases Logs Streams Browser versions Reviews Clicktrails
  • 8. Okay. We have data. What do we do with it? We classify it.
  • 11. Classification :) OR :(
  • 12. Classification • Documents o Sort email (Gmail's importance filter) o Route questions to appropriate expert (Aardvark) o Categorize reviews (Amazon) • Users o Expertise; interests; pro vs free; likelihood of paying; expected future karma • Events o Abnormal vs. normal
  • 13. Algorithms: Decision Tree Learning
  • 14. Algorithms: Decision Tree Learning Features Email contains word "viagra" no yes Email contains Email contains word "Ruby" attachment? no yes no yes P(Spam)=10% P(Spam)=5% P(Spam)=70% P(Spam)=95% Labels
  • 15. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  • 16. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  • 17. Algorithms: Naive Bayes • Break documents into words and treat each word as an independent feature • Surprisingly effective on simple text and document classification • Works well when you have lots of data Graphics from Wikipedia
  • 18. Algorithms: Naive Bayes You received 100 emails, 70 of which were spam. Word Spam with this word Ham with this word viagra 42 (60%) 1 (3.3%) ruby 7 (10%) 15 (50%) hello 35 (50%) 24 (80%) A new email contains hello and viagra. The probability that it is spam is: P(S|hello,viagra) = P(S) * P(hello,viagra|S) / P(hello,viagra) = 0.7 * (0.5 * 0.6) / (0.59 * 0.43) = 82% Graphics from Wikipedia
  • 19. Algorithms: Neural Nets Hidden layer Input layer (features) Output layer (Classification) Graphics from Wikipedia
  • 20. Curse of Dimensionality The more features and labels that you have, the more data that you need. http://www.iro.umontreal.ca/~bengioy/yoshua_en/research_files/CurseDimensionality.jpg
  • 21. Overfitting • With enough parameters, anything is possible. • We want our algorithms to generalize and infer, not memorize specific training examples. • Therefore, we test our algorithms on different data than we train them on.