SlideShare une entreprise Scribd logo
1  sur  33
Intelligent Ruby + Machine Learning what, why, the trends, and the toolkit Ilya Grigorik @igrigorik
Machine Learning is ___________ speak up!
“Machine learning is a discipline that is concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data”
Algorithm Data Input Data Output Runtime ML & AI in the academia and how it’s commonly taught
Algorithm Data Input Data Output Runtime ML & AI in the real world or, at least, where the trends are going
Algorithm Data Input Data Output ,[object Object]
 CPU vs GPU?
 on-demand supercomputing
 supercomputer by the hour (cloud)Runtime Runtime Runtime Runtime Runtime Runtime is a practical constraint which is often overlooked by academia
Algorithm Data Input Data Output Data Input Data Input Data Input Data Input Runtime Runtime Runtime Runtime ,[object Object]
 Trillions of social connections
Petabytes of unstructured data
 Growing at exponential rateRuntime Data, is often no longer scarce… in fact, we (Rubyists) are responsible for generating a lot of it…
Data Input Data Input Data Input Data Input Data Input ? Runtime Runtime Runtime Runtime Runtime Mo’ data, Mo’ problems?  Requires more resources? No better off…?
“Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing” Michelle Banko, Eric Brill http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.646 “More input data vs. Better Algorithms”
“Data-Driven Learning” "We were able significantly reduce the error rate, compared to the best system trained on the standard training set size, simply by adding more training data... We see that even out to a billion words the learners continue to benefit from additional training data."
Brute-forcing “learning” with Big-Data data as the algorithm…
新星歐唐尼爾 保守特立獨行 Wordsegmentationistricky Word|segmentation|is|tricky Strategy 1: Grammar for dummies Strategy 2: Natural language toolkit (encode a language model) Strategy 3: Take a guess! NLP with Big-Data  Google does this better than anyone else…
P(W) xP(ordsegmentationistricky) P(Wo) xP(rdsegmentationistricky) … P(Word) xP(segmentationistricky) argmax P(W) = ???? Word Segmentation: Take a guess! Estimate the probability of every segmentation, pick the best performer
P(W) = # of google hits / ~ # of pages on the web not kidding.. it works. Exercise: write a ruby script for it. P(W) = Google’s n-gram dataset / # of n-grams http://bit.ly/dyTvLO ,[object Object]
 Adding new language: scrape the web, count the words, done.Word Segmentation: Take a guess! That’s how Google does it, and does it well…
Algorithm Data Input Data Output Data Input Data Input Data Input Data Input Runtime Runtime Runtime Runtime Runtime Of course, smarter algorithms still matter! don’t get me wrong…
If we can identify significant concepts (within a dataset) then we can represent a large dataset with fewer bits. “Machine Learning” If we can represent our data with fewer bits (compress our data), then we have identified “significant” concepts! Learning vs. Compression closely correlated concepts
Ex: Classification
? Exercise: maximize the margin Color Red = Not tasty Green = Tasty ? Tasty… Feel Predicting a “tasty fruit” with the perceptron algorithm (y = mx + b) http://bit.ly/bMcwhI
Green   = Positive Purple = Negative Where perceptron breaks down we need a better model…
Gree   = Positive Purple = Negative Perfect! Idea: y = x2 Throw the data into a “higher dimensional” space! http://bit.ly/dfG7vD
require'SVM' sp =Problem.new sp.addExample(”spam", [1,1,0]) sp.addExample(”ham",  [0,1,1]) pa =Parameter.new m=Model.new(sp, pa) m.predict [1, 0, 0] Support Vector Machines That’s the core insight! Simple as that. http://bit.ly/a2oyMu
Ex: Recommendations
  A           B           C          D Ben Any M xN matrix (where M >= N), can be decomposed into: M xM - call it U M xN  - call it S N xN   - call it V Fred Tom James Bob Observation: we can use this decomposition to approximate the original MxN matrix (by fiddling with S and then recomputingU x S x V) Linear Algebra +  Singular Value Decomposition A bit of linear algebra for good measure…
SVD in action bread and butter of computer vision systems
require'linalg' m=Linalg::DMatrix[[1,0,1,0], [1,1,1,1], ... ]] # Compute the SVD Decomposition u, s, vt=m.singular_value_decomposition # ... compute user similarity # ... make recommendations based on similar users! gem install linalg to do the heavy-lifting… http://bit.ly/9lXuOL
Ex: Clustering

Contenu connexe

Similaire à Intelligent Ruby + Machine Learning

Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowOswald Campesato
 
Meetup 29042015
Meetup 29042015Meetup 29042015
Meetup 29042015lbishal
 
MapReduce: teoria e prática
MapReduce: teoria e práticaMapReduce: teoria e prática
MapReduce: teoria e práticaPET Computação
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET Journal
 
2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.NetBruno Capuano
 
2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.NetBruno Capuano
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .pptbutest
 
Design and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.pptDesign and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.pptmoiza354
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonTariq Rashid
 
Google Big Data Expo
Google Big Data ExpoGoogle Big Data Expo
Google Big Data ExpoBigDataExpo
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning OverviewMykhailo Koval
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science ChallengeMark Nichols, P.E.
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupDoug Needham
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsbutest
 
Getting Started with Machine Learning
Getting Started with Machine LearningGetting Started with Machine Learning
Getting Started with Machine LearningHumberto Marchezi
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxRuby Shrestha
 

Similaire à Intelligent Ruby + Machine Learning (20)

Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
Meetup 29042015
Meetup 29042015Meetup 29042015
Meetup 29042015
 
MapReduce: teoria e prática
MapReduce: teoria e práticaMapReduce: teoria e prática
MapReduce: teoria e prática
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 
2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
Design and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.pptDesign and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.ppt
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with Python
 
Google Big Data Expo
Google Big Data ExpoGoogle Big Data Expo
Google Big Data Expo
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
 
supervised.pptx
supervised.pptxsupervised.pptx
supervised.pptx
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science Challenge
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical models
 
Getting Started with Machine Learning
Getting Started with Machine LearningGetting Started with Machine Learning
Getting Started with Machine Learning
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
 

Plus de Ilya Grigorik

Pagespeed what, why, and how it works
Pagespeed   what, why, and how it worksPagespeed   what, why, and how it works
Pagespeed what, why, and how it worksIlya Grigorik
 
Making the web fast(er) - RailsConf 2012
Making the web fast(er) - RailsConf 2012Making the web fast(er) - RailsConf 2012
Making the web fast(er) - RailsConf 2012Ilya Grigorik
 
0-60 with Goliath: High performance web services
0-60 with Goliath: High performance web services0-60 with Goliath: High performance web services
0-60 with Goliath: High performance web servicesIlya Grigorik
 
0-60 with Goliath: Building High Performance Ruby Web-Services
0-60 with Goliath: Building High Performance Ruby Web-Services0-60 with Goliath: Building High Performance Ruby Web-Services
0-60 with Goliath: Building High Performance Ruby Web-ServicesIlya Grigorik
 
Ruby in the Browser - RubyConf 2011
Ruby in the Browser - RubyConf 2011Ruby in the Browser - RubyConf 2011
Ruby in the Browser - RubyConf 2011Ilya Grigorik
 
No callbacks, No Threads - Cooperative web servers in Ruby 1.9
No callbacks, No Threads - Cooperative web servers in Ruby 1.9No callbacks, No Threads - Cooperative web servers in Ruby 1.9
No callbacks, No Threads - Cooperative web servers in Ruby 1.9Ilya Grigorik
 
No Callbacks, No Threads - RailsConf 2010
No Callbacks, No Threads - RailsConf 2010No Callbacks, No Threads - RailsConf 2010
No Callbacks, No Threads - RailsConf 2010Ilya Grigorik
 
Real-time Ruby for the Real-time Web
Real-time Ruby for the Real-time WebReal-time Ruby for the Real-time Web
Real-time Ruby for the Real-time WebIlya Grigorik
 
Ruby C10K: High Performance Networking - RubyKaigi '09
Ruby C10K: High Performance Networking - RubyKaigi '09Ruby C10K: High Performance Networking - RubyKaigi '09
Ruby C10K: High Performance Networking - RubyKaigi '09Ilya Grigorik
 
Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09
Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09
Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09Ilya Grigorik
 
Leveraging Social Media - Strategies & Tactics - PostRank
Leveraging Social Media - Strategies & Tactics - PostRankLeveraging Social Media - Strategies & Tactics - PostRank
Leveraging Social Media - Strategies & Tactics - PostRankIlya Grigorik
 
Ruby Proxies for Scale, Performance, and Monitoring
Ruby Proxies for Scale, Performance, and MonitoringRuby Proxies for Scale, Performance, and Monitoring
Ruby Proxies for Scale, Performance, and MonitoringIlya Grigorik
 
Building Mini Google in Ruby
Building Mini Google in RubyBuilding Mini Google in Ruby
Building Mini Google in RubyIlya Grigorik
 
Ruby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.com
Ruby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.comRuby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.com
Ruby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.comIlya Grigorik
 
Event Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya GrigorikEvent Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya GrigorikIlya Grigorik
 
Taming The RSS Beast
Taming The  RSS  BeastTaming The  RSS  Beast
Taming The RSS BeastIlya Grigorik
 

Plus de Ilya Grigorik (16)

Pagespeed what, why, and how it works
Pagespeed   what, why, and how it worksPagespeed   what, why, and how it works
Pagespeed what, why, and how it works
 
Making the web fast(er) - RailsConf 2012
Making the web fast(er) - RailsConf 2012Making the web fast(er) - RailsConf 2012
Making the web fast(er) - RailsConf 2012
 
0-60 with Goliath: High performance web services
0-60 with Goliath: High performance web services0-60 with Goliath: High performance web services
0-60 with Goliath: High performance web services
 
0-60 with Goliath: Building High Performance Ruby Web-Services
0-60 with Goliath: Building High Performance Ruby Web-Services0-60 with Goliath: Building High Performance Ruby Web-Services
0-60 with Goliath: Building High Performance Ruby Web-Services
 
Ruby in the Browser - RubyConf 2011
Ruby in the Browser - RubyConf 2011Ruby in the Browser - RubyConf 2011
Ruby in the Browser - RubyConf 2011
 
No callbacks, No Threads - Cooperative web servers in Ruby 1.9
No callbacks, No Threads - Cooperative web servers in Ruby 1.9No callbacks, No Threads - Cooperative web servers in Ruby 1.9
No callbacks, No Threads - Cooperative web servers in Ruby 1.9
 
No Callbacks, No Threads - RailsConf 2010
No Callbacks, No Threads - RailsConf 2010No Callbacks, No Threads - RailsConf 2010
No Callbacks, No Threads - RailsConf 2010
 
Real-time Ruby for the Real-time Web
Real-time Ruby for the Real-time WebReal-time Ruby for the Real-time Web
Real-time Ruby for the Real-time Web
 
Ruby C10K: High Performance Networking - RubyKaigi '09
Ruby C10K: High Performance Networking - RubyKaigi '09Ruby C10K: High Performance Networking - RubyKaigi '09
Ruby C10K: High Performance Networking - RubyKaigi '09
 
Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09
Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09
Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09
 
Leveraging Social Media - Strategies & Tactics - PostRank
Leveraging Social Media - Strategies & Tactics - PostRankLeveraging Social Media - Strategies & Tactics - PostRank
Leveraging Social Media - Strategies & Tactics - PostRank
 
Ruby Proxies for Scale, Performance, and Monitoring
Ruby Proxies for Scale, Performance, and MonitoringRuby Proxies for Scale, Performance, and Monitoring
Ruby Proxies for Scale, Performance, and Monitoring
 
Building Mini Google in Ruby
Building Mini Google in RubyBuilding Mini Google in Ruby
Building Mini Google in Ruby
 
Ruby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.com
Ruby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.comRuby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.com
Ruby Proxies for Scale, Performance, and Monitoring - GoGaRuCo - igvita.com
 
Event Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya GrigorikEvent Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya Grigorik
 
Taming The RSS Beast
Taming The  RSS  BeastTaming The  RSS  Beast
Taming The RSS Beast
 

Intelligent Ruby + Machine Learning

  • 1. Intelligent Ruby + Machine Learning what, why, the trends, and the toolkit Ilya Grigorik @igrigorik
  • 2. Machine Learning is ___________ speak up!
  • 3. “Machine learning is a discipline that is concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data”
  • 4. Algorithm Data Input Data Output Runtime ML & AI in the academia and how it’s commonly taught
  • 5. Algorithm Data Input Data Output Runtime ML & AI in the real world or, at least, where the trends are going
  • 6.
  • 7. CPU vs GPU?
  • 9. supercomputer by the hour (cloud)Runtime Runtime Runtime Runtime Runtime Runtime is a practical constraint which is often overlooked by academia
  • 10.
  • 11. Trillions of social connections
  • 13. Growing at exponential rateRuntime Data, is often no longer scarce… in fact, we (Rubyists) are responsible for generating a lot of it…
  • 14. Data Input Data Input Data Input Data Input Data Input ? Runtime Runtime Runtime Runtime Runtime Mo’ data, Mo’ problems? Requires more resources? No better off…?
  • 15. “Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing” Michelle Banko, Eric Brill http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.646 “More input data vs. Better Algorithms”
  • 16. “Data-Driven Learning” "We were able significantly reduce the error rate, compared to the best system trained on the standard training set size, simply by adding more training data... We see that even out to a billion words the learners continue to benefit from additional training data."
  • 17. Brute-forcing “learning” with Big-Data data as the algorithm…
  • 18. 新星歐唐尼爾 保守特立獨行 Wordsegmentationistricky Word|segmentation|is|tricky Strategy 1: Grammar for dummies Strategy 2: Natural language toolkit (encode a language model) Strategy 3: Take a guess! NLP with Big-Data Google does this better than anyone else…
  • 19. P(W) xP(ordsegmentationistricky) P(Wo) xP(rdsegmentationistricky) … P(Word) xP(segmentationistricky) argmax P(W) = ???? Word Segmentation: Take a guess! Estimate the probability of every segmentation, pick the best performer
  • 20.
  • 21. Adding new language: scrape the web, count the words, done.Word Segmentation: Take a guess! That’s how Google does it, and does it well…
  • 22. Algorithm Data Input Data Output Data Input Data Input Data Input Data Input Runtime Runtime Runtime Runtime Runtime Of course, smarter algorithms still matter! don’t get me wrong…
  • 23. If we can identify significant concepts (within a dataset) then we can represent a large dataset with fewer bits. “Machine Learning” If we can represent our data with fewer bits (compress our data), then we have identified “significant” concepts! Learning vs. Compression closely correlated concepts
  • 25. ? Exercise: maximize the margin Color Red = Not tasty Green = Tasty ? Tasty… Feel Predicting a “tasty fruit” with the perceptron algorithm (y = mx + b) http://bit.ly/bMcwhI
  • 26. Green = Positive Purple = Negative Where perceptron breaks down we need a better model…
  • 27. Gree = Positive Purple = Negative Perfect! Idea: y = x2 Throw the data into a “higher dimensional” space! http://bit.ly/dfG7vD
  • 28. require'SVM' sp =Problem.new sp.addExample(”spam", [1,1,0]) sp.addExample(”ham", [0,1,1]) pa =Parameter.new m=Model.new(sp, pa) m.predict [1, 0, 0] Support Vector Machines That’s the core insight! Simple as that. http://bit.ly/a2oyMu
  • 30. A B C D Ben Any M xN matrix (where M >= N), can be decomposed into: M xM - call it U M xN - call it S N xN - call it V Fred Tom James Bob Observation: we can use this decomposition to approximate the original MxN matrix (by fiddling with S and then recomputingU x S x V) Linear Algebra + Singular Value Decomposition A bit of linear algebra for good measure…
  • 31. SVD in action bread and butter of computer vision systems
  • 32. require'linalg' m=Linalg::DMatrix[[1,0,1,0], [1,1,1,1], ... ]] # Compute the SVD Decomposition u, s, vt=m.singular_value_decomposition # ... compute user similarity # ... make recommendations based on similar users! gem install linalg to do the heavy-lifting… http://bit.ly/9lXuOL
  • 34. Raw data Similarity? 1. AAAA AAA AAAA AAA AAAAA 2. BBBBB BBBBBB BBBBB BBBBB 3. AAAA BBBBB AAA BBBBB AA similarity(1, 3) > similarity(1, 2) similarity(2, 3) > similarity(1, 2) Yeah.. but how did you figure that out? Learning & compression are closely correlated concepts Some of you ran Lempel-Ziv on it…
  • 35. Exercise: cluster your ITunes library.. files =Dir['data/*'] defdeflate(*files) z=Zlib::Deflate.new z.deflate(files.collect {|f| open(f).read}.join(""), Zlib::FINISH).size end pairwise= files.combination(2).collect do |f1, f2| a = deflate(f1) b= deflate(f2) both = deflate(f1, f2) { :files => [f1, f2], :score => (a+b)-both } end pp pairwise.sort {|a,b| b[:score] <=> a[:score]}.first(20) Similarity = amount of space saved when compressed together vs. individually Clustering with Zlib no knowledge of the domain, just straight up compression
  • 36. Algorithm Data Input Data Output Data Input Algorithm Data Input Algorithm Data Input Algorithm Data Input Algorithm Runtime Runtime Runtime Runtime Runtime “Ensemble Methods in Machine Learning” Thomas G. Diettrerich (2000) “Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a vote of their predictions… ensembles can often perform better than any single classifier.”
  • 37. The Ensemble = 30+ members BellKor = 7 members http://nyti.ms/ccR7ul
  • 38. require'open-uri' classCrowdsource definitialize load_leaderboard# scrape github contest leaders parse_leaders# find their top performing results fetch_results# download best results cleanup_leaders# cleanup missing or incorrect data crunchit# build an ensemble end #... end Crowdsource.new Collaborative, Collaborative Filtering? Unfortunately, GitHub grew didn’t buy into the idea…
  • 39.
  • 40. Ensembles: embrace complexity of many small, independent models!
  • 41. Complex ideas are constructed on simple ideas: explore the simple ideasMore resources, More data, More Models = Collaborative, Data-Driven Learning
  • 42. Collaborative Filtering with Ensembles: http://www.igvita.com/2009/09/01/collaborative-filtering-with-ensembles/ Support Vector Machines in Ruby: http://www.igvita.com/2008/01/07/support-vector-machines-svm-in-ruby/ SVD Recommendation System in Ruby: http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/ gem install ai4r http://ai4r.rubyforge.org/ Phew, time for questions? hope this convinced you to explore the area further…

Notes de l'éditeur

  1. Now, I believe that as the rails ecosystem grows, and becomes older… The end-to-end performance becomes only more important, because all of the sudden, the projects are larger, and more successful, and they’re feeling the pain of “scaling the Rails stack”.