SlideShare une entreprise Scribd logo
1  sur  9
Télécharger pour lire hors ligne
GBM	
  &	
  Random	
  Forest	
  in	
  H2O	
  
Mark	
  Landry	
  
Presenta6on	
  Outline	
  
•  Algorithm	
  Background	
  
o Decision	
  Trees	
  
o Random	
  Forest	
  
o Gradient	
  Boosted	
  Machines	
  (GBM)	
  
•  H2O	
  ImplementaCons	
  
o Code	
  examples	
  
o DescripCon	
  of	
  parameters	
  and	
  general	
  usage	
  
Decision	
  Trees:	
  Concept	
  
•  Separate	
  the	
  data	
  
according	
  to	
  a	
  series	
  of	
  
quesCons	
  
o  Age	
  >	
  9.5?	
  
•  The	
  quesCons	
  are	
  found	
  
automaCcally	
  to	
  
opCmize	
  separaCon	
  of	
  
the	
  data	
  point	
  by	
  the	
  
“target”	
  
Source: wikimedia CART tree Titanic survivors
Example decision tree:
Predicting survival of Titanic passengers
Decision	
  Trees:	
  Prac6cal	
  Use	
  
•  Non	
  linear	
  
•  Robust	
  to	
  correlated	
  
features	
  
•  Robust	
  to	
  feature	
  
distribuCons	
  
•  Robust	
  to	
  missing	
  
values	
  
•  Simple	
  to	
  comprehend	
  
•  Fast	
  to	
  train	
  
•  Fast	
  to	
  score	
  
•  Poor	
  accuracy	
  
•  Cannot	
  project	
  
•  Inefficiently	
  fits	
  linear	
  
relaConships	
  
WeaknessesStrengths
Improved	
  Decision	
  Trees:	
  Ensembles	
  
•  Bootstrap	
  aggregaCon	
  
(bagging)	
  
•  Fit	
  many	
  trees	
  against	
  
different	
  samples	
  of	
  the	
  
data	
  and	
  average	
  
together	
  
•  BoosCng	
  
•  Fits	
  consecuCve	
  trees	
  
where	
  each	
  solves	
  for	
  
the	
  net	
  error	
  of	
  the	
  
prior	
  trees	
  	
  
GBMRandom Forest
Random	
  Forest	
  
•  Combine	
  mulCple	
  
decision	
  trees,	
  each	
  fit	
  
to	
  a	
  random	
  sample	
  of	
  
the	
  original	
  data	
  
•  Randomly	
  samples	
  	
  
o  Rows	
  
o  Columns	
  
•  Reduce	
  variance,	
  with	
  
minimal	
  increase	
  in	
  bias	
  
•  Strengths	
  
o  Easy	
  to	
  use	
  
•  Few	
  parameters	
  
•  Well-­‐established	
  default	
  
values	
  for	
  parameters	
  	
  
o  Robust	
  
o  CompeCCve	
  accuracy	
  on	
  
most	
  data	
  sets	
  
•  Weaknesses	
  
o  Slow	
  to	
  score	
  
o  Lack	
  of	
  transparency	
  
PracticalConceptual
Gradient	
  Boosted	
  Machines	
  (GBM)	
  
•  BoosCng:	
  ensemble	
  of	
  
weak	
  learners*	
  
•  Fits	
  consecuCve	
  trees	
  
where	
  each	
  solves	
  for	
  the	
  
net	
  loss	
  of	
  the	
  prior	
  trees	
  
•  Results	
  of	
  new	
  trees	
  are	
  
applied	
  parCally	
  to	
  the	
  
enCre	
  soluCon	
  
•  Strengths	
  
o  O`en	
  best	
  possible	
  model	
  
o  Robust	
  
o  Directly	
  opCmizes	
  cost	
  
funcCon	
  
•  Weaknesses	
  
o  Overfits	
  
•  Need	
  to	
  find	
  proper	
  
stopping	
  point	
  
o  SensiCve	
  to	
  noise	
  and	
  
extreme	
  values	
  
o  Several	
  hyper-­‐parameters	
  
o  Lack	
  of	
  transparency	
  
PracticalConceptual
* the notion of “weak” is being challenged
in practice
Trees	
  in	
  H2O	
  
•  Individual	
  tree	
  fiang	
  is	
  performed	
  in	
  parallel	
  
•  Shared	
  histograms	
  calculate	
  cut-­‐points	
  	
  
•  Greedy	
  search	
  of	
  histogram	
  bins,	
  opCmizing	
  
squared	
  error	
  
Explore	
  Further	
  through	
  Examples	
  
I have H2O
Installed
I have R
installed
I have the
H2O World
data sets

Contenu connexe

Tendances

Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
MLconf
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2O
Sri Ambati
 

Tendances (20)

Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
 
Ppt shuai
Ppt shuaiPpt shuai
Ppt shuai
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
 
Java/Scala Lab 2016. Сергей Моренец: Способы повышения эффективности в Java 8.
Java/Scala Lab 2016. Сергей Моренец: Способы повышения эффективности в Java 8.Java/Scala Lab 2016. Сергей Моренец: Способы повышения эффективности в Java 8.
Java/Scala Lab 2016. Сергей Моренец: Способы повышения эффективности в Java 8.
 
AI&BigData Lab 2016. Руденко Петр: Особенности обучения, настройки и использо...
AI&BigData Lab 2016. Руденко Петр: Особенности обучения, настройки и использо...AI&BigData Lab 2016. Руденко Петр: Особенности обучения, настройки и использо...
AI&BigData Lab 2016. Руденко Петр: Особенности обучения, настройки и использо...
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_con
 
5 Coding Hacks to Reduce GC Overhead
5 Coding Hacks to Reduce GC Overhead5 Coding Hacks to Reduce GC Overhead
5 Coding Hacks to Reduce GC Overhead
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
 
GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2O
 

En vedette

Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)
Tejamoy Ghosh
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval
Venkata Reddy Konasani
 

En vedette (20)

Xgboost
XgboostXgboost
Xgboost
 
Higgs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - KaggleHiggs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - Kaggle
 
classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning
 
Forecasting P2P Credit Risk based on Lending Club data
Forecasting P2P Credit Risk based on Lending Club dataForecasting P2P Credit Risk based on Lending Club data
Forecasting P2P Credit Risk based on Lending Club data
 
Consumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestConsumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random Forest
 
Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)
 
Estimation of the probability of default : Credit Rish
Estimation of the probability of default : Credit RishEstimation of the probability of default : Credit Rish
Estimation of the probability of default : Credit Rish
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
 
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
 
Quick Tour of Text Mining
Quick Tour of Text MiningQuick Tour of Text Mining
Quick Tour of Text Mining
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
陳宜欣/大數據下的情緒分析
陳宜欣/大數據下的情緒分析陳宜欣/大數據下的情緒分析
陳宜欣/大數據下的情緒分析
 
Building Random Forest at Scale
Building Random Forest at ScaleBuilding Random Forest at Scale
Building Random Forest at Scale
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVM
 
Introduction to Modeling
Introduction to ModelingIntroduction to Modeling
Introduction to Modeling
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval
 
給軟體工程師的不廢話 R 語言精要班
給軟體工程師的不廢話 R 語言精要班給軟體工程師的不廢話 R 語言精要班
給軟體工程師的不廢話 R 語言精要班
 

Similaire à H2O World - GBM and Random Forest in H2O- Mark Landry

Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
Don Demcsak
 
MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012
Sean Laurent
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
Christophe Grand
 

Similaire à H2O World - GBM and Random Forest in H2O- Mark Landry (20)

Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
Parallel programming in .NET
Parallel programming in .NETParallel programming in .NET
Parallel programming in .NET
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
Memory efficient java tutorial practices and challenges
Memory efficient java tutorial practices and challengesMemory efficient java tutorial practices and challenges
Memory efficient java tutorial practices and challenges
 
Sc12 workshop-writeup
Sc12 workshop-writeupSc12 workshop-writeup
Sc12 workshop-writeup
 
Lessons learned
Lessons learnedLessons learned
Lessons learned
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An Overview
 
MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
 
2013 py con awesome big data algorithms
2013 py con awesome big data algorithms2013 py con awesome big data algorithms
2013 py con awesome big data algorithms
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Building Highly Available Apps on Cassandra (Robbie Strickland, Weather Compa...
Building Highly Available Apps on Cassandra (Robbie Strickland, Weather Compa...Building Highly Available Apps on Cassandra (Robbie Strickland, Weather Compa...
Building Highly Available Apps on Cassandra (Robbie Strickland, Weather Compa...
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
Leveraging MongoDB: An Introductory Case Study
Leveraging MongoDB: An Introductory Case StudyLeveraging MongoDB: An Introductory Case Study
Leveraging MongoDB: An Introductory Case Study
 
Lecture 11
Lecture 11Lecture 11
Lecture 11
 
Lecture 11
Lecture 11Lecture 11
Lecture 11
 

Plus de Sri Ambati

Plus de Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Dernier

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Dernier (20)

LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 

H2O World - GBM and Random Forest in H2O- Mark Landry

  • 1. GBM  &  Random  Forest  in  H2O   Mark  Landry  
  • 2. Presenta6on  Outline   •  Algorithm  Background   o Decision  Trees   o Random  Forest   o Gradient  Boosted  Machines  (GBM)   •  H2O  ImplementaCons   o Code  examples   o DescripCon  of  parameters  and  general  usage  
  • 3. Decision  Trees:  Concept   •  Separate  the  data   according  to  a  series  of   quesCons   o  Age  >  9.5?   •  The  quesCons  are  found   automaCcally  to   opCmize  separaCon  of   the  data  point  by  the   “target”   Source: wikimedia CART tree Titanic survivors Example decision tree: Predicting survival of Titanic passengers
  • 4. Decision  Trees:  Prac6cal  Use   •  Non  linear   •  Robust  to  correlated   features   •  Robust  to  feature   distribuCons   •  Robust  to  missing   values   •  Simple  to  comprehend   •  Fast  to  train   •  Fast  to  score   •  Poor  accuracy   •  Cannot  project   •  Inefficiently  fits  linear   relaConships   WeaknessesStrengths
  • 5. Improved  Decision  Trees:  Ensembles   •  Bootstrap  aggregaCon   (bagging)   •  Fit  many  trees  against   different  samples  of  the   data  and  average   together   •  BoosCng   •  Fits  consecuCve  trees   where  each  solves  for   the  net  error  of  the   prior  trees     GBMRandom Forest
  • 6. Random  Forest   •  Combine  mulCple   decision  trees,  each  fit   to  a  random  sample  of   the  original  data   •  Randomly  samples     o  Rows   o  Columns   •  Reduce  variance,  with   minimal  increase  in  bias   •  Strengths   o  Easy  to  use   •  Few  parameters   •  Well-­‐established  default   values  for  parameters     o  Robust   o  CompeCCve  accuracy  on   most  data  sets   •  Weaknesses   o  Slow  to  score   o  Lack  of  transparency   PracticalConceptual
  • 7. Gradient  Boosted  Machines  (GBM)   •  BoosCng:  ensemble  of   weak  learners*   •  Fits  consecuCve  trees   where  each  solves  for  the   net  loss  of  the  prior  trees   •  Results  of  new  trees  are   applied  parCally  to  the   enCre  soluCon   •  Strengths   o  O`en  best  possible  model   o  Robust   o  Directly  opCmizes  cost   funcCon   •  Weaknesses   o  Overfits   •  Need  to  find  proper   stopping  point   o  SensiCve  to  noise  and   extreme  values   o  Several  hyper-­‐parameters   o  Lack  of  transparency   PracticalConceptual * the notion of “weak” is being challenged in practice
  • 8. Trees  in  H2O   •  Individual  tree  fiang  is  performed  in  parallel   •  Shared  histograms  calculate  cut-­‐points     •  Greedy  search  of  histogram  bins,  opCmizing   squared  error  
  • 9. Explore  Further  through  Examples   I have H2O Installed I have R installed I have the H2O World data sets