SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
Fast Perceptron Decision Tree Learning
from Evolving Data Streams
Albert Bifet, Geoff Holmes, Bernhard Pfahringer, and Eibe Frank
University of Waikato
Hamilton, New Zealand
Hyderabad, 23 June 2010
14th Pacific-Asia Conference on Knowledge Discovery and Data Mining
(PAKDD’10)
Motivation
RAM Hours
Time and Memory in one measure
Hoeffding Decision Trees with Perceptron Learners at
leaves
Improve performance of classification methods for data
streams
2 / 28
Outline
1 RAM-Hours
2 Perceptron Decision Tree Learning
3 Empirical evaluation
3 / 28
Mining Massive Data
2007
Digital Universe: 281 exabytes (billion gigabytes)
The amount of information created exceeded available
storage for the first time
Web 2.0
106 million registered users
600 million search queries per day
3 billion requests a day via its API.
4 / 28
Green Computing
Green Computing
Study and practice of using computing resources efficiently.
Algorithmic Efficiency
A main approach of Green Computing
Data Streams
Fast methods without storing all dataset in memory
5 / 28
Data stream classification cycle
1 Process an example at a time,
and inspect it only once (at
most)
2 Use a limited amount of
memory
3 Work in a limited amount of
time
4 Be ready to predict at any
point
6 / 28
Mining Massive Data
Koichi Kawana
Simplicity means the achievement of maximum effect with
minimum means.
time
accuracy
memory
Data Streams
7 / 28
Evaluation Example
Accuracy Time Memory
Classifier A 70% 100 20
Classifier B 80% 20 40
Which classifier is performing better?
8 / 28
RAM-Hours
RAM-Hour
Every GB of RAM deployed for 1 hour
Cloud Computing Rental Cost Options
9 / 28
Evaluation Example
Accuracy Time Memory RAM-Hours
Classifier A 70% 100 20 2,000
Classifier B 80% 20 40 800
Which classifier is performing better?
10 / 28
Outline
1 RAM-Hours
2 Perceptron Decision Tree Learning
3 Empirical evaluation
11 / 28
Hoeffding Trees
Hoeffding Tree : VFDT
Pedro Domingos and Geoff Hulten.
Mining high-speed data streams. 2000
With high probability, constructs an identical model that a
traditional (greedy) method would learn
With theoretical guarantees on the error rate
Time
Contains “Money”
YES
Yes
NO
No
Day
YES
Night
12 / 28
Hoeffding Naive Bayes Tree
Hoeffding Tree
Majority Class learner at leaves
Hoeffding Naive Bayes Tree
G. Holmes, R. Kirkby, and B. Pfahringer.
Stress-testing Hoeffding trees, 2005.
monitors accuracy of a Majority Class learner
monitors accuracy of a Naive Bayes learner
predicts using the most accurate method
13 / 28
Perceptron
Attribute 1
Attribute 2
Attribute 3
Attribute 4
Attribute 5
Output hw (xi)
w1
w2
w3
w4
w5
Data stream: xi,yi
Classical perceptron: hw (xi) = sgn(wT xi),
Minimize Mean-square error: J(w) = 1
2 ∑(yi −hw (xi))2
14 / 28
Perceptron
Attribute 1
Attribute 2
Attribute 3
Attribute 4
Attribute 5
Output hw (xi)
w1
w2
w3
w4
w5
We use sigmoid function hw = σ(wT x) where
σ(x) = 1/(1+e−x
)
σ (x) = σ(x)(1−σ(x))
14 / 28
Perceptron
Minimize Mean-square error: J(w) = 1
2 ∑(yi −hw (xi))2
Stochastic Gradient Descent: w = w +η∇Jxi
Gradient of the error function:
∇J = −∑
i
(yi −hw (xi))∇hw (xi)
∇hw (xi) = hw (xi)(1−hw (xi))
Weight update rule
w = w +η ∑
i
(yi −hw (xi))hw (xi)(1−hw (xi))xi
14 / 28
Perceptron
PERCEPTRON LEARNING(Stream,η)
1 for each class
2 do PERCEPTRON LEARNING(Stream,class,η)
PERCEPTRON LEARNING(Stream,class,η)
1 £ Let w0 and w be randomly initialized
2 for each example (x,y) in Stream
3 do if class = y
4 then δ = (1−hw (x))·hw (x)·(1−hw (x))
5 else δ = (0−hw (x))·hw (x)·(1−hw (x))
6 w = w +η ·δ ·x
PERCEPTRON PREDICTION(x)
1 return argmaxclass hwclass
(x)
15 / 28
Hybrid Hoeffding Trees
Hoeffding Naive Bayes Tree
Two learners at leaves: Naive Bayes and Majority Class
Hoeffding Perceptron Tree
Two learners at leaves: Perceptron and Majority Class
Hoeffding Naive Bayes Perceptron Tree
Three learners at leaves: Naive Bayes, Perceptron and Majority
Class
16 / 28
Outline
1 RAM-Hours
2 Perceptron Decision Tree Learning
3 Empirical evaluation
17 / 28
What is MOA?
{M}assive {O}nline {A}nalysis is a framework for online
learning from data streams.
It is closely related to WEKA
It includes a collection of offline and online methods as well
as tools for evaluation:
boosting and bagging
Hoeffding Trees
with and without Na¨ıve Bayes classifiers at the leaves.
18 / 28
What is MOA?
Easy to extend
Easy to design and run experiments
Philipp Kranen, Hardy Kremer, Timm Jansen, Thomas
Seidl, Albert Bifet, Geoff Holmes, Bernhard Pfahringer
RWTH Aachen University, University of Waikato
Benchmarking Stream Clustering Algorithms within the
MOA Framework
KDD 2010 Demo
18 / 28
MOA: the bird
The Moa (another native NZ bird) is not only flightless, like the
Weka, but also extinct.
19 / 28
MOA: the bird
The Moa (another native NZ bird) is not only flightless, like the
Weka, but also extinct.
19 / 28
MOA: the bird
The Moa (another native NZ bird) is not only flightless, like the
Weka, but also extinct.
19 / 28
Concept Drift Framework
t
f(t) f(t)
α
α
t0
W
0.5
1
Definition
Given two data streams a, b, we define c = a⊕W
t0
b as the data
stream built joining the two data streams a and b
Pr[c(t) = b(t)] = 1/(1+e−4(t−t0)/W ).
Pr[c(t) = a(t)] = 1−Pr[c(t) = b(t)]
20 / 28
Concept Drift Framework
t
f(t) f(t)
α
α
t0
W
0.5
1
Example
(((a⊕W0
t0
b)⊕W1
t1
c)⊕W2
t2
d)...
(((SEA9 ⊕W
t0
SEA8)⊕W
2t0
SEA7)⊕W
3t0
SEA9.5)
CovPokElec = (CoverType⊕5,000
581,012 Poker)⊕5,000
1,000,000 ELEC2
20 / 28
Empirical evaluation
Accuracy
40
45
50
55
60
65
70
75
80
10.000 120.000 230.000 340.000 450.000 560.000 670.000 780.000 890.000 1.000.0
Instances
Accuracy(%)
htnbp
htnb
htp
ht
Figure: Accuracy on dataset LED with three concept drifts.
21 / 28
Empirical evaluation
RunTime
0
5
10
15
20
25
30
35
10.000 120.000 230.000 340.000 450.000 560.000 670.000 780.000 890.000
Instances
Time(sec.)
htnbp
htnb
htp
ht
Figure: Time on dataset LED with three concept drifts.
22 / 28
Empirical evaluation
Memory
0
0,5
1
1,5
2
2,5
3
3,5
4
4,5
5
10.000 130.000 250.000 370.000 490.000 610.000 730.000 850.000 970.000
Instances
Memory(Mb)
htnbp
htnb
htp
ht
Figure: Memory on dataset LED with three concept drifts.
23 / 28
Empirical evaluation
RAM-Hours
0,00E+00
5,00E-06
1,00E-05
1,50E-05
2,00E-05
2,50E-05
3,00E-05
3,50E-05
4,00E-05
4,50E-05
10.000 130.000 250.000 370.000 490.000 610.000 730.000 850.000 970.000
Instances
RAM-Hours
htnbp
htnb
htp
ht
Figure: RAM-Hours on dataset LED with three concept drifts.
24 / 28
Empirical evaluation Cover Type Dataset
Accuracy Time Mem RAM-Hours
Perceptron 81.68 12.21 0.05 1.00
Na¨ıve Bayes 60.52 22.81 0.08 2.99
Hoeffding Tree 68.3 13.43 2.59 56.98
Trees
Na¨ıve Bayes HT 81.06 24.73 2.59 104.92
Perceptron HT 83.59 16.53 3.46 93.68
NB Perceptron HT 85.77 22.16 3.46 125.59
Bagging
Na¨ıve Bayes HT 85.73 165.75 0.8 217.20
Perceptron HT 86.33 50.06 1.66 136.12
NB Perceptron HT 87.88 115.58 1.25 236.65
25 / 28
Empirical evaluation Electricity Dataset
Accuracy Time Mem RAM-Hours
Perceptron 79.07 0.53 0.01 1.00
Na¨ıve Bayes 73.36 0.55 0.01 1.04
Hoeffding Tree 75.35 0.86 0.12 19.47
Trees
Na¨ıve Bayes HT 80.69 0.96 0.12 21.74
Perceptron HT 84.24 0.93 0.21 36.85
NB Perceptron HT 84.34 1.07 0.21 42.40
Bagging
Na¨ıve Bayes HT 84.36 3.17 0.13 77.75
Perceptron HT 85.22 2.59 0.44 215.02
NB Perceptron HT 86.44 3.55 0.3 200.94
26 / 28
Summary
http://moa.cs.waikato.ac.nz/
Summary
Sensor Networks
use Perceptron
Handheld Computers
use Hoeffding Naive Bayes Perceptron Tree
Servers
use Bagging Hoeffding Naive Bayes Perceptron Tree
27 / 28
Summary
http://moa.cs.waikato.ac.nz/
Conclusions
RAM-Hours as a new measure of time and memory
Hoeffding Perceptron Tree
Hoeffding Naive Bayes Perceptron Tree
Future Work
Adaptive learning rate for the Perceptron.
28 / 28

Contenu connexe

Tendances

Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache FlinkAlbert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Flink Forward
 
Visualization-Driven Data Aggregation
Visualization-Driven Data AggregationVisualization-Driven Data Aggregation
Visualization-Driven Data Aggregation
Zbigniew Jerzak
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in Python
Wes McKinney
 
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
Thomas Gottron
 

Tendances (20)

A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
Streaming Algorithms
Streaming AlgorithmsStreaming Algorithms
Streaming Algorithms
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDT
 
Mining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert BifetMining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert Bifet
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache FlinkAlbert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Visualization-Driven Data Aggregation
Visualization-Driven Data AggregationVisualization-Driven Data Aggregation
Visualization-Driven Data Aggregation
 
ReComp: challenges in selective recomputation of (expensive) data analytics t...
ReComp: challenges in selective recomputation of (expensive) data analytics t...ReComp: challenges in selective recomputation of (expensive) data analytics t...
ReComp: challenges in selective recomputation of (expensive) data analytics t...
 
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian ApproachAutomatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
 
Data streaming algorithms
Data streaming algorithmsData streaming algorithms
Data streaming algorithms
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Metric based meta_learning
Metric based meta_learningMetric based meta_learning
Metric based meta_learning
 
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in Python
 
Surface-related multiple elimination through orthogonal encoding in the laten...
Surface-related multiple elimination through orthogonal encoding in the laten...Surface-related multiple elimination through orthogonal encoding in the laten...
Surface-related multiple elimination through orthogonal encoding in the laten...
 
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
 
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
 

En vedette

Mitchell's Face Recognition
Mitchell's Face RecognitionMitchell's Face Recognition
Mitchell's Face Recognition
butest
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Indraneel Pole
 

En vedette (20)

Mitchell's Face Recognition
Mitchell's Face RecognitionMitchell's Face Recognition
Mitchell's Face Recognition
 
Create a MLP
Create a MLPCreate a MLP
Create a MLP
 
Multilayer Perceptron Backpropagation Hagan
Multilayer Perceptron Backpropagation HaganMultilayer Perceptron Backpropagation Hagan
Multilayer Perceptron Backpropagation Hagan
 
Perceptron
PerceptronPerceptron
Perceptron
 
restrictedboltzmannmachines
restrictedboltzmannmachinesrestrictedboltzmannmachines
restrictedboltzmannmachines
 
Multi Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationMulti Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back Propagation
 
Machine learning in R
Machine learning in RMachine learning in R
Machine learning in R
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
 
Procedural modeling using autoencoder networks
Procedural modeling using autoencoder networksProcedural modeling using autoencoder networks
Procedural modeling using autoencoder networks
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's Perceptron
 
Lecture 9 Perceptron
Lecture 9 PerceptronLecture 9 Perceptron
Lecture 9 Perceptron
 
Variational autoencoder talk
Variational autoencoder talkVariational autoencoder talk
Variational autoencoder talk
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 
Learning RBM(Restricted Boltzmann Machine in Practice)
Learning RBM(Restricted Boltzmann Machine in Practice)Learning RBM(Restricted Boltzmann Machine in Practice)
Learning RBM(Restricted Boltzmann Machine in Practice)
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
 
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
 

Similaire à Fast Perceptron Decision Tree Learning from Evolving Data Streams

kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
butest
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Esteban Donato
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Advances in Bayesian Learning
Advances in Bayesian LearningAdvances in Bayesian Learning
Advances in Bayesian Learning
butest
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
ESCOM
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
butest
 
An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...
An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...
An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...
Databricks
 

Similaire à Fast Perceptron Decision Tree Learning from Evolving Data Streams (20)

kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
 
2015 illinois-talk
2015 illinois-talk2015 illinois-talk
2015 illinois-talk
 
Data mining
Data mining Data mining
Data mining
 
Slides barcelona risk data
Slides barcelona risk dataSlides barcelona risk data
Slides barcelona risk data
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
Scaling-up collections digitisation
Scaling-up collections digitisationScaling-up collections digitisation
Scaling-up collections digitisation
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Advances in Bayesian Learning
Advances in Bayesian LearningAdvances in Bayesian Learning
Advances in Bayesian Learning
 
A genetic algorithm coupled with tree-based pruning for mining closed associa...
A genetic algorithm coupled with tree-based pruning for mining closed associa...A genetic algorithm coupled with tree-based pruning for mining closed associa...
A genetic algorithm coupled with tree-based pruning for mining closed associa...
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
A New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScienceA New Partnership for Cross-Scale, Cross-Domain eScience
A New Partnership for Cross-Scale, Cross-Domain eScience
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...
An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...
An Online Spark Pipeline: Semi-Supervised Learning and Automatic Retraining w...
 
Braintalk cuso nm
Braintalk cuso nmBraintalk cuso nm
Braintalk cuso nm
 

Plus de Albert Bifet

Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labels
Albert Bifet
 
Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
Albert Bifet
 

Plus de Albert Bifet (19)

Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache Flink
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labels
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsPAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
 
MOA : Massive Online Analysis
MOA : Massive Online AnalysisMOA : Massive Online Analysis
MOA : Massive Online Analysis
 
New ensemble methods for evolving data streams
New ensemble methods for evolving data streamsNew ensemble methods for evolving data streams
New ensemble methods for evolving data streams
 
Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.
 
Adaptive XML Tree Mining on Evolving Data Streams
Adaptive XML Tree Mining on Evolving Data StreamsAdaptive XML Tree Mining on Evolving Data Streams
Adaptive XML Tree Mining on Evolving Data Streams
 
Adaptive Learning and Mining for Data Streams and Frequent Patterns
Adaptive Learning and Mining for Data Streams and Frequent PatternsAdaptive Learning and Mining for Data Streams and Frequent Patterns
Adaptive Learning and Mining for Data Streams and Frequent Patterns
 
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsMining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
 
Mining Implications from Lattices of Closed Trees
Mining Implications from Lattices of Closed TreesMining Implications from Lattices of Closed Trees
Mining Implications from Lattices of Closed Trees
 
Kalman Filters and Adaptive Windows for Learning in Data Streams
Kalman Filters and Adaptive Windows for Learning in Data StreamsKalman Filters and Adaptive Windows for Learning in Data Streams
Kalman Filters and Adaptive Windows for Learning in Data Streams
 

Dernier

Dernier (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Fast Perceptron Decision Tree Learning from Evolving Data Streams

  • 1. Fast Perceptron Decision Tree Learning from Evolving Data Streams Albert Bifet, Geoff Holmes, Bernhard Pfahringer, and Eibe Frank University of Waikato Hamilton, New Zealand Hyderabad, 23 June 2010 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’10)
  • 2. Motivation RAM Hours Time and Memory in one measure Hoeffding Decision Trees with Perceptron Learners at leaves Improve performance of classification methods for data streams 2 / 28
  • 3. Outline 1 RAM-Hours 2 Perceptron Decision Tree Learning 3 Empirical evaluation 3 / 28
  • 4. Mining Massive Data 2007 Digital Universe: 281 exabytes (billion gigabytes) The amount of information created exceeded available storage for the first time Web 2.0 106 million registered users 600 million search queries per day 3 billion requests a day via its API. 4 / 28
  • 5. Green Computing Green Computing Study and practice of using computing resources efficiently. Algorithmic Efficiency A main approach of Green Computing Data Streams Fast methods without storing all dataset in memory 5 / 28
  • 6. Data stream classification cycle 1 Process an example at a time, and inspect it only once (at most) 2 Use a limited amount of memory 3 Work in a limited amount of time 4 Be ready to predict at any point 6 / 28
  • 7. Mining Massive Data Koichi Kawana Simplicity means the achievement of maximum effect with minimum means. time accuracy memory Data Streams 7 / 28
  • 8. Evaluation Example Accuracy Time Memory Classifier A 70% 100 20 Classifier B 80% 20 40 Which classifier is performing better? 8 / 28
  • 9. RAM-Hours RAM-Hour Every GB of RAM deployed for 1 hour Cloud Computing Rental Cost Options 9 / 28
  • 10. Evaluation Example Accuracy Time Memory RAM-Hours Classifier A 70% 100 20 2,000 Classifier B 80% 20 40 800 Which classifier is performing better? 10 / 28
  • 11. Outline 1 RAM-Hours 2 Perceptron Decision Tree Learning 3 Empirical evaluation 11 / 28
  • 12. Hoeffding Trees Hoeffding Tree : VFDT Pedro Domingos and Geoff Hulten. Mining high-speed data streams. 2000 With high probability, constructs an identical model that a traditional (greedy) method would learn With theoretical guarantees on the error rate Time Contains “Money” YES Yes NO No Day YES Night 12 / 28
  • 13. Hoeffding Naive Bayes Tree Hoeffding Tree Majority Class learner at leaves Hoeffding Naive Bayes Tree G. Holmes, R. Kirkby, and B. Pfahringer. Stress-testing Hoeffding trees, 2005. monitors accuracy of a Majority Class learner monitors accuracy of a Naive Bayes learner predicts using the most accurate method 13 / 28
  • 14. Perceptron Attribute 1 Attribute 2 Attribute 3 Attribute 4 Attribute 5 Output hw (xi) w1 w2 w3 w4 w5 Data stream: xi,yi Classical perceptron: hw (xi) = sgn(wT xi), Minimize Mean-square error: J(w) = 1 2 ∑(yi −hw (xi))2 14 / 28
  • 15. Perceptron Attribute 1 Attribute 2 Attribute 3 Attribute 4 Attribute 5 Output hw (xi) w1 w2 w3 w4 w5 We use sigmoid function hw = σ(wT x) where σ(x) = 1/(1+e−x ) σ (x) = σ(x)(1−σ(x)) 14 / 28
  • 16. Perceptron Minimize Mean-square error: J(w) = 1 2 ∑(yi −hw (xi))2 Stochastic Gradient Descent: w = w +η∇Jxi Gradient of the error function: ∇J = −∑ i (yi −hw (xi))∇hw (xi) ∇hw (xi) = hw (xi)(1−hw (xi)) Weight update rule w = w +η ∑ i (yi −hw (xi))hw (xi)(1−hw (xi))xi 14 / 28
  • 17. Perceptron PERCEPTRON LEARNING(Stream,η) 1 for each class 2 do PERCEPTRON LEARNING(Stream,class,η) PERCEPTRON LEARNING(Stream,class,η) 1 £ Let w0 and w be randomly initialized 2 for each example (x,y) in Stream 3 do if class = y 4 then δ = (1−hw (x))·hw (x)·(1−hw (x)) 5 else δ = (0−hw (x))·hw (x)·(1−hw (x)) 6 w = w +η ·δ ·x PERCEPTRON PREDICTION(x) 1 return argmaxclass hwclass (x) 15 / 28
  • 18. Hybrid Hoeffding Trees Hoeffding Naive Bayes Tree Two learners at leaves: Naive Bayes and Majority Class Hoeffding Perceptron Tree Two learners at leaves: Perceptron and Majority Class Hoeffding Naive Bayes Perceptron Tree Three learners at leaves: Naive Bayes, Perceptron and Majority Class 16 / 28
  • 19. Outline 1 RAM-Hours 2 Perceptron Decision Tree Learning 3 Empirical evaluation 17 / 28
  • 20. What is MOA? {M}assive {O}nline {A}nalysis is a framework for online learning from data streams. It is closely related to WEKA It includes a collection of offline and online methods as well as tools for evaluation: boosting and bagging Hoeffding Trees with and without Na¨ıve Bayes classifiers at the leaves. 18 / 28
  • 21. What is MOA? Easy to extend Easy to design and run experiments Philipp Kranen, Hardy Kremer, Timm Jansen, Thomas Seidl, Albert Bifet, Geoff Holmes, Bernhard Pfahringer RWTH Aachen University, University of Waikato Benchmarking Stream Clustering Algorithms within the MOA Framework KDD 2010 Demo 18 / 28
  • 22. MOA: the bird The Moa (another native NZ bird) is not only flightless, like the Weka, but also extinct. 19 / 28
  • 23. MOA: the bird The Moa (another native NZ bird) is not only flightless, like the Weka, but also extinct. 19 / 28
  • 24. MOA: the bird The Moa (another native NZ bird) is not only flightless, like the Weka, but also extinct. 19 / 28
  • 25. Concept Drift Framework t f(t) f(t) α α t0 W 0.5 1 Definition Given two data streams a, b, we define c = a⊕W t0 b as the data stream built joining the two data streams a and b Pr[c(t) = b(t)] = 1/(1+e−4(t−t0)/W ). Pr[c(t) = a(t)] = 1−Pr[c(t) = b(t)] 20 / 28
  • 26. Concept Drift Framework t f(t) f(t) α α t0 W 0.5 1 Example (((a⊕W0 t0 b)⊕W1 t1 c)⊕W2 t2 d)... (((SEA9 ⊕W t0 SEA8)⊕W 2t0 SEA7)⊕W 3t0 SEA9.5) CovPokElec = (CoverType⊕5,000 581,012 Poker)⊕5,000 1,000,000 ELEC2 20 / 28
  • 27. Empirical evaluation Accuracy 40 45 50 55 60 65 70 75 80 10.000 120.000 230.000 340.000 450.000 560.000 670.000 780.000 890.000 1.000.0 Instances Accuracy(%) htnbp htnb htp ht Figure: Accuracy on dataset LED with three concept drifts. 21 / 28
  • 28. Empirical evaluation RunTime 0 5 10 15 20 25 30 35 10.000 120.000 230.000 340.000 450.000 560.000 670.000 780.000 890.000 Instances Time(sec.) htnbp htnb htp ht Figure: Time on dataset LED with three concept drifts. 22 / 28
  • 29. Empirical evaluation Memory 0 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 10.000 130.000 250.000 370.000 490.000 610.000 730.000 850.000 970.000 Instances Memory(Mb) htnbp htnb htp ht Figure: Memory on dataset LED with three concept drifts. 23 / 28
  • 30. Empirical evaluation RAM-Hours 0,00E+00 5,00E-06 1,00E-05 1,50E-05 2,00E-05 2,50E-05 3,00E-05 3,50E-05 4,00E-05 4,50E-05 10.000 130.000 250.000 370.000 490.000 610.000 730.000 850.000 970.000 Instances RAM-Hours htnbp htnb htp ht Figure: RAM-Hours on dataset LED with three concept drifts. 24 / 28
  • 31. Empirical evaluation Cover Type Dataset Accuracy Time Mem RAM-Hours Perceptron 81.68 12.21 0.05 1.00 Na¨ıve Bayes 60.52 22.81 0.08 2.99 Hoeffding Tree 68.3 13.43 2.59 56.98 Trees Na¨ıve Bayes HT 81.06 24.73 2.59 104.92 Perceptron HT 83.59 16.53 3.46 93.68 NB Perceptron HT 85.77 22.16 3.46 125.59 Bagging Na¨ıve Bayes HT 85.73 165.75 0.8 217.20 Perceptron HT 86.33 50.06 1.66 136.12 NB Perceptron HT 87.88 115.58 1.25 236.65 25 / 28
  • 32. Empirical evaluation Electricity Dataset Accuracy Time Mem RAM-Hours Perceptron 79.07 0.53 0.01 1.00 Na¨ıve Bayes 73.36 0.55 0.01 1.04 Hoeffding Tree 75.35 0.86 0.12 19.47 Trees Na¨ıve Bayes HT 80.69 0.96 0.12 21.74 Perceptron HT 84.24 0.93 0.21 36.85 NB Perceptron HT 84.34 1.07 0.21 42.40 Bagging Na¨ıve Bayes HT 84.36 3.17 0.13 77.75 Perceptron HT 85.22 2.59 0.44 215.02 NB Perceptron HT 86.44 3.55 0.3 200.94 26 / 28
  • 33. Summary http://moa.cs.waikato.ac.nz/ Summary Sensor Networks use Perceptron Handheld Computers use Hoeffding Naive Bayes Perceptron Tree Servers use Bagging Hoeffding Naive Bayes Perceptron Tree 27 / 28
  • 34. Summary http://moa.cs.waikato.ac.nz/ Conclusions RAM-Hours as a new measure of time and memory Hoeffding Perceptron Tree Hoeffding Naive Bayes Perceptron Tree Future Work Adaptive learning rate for the Perceptron. 28 / 28