SlideShare une entreprise Scribd logo
1  sur  18
Patrick Nicolas
http://patricknicolas.blogspot.com
07/13/2013
Need for reliability
Copyright 2013 Patrick Nicolas 2
Existing algorithms used in recommendation,
predictive behavior of consumers or target
advertising do not have to be very accurate: the
negative impact of recommending a book, movie
incorrectly or failing to detect the interest of a
consumer is very limited.
However, some problems requires a far more reliable
solution: failure to preserve large amount of data,
detect security intrusion or predict the progress of a
disease have grave consequence.
Options
Copyright 2013 Patrick Nicolas 3
Traditional data mining approaches such as
clustering (Unsupervised learning), generative
or discriminative supervised learning algorithm
failed to capture the evolutionary nature of a
system with its states and underlying data.
Supervised learning
Copyright 2013 Patrick Nicolas 4
Supervised learning is effective for problems with a
large training compared to the dimension of the model.
However it suffers from the following limitations:
• Over-fitting: A supervised learning algorithm needs
a large training to account for bias in the training set
• No descriptive (human) knowledge representation
• Role of domain expert is limited to providing labeled
data and validate the results.
• The model has to be retrained in case of false positive
or false negative
Unsupervised learning
Copyright 2013 Patrick Nicolas 5
Unsupervised learning methods such as Spectral
Clustering, Kernel-based K-Means are used for anomaly
detections or dimension reduction but have drawbacks:
• Poor classification, in case of mix discrete &
continuous variables
• No descriptive knowledge representation
• Limited leverage of domain expertise: Role of the
domain expert is limited to validating the cluster
• Clusters have to be rebuilt if number of outliers
increases
Symbolic Regression
Copyright 2013 Patrick Nicolas 6
Symbolic Regression addresses the key limitations
of unsupervised and supervised learning methods.
It combines evolutionary computation with
reinforcement learning to provide domain experts
a tool to create, evaluate and modify rules, policies
or models.
The most commonly used algorithms in Symbolic
Regression
•Genetic programming
•Learning Classifiers System
Symbolic Regression
Copyright 2013 Patrick Nicolas 7
• Optimization of data archiving
• Intelligent data and instrumentation
streaming
• Predicting behavior of ecommerce site during
“flash” or holiday sales
• Monitoring and predicting security
vulnerabilities in data centers
• Distribution of network traffic and flow in
public cloud
Symbolic Regression is used in very different
applications such as
Symbolic representation
Copyright 2013 Patrick Nicolas 8
The goal is to extract knowledge from data (numerical,
textual, events…) as symbolic or human readable
representation using primitives or operators
• Boolean operators OR, AND, XOR,..
• Numerical functions Sin, Exp, Sigmoid,….
• Numerical operators +, *, o, …
• Differentiable operators derivative, integral,.
• Logical operators: Predicate, rules,..
Domain ExpertDomain Expert
Data MiningData Mining
DataData
sinIf _ then _
_ has a _
If _ then _
exp
_ * _
Knowledge Extraction
Copyright 2013 Patrick Nicolas 9
Knowledge extraction is the process of selecting,
combining the appropriate symbolic primitives or
operators to describe and predict states of a system.
Expertise
Model
Expertise
Model
sinIf _ then _
_ has a _
If _ then _
exp
_ * _
f”
SystemSystem
State/DataState/Data
PredictionPrediction
Knowledge Primitives
Copyright 2013 Patrick Nicolas 10
The generation of knowledge from a set of symbolic
primitives to represent underlying state of a system is a NP
problem (combinatorial explosion). Moreover computers
process data in binary format (theory of information).
Value
Binary
Encoding
The solution is to represent knowledge as symbolic
primitives in binary format.
Knowledge Encoding
Copyright 2013 Patrick Nicolas 11
The most common representation is to encode
symbolic primitives as sequences 0 & 1’s
f(x) = 2.sin(x) – exp(x*x)
- ( * (sin,2), o (exp, sqr))
- * o sin 2 exp sqr
long long long
Binary data
0101001001110111011101110111011101111111000111111011101101000001001000101010
Data Modeling using Genetic Algorithm
Copyright 2013 Patrick Nicolas 12
For a given state of a system we need to find the
optimal model (combination of primitives) to describe
the current state using a Genetic Algorithm. The (0,1)
encoding is associated to a chromosome with selection,
cross-over, transposition and mutation operators
100100111011101110111011101110oo
10000010111100001010010011011
1001010111011101110100100111011
100000101111000010011011101110
Cross-over
Parents Off-springs
10010011101110111000111011101110 100100111010111101110111111100110
Mutation
10010011101110111000111011101110
Transposition
101110100100111011011101110111011
s e se
Computation Flow of Genetic Algorithm
Copyright 2013 Patrick Nicolas 13
Initial Pool
of Models
Initial Pool
of Models
EncodingEncoding Initial
Chromosomes
Initial
Chromosomes
New
population
New
population
SelectionSelectionFitnessFitness
Cross-overCross-over
MutationMutation
Fittest
Chromosome
Fittest
ChromosomeDecodingDecoding
Best ModelBest Model
Once the initial set of chromosomes is randomly
generated the algorithm iterates until fittest
chromosome emerges
TranspositionTransposition
Limitation of Genetic Algorithm
Copyright 2013 Patrick Nicolas 14
The selection of the best chromosome representing
the best classifier (or model) relies on the
computation of a fitness value under the assumption
that the objective does not change over time.
As most system evolves over-time, so does the
objective. Reinforcement learning is used to adjust
the objective using a reward/credit assignment
mechanism.
EncodingEncoding
Concept of Reinforcement Learning
Copyright 2013 Patrick Nicolas 15
As the state of the system evolves over-time, it
rewards or punishes the fittest classifier which action
has been executed. The rewards or punishment is
used to adjust the objective and fitness function.
System
State/DataState/Data
ProbesProbes EffectorsEffectors RewardReward
Best
Action
Best
Action
Reward AssignmentReward Assignment
DecodingDecoding
Genetic
Algorithm
Genetic
Algorithm
PrimitivesPrimitives
Best
classifier
Best
classifier
Elements of Reinforcement Learning
Copyright 2013 Patrick Nicolas 16
The main challenge of reinforcement learning is to predict the impact
of each action An on the global state. We need …
•Actions (or classifiers) that support logic, IF/THEN, numerical,
y=f(x1, … xn) and discrete {ai} classifiers to predict the impact of a
remedial action on the security of the system
1.A metric to measure the security of the overall system (distance
between the current state and the baseline)
1.An actions discovery & adaptation mechanism
1.An efficient optimizer to select the best action at any state:
Stochastic Descent Gradient for continuous variables {xi} only or
Genetic Algorithm for mix of Boolean, Integer and Double
Putting All Together
Copyright 2013 Patrick Nicolas 17
EnvironmentInitial
Knowledge
Initial
Knowledge
EncodingEncoding
Expert Supervised
Learning
Classifiers
Population
Classifiers
Population
State/DataState/Data
SelectSelect
Cross-
over
Cross-
over
MutateMutate
ProbesProbes EffectorsEffectors RewardReward
Best
Classifiers
Best
Classifiers
Actions
Predictor
Actions
Predictor
ActionAction
Q-LearningQ-LearningReward AssignmentReward Assignment
Genetic AlgorithmReinforcement Learning
MatchMatch
TransposeTranspose
References
Copyright 2013 Patrick Nicolas 18
• Genetic Programming: On the Programming of Computers by
Means of Natural Selection - J. Koza
• Reinforcement Learning: An Introduction (Adaptive
Computation and Machine Learning) – R. Sutton, A. Barto
• http://www.mendeley.com/catalog/symbolic-regression-via-genetic-
programming/

Contenu connexe

Tendances

A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryIchigaku Takigawa
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: ClusteringDeepak George
 
Build an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMBuild an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMPoo Kuan Hoong
 
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Edureka!
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inferencebutest
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalHarvinder Atwal
 
Machine learning
Machine learningMachine learning
Machine learningInfoFarm
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringSri Ambati
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Julien Le Dem
 
3 7 건설정보화전략과pmis(이민남)
3 7 건설정보화전략과pmis(이민남)3 7 건설정보화전략과pmis(이민남)
3 7 건설정보화전략과pmis(이민남)JiWoon Yi
 
04. logistic regression ( 로지스틱 회귀 )
04. logistic regression ( 로지스틱 회귀 )04. logistic regression ( 로지스틱 회귀 )
04. logistic regression ( 로지스틱 회귀 )Jeonghun Yoon
 
Process Mining: Past, Present, and Future
Process Mining: Past, Present, and FutureProcess Mining: Past, Present, and Future
Process Mining: Past, Present, and FutureCelonis
 
Business Process Monitoring and Mining
Business Process Monitoring and MiningBusiness Process Monitoring and Mining
Business Process Monitoring and MiningMarlon Dumas
 
Introduction to Machine learning ppt
Introduction to Machine learning pptIntroduction to Machine learning ppt
Introduction to Machine learning pptshubhamshirke12
 
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Simplilearn
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Edureka!
 

Tendances (20)

Titanic: Machine Learning from Disaster
Titanic: Machine Learning from DisasterTitanic: Machine Learning from Disaster
Titanic: Machine Learning from Disaster
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discovery
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: Clustering
 
Build an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMBuild an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBM
 
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
Machine learning
Machine learningMachine learning
Machine learning
 
Quality Control in Development
Quality Control in DevelopmentQuality Control in Development
Quality Control in Development
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020
 
3 7 건설정보화전략과pmis(이민남)
3 7 건설정보화전략과pmis(이민남)3 7 건설정보화전략과pmis(이민남)
3 7 건설정보화전략과pmis(이민남)
 
04. logistic regression ( 로지스틱 회귀 )
04. logistic regression ( 로지스틱 회귀 )04. logistic regression ( 로지스틱 회귀 )
04. logistic regression ( 로지스틱 회귀 )
 
Process Mining: Past, Present, and Future
Process Mining: Past, Present, and FutureProcess Mining: Past, Present, and Future
Process Mining: Past, Present, and Future
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
 
Business Process Monitoring and Mining
Business Process Monitoring and MiningBusiness Process Monitoring and Mining
Business Process Monitoring and Mining
 
Introduction to Machine learning ppt
Introduction to Machine learning pptIntroduction to Machine learning ppt
Introduction to Machine learning ppt
 
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
 

Similaire à Data Modeling using Symbolic Regression

Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERGanesan Narayanasamy
 
Machine Learning in Malware Detection
Machine Learning in Malware DetectionMachine Learning in Malware Detection
Machine Learning in Malware DetectionKaspersky
 
Adaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersAdaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersPatrick Nicolas
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxiaeronlineexm
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxnishanth kurush
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...IRJET Journal
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialTariq King
 
Big data for cybersecurity - skilledfield slides - 25032021
Big data for cybersecurity - skilledfield slides - 25032021Big data for cybersecurity - skilledfield slides - 25032021
Big data for cybersecurity - skilledfield slides - 25032021Mouaz Alnouri
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
The role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha KrishnanThe role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha Krishnansunanthakrishnan
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learningPramit Choudhary
 
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)aNumak & Company
 
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)aNumak & Company
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningSujith Jayaprakash
 
Regression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelRegression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelDr. Abdul Ahad Abro
 

Similaire à Data Modeling using Symbolic Regression (20)

Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWER
 
ds 2.pptx
ds 2.pptxds 2.pptx
ds 2.pptx
 
Machine Learning in Malware Detection
Machine Learning in Malware DetectionMachine Learning in Malware Detection
Machine Learning in Malware Detection
 
Adaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning ClassifiersAdaptive Intrusion Detection Using Learning Classifiers
Adaptive Intrusion Detection Using Learning Classifiers
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Presentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptxPresentation_Malware Analysis.pptx
Presentation_Malware Analysis.pptx
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World Tutorial
 
Big data for cybersecurity - skilledfield slides - 25032021
Big data for cybersecurity - skilledfield slides - 25032021Big data for cybersecurity - skilledfield slides - 25032021
Big data for cybersecurity - skilledfield slides - 25032021
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Machine learning
 Machine learning Machine learning
Machine learning
 
The role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha KrishnanThe role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha Krishnan
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learning
 
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
 
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
Artificial Neural Networks (ANN) And Artificial Intelligence (AI)
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Regression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelRegression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms Excel
 

Plus de Patrick Nicolas

Autonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersAutonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersPatrick Nicolas
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningPatrick Nicolas
 
AI for electronic health records
AI for electronic health recordsAI for electronic health records
AI for electronic health recordsPatrick Nicolas
 
Monadic genetic kernels in Scala
Monadic genetic kernels in ScalaMonadic genetic kernels in Scala
Monadic genetic kernels in ScalaPatrick Nicolas
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine LearningPatrick Nicolas
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentPatrick Nicolas
 
Advanced Functional Programming in Scala
Advanced Functional Programming in ScalaAdvanced Functional Programming in Scala
Advanced Functional Programming in ScalaPatrick Nicolas
 
Semantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia TaxonomySemantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia TaxonomyPatrick Nicolas
 
Taxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingTaxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingPatrick Nicolas
 
Multi-tenancy in Private Clouds
Multi-tenancy in Private CloudsMulti-tenancy in Private Clouds
Multi-tenancy in Private CloudsPatrick Nicolas
 

Plus de Patrick Nicolas (11)

Autonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformersAutonomous medical coding with discriminative transformers
Autonomous medical coding with discriminative transformers
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
 
AI for electronic health records
AI for electronic health recordsAI for electronic health records
AI for electronic health records
 
Monadic genetic kernels in Scala
Monadic genetic kernels in ScalaMonadic genetic kernels in Scala
Monadic genetic kernels in Scala
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine Learning
 
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentimentStock Market Prediction using Hidden Markov Models and Investor sentiment
Stock Market Prediction using Hidden Markov Models and Investor sentiment
 
Advanced Functional Programming in Scala
Advanced Functional Programming in ScalaAdvanced Functional Programming in Scala
Advanced Functional Programming in Scala
 
Semantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia TaxonomySemantic Analysis using Wikipedia Taxonomy
Semantic Analysis using Wikipedia Taxonomy
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Taxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads TargetingTaxonomy-based Contextual Ads Targeting
Taxonomy-based Contextual Ads Targeting
 
Multi-tenancy in Private Clouds
Multi-tenancy in Private CloudsMulti-tenancy in Private Clouds
Multi-tenancy in Private Clouds
 

Dernier

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Dernier (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Data Modeling using Symbolic Regression

  • 2. Need for reliability Copyright 2013 Patrick Nicolas 2 Existing algorithms used in recommendation, predictive behavior of consumers or target advertising do not have to be very accurate: the negative impact of recommending a book, movie incorrectly or failing to detect the interest of a consumer is very limited. However, some problems requires a far more reliable solution: failure to preserve large amount of data, detect security intrusion or predict the progress of a disease have grave consequence.
  • 3. Options Copyright 2013 Patrick Nicolas 3 Traditional data mining approaches such as clustering (Unsupervised learning), generative or discriminative supervised learning algorithm failed to capture the evolutionary nature of a system with its states and underlying data.
  • 4. Supervised learning Copyright 2013 Patrick Nicolas 4 Supervised learning is effective for problems with a large training compared to the dimension of the model. However it suffers from the following limitations: • Over-fitting: A supervised learning algorithm needs a large training to account for bias in the training set • No descriptive (human) knowledge representation • Role of domain expert is limited to providing labeled data and validate the results. • The model has to be retrained in case of false positive or false negative
  • 5. Unsupervised learning Copyright 2013 Patrick Nicolas 5 Unsupervised learning methods such as Spectral Clustering, Kernel-based K-Means are used for anomaly detections or dimension reduction but have drawbacks: • Poor classification, in case of mix discrete & continuous variables • No descriptive knowledge representation • Limited leverage of domain expertise: Role of the domain expert is limited to validating the cluster • Clusters have to be rebuilt if number of outliers increases
  • 6. Symbolic Regression Copyright 2013 Patrick Nicolas 6 Symbolic Regression addresses the key limitations of unsupervised and supervised learning methods. It combines evolutionary computation with reinforcement learning to provide domain experts a tool to create, evaluate and modify rules, policies or models. The most commonly used algorithms in Symbolic Regression •Genetic programming •Learning Classifiers System
  • 7. Symbolic Regression Copyright 2013 Patrick Nicolas 7 • Optimization of data archiving • Intelligent data and instrumentation streaming • Predicting behavior of ecommerce site during “flash” or holiday sales • Monitoring and predicting security vulnerabilities in data centers • Distribution of network traffic and flow in public cloud Symbolic Regression is used in very different applications such as
  • 8. Symbolic representation Copyright 2013 Patrick Nicolas 8 The goal is to extract knowledge from data (numerical, textual, events…) as symbolic or human readable representation using primitives or operators • Boolean operators OR, AND, XOR,.. • Numerical functions Sin, Exp, Sigmoid,…. • Numerical operators +, *, o, … • Differentiable operators derivative, integral,. • Logical operators: Predicate, rules,.. Domain ExpertDomain Expert Data MiningData Mining DataData sinIf _ then _ _ has a _ If _ then _ exp _ * _
  • 9. Knowledge Extraction Copyright 2013 Patrick Nicolas 9 Knowledge extraction is the process of selecting, combining the appropriate symbolic primitives or operators to describe and predict states of a system. Expertise Model Expertise Model sinIf _ then _ _ has a _ If _ then _ exp _ * _ f” SystemSystem State/DataState/Data PredictionPrediction
  • 10. Knowledge Primitives Copyright 2013 Patrick Nicolas 10 The generation of knowledge from a set of symbolic primitives to represent underlying state of a system is a NP problem (combinatorial explosion). Moreover computers process data in binary format (theory of information). Value Binary Encoding The solution is to represent knowledge as symbolic primitives in binary format.
  • 11. Knowledge Encoding Copyright 2013 Patrick Nicolas 11 The most common representation is to encode symbolic primitives as sequences 0 & 1’s f(x) = 2.sin(x) – exp(x*x) - ( * (sin,2), o (exp, sqr)) - * o sin 2 exp sqr long long long Binary data 0101001001110111011101110111011101111111000111111011101101000001001000101010
  • 12. Data Modeling using Genetic Algorithm Copyright 2013 Patrick Nicolas 12 For a given state of a system we need to find the optimal model (combination of primitives) to describe the current state using a Genetic Algorithm. The (0,1) encoding is associated to a chromosome with selection, cross-over, transposition and mutation operators 100100111011101110111011101110oo 10000010111100001010010011011 1001010111011101110100100111011 100000101111000010011011101110 Cross-over Parents Off-springs 10010011101110111000111011101110 100100111010111101110111111100110 Mutation 10010011101110111000111011101110 Transposition 101110100100111011011101110111011 s e se
  • 13. Computation Flow of Genetic Algorithm Copyright 2013 Patrick Nicolas 13 Initial Pool of Models Initial Pool of Models EncodingEncoding Initial Chromosomes Initial Chromosomes New population New population SelectionSelectionFitnessFitness Cross-overCross-over MutationMutation Fittest Chromosome Fittest ChromosomeDecodingDecoding Best ModelBest Model Once the initial set of chromosomes is randomly generated the algorithm iterates until fittest chromosome emerges TranspositionTransposition
  • 14. Limitation of Genetic Algorithm Copyright 2013 Patrick Nicolas 14 The selection of the best chromosome representing the best classifier (or model) relies on the computation of a fitness value under the assumption that the objective does not change over time. As most system evolves over-time, so does the objective. Reinforcement learning is used to adjust the objective using a reward/credit assignment mechanism.
  • 15. EncodingEncoding Concept of Reinforcement Learning Copyright 2013 Patrick Nicolas 15 As the state of the system evolves over-time, it rewards or punishes the fittest classifier which action has been executed. The rewards or punishment is used to adjust the objective and fitness function. System State/DataState/Data ProbesProbes EffectorsEffectors RewardReward Best Action Best Action Reward AssignmentReward Assignment DecodingDecoding Genetic Algorithm Genetic Algorithm PrimitivesPrimitives Best classifier Best classifier
  • 16. Elements of Reinforcement Learning Copyright 2013 Patrick Nicolas 16 The main challenge of reinforcement learning is to predict the impact of each action An on the global state. We need … •Actions (or classifiers) that support logic, IF/THEN, numerical, y=f(x1, … xn) and discrete {ai} classifiers to predict the impact of a remedial action on the security of the system 1.A metric to measure the security of the overall system (distance between the current state and the baseline) 1.An actions discovery & adaptation mechanism 1.An efficient optimizer to select the best action at any state: Stochastic Descent Gradient for continuous variables {xi} only or Genetic Algorithm for mix of Boolean, Integer and Double
  • 17. Putting All Together Copyright 2013 Patrick Nicolas 17 EnvironmentInitial Knowledge Initial Knowledge EncodingEncoding Expert Supervised Learning Classifiers Population Classifiers Population State/DataState/Data SelectSelect Cross- over Cross- over MutateMutate ProbesProbes EffectorsEffectors RewardReward Best Classifiers Best Classifiers Actions Predictor Actions Predictor ActionAction Q-LearningQ-LearningReward AssignmentReward Assignment Genetic AlgorithmReinforcement Learning MatchMatch TransposeTranspose
  • 18. References Copyright 2013 Patrick Nicolas 18 • Genetic Programming: On the Programming of Computers by Means of Natural Selection - J. Koza • Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) – R. Sutton, A. Barto • http://www.mendeley.com/catalog/symbolic-regression-via-genetic- programming/