SlideShare a Scribd company logo
1 of 31
Presented to :
Dr. Rabie
By :
Amr Abd EL Latief Abd El Al
Data Mining Def.
 Def. :
 Data mining is the extraction of interesting patterns or
knowledge from huge amount of data.
Known different names :
 knowledge discovery (mining) in databases (KDD)
 knowledge extraction,
 data/pattern analysis,
 data archeology,
 data dredging,
 information harvesting,
 business intelligence and others. [1]
What is Data Mining
 Data Mining enables data exploration, data analysis,
and data visualization of huge databases at a high level
of abstraction, without a specific hypothesis in mind.
 working of data mining is understood by using a
method called modeling with it to make predictions.
Data Mining Technologies
 include :
 artificial neural networks
 decision trees
 genetic algorithms.
 Machine Learning .
 Evolutionary Computing
 MOEA Multi objective Evolutionary
Computing
Data Mining System Arch.
Data Mining Procedure
The Process of Data Mining
Classifications
Data Types
Application
Data Types
Data Structure
Functionality
Data Types Application S.V.
 Business transactions
 Scientific data
 Medical and personal data
 Surveillance video and pictures
 Satellite sensing
 Text reports and memos (e-mail messages)
 Most of the communications
 The World Wide Web repositories
types of data (Data Structure S.V.)
 Flat files
 Relational Databases
 Data Warehouses
 Transaction Databases
 Multimedia Databases
 Spatial Databases
 World Wide Web
FUNCTIONALITIES AND
CLASSIFICATIONS OF
DATA MINING
 Characterization
 Discrimination
 Association analysis
 Classification
 uses given class labels to order the objects in
 the data collection Classification approaches normally use a
 training set where all objects are already associated with
 known class labels. The classification algorithm learns from
 the training set and builds a model. The model is used to
 classify new objects.
 Prediction
 Prediction
Data Mining Systems
specialized
data source mined
dataClassification
according to the data
drawn on modmodel
el drawn on
kind of knowledge
discovered
mining techniques
used
comprehensive
Classification according to the type
of data source mined
 This classification categorizes data mining systems
according to the type of data handled:
 spatial data
 multimedia data
 time-series data
 text data
 World Wide Web.
Classification according to the data
model drawn on
 This classification categorizes data mining systems
based on the data model involved:
 Relational database
 object-oriented database
 data warehouse
 Transactional
 others
Classification according to the king
of knowledge discovered
 This classification categorizes data mining systems
based on the kind of knowledge discovered or data
mining functionalities:
 Characterization
 discrimination
 Association
 classification
 clustering
 others
Classification according to mining
techniques used
 The classification categorizes data mining systems
according to the data analysis approach used:
 machine learning
 neural networks
 Genetic algorithms
 Statistics
 visualization
 database oriented
 data warehouse-oriented
 others
take into account the degree of
user interaction involved in the
data mining process
 query-driven systems,
 interactive exploratory systems
 autonomous systems
Note:
 A comprehensive system would provide a wide variety
of data mining techniques to fit different situations
and options, and offer different degrees of user
interaction.
[2]
Papers
Data Mining Goals
 the two main goals of DM are:
 description
 prediction.
 Standard tasks in the field of DM are: description,
clustering, association discovery, sequential pattern
analysis, classification and regression.
 Description : can be obtained by characterization or by
discrimination.
 Characterization: is a summarization of the general features
 Discrimination :does not differ too much from
characterization. It consists of characterizing a class by
comparison with another one.
Data Mining Goals
 Clustering differs from classification since it analyses data
objects without knowing their class.
 Association : discovery results in a set of association rules
which represents attribute-value conditions frequently
occurring in a given set of data.
 Sequential pattern analysis : consists in searching for
frequently occurring patterns related to time.
 Regression : uses existing values of some variables in order
to forecast what values of another continuous variable will
be
Machine Learning
 A ML system uses an entire finite set of objects,
examples which represent observations of the
environment ; the learning algorithm learns a model
from this set which is called the training set.
 ML In DM include:
 databases
 data warehouses
 flat files
Classification in DM
 Classification:
is a form of data analysis that can be used to extract
models describing important classes or to predict future
trends.
 It represents :
learning paradigm which consists in segmenting data by
assigning it to groups, or classes,, that are already defined.
 the assumption is a small database size but In Data Mining
it must be scalable technique.
Classification in DM
 classes are represented by:
the values of a particular attribute called goal attribute
and remaining attributes are called predicting
attribute.
 resulting model is usually represented as:
a set of IF-THEN prediction rules where each one
predicts a class from the predicting attributes.
ML in Classification
 Procedure:
 Algorithms are first applied to the so-called training set
which contains training examples with a known class to
discover rules.
 the model is used for classification on a set of examples,
called the test set.
 The predictive accuracy of the model is evaluated on the
test set
Classification Methods
 Main classification methods are:
 decision tree induction
 Scalability problem
 Bayesian classification
 neural network learning.
 Draw Backs:
 Time-consuming
 difficulty for humans to interpret their results.
ASSOCIATION ANALYSIS
 They show relationships between attributes. Their
typical application domain is market basket and
transaction data analysis.
 Association Rules:
 An association rule is generally defined as an expression
 X=>Y,
 where X and Y are sets of attribute-value terms
ASSOCIATION ANALYSIS
 Rules are not supposed to be strictly correct in order
for them to be useful. It is generally required to find
rules which are true to some degree only.
 X implies Y
 X tends to imply Y
 Support and confidence
Apriori Algorithm
 Depends on Frqeuent occurence
 Draw Backs :
 Large number of database scans
 Large size of generated intermediate sets.
 Apriori mining only Boolean and single-dimensional
association rules.
 These rules are adapted to market basket analysis and can
GA Advantages in Data Mining
 DM problem needs: robustness of solutions and
scalability
 GA Advantages:
 there is high ability to find patterns in vey large spaces.
 parallel implementation
 It performs a kind Of global search rather than local
hill-climbing.
 the patterns produced are directly understandable
Search Challenges
 scalability problems is an important research
challenge too.
 MULTI-OBJECTIVE RULE EXTRACTION
 MOEA Issues
Aperior Ex.

More Related Content

What's hot

Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 

What's hot (20)

Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data mining
Data mining Data mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Data mining
Data miningData mining
Data mining
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Association rule mining.pptx
Association rule mining.pptxAssociation rule mining.pptx
Association rule mining.pptx
 
Kdd process
Kdd processKdd process
Kdd process
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
data mining
data miningdata mining
data mining
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 

Viewers also liked

Libro l4
Libro l4Libro l4
Libro l4
Tono582
 
Certificate_35
Certificate_35Certificate_35
Certificate_35
Adeel Khan
 
Manger-et-penser-bio
Manger-et-penser-bioManger-et-penser-bio
Manger-et-penser-bio
Aude Debenest
 
Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)
Mourad Labguira
 
Golden hollywood
Golden hollywoodGolden hollywood
Golden hollywood
Ms Olive
 
3. synergy and convergence
3. synergy and convergence3. synergy and convergence
3. synergy and convergence
Ms Olive
 
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Solocal Group UK
 

Viewers also liked (19)

Libro l4
Libro l4Libro l4
Libro l4
 
Certificate_35
Certificate_35Certificate_35
Certificate_35
 
La amistad
La amistadLa amistad
La amistad
 
Manger-et-penser-bio
Manger-et-penser-bioManger-et-penser-bio
Manger-et-penser-bio
 
Jun 06 jorge tuto quiroga - oea - reeleccion evo
Jun 06   jorge tuto quiroga - oea - reeleccion evoJun 06   jorge tuto quiroga - oea - reeleccion evo
Jun 06 jorge tuto quiroga - oea - reeleccion evo
 
Qualification of the NDI process
Qualification of the NDI processQualification of the NDI process
Qualification of the NDI process
 
Sorteo alianzas
Sorteo alianzasSorteo alianzas
Sorteo alianzas
 
Higado y vias biliares
Higado y vias biliaresHigado y vias biliares
Higado y vias biliares
 
Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)
 
Determinantes
DeterminantesDeterminantes
Determinantes
 
Letra t t
Letra t tLetra t t
Letra t t
 
Raising Tomatoes Workshop
Raising Tomatoes WorkshopRaising Tomatoes Workshop
Raising Tomatoes Workshop
 
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
 
ESTUDIOS DE VELOCIDADES EN CARRETERAS
ESTUDIOS DE VELOCIDADES EN CARRETERASESTUDIOS DE VELOCIDADES EN CARRETERAS
ESTUDIOS DE VELOCIDADES EN CARRETERAS
 
What, Why & How of Crowdfunding
What, Why & How of CrowdfundingWhat, Why & How of Crowdfunding
What, Why & How of Crowdfunding
 
Golden hollywood
Golden hollywoodGolden hollywood
Golden hollywood
 
3. synergy and convergence
3. synergy and convergence3. synergy and convergence
3. synergy and convergence
 
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
 
Northern Illinois Rockford Heart Walk Slated for May of 2015
Northern Illinois Rockford Heart Walk Slated for May of 2015 Northern Illinois Rockford Heart Walk Slated for May of 2015
Northern Illinois Rockford Heart Walk Slated for May of 2015
 

Similar to Data mining concepts and work

Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & Representation
Darshan Patil
 

Similar to Data mining concepts and work (20)

Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Talk
TalkTalk
Talk
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysis
 
Part1
Part1Part1
Part1
 
Unit i
Unit iUnit i
Unit i
 
G045033841
G045033841G045033841
G045033841
 
Data mining
Data miningData mining
Data mining
 
Data Mining System and Applications: A Review
Data Mining System and Applications: A ReviewData Mining System and Applications: A Review
Data Mining System and Applications: A Review
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & Representation
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
 
Data mining
Data miningData mining
Data mining
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Data Mining and Knowledge
Data Mining and KnowledgeData Mining and Knowledge
Data Mining and Knowledge
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection method
 

More from Amr Abd El Latief

More from Amr Abd El Latief (12)

master-journey.pptx
master-journey.pptxmaster-journey.pptx
master-journey.pptx
 
Micro frontend
Micro frontendMicro frontend
Micro frontend
 
I feel presentation [autosaved]
I feel presentation [autosaved]I feel presentation [autosaved]
I feel presentation [autosaved]
 
Design p atterns
Design p atternsDesign p atterns
Design p atterns
 
AngularJs advanced Topics
AngularJs advanced TopicsAngularJs advanced Topics
AngularJs advanced Topics
 
Angular js slides
Angular js slidesAngular js slides
Angular js slides
 
Test vector compression
Test vector compressionTest vector compression
Test vector compression
 
Designing energy efficient lte
Designing energy efficient lteDesigning energy efficient lte
Designing energy efficient lte
 
Stock market analysis using ga and neural network
Stock market analysis using ga and neural networkStock market analysis using ga and neural network
Stock market analysis using ga and neural network
 
Chromium os architecture report
Chromium os  architecture reportChromium os  architecture report
Chromium os architecture report
 
Marketing plane of cadbry bupply kids
Marketing plane of cadbry bupply kidsMarketing plane of cadbry bupply kids
Marketing plane of cadbry bupply kids
 
Test vector compression in Digital Testing
Test vector compression in Digital Testing Test vector compression in Digital Testing
Test vector compression in Digital Testing
 

Recently uploaded

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 

Data mining concepts and work

  • 1. Presented to : Dr. Rabie By : Amr Abd EL Latief Abd El Al
  • 2. Data Mining Def.  Def. :  Data mining is the extraction of interesting patterns or knowledge from huge amount of data. Known different names :  knowledge discovery (mining) in databases (KDD)  knowledge extraction,  data/pattern analysis,  data archeology,  data dredging,  information harvesting,  business intelligence and others. [1]
  • 3. What is Data Mining  Data Mining enables data exploration, data analysis, and data visualization of huge databases at a high level of abstraction, without a specific hypothesis in mind.  working of data mining is understood by using a method called modeling with it to make predictions.
  • 4. Data Mining Technologies  include :  artificial neural networks  decision trees  genetic algorithms.  Machine Learning .  Evolutionary Computing  MOEA Multi objective Evolutionary Computing
  • 7. The Process of Data Mining
  • 9. Data Types Application S.V.  Business transactions  Scientific data  Medical and personal data  Surveillance video and pictures  Satellite sensing  Text reports and memos (e-mail messages)  Most of the communications  The World Wide Web repositories
  • 10. types of data (Data Structure S.V.)  Flat files  Relational Databases  Data Warehouses  Transaction Databases  Multimedia Databases  Spatial Databases  World Wide Web
  • 11. FUNCTIONALITIES AND CLASSIFICATIONS OF DATA MINING  Characterization  Discrimination  Association analysis  Classification  uses given class labels to order the objects in  the data collection Classification approaches normally use a  training set where all objects are already associated with  known class labels. The classification algorithm learns from  the training set and builds a model. The model is used to  classify new objects.  Prediction  Prediction
  • 12. Data Mining Systems specialized data source mined dataClassification according to the data drawn on modmodel el drawn on kind of knowledge discovered mining techniques used comprehensive
  • 13. Classification according to the type of data source mined  This classification categorizes data mining systems according to the type of data handled:  spatial data  multimedia data  time-series data  text data  World Wide Web.
  • 14. Classification according to the data model drawn on  This classification categorizes data mining systems based on the data model involved:  Relational database  object-oriented database  data warehouse  Transactional  others
  • 15. Classification according to the king of knowledge discovered  This classification categorizes data mining systems based on the kind of knowledge discovered or data mining functionalities:  Characterization  discrimination  Association  classification  clustering  others
  • 16. Classification according to mining techniques used  The classification categorizes data mining systems according to the data analysis approach used:  machine learning  neural networks  Genetic algorithms  Statistics  visualization  database oriented  data warehouse-oriented  others
  • 17. take into account the degree of user interaction involved in the data mining process  query-driven systems,  interactive exploratory systems  autonomous systems Note:  A comprehensive system would provide a wide variety of data mining techniques to fit different situations and options, and offer different degrees of user interaction.
  • 19. Data Mining Goals  the two main goals of DM are:  description  prediction.  Standard tasks in the field of DM are: description, clustering, association discovery, sequential pattern analysis, classification and regression.  Description : can be obtained by characterization or by discrimination.  Characterization: is a summarization of the general features  Discrimination :does not differ too much from characterization. It consists of characterizing a class by comparison with another one.
  • 20. Data Mining Goals  Clustering differs from classification since it analyses data objects without knowing their class.  Association : discovery results in a set of association rules which represents attribute-value conditions frequently occurring in a given set of data.  Sequential pattern analysis : consists in searching for frequently occurring patterns related to time.  Regression : uses existing values of some variables in order to forecast what values of another continuous variable will be
  • 21. Machine Learning  A ML system uses an entire finite set of objects, examples which represent observations of the environment ; the learning algorithm learns a model from this set which is called the training set.  ML In DM include:  databases  data warehouses  flat files
  • 22. Classification in DM  Classification: is a form of data analysis that can be used to extract models describing important classes or to predict future trends.  It represents : learning paradigm which consists in segmenting data by assigning it to groups, or classes,, that are already defined.  the assumption is a small database size but In Data Mining it must be scalable technique.
  • 23. Classification in DM  classes are represented by: the values of a particular attribute called goal attribute and remaining attributes are called predicting attribute.  resulting model is usually represented as: a set of IF-THEN prediction rules where each one predicts a class from the predicting attributes.
  • 24. ML in Classification  Procedure:  Algorithms are first applied to the so-called training set which contains training examples with a known class to discover rules.  the model is used for classification on a set of examples, called the test set.  The predictive accuracy of the model is evaluated on the test set
  • 25. Classification Methods  Main classification methods are:  decision tree induction  Scalability problem  Bayesian classification  neural network learning.  Draw Backs:  Time-consuming  difficulty for humans to interpret their results.
  • 26. ASSOCIATION ANALYSIS  They show relationships between attributes. Their typical application domain is market basket and transaction data analysis.  Association Rules:  An association rule is generally defined as an expression  X=>Y,  where X and Y are sets of attribute-value terms
  • 27. ASSOCIATION ANALYSIS  Rules are not supposed to be strictly correct in order for them to be useful. It is generally required to find rules which are true to some degree only.  X implies Y  X tends to imply Y  Support and confidence
  • 28. Apriori Algorithm  Depends on Frqeuent occurence  Draw Backs :  Large number of database scans  Large size of generated intermediate sets.  Apriori mining only Boolean and single-dimensional association rules.  These rules are adapted to market basket analysis and can
  • 29. GA Advantages in Data Mining  DM problem needs: robustness of solutions and scalability  GA Advantages:  there is high ability to find patterns in vey large spaces.  parallel implementation  It performs a kind Of global search rather than local hill-climbing.  the patterns produced are directly understandable
  • 30. Search Challenges  scalability problems is an important research challenge too.  MULTI-OBJECTIVE RULE EXTRACTION  MOEA Issues