SlideShare une entreprise Scribd logo
1  sur  4
Monday March 22, 2004

                 DSES-4810-01 Intro to COMPUTATIONAL INTELLIGENCE
                                   & SOFT COMPUTING

Instructor:           Prof. Mark J. Embrechts (x 4009 or 371-4562) (embrem@rpi.edu)
Office Hours:         Thursday 10-12 am (CII5217), or by appointment.
Class Time:           Monday/Thursday: 8-9:50 (Amos Eaton 216)
TEXT:                 J. S. Jang, C. T. Sun, E. Mizutani, “Neuro-Fuzzy and Soft Computing,”
                      Prentice Hall, 1996. (1998) ISBN 0-13-261066-3

                     LECTURES #16&17: Data Mining with Neural Networks

1. WHAT IS DATA MINING?

Data Mining (DM), also known as Knowledge Discovery in Databases (KDD), is an area of common interest to
researchers in machine discovery, statistics, databases, knowledge acquisition, machine learning, data visualization,
high performance computing, and knowledge-based systems. The rapid growth of data and information has created a
need and an opportunity for extracting knowledge from databases, and both researchers and application developers
have been responding to that need. KDD applications have been developed for manufacturing, astronomy, biology,
finance, insurance, marketing,
medicine, and many other fields.

The process of automatically extracting previously unknown and important information from large databases and
using it to make crucial business decisions.

Organizations generate and collect large volumes of data which they use in daily operations, e.g. billing, inventory,
etc... The data necessary for each operation is captured and maintained by the corresponding department. Yet despite
this wealth of data, many companies have been unable to fully capitalize on its value because information implicit in
the data is not easy to discern.

Competitive business pressures and a desire to leverage existing information technology investments have led many
firms to explore the benefits of data mining technology. This technology is designed to help businesses discover
hidden information in their data -- information that can help them understand the purchasing behavior of their key
customers, detect likely credit card or insurance fraud, predict probable changes in financial markets, etc...

Data mining consists of a number of operations each of which is support by a variety of technologies such as rule
induction, neural networks, conceptual clustering, association discovery, etc... In many real-world applications such
as marketing analysis, financial analysis, fraud detection, etc..., information extraction requires the cooperative use
of several data mining operations and techniques.

2. ISSUES RELATED TO DATA MINING

 Theory and Foundation Issues in DM:
  Data and knowledge representation for DM
  Probabilistic modeling and uncertainty management in DM
  Modeling of structured, unstructured and multimedia data
  Metrics for evaluation of DM results
  Fundamental advances in search, retrieval, and discovery methods
  Definitions, formalisms, and theoretical issues in DM

 Data Mining Methods and Algorithms:



                                                          1
Algorithmic complexity, efficiency and scalability issues in data mining
   Probabilistic and statistical models and methods
   Using prior domain knowledge and re-use of discovered knowledge
   Parallel and distributed data mining techniques
   High dimensional datasets and data preprocessing
   Unsupervised discovery and predictive modeling

 DM Process and Human Interaction:
  Models of the DM process
  Methods for evaluating subjective relevance and utility
  Data and knowledge visualization
  Interactive data exploration and discovery
  Privacy and security

 Applications:
  Data mining systems and data mining tools
  Application of DM in business, science, medicine and engineering
  Application of DM methods for mining knowledge in text, image, audio, sensor, numeric, categorical
     or mixed format data
  Resource and knowledge discovery using the Internet

3. DATA MINING TOOLS AND TECHNIQUES

3.1 Multistrategy Tools:
     Public domain: MOBAL ;
     Research prototype: DBlearn, Emerald
     Commercial: DataMariner, Darwin, INSPECT, Modelware, Recon

3.2 Classification:
 3.2.1 Decision-tree approach:
     Public domain: LMDT, MLC++, OC1, SE-Learn, SIPINA-W
     Commercial: AC2, C4.5 , Clementine, IND, KATE-tools, Knowledge Seeker, SPSS CHAID,
     XPERTRule
 3.2.2 Neural network approach:
     Public domain: NN FAQ free software list, NEuroNet site
     Commercial: NN FAQ commercial software list, 4Thought, AIM, BrainMaker, Clementine,
             INSPECT, MATLAB NN Toolbox, SPSS Neural Connection
 3.2.3 Rule Discovery approach:
     Public domain: Brute, CN2, QM
     Research prototype: DBlearn
     Commercial: Datalogic, DataMariner, Data Surveyor, IDIS, PQ->R
 3.2.4 Genetic Programming approach:
     Commercial: GAAF, FUGA
 3.2.5 Nearest Neighbor approach: Research prototype: PEBLS
 3.2.6 Other approaches: Research Prototype: HCV
     Commercial: Information Harvester, Wizrule

3.3 Clustering:
     Public Domain: Autoclass C, ECOBWEB, SDISCOVER, SNOB
     Commercial: Autoclass III, COBWEB/3, DataEngine

3.4 Dependency Derivation: A catalog of Software for Belief Networks
     Commercial: Hugin, TETRAD II

3.5 Deviation Detection:



                                                         2
Public domain: Explora

3.6 Summarization: DBlearn, Emerald, Recon

3.7 Visualization:
     Commercial: Data Desk, DX: IBM Visualization Data Explorer , NetMAP, PV-Wave, PVE, WinViz

3.8 Statistics:
     Commercial: BBN Cornerstone , MATLAB, PC-MARS , JMP , SAS, SPSS

3.9 Dimensional Analysis:
     Commercial: CrossTarget , Cross/Z , Essbase

3.10 Database Systems:
     Commercial: White Cross

4. DATA MINING PROJECTS
The Pattern Matching & Data Mining project (PMDM) at the University of Helsinki
... studies and develops concepts, models, algorithms, and implementations for three closely interconnected areas.

The data mining research at IBM
... has developed approaches for knowledge discovery in databases with the perspective to become the newest and
most promising application of database technology. The data mining project at the IBM Almaden Research Center
has developed innovative ground-breaking technology to discover useful patterns in gigabytes of data in a short
amount of time and without any "pre-analysis" (or filtering) of data.

The ITERATE system
... is a conceptual clustering system that generates stable and cohesive clusters through ADO-star data ordering
technique and iterative redistribution strategy, and (2) knowedge-based equation discovery system that defines
homogenuous context using clustering techniques and derives analytical equations for the response variable under
proper context.

The WinViz system
 ... is a data visualisation software with a revolutionary concept of representing N-dimension data. It allows data
mining through visualisation and many other ML techniques. The visualisation component of WinViz is currently
completed. Enhancement such as supervised machine learning, self clustering module, statistical analysis and time
series analysis are under construction.

The Waikato Environment for Knowledge Analysis (WEKA)
...is a machine learning workbench currently being developed at the University of Waikato. Its purpose is to allow
users to access a variety of machine learning techniques for the purpose of experimentation and comparison using
real world data sets. The workbench currently runs on Sun workstations under X-windows, with machine learning
tools written in a variety of programming languages (C, C++, and LISP). It is being used for problems of machine
learning in agriculture, including cow culling, determining when cows are in estrus, mastitis diagnosis, fertilizer
evaluation, etc.

The University of Ulster Data Mine
... provides information on the research undertaken by the Database Mining Interest Group (UUDMIG) at the
University of Ulster. The Database Mining Interest Group is a collection of Database, Machine Learning,
Mathematics, Statistics, Database Mining and Artificial Intelligence Researchers interested in the field of Database
Mining/ Knowledge Discovery in Databases.

The Knowledge Discovery in Databases (KDD)




                                                          3
... project at GTE Laboratories examines ways to extract useful and interesting knowledge from large GTE databases
containing data on customers, healthcare costs, cellular calls, etc. We have developed several techniques for
intelligent data analysis, including algorithms for classification, summarization, discovery of dependencies, and the
detection of deviations and changes.

Automated Discovery Methods for Science
... We try to identity new tasks to automate from within science (biology, chemistry, and physics so far), automate
them, then look for analogies elsewhere. Task automation relies on a logical analysis, and draws on methods of
heuristic search, combinatorial optimization, and elementary mathematics and statistics.

EXPLORA
... is a discovery system for Apple Macintosh. In this project, discovery tools are developed, prototypically
implemented, and evaluated in practical applications. The main research topics relate to patterns, search strategies,
interestingness, and efficiency.

Handouts:

Swets, J. A., Dawes, R. M., and Monahan, J. [2000] “Better Decisions through Science,”
Scientific American, pp. 82-87.

Deadlines

       January 22                   Homework Problem #0 (web browsing)
       January 29                   Project Proposal
       February 17                  Homework Problem #1 (paper)
       February 19                  Quiz #1 (20 minutes, Chapter 2 of Book)
       February 23                  Homework Problem #2 (peaks function)
       March 16                     Progress Report
       March 23                     Homework #3 (fuzzy – groups)
       April 12                     Homework #4 (neural networks - groups)
       April 22/April 26            Final Presentations




                                                         4

Contenu connexe

Tendances

The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)Kartik Kalpande Patil
 
Performance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural networkPerformance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural networkIAEME Publication
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs StatisticsAndry Alamsyah
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueMehmet Beyaz
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutionscsandit
 
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLMITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLijaia
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWieijjournal
 
Overview of Data Mining
Overview of Data MiningOverview of Data Mining
Overview of Data Miningijtsrd
 
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...ijaia
 
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...Alexander Decker
 
5. data mining tools and techniques a review--31-39
5. data mining tools and techniques  a review--31-395. data mining tools and techniques  a review--31-39
5. data mining tools and techniques a review--31-39Alexander Decker
 
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATIONREVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATIONijaia
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Toolsijsrd.com
 

Tendances (19)

The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)
 
Analytics and Data Mining Industry Overview
Analytics and Data Mining Industry OverviewAnalytics and Data Mining Industry Overview
Analytics and Data Mining Industry Overview
 
Performance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural networkPerformance analysis of data mining algorithms with neural network
Performance analysis of data mining algorithms with neural network
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs Statistics
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining Technique
 
Data mining
Data miningData mining
Data mining
 
Lecture 01 Data Mining
Lecture 01 Data MiningLecture 01 Data Mining
Lecture 01 Data Mining
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLMITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
 
RESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEWRESEARCH IN BIG DATA – AN OVERVIEW
RESEARCH IN BIG DATA – AN OVERVIEW
 
Overview of Data Mining
Overview of Data MiningOverview of Data Mining
Overview of Data Mining
 
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
 
18231979 Data Mining
18231979 Data Mining18231979 Data Mining
18231979 Data Mining
 
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
 
5. data mining tools and techniques a review--31-39
5. data mining tools and techniques  a review--31-395. data mining tools and techniques  a review--31-39
5. data mining tools and techniques a review--31-39
 
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATIONREVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
Elementary Concepts of data minig
Elementary Concepts of data minigElementary Concepts of data minig
Elementary Concepts of data minig
 

En vedette

Dowload Paper.doc.doc
Dowload Paper.doc.docDowload Paper.doc.doc
Dowload Paper.doc.docbutest
 
MikroBasic
MikroBasicMikroBasic
MikroBasicbutest
 
Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425guest4f2eea
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)butest
 

En vedette (7)

.doc
.doc.doc
.doc
 
Dowload Paper.doc.doc
Dowload Paper.doc.docDowload Paper.doc.doc
Dowload Paper.doc.doc
 
MikroBasic
MikroBasicMikroBasic
MikroBasic
 
Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425Couchdbkit djangocong-20100425
Couchdbkit djangocong-20100425
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)Associate Professor David Levy (DCL)
Associate Professor David Levy (DCL)
 

Similaire à Ci2004-10.doc

A Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data MiningA Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data MiningBRNSSPublicationHubI
 
A LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGA LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGCarrie Romero
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...butest
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...butest
 
Unit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxUnit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxvipulkondekar
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptSangrangBargayary3
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
 
Machine Learning, Data Mining, and
Machine Learning, Data Mining, and Machine Learning, Data Mining, and
Machine Learning, Data Mining, and butest
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsIJMER
 
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsSurvey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsIRJET Journal
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docxbraycarissa250
 
A Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The FutureA Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The FutureKaela Johnson
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptPadmajaLaksh
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective ApproachIRJET Journal
 

Similaire à Ci2004-10.doc (20)

A Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data MiningA Comprehensive Study on Outlier Detection in Data Mining
A Comprehensive Study on Outlier Detection in Data Mining
 
A LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGA LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMINING
 
Data science
Data science Data science
Data science
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...
 
Unit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxUnit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptx
 
isd314-01
isd314-01isd314-01
isd314-01
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .ppt
 
Data Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope SurveyData Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope Survey
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
 
Machine Learning, Data Mining, and
Machine Learning, Data Mining, and Machine Learning, Data Mining, and
Machine Learning, Data Mining, and
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and Applications
 
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsSurvey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
 
1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx1. Web Mining – Web mining is an application of data mining for di.docx
1. Web Mining – Web mining is an application of data mining for di.docx
 
A Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The FutureA Review On Data Mining From Past To The Future
A Review On Data Mining From Past To The Future
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 

Plus de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Plus de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Ci2004-10.doc

  • 1. Monday March 22, 2004 DSES-4810-01 Intro to COMPUTATIONAL INTELLIGENCE & SOFT COMPUTING Instructor: Prof. Mark J. Embrechts (x 4009 or 371-4562) (embrem@rpi.edu) Office Hours: Thursday 10-12 am (CII5217), or by appointment. Class Time: Monday/Thursday: 8-9:50 (Amos Eaton 216) TEXT: J. S. Jang, C. T. Sun, E. Mizutani, “Neuro-Fuzzy and Soft Computing,” Prentice Hall, 1996. (1998) ISBN 0-13-261066-3 LECTURES #16&17: Data Mining with Neural Networks 1. WHAT IS DATA MINING? Data Mining (DM), also known as Knowledge Discovery in Databases (KDD), is an area of common interest to researchers in machine discovery, statistics, databases, knowledge acquisition, machine learning, data visualization, high performance computing, and knowledge-based systems. The rapid growth of data and information has created a need and an opportunity for extracting knowledge from databases, and both researchers and application developers have been responding to that need. KDD applications have been developed for manufacturing, astronomy, biology, finance, insurance, marketing, medicine, and many other fields. The process of automatically extracting previously unknown and important information from large databases and using it to make crucial business decisions. Organizations generate and collect large volumes of data which they use in daily operations, e.g. billing, inventory, etc... The data necessary for each operation is captured and maintained by the corresponding department. Yet despite this wealth of data, many companies have been unable to fully capitalize on its value because information implicit in the data is not easy to discern. Competitive business pressures and a desire to leverage existing information technology investments have led many firms to explore the benefits of data mining technology. This technology is designed to help businesses discover hidden information in their data -- information that can help them understand the purchasing behavior of their key customers, detect likely credit card or insurance fraud, predict probable changes in financial markets, etc... Data mining consists of a number of operations each of which is support by a variety of technologies such as rule induction, neural networks, conceptual clustering, association discovery, etc... In many real-world applications such as marketing analysis, financial analysis, fraud detection, etc..., information extraction requires the cooperative use of several data mining operations and techniques. 2. ISSUES RELATED TO DATA MINING Theory and Foundation Issues in DM: Data and knowledge representation for DM Probabilistic modeling and uncertainty management in DM Modeling of structured, unstructured and multimedia data Metrics for evaluation of DM results Fundamental advances in search, retrieval, and discovery methods Definitions, formalisms, and theoretical issues in DM Data Mining Methods and Algorithms: 1
  • 2. Algorithmic complexity, efficiency and scalability issues in data mining Probabilistic and statistical models and methods Using prior domain knowledge and re-use of discovered knowledge Parallel and distributed data mining techniques High dimensional datasets and data preprocessing Unsupervised discovery and predictive modeling DM Process and Human Interaction: Models of the DM process Methods for evaluating subjective relevance and utility Data and knowledge visualization Interactive data exploration and discovery Privacy and security Applications: Data mining systems and data mining tools Application of DM in business, science, medicine and engineering Application of DM methods for mining knowledge in text, image, audio, sensor, numeric, categorical or mixed format data Resource and knowledge discovery using the Internet 3. DATA MINING TOOLS AND TECHNIQUES 3.1 Multistrategy Tools: Public domain: MOBAL ; Research prototype: DBlearn, Emerald Commercial: DataMariner, Darwin, INSPECT, Modelware, Recon 3.2 Classification: 3.2.1 Decision-tree approach: Public domain: LMDT, MLC++, OC1, SE-Learn, SIPINA-W Commercial: AC2, C4.5 , Clementine, IND, KATE-tools, Knowledge Seeker, SPSS CHAID, XPERTRule 3.2.2 Neural network approach: Public domain: NN FAQ free software list, NEuroNet site Commercial: NN FAQ commercial software list, 4Thought, AIM, BrainMaker, Clementine, INSPECT, MATLAB NN Toolbox, SPSS Neural Connection 3.2.3 Rule Discovery approach: Public domain: Brute, CN2, QM Research prototype: DBlearn Commercial: Datalogic, DataMariner, Data Surveyor, IDIS, PQ->R 3.2.4 Genetic Programming approach: Commercial: GAAF, FUGA 3.2.5 Nearest Neighbor approach: Research prototype: PEBLS 3.2.6 Other approaches: Research Prototype: HCV Commercial: Information Harvester, Wizrule 3.3 Clustering: Public Domain: Autoclass C, ECOBWEB, SDISCOVER, SNOB Commercial: Autoclass III, COBWEB/3, DataEngine 3.4 Dependency Derivation: A catalog of Software for Belief Networks Commercial: Hugin, TETRAD II 3.5 Deviation Detection: 2
  • 3. Public domain: Explora 3.6 Summarization: DBlearn, Emerald, Recon 3.7 Visualization: Commercial: Data Desk, DX: IBM Visualization Data Explorer , NetMAP, PV-Wave, PVE, WinViz 3.8 Statistics: Commercial: BBN Cornerstone , MATLAB, PC-MARS , JMP , SAS, SPSS 3.9 Dimensional Analysis: Commercial: CrossTarget , Cross/Z , Essbase 3.10 Database Systems: Commercial: White Cross 4. DATA MINING PROJECTS The Pattern Matching & Data Mining project (PMDM) at the University of Helsinki ... studies and develops concepts, models, algorithms, and implementations for three closely interconnected areas. The data mining research at IBM ... has developed approaches for knowledge discovery in databases with the perspective to become the newest and most promising application of database technology. The data mining project at the IBM Almaden Research Center has developed innovative ground-breaking technology to discover useful patterns in gigabytes of data in a short amount of time and without any "pre-analysis" (or filtering) of data. The ITERATE system ... is a conceptual clustering system that generates stable and cohesive clusters through ADO-star data ordering technique and iterative redistribution strategy, and (2) knowedge-based equation discovery system that defines homogenuous context using clustering techniques and derives analytical equations for the response variable under proper context. The WinViz system ... is a data visualisation software with a revolutionary concept of representing N-dimension data. It allows data mining through visualisation and many other ML techniques. The visualisation component of WinViz is currently completed. Enhancement such as supervised machine learning, self clustering module, statistical analysis and time series analysis are under construction. The Waikato Environment for Knowledge Analysis (WEKA) ...is a machine learning workbench currently being developed at the University of Waikato. Its purpose is to allow users to access a variety of machine learning techniques for the purpose of experimentation and comparison using real world data sets. The workbench currently runs on Sun workstations under X-windows, with machine learning tools written in a variety of programming languages (C, C++, and LISP). It is being used for problems of machine learning in agriculture, including cow culling, determining when cows are in estrus, mastitis diagnosis, fertilizer evaluation, etc. The University of Ulster Data Mine ... provides information on the research undertaken by the Database Mining Interest Group (UUDMIG) at the University of Ulster. The Database Mining Interest Group is a collection of Database, Machine Learning, Mathematics, Statistics, Database Mining and Artificial Intelligence Researchers interested in the field of Database Mining/ Knowledge Discovery in Databases. The Knowledge Discovery in Databases (KDD) 3
  • 4. ... project at GTE Laboratories examines ways to extract useful and interesting knowledge from large GTE databases containing data on customers, healthcare costs, cellular calls, etc. We have developed several techniques for intelligent data analysis, including algorithms for classification, summarization, discovery of dependencies, and the detection of deviations and changes. Automated Discovery Methods for Science ... We try to identity new tasks to automate from within science (biology, chemistry, and physics so far), automate them, then look for analogies elsewhere. Task automation relies on a logical analysis, and draws on methods of heuristic search, combinatorial optimization, and elementary mathematics and statistics. EXPLORA ... is a discovery system for Apple Macintosh. In this project, discovery tools are developed, prototypically implemented, and evaluated in practical applications. The main research topics relate to patterns, search strategies, interestingness, and efficiency. Handouts: Swets, J. A., Dawes, R. M., and Monahan, J. [2000] “Better Decisions through Science,” Scientific American, pp. 82-87. Deadlines January 22 Homework Problem #0 (web browsing) January 29 Project Proposal February 17 Homework Problem #1 (paper) February 19 Quiz #1 (20 minutes, Chapter 2 of Book) February 23 Homework Problem #2 (peaks function) March 16 Progress Report March 23 Homework #3 (fuzzy – groups) April 12 Homework #4 (neural networks - groups) April 22/April 26 Final Presentations 4