SlideShare une entreprise Scribd logo
1  sur  3
Télécharger pour lire hors ligne
Data Mining – analyse Bank Marketing
Data Set by WEKA.
Author: Mateusz Brzoska
ID: M*********
Supervisor: Daming Shi
24/04/2015
Middlesex University 2015
A thesis submitted in partial fulfilment of the requirements for the degree of Bachelor of Science.
Table of Contents
1. Knowledge Discovery in Databases ........................................................................................ 4
2. Data Mining............................................................................................................................. 7
2.1. Overview .................................................................................................................................. 7
2.2. Data Mining Methods ............................................................................................................... 8
2.3. WEKA Methods ....................................................................................................................... 11
2.4. The problems of knowledge discovery ................................................................................... 16
3. WEKA Software ..................................................................................................................... 18
4. Bank Marketing Data Set ...................................................................................................... 21
4.1. Description of Data Set ........................................................................................................... 21
4.2. Cleaning Data Set .................................................................................................................... 22
4.3. Visualization of Data Set and Examining Data........................................................................ 25
4.4. Discovering potentially useful patterns from a data set ........................................................ 28
5. Conclusion ............................................................................................................................. 43
6. References............................................................................................................................. 46
Abstract
Our lives such as network and computers are filled giant volumes of data. Huge funds are sacrificed
for collecting and storing data by scientific institutions, businesses, and government agencies. Only a
small amount of these data will ever be used. Data structures are very often too complex to analyze
them effectively or too big to manage them. In the corporate and business world, the customer data
are becoming recognized as a strategic asset. It is becoming more and more important in today’s
competitive world to have the ability to extract useful knowledge hidden in these data and to operate
on that knowledge. Increasingly large amounts of data generated by the systems are caused by the
increasing use of computers. Decision Support System (DSS) is information system, also called Data
Mining, deals with discovering new, interesting and useful patterns and relationships between them,
to solve problems with plenty volumes of data. Exploitation of data is directed in order to establish a
general knowledge of the group rather than knowledge about specific individuals - though pattern
analysis also may be used to recognize anomalous individual behaviour such criminal activity. Since
data mining is a natural activity to be executed on large data sets, one of the biggest target markets is
the entire data - warehousing, data - mart, and decision – support community, include professionals
from such industries as manufacturing, telecommunications, retail, health care, transportation and
insurance. In the computer industry, data mining already is the fastest growing field. The greatest
strengths of data mining are reflected in its wide range of techniques and methodologies that can be
applied to a host of problem sets. The data mining has not only survived but matured and adapted for
practical use in the business world. To study techniques and methodologies in data mining that can
be applied to gain specific goals The project will focus on Data Mining as one of process Decision
Support Systems, which collecting and discovering knowledge from data. It will show the techniques,
algorithms and rules used to achieve certain goal from Bank Marketing Data Set. The WEKA software
will be used to show how to analyse data and it will explain many kinds of data mining techniques
used into the project.
Aims
1. To study techniques and methodologies in data mining that can be applied to gain specific goal:
which is predict if the client will subscribe (yes/no) a term deposit (variable y).
2. To analyse a data set of interest for clustering, classification, learning dependencies and
prediction, using algorithms such as k-means, soft k-means, and decision trees.
3. To process the data and achieve the final satisfactory result.
Objectives
1. To study Knowledge Discovery in Database (KDD) as the process of discovering useful patterns and
knowledge from data sources.
2. To understand the need for analyses of large, complex, information - rich data sets.
3. To describe process decision-making based on gaining of knowledge from data mining.
4. To provide essential information about patterns, techniques and methods.
5. To demonstrate relevant algorithms onto techniques and to operation on data.
6. To prepare results and conclusions.

Contenu connexe

Tendances (19)

Data mining on Financial Data
Data mining on Financial DataData mining on Financial Data
Data mining on Financial Data
 
Data mining
Data mining Data mining
Data mining
 
Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashok
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data Mining
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Data mining
Data miningData mining
Data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Mining and Data Warehouse
Data Mining and Data WarehouseData Mining and Data Warehouse
Data Mining and Data Warehouse
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
Data mining and its applications!
Data mining and its applications!Data mining and its applications!
Data mining and its applications!
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
 
Data mining
Data miningData mining
Data mining
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and Applications
 
Application areas of data mining
Application areas of data miningApplication areas of data mining
Application areas of data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation
 
USE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTORUSE OF DATA MINING IN BANKING SECTOR
USE OF DATA MINING IN BANKING SECTOR
 

En vedette

Portuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing CampaignPortuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing CampaignRehan Akhtar
 
Data Mining Technique Clustering on Bank Data Set
Data Mining Technique Clustering on Bank Data Set  Data Mining Technique Clustering on Bank Data Set
Data Mining Technique Clustering on Bank Data Set Punit Kishore
 
DSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankDSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankEric Esajian
 
Canopy kmeans
Canopy kmeansCanopy kmeans
Canopy kmeansnagwww
 
BIS2311 Group Presentation
BIS2311 Group PresentationBIS2311 Group Presentation
BIS2311 Group PresentationMateusz Brzoska
 
7 data warehouse & marts
7 data warehouse & marts7 data warehouse & marts
7 data warehouse & martsNymphea Saraf
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Wekaweka Content
 
Data Mining in Retail Industries
Data Mining in Retail IndustriesData Mining in Retail Industries
Data Mining in Retail IndustriesRahul Sinha
 
Linear programming - Model formulation, Graphical Method
Linear programming  - Model formulation, Graphical MethodLinear programming  - Model formulation, Graphical Method
Linear programming - Model formulation, Graphical MethodJoseph Konnully
 

En vedette (12)

Portuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing CampaignPortuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing Campaign
 
Data Mining Technique Clustering on Bank Data Set
Data Mining Technique Clustering on Bank Data Set  Data Mining Technique Clustering on Bank Data Set
Data Mining Technique Clustering on Bank Data Set
 
Bank market classification
Bank market classificationBank market classification
Bank market classification
 
Bank marketing
Bank marketingBank marketing
Bank marketing
 
DSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankDSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBank
 
Canopy kmeans
Canopy kmeansCanopy kmeans
Canopy kmeans
 
Presentation_Group_31
Presentation_Group_31Presentation_Group_31
Presentation_Group_31
 
BIS2311 Group Presentation
BIS2311 Group PresentationBIS2311 Group Presentation
BIS2311 Group Presentation
 
7 data warehouse & marts
7 data warehouse & marts7 data warehouse & marts
7 data warehouse & marts
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 
Data Mining in Retail Industries
Data Mining in Retail IndustriesData Mining in Retail Industries
Data Mining in Retail Industries
 
Linear programming - Model formulation, Graphical Method
Linear programming  - Model formulation, Graphical MethodLinear programming  - Model formulation, Graphical Method
Linear programming - Model formulation, Graphical Method
 

Similaire à Data Mining – analyse Bank Marketing Data Set by WEKA.

notes_dmdw_chap1.docx
notes_dmdw_chap1.docxnotes_dmdw_chap1.docx
notes_dmdw_chap1.docxAbshar Fatima
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective ApproachIRJET Journal
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learningGiuseppe Manco
 
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS cscpconf
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Toolsijsrd.com
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptxSyauqiAsyhabira1
 
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...Alexander Decker
 
5. data mining tools and techniques a review--31-39
5. data mining tools and techniques  a review--31-395. data mining tools and techniques  a review--31-39
5. data mining tools and techniques a review--31-39Alexander Decker
 
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...ijtsrd
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data MiningIOSR Journals
 
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...Sahilakhurana
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfDr. Radhey Shyam
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & ChallengesRupen Momaya
 
Business Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptxBusiness Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptxRupaRani28
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
 

Similaire à Data Mining – analyse Bank Marketing Data Set by WEKA. (20)

notes_dmdw_chap1.docx
notes_dmdw_chap1.docxnotes_dmdw_chap1.docx
notes_dmdw_chap1.docx
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Data Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope SurveyData Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope Survey
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
 
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
 
Seminar Report Vaibhav
Seminar Report VaibhavSeminar Report Vaibhav
Seminar Report Vaibhav
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx
 
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
11.0005www.iiste.org call for paper. data mining tools and techniques- a revi...
 
5. data mining tools and techniques a review--31-39
5. data mining tools and techniques  a review--31-395. data mining tools and techniques  a review--31-39
5. data mining tools and techniques a review--31-39
 
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
Applications, Techniques and Trends of Data Mining and Knowledge Discovery Da...
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data Mining
 
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & Challenges
 
Business Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptxBusiness Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptx
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 

Data Mining – analyse Bank Marketing Data Set by WEKA.

  • 1. Data Mining – analyse Bank Marketing Data Set by WEKA. Author: Mateusz Brzoska ID: M********* Supervisor: Daming Shi 24/04/2015 Middlesex University 2015 A thesis submitted in partial fulfilment of the requirements for the degree of Bachelor of Science.
  • 2. Table of Contents 1. Knowledge Discovery in Databases ........................................................................................ 4 2. Data Mining............................................................................................................................. 7 2.1. Overview .................................................................................................................................. 7 2.2. Data Mining Methods ............................................................................................................... 8 2.3. WEKA Methods ....................................................................................................................... 11 2.4. The problems of knowledge discovery ................................................................................... 16 3. WEKA Software ..................................................................................................................... 18 4. Bank Marketing Data Set ...................................................................................................... 21 4.1. Description of Data Set ........................................................................................................... 21 4.2. Cleaning Data Set .................................................................................................................... 22 4.3. Visualization of Data Set and Examining Data........................................................................ 25 4.4. Discovering potentially useful patterns from a data set ........................................................ 28 5. Conclusion ............................................................................................................................. 43 6. References............................................................................................................................. 46
  • 3. Abstract Our lives such as network and computers are filled giant volumes of data. Huge funds are sacrificed for collecting and storing data by scientific institutions, businesses, and government agencies. Only a small amount of these data will ever be used. Data structures are very often too complex to analyze them effectively or too big to manage them. In the corporate and business world, the customer data are becoming recognized as a strategic asset. It is becoming more and more important in today’s competitive world to have the ability to extract useful knowledge hidden in these data and to operate on that knowledge. Increasingly large amounts of data generated by the systems are caused by the increasing use of computers. Decision Support System (DSS) is information system, also called Data Mining, deals with discovering new, interesting and useful patterns and relationships between them, to solve problems with plenty volumes of data. Exploitation of data is directed in order to establish a general knowledge of the group rather than knowledge about specific individuals - though pattern analysis also may be used to recognize anomalous individual behaviour such criminal activity. Since data mining is a natural activity to be executed on large data sets, one of the biggest target markets is the entire data - warehousing, data - mart, and decision – support community, include professionals from such industries as manufacturing, telecommunications, retail, health care, transportation and insurance. In the computer industry, data mining already is the fastest growing field. The greatest strengths of data mining are reflected in its wide range of techniques and methodologies that can be applied to a host of problem sets. The data mining has not only survived but matured and adapted for practical use in the business world. To study techniques and methodologies in data mining that can be applied to gain specific goals The project will focus on Data Mining as one of process Decision Support Systems, which collecting and discovering knowledge from data. It will show the techniques, algorithms and rules used to achieve certain goal from Bank Marketing Data Set. The WEKA software will be used to show how to analyse data and it will explain many kinds of data mining techniques used into the project. Aims 1. To study techniques and methodologies in data mining that can be applied to gain specific goal: which is predict if the client will subscribe (yes/no) a term deposit (variable y). 2. To analyse a data set of interest for clustering, classification, learning dependencies and prediction, using algorithms such as k-means, soft k-means, and decision trees. 3. To process the data and achieve the final satisfactory result. Objectives 1. To study Knowledge Discovery in Database (KDD) as the process of discovering useful patterns and knowledge from data sources. 2. To understand the need for analyses of large, complex, information - rich data sets. 3. To describe process decision-making based on gaining of knowledge from data mining. 4. To provide essential information about patterns, techniques and methods. 5. To demonstrate relevant algorithms onto techniques and to operation on data. 6. To prepare results and conclusions.