SlideShare une entreprise Scribd logo
1  sur  5
ASIET Kalady
Data mining
Techniques
Assignment
RespaPeter
10/8/2013
Data mining
Data mining is the analysis of (often large) observational data sets to find unsuspected
relationships and to summarize the data in novel ways that are both understandable and useful to
the data owner.
Data mining Techniques
In additiontousinga particulardata miningtool, internal auditors can choose from a variety of
data miningtechniques.The mostcommonlyusedtechniquesincludeartificial neuralnetworks,decision
trees, and the nearest-neighbor method. Each of these techniques analyzes data in different ways:
 Artificial neural networks are non-linear, predictive models that learn through training.
Although they are powerful predictive modeling techniques, some of the power comes at
the expense of ease of use and deployment.
One area where auditors can easily use them is when reviewing records to
identify fraud and fraud-like actions. Because of their complexity, they are better
employed in situations where they can be used and reused, such as reviewing credit card
transactions every month to check for anomalies.
 Decision trees are tree-shaped structures that represent decision sets. These decisions
generate rules, which then are used to classify data. Decision trees are the favored
technique for building understandable models. Auditors can use them to assess, for
example, whether the organization is using an appropriate cost-effective marketing
strategy that is based on the assigned value of the customer, such as profit.
 The nearest-neighbor method classifies dataset records based on similar data in a
historical dataset. Auditors can use this approach to define a document that is interesting
to them and ask the system to search for similar items.
The most commonly used techniques include artificial neural networks, decision trees,
and the nearest-neighbor method. Each of these techniques analyzes data in different ways:
 Association
Association (or relation) is probably the better known and most familiar and
straightforward data mining technique. Here, it make a simple correlation between two or
more items, often of the same type to identify patterns. For example, when tracking
people's buying habits, you might identify that a customer always buys cream when they
buy strawberries, and therefore suggest that the next time that they buy strawberries they
might also want to buy cream.
 Classification
It use to build up an idea of the type of customer, item, or object by describing
multiple attributes to identify a particular class. For example, you can easily classify cars
into different types (sedan, 4x4, convertible) by identifying different attributes (number
of seats, car shape, driven wheels). Given a new car, you might apply it into a particular
class by comparing the attributes with our known definition. You can apply the same
principles to customers, for example by classifying them by age and social group.
 Clustering
Clustering is the task of segmenting a diverse group into a number of similar sub groups
or clusters.The distingiush clustering from classification is that clustering is not rely
on predefined classes.
The records are grouped together on the basis of self similarity.clustering is often done as
a prelude to some other form of datamining.
 Prediction
Prediction is a wide topic and runs from predicting the failure of components or
machinery, to identifying fraud and even the prediction of company profits. Used in combination
with the other data mining techniques, prediction involves analyzing trends, classification,
pattern matching, and relation. By analyzing past events or instances, you can make a prediction
about an event.
Using the credit card authorization, for example, you might combine decision tree analysis of
individual past transactions with classification and historical pattern matches to identify whether
a transaction is fraudulent. Making a match between the purchase of flights to the US and
transactions in the US, it is likely that the transaction is valid.
 Sequential patterns
Oftern used over longer-term data, sequential patterns are a useful method for identifying
trends, or regular occurrences of similar events. For example, with customer data you can
identify that customers buy a particular collection of products together at different times of the
year. In a shopping basket application, you can use this information to automatically suggest that
certain items be added to a basket based on their frequency and past purchasing history.
 Decision trees
Related to most of the other techniques (primarily classification and prediction), the
decision tree can be used either as a part of the selection criteria, or to support the use and
selection of specific data within the overall structure. Within the decision tree, you start with a
simple question that has two (or sometimes more) answers. Each answer leads to a further
question to help classify or identify the data so that it can be categorized, or so that a prediction
can be made based on each answer.
Each of these approaches brings different advantages and disadvantages that need to be
considered prior to their use. Neural networks, which are difficult to implement, require all input
and resultant output to be expressed numerically, thus needing some sort of interpretation
depending on the nature of the data-mining exercise
The decisiontree techniqueisthe mostcommonlyused methodology, because it is simple and
straightforwardtoimplement.Finally,the nearest-neighbormethodreliesmore onlinkingsimilar items
and, therefore, works better for extrapolation rather than predictive enquiries.
A goodway to applyadvanceddataminingtechniquesis to have a flexible and interactive data
mining tool that is fully integrated with a database or data warehouse. Using a tool that operates
outside of the database or data warehouse is not as efficient
Regardlessof the technique used,the real value behind data mining is modeling — the process
of buildingamodel basedonuser-specifiedcriteriafromalreadycaptureddata.Once a model is built, it
can be used in similar situations where an answer is not known.
For example,an organization looking to acquire new customers can create a model of its ideal
customer that is based on existing data captured from people who previously purchased the product.
The model then is used to query data on prospective customers to see if they match the profile.
Modeling also can be used in audit departments to predict the number of auditors required to
undertake an audit plan based on previous attempts and similar work.

Contenu connexe

Tendances

Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
IJDKP
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
sumit621
 

Tendances (17)

Data mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updatedData mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updated
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
Database
DatabaseDatabase
Database
 
Ghhh
GhhhGhhh
Ghhh
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Part1
Part1Part1
Part1
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
5 data preparation and processing2
5 data preparation and processing25 data preparation and processing2
5 data preparation and processing2
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Data model
Data modelData model
Data model
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
1234
12341234
1234
 

En vedette (6)

Why Datamining?
Why Datamining?Why Datamining?
Why Datamining?
 
Biometrics--The Technology of Tomorrow
Biometrics--The Technology of TomorrowBiometrics--The Technology of Tomorrow
Biometrics--The Technology of Tomorrow
 
Biometric Technology
Biometric TechnologyBiometric Technology
Biometric Technology
 
biometric technology
biometric technologybiometric technology
biometric technology
 
3d internet
3d internet3d internet
3d internet
 
Biometrics
BiometricsBiometrics
Biometrics
 

Similaire à DataMining Techniq

Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docx
healdkathaleen
 
Mining internal sources of data
Mining internal sources of dataMining internal sources of data
Mining internal sources of data
nomanbhutta
 
Data Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıData Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat Yazıcı
Murat YAZICI, M.Sc.
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
hp41112004
 

Similaire à DataMining Techniq (20)

Data Mining
Data MiningData Mining
Data Mining
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdf
 
Data mining-basic
Data mining-basicData mining-basic
Data mining-basic
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Unit 4 Advanced Data Analytics
Unit 4 Advanced Data AnalyticsUnit 4 Advanced Data Analytics
Unit 4 Advanced Data Analytics
 
Data Mining
Data MiningData Mining
Data Mining
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docx
 
Chapter 1.pdf
Chapter 1.pdfChapter 1.pdf
Chapter 1.pdf
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
 
Data and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptxData and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptx
 
Mining internal sources of data
Mining internal sources of dataMining internal sources of data
Mining internal sources of data
 
Data Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıData Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat Yazıcı
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
 
Data mining
Data miningData mining
Data mining
 
Data Mining Lec1.pptx
Data Mining Lec1.pptxData Mining Lec1.pptx
Data Mining Lec1.pptx
 

Plus de Respa Peter

Plus de Respa Peter (13)

Tpes of Softwares
Tpes of SoftwaresTpes of Softwares
Tpes of Softwares
 
Information technology for business
Information technology for business Information technology for business
Information technology for business
 
Types of sql injection attacks
Types of sql injection attacksTypes of sql injection attacks
Types of sql injection attacks
 
software failures
 software failures software failures
software failures
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Managing software development
Managing software developmentManaging software development
Managing software development
 
Data mining
Data miningData mining
Data mining
 
Knime
KnimeKnime
Knime
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Matrix multiplicationdesign
Matrix multiplicationdesignMatrix multiplicationdesign
Matrix multiplicationdesign
 
Web services have made the development of mobile Web applications much easier...
Web services have made the development of mobile Web applications much easier...Web services have made the development of mobile Web applications much easier...
Web services have made the development of mobile Web applications much easier...
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
 
Open shortest path first (ospf)
Open shortest path first (ospf)Open shortest path first (ospf)
Open shortest path first (ospf)
 

Dernier

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Dernier (20)

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 

DataMining Techniq

  • 2. Data mining Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner. Data mining Techniques In additiontousinga particulardata miningtool, internal auditors can choose from a variety of data miningtechniques.The mostcommonlyusedtechniquesincludeartificial neuralnetworks,decision trees, and the nearest-neighbor method. Each of these techniques analyzes data in different ways:  Artificial neural networks are non-linear, predictive models that learn through training. Although they are powerful predictive modeling techniques, some of the power comes at the expense of ease of use and deployment. One area where auditors can easily use them is when reviewing records to identify fraud and fraud-like actions. Because of their complexity, they are better employed in situations where they can be used and reused, such as reviewing credit card transactions every month to check for anomalies.  Decision trees are tree-shaped structures that represent decision sets. These decisions generate rules, which then are used to classify data. Decision trees are the favored technique for building understandable models. Auditors can use them to assess, for example, whether the organization is using an appropriate cost-effective marketing strategy that is based on the assigned value of the customer, such as profit.  The nearest-neighbor method classifies dataset records based on similar data in a historical dataset. Auditors can use this approach to define a document that is interesting to them and ask the system to search for similar items.
  • 3. The most commonly used techniques include artificial neural networks, decision trees, and the nearest-neighbor method. Each of these techniques analyzes data in different ways:  Association Association (or relation) is probably the better known and most familiar and straightforward data mining technique. Here, it make a simple correlation between two or more items, often of the same type to identify patterns. For example, when tracking people's buying habits, you might identify that a customer always buys cream when they buy strawberries, and therefore suggest that the next time that they buy strawberries they might also want to buy cream.  Classification It use to build up an idea of the type of customer, item, or object by describing multiple attributes to identify a particular class. For example, you can easily classify cars into different types (sedan, 4x4, convertible) by identifying different attributes (number of seats, car shape, driven wheels). Given a new car, you might apply it into a particular class by comparing the attributes with our known definition. You can apply the same principles to customers, for example by classifying them by age and social group.  Clustering Clustering is the task of segmenting a diverse group into a number of similar sub groups or clusters.The distingiush clustering from classification is that clustering is not rely on predefined classes. The records are grouped together on the basis of self similarity.clustering is often done as a prelude to some other form of datamining.  Prediction
  • 4. Prediction is a wide topic and runs from predicting the failure of components or machinery, to identifying fraud and even the prediction of company profits. Used in combination with the other data mining techniques, prediction involves analyzing trends, classification, pattern matching, and relation. By analyzing past events or instances, you can make a prediction about an event. Using the credit card authorization, for example, you might combine decision tree analysis of individual past transactions with classification and historical pattern matches to identify whether a transaction is fraudulent. Making a match between the purchase of flights to the US and transactions in the US, it is likely that the transaction is valid.  Sequential patterns Oftern used over longer-term data, sequential patterns are a useful method for identifying trends, or regular occurrences of similar events. For example, with customer data you can identify that customers buy a particular collection of products together at different times of the year. In a shopping basket application, you can use this information to automatically suggest that certain items be added to a basket based on their frequency and past purchasing history.  Decision trees Related to most of the other techniques (primarily classification and prediction), the decision tree can be used either as a part of the selection criteria, or to support the use and selection of specific data within the overall structure. Within the decision tree, you start with a simple question that has two (or sometimes more) answers. Each answer leads to a further question to help classify or identify the data so that it can be categorized, or so that a prediction can be made based on each answer. Each of these approaches brings different advantages and disadvantages that need to be considered prior to their use. Neural networks, which are difficult to implement, require all input and resultant output to be expressed numerically, thus needing some sort of interpretation depending on the nature of the data-mining exercise
  • 5. The decisiontree techniqueisthe mostcommonlyused methodology, because it is simple and straightforwardtoimplement.Finally,the nearest-neighbormethodreliesmore onlinkingsimilar items and, therefore, works better for extrapolation rather than predictive enquiries. A goodway to applyadvanceddataminingtechniquesis to have a flexible and interactive data mining tool that is fully integrated with a database or data warehouse. Using a tool that operates outside of the database or data warehouse is not as efficient Regardlessof the technique used,the real value behind data mining is modeling — the process of buildingamodel basedonuser-specifiedcriteriafromalreadycaptureddata.Once a model is built, it can be used in similar situations where an answer is not known. For example,an organization looking to acquire new customers can create a model of its ideal customer that is based on existing data captured from people who previously purchased the product. The model then is used to query data on prospective customers to see if they match the profile. Modeling also can be used in audit departments to predict the number of auditors required to undertake an audit plan based on previous attempts and similar work.