SlideShare une entreprise Scribd logo
1  sur  38
Knowledge Acquisition In
Decision Making (SQIT 3033)
Izwan Nizal Mohd Shaharanee
SQS 4017/ 6866
nizal@uum.edu.my
izwan.nizal@gmail.com
Course Objective







To introduce :: knowledge about data mining
and data warehouse
To evaluate and understand several data
mining techniques
To enhance skill on data mining through
analysis problem in business
Being able to apply the commonly used
functions of SAS Enterprise Miner and
WEKA to solve data mining problems
Developing the skills of data mining modeling
and data analysis with SAS Enterprise Miner
and WEKA
Course Content





Intro to Knowledge Acquisition aka ~knowledge discovery~
(3 hours)
Knowledge Discovery Process (4 hours)
Pre-processing data (5 hours)
Predictive Modeling (10 hours)





Evaluation And Implementation (6 hours)
Descriptive Modeling (7 hours)





Decision Tree, Regression, Neural Network, Rough Set
Clustering, Association Rules

Data mining ethics (1.5 hours)
PROJECT PRESENTATION
Course Evaluation
 Assignments
 Case

study + Presentations
 Project + Poster Presentations
 Mid Term ? Quizzes ?
 Class PARTICIPATION !!
 Final Exam 40%

60%
PreRequisites








A “Basic statistics course such as
SQQS2023”Bussiness Statistical”+” programming
language knowledge”+“SAS knowledge”+”Database”+
“spreadsheet+ web 2.0”
Passion in computer applications
Dare to take the challenges
Have a sincere heart to understand infinite God’s
knowledge
Attendance is compulsory (no freely “tuang kelas”)
Behave your “gadget". Please respect others
Timetable
Please introduce yourself..
http://padlet.com/wall/8yly4q2yu8
Facebook Group
https://www.facebook.com/dataharvester2.0
Youtube Channel + Vimeo Video
izwan nizal
http://www.theage.com.au/it-pro/business-it/data-miners-find-theres-gold-in-them-tharfiles-20120511-1yi3q.html
The Age of Big Data




“The BBC documentary follows people who mine
Big Data, including LAPD police officers who use
data to predict crime, a London scientist/trader
who makes millions with math, and a South
African astronomer who wants to catalog the
entire cosmos.”
“Data Scientist” is the sexiest job of the 21st
century. The Harvard Business Review made this
claim last October and it seems that everyone
(including your grandmother) has been repeating it
ever since.
Why Knowledge Acquisitions ?


Why?

Data explosion (tremendous amount of data available + cloud
computing)
 Data is being warehoused
 Computing power – Bionic Skin?
 Competitive pressure


Hard Disk Nowadays more than 1TB capacities
What is Knowledge Acquisitions ?







aka :: data mining, knowledge discovery, knowledge
extraction, information discovery, information
harvesting ect.
Process of discovering useful information,hidden
pattern or rules in large quantities of data ( nontrivial, unknown data)
By automatic or semiautomatic means
It’s impossible to find pattern using manual method.
Traditional Approaches






Traditional database queries:. Access a
database using a well defined query such as
SQL
The query output consist of data from
database
The output usually a subset of the database
SQL
DBMS

DB
Disciplines Of Data Mining
Database System

Machine Learning

Algorithm

Statistics

Data Mining

Visualization

Information Retrieval
Data Mining Model & Task
Data Mining

Predictive

Descriptive

•Classification

•Clustering

•Time

•Association

•Regression

Series Analysis
•Prediction

•Summarization

Rules
•Sequence Discovery
Try to related with your previous
knowledge?
Hmmm…how this data mining differ with
forecasting or prediction?
 Are there similar?

Predictive Model





Make prediction about values of data using
known results found from different data
Or based on the use of other historical data
Example:: credit card fraud, breast cancer
early warning, terrorist act, tsunami and ect.
 Ghost

Protocol, Minority Report, Eagle Eye,
Predictive Model






Perform inference on the current data to make
predictions.
We know what to predict based on historical data)
Never accurate 100%
Concentrate more to input output relation ship ( x,f(x))
Typical Question
 Which costumer are likely to buy this product next
four month
 What kind of transactions that are likely to be
fraudulent
 Who is likely to drop this paper?
Predictive Model
Profit (RM)

O ? Future data

x x x
x x x x x
x
xx x x x
x x

Current data
months
Descriptive Model








Identifies pattern or relationships in data.
Serves as a way to explore the properties of data
examined, not to predict new properties
Always required a domain expert
Example::
Segmenting marketing area
Profiling student performances
Profiling GooglePlay/ AppleApps customer
Descriptive Model







Discovering new patterns inside the data
We may don’t have any idea how the data looks like
Explores the properties of the data examined
Pattern at various granularities (eg: Student: University> faculty->program-> major?
Typical Question
 What is the data
 What does it look like
 What does the data suggest for group of costumer
advertisement?
Descriptive Model
Results
y
o

y
y

y

y
y

y y
y

y

o
o
o
y y
o
o o o
y
o
Group 3
o o o
y
o
y
x x
o o
x
o
x x
x
x
x
Group 2
x x
Group 1

major
View Of DM








Data To Be Mined
 Data warehouse, WWW, time series, textual. spatial
multimedia, transactional
Knowledge To Be Mined
 Classification, prediction, summarization, trend
Techniques Utilized
 Database, machine learning, visualization, statistics
Applications Adapted
 Marketing, demographic segmentation, stock analysis
DM In Action







Medical Applications ::clinical diagnosis, drug analysis
Business (marketing segmentation & strategies, insolvency
predictor, loan risk assessment
Education (Online learning)
Internet (searching engine)
Ect
Data Mining Methodology


Hypothesis Testing vs Knowledge Discovery
 Hypothesis



Top down approach
Attempts to substantiate or disprove preconceived idea

 Knowledge



Testing

Discovery

Bottom-up approach
Start with data and tries to get it to tell us something
we didn’t already know
Data Mining Methodology


Hypothesis Testing
 Generate

good ideas
 Determine what data allow these hypotheses to be
tested
 Locate the data
 Prepare the data for analysis
 Build computer models based on the data
 Evaluate computer model to confirm or reject
hypotheses
Data Mining Methodology


Knowledge Discovery
 Directed











Identified sources of pre classified data
Prepare data analysis
Select appropriated KD techniques based on data
characteristics and data mining goal
Divide data into training, testing and evaluation
Use the training dataset to build model
Tune the model by applying it to test dataset
Take action based on data mining results
Measure the effect of the action taken
Restart the DM process taking advantage of new data
generated by the action taken
Data Mining Methodology


Knowledge Discovery
 Undirected









Identified available data sources
Prepare data analysis
Select appropriated undirected KD techniques based on
data characteristics and data mining goal
Use the selected technique to uncover hidden structure in
the data
Identify potential targets for directed KD
Generate new hypothesis to test
Revision::
Two Approaches In data Mining
Predict the future value

Data Mining

Predictive

Define R/S among data

Descriptive

•Classification

•Clustering

•Time

•Association

•Regression

Series Analysis
•Prediction

•Summarization

Rules
•Sequence Discovery
Knowledge Discovery Process
Knowledge Discovery Process
Knowledge Discovery Process


1.0 Selection
 The

data needs for the data mining process may be
obtained from many different and heterogeneous
data sources
 Examples





Business Transactions
Scientific Data
Video and pictures
UUM Student Database
Knowledge Discovery Process



2.0 Pre Processing
Main idea – to ensure that data is clean (high quality of
data).
 The data to be used by the process may have
incorrect or missing data.
 There may be anomalous data from multiple
sources involving different data types and
metrics
 Erroneous data may be corrected or removed,
whereas missing data must be supplied or
predicted (Often using data mining tools)
Knowledge Discovery Process


3.0 Transformation
 Data

from different sources must be converted
into a common format for processing
 Some data may be encoded or transformed into
more usable formats
 Example::


Data Reduction Data Cleaning, Data Integration,
Data Transformation, Data Reduction and Data
Discretization
Knowledge Discovery Process









4.0 Data Mining
Main idea –to use intelligent method to extract patterns
and knowledge from database
This step applies algorithms to the transformed data to
generate the desired results.
The heart of KD process (where unknown pattern will be
revealed).
Example of algorithms: Regression (classification,
prediction), Neural Networks (prediction, classification,
clustering), Apriori Algorithms (association rules), KMeans & K-Nearest Neighbor (clustering), Decision
Tree (classification), Instance Learning (classification).
Knowledge Discovery Process


5.0 Interpretation/Evaluation
 How

the data mining results are presented to the
users is extremely important because the
usefulness of the results is dependent on it
 Example::
 Graphical
 Geometric
 Icon Based
 Pixel Based
 Hierarchical Based
 Hybrid
Case Study: Predicting SQS Final
Year’s Studentrecord
Selected Performance
Student
database
{contains
30,000 records}
Academics
activities

Knowledge
(apply model)
Testing result:
90 % correct 
accept model

{matric, PMK, grades} –
only 2,000 records
(contains incomplete
records etc.

Clean record {replace
the missing value,
removed the replicated}
academics

Selection academics

Pre-processing
Transformation

Generated Model :
pattern for
prediction

Interpretation Y=w1x1+w2x2+b1
& evaluation
Data mining

Using neural
networks :
transform into
numerical.
Assignment 1






Group Assignment >> you may be selected (randomly) to present your
answer? (2 minutes max)
Discuss how prediction/forecasting related to your life? Or any issues
related to prediction/forecasting that might interest to you.
You may discuss






Give an appropriated example? Ect. Weather forecasting can determine your
daily exercise planning?
How it been done?

Minimum 1 pages
Due Date: 18 September 2013

Contenu connexe

Tendances

Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 
Datamining - On What Kind of Data
Datamining - On What Kind of DataDatamining - On What Kind of Data
Datamining - On What Kind of Datawina wulansari
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesDeepaR42
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining TechniquesHouw Liong The
 
Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017Fabien Gandon
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Miningcpjcollege
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop IntroductionJayant Mukherjee
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDataminingTools Inc
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataSalah Amean
 

Tendances (20)

Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Datamining - On What Kind of Data
Datamining - On What Kind of DataDatamining - On What Kind of Data
Datamining - On What Kind of Data
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
web mining
web miningweb mining
web mining
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and Techniques
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017Wimmics Research Team Overview 2017
Wimmics Research Team Overview 2017
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
 

Similaire à Chapter 1: Introduction to Data Mining

Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introductionBasma Gamal
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseKartik Kalpande Patil
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dmsumit621
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introductionDr-Dipali Meher
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 

Similaire à Chapter 1: Introduction to Data Mining (20)

Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Talk
TalkTalk
Talk
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in Database
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Data Mining
Data MiningData Mining
Data Mining
 
Part1
Part1Part1
Part1
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
data mining
data miningdata mining
data mining
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
KDD assignmnt data.docx
KDD assignmnt data.docxKDD assignmnt data.docx
KDD assignmnt data.docx
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data mining
Data miningData mining
Data mining
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 

Plus de Izwan Nizal Mohd Shaharanee (8)

Padlet and Kahoot
Padlet and KahootPadlet and Kahoot
Padlet and Kahoot
 
An Improved Framework of Tree-Structured Data Mining for Business Process Log...
An Improved Framework of Tree-Structured Data Mining for Business Process Log...An Improved Framework of Tree-Structured Data Mining for Business Process Log...
An Improved Framework of Tree-Structured Data Mining for Business Process Log...
 
Taklimat kok untuk lawatan luar (2)
Taklimat kok untuk lawatan luar (2)Taklimat kok untuk lawatan luar (2)
Taklimat kok untuk lawatan luar (2)
 
Bengkel pemantapan jurulatih a161
Bengkel pemantapan jurulatih a161Bengkel pemantapan jurulatih a161
Bengkel pemantapan jurulatih a161
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Mendeley ver6 wm
Mendeley ver6 wmMendeley ver6 wm
Mendeley ver6 wm
 
Chapter 7 -DescriptiveStatistics and Pivot Table
Chapter 7 -DescriptiveStatistics and Pivot TableChapter 7 -DescriptiveStatistics and Pivot Table
Chapter 7 -DescriptiveStatistics and Pivot Table
 
Mendeley Training
Mendeley TrainingMendeley Training
Mendeley Training
 

Dernier

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 

Dernier (20)

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 

Chapter 1: Introduction to Data Mining

  • 1. Knowledge Acquisition In Decision Making (SQIT 3033) Izwan Nizal Mohd Shaharanee SQS 4017/ 6866 nizal@uum.edu.my izwan.nizal@gmail.com
  • 2. Course Objective      To introduce :: knowledge about data mining and data warehouse To evaluate and understand several data mining techniques To enhance skill on data mining through analysis problem in business Being able to apply the commonly used functions of SAS Enterprise Miner and WEKA to solve data mining problems Developing the skills of data mining modeling and data analysis with SAS Enterprise Miner and WEKA
  • 3. Course Content     Intro to Knowledge Acquisition aka ~knowledge discovery~ (3 hours) Knowledge Discovery Process (4 hours) Pre-processing data (5 hours) Predictive Modeling (10 hours)    Evaluation And Implementation (6 hours) Descriptive Modeling (7 hours)    Decision Tree, Regression, Neural Network, Rough Set Clustering, Association Rules Data mining ethics (1.5 hours) PROJECT PRESENTATION
  • 4. Course Evaluation  Assignments  Case study + Presentations  Project + Poster Presentations  Mid Term ? Quizzes ?  Class PARTICIPATION !!  Final Exam 40% 60%
  • 5. PreRequisites       A “Basic statistics course such as SQQS2023”Bussiness Statistical”+” programming language knowledge”+“SAS knowledge”+”Database”+ “spreadsheet+ web 2.0” Passion in computer applications Dare to take the challenges Have a sincere heart to understand infinite God’s knowledge Attendance is compulsory (no freely “tuang kelas”) Behave your “gadget". Please respect others
  • 7. Please introduce yourself.. http://padlet.com/wall/8yly4q2yu8 Facebook Group https://www.facebook.com/dataharvester2.0 Youtube Channel + Vimeo Video izwan nizal
  • 9. The Age of Big Data   “The BBC documentary follows people who mine Big Data, including LAPD police officers who use data to predict crime, a London scientist/trader who makes millions with math, and a South African astronomer who wants to catalog the entire cosmos.” “Data Scientist” is the sexiest job of the 21st century. The Harvard Business Review made this claim last October and it seems that everyone (including your grandmother) has been repeating it ever since.
  • 10. Why Knowledge Acquisitions ?  Why? Data explosion (tremendous amount of data available + cloud computing)  Data is being warehoused  Computing power – Bionic Skin?  Competitive pressure  Hard Disk Nowadays more than 1TB capacities
  • 11. What is Knowledge Acquisitions ?     aka :: data mining, knowledge discovery, knowledge extraction, information discovery, information harvesting ect. Process of discovering useful information,hidden pattern or rules in large quantities of data ( nontrivial, unknown data) By automatic or semiautomatic means It’s impossible to find pattern using manual method.
  • 12. Traditional Approaches    Traditional database queries:. Access a database using a well defined query such as SQL The query output consist of data from database The output usually a subset of the database SQL DBMS DB
  • 13. Disciplines Of Data Mining Database System Machine Learning Algorithm Statistics Data Mining Visualization Information Retrieval
  • 14. Data Mining Model & Task Data Mining Predictive Descriptive •Classification •Clustering •Time •Association •Regression Series Analysis •Prediction •Summarization Rules •Sequence Discovery
  • 15. Try to related with your previous knowledge? Hmmm…how this data mining differ with forecasting or prediction?  Are there similar? 
  • 16. Predictive Model    Make prediction about values of data using known results found from different data Or based on the use of other historical data Example:: credit card fraud, breast cancer early warning, terrorist act, tsunami and ect.  Ghost Protocol, Minority Report, Eagle Eye,
  • 17. Predictive Model      Perform inference on the current data to make predictions. We know what to predict based on historical data) Never accurate 100% Concentrate more to input output relation ship ( x,f(x)) Typical Question  Which costumer are likely to buy this product next four month  What kind of transactions that are likely to be fraudulent  Who is likely to drop this paper?
  • 18. Predictive Model Profit (RM) O ? Future data x x x x x x x x x xx x x x x x Current data months
  • 19. Descriptive Model        Identifies pattern or relationships in data. Serves as a way to explore the properties of data examined, not to predict new properties Always required a domain expert Example:: Segmenting marketing area Profiling student performances Profiling GooglePlay/ AppleApps customer
  • 20. Descriptive Model      Discovering new patterns inside the data We may don’t have any idea how the data looks like Explores the properties of the data examined Pattern at various granularities (eg: Student: University> faculty->program-> major? Typical Question  What is the data  What does it look like  What does the data suggest for group of costumer advertisement?
  • 21. Descriptive Model Results y o y y y y y y y y y o o o y y o o o o y o Group 3 o o o y o y x x o o x o x x x x x Group 2 x x Group 1 major
  • 22. View Of DM     Data To Be Mined  Data warehouse, WWW, time series, textual. spatial multimedia, transactional Knowledge To Be Mined  Classification, prediction, summarization, trend Techniques Utilized  Database, machine learning, visualization, statistics Applications Adapted  Marketing, demographic segmentation, stock analysis
  • 23. DM In Action      Medical Applications ::clinical diagnosis, drug analysis Business (marketing segmentation & strategies, insolvency predictor, loan risk assessment Education (Online learning) Internet (searching engine) Ect
  • 24. Data Mining Methodology  Hypothesis Testing vs Knowledge Discovery  Hypothesis   Top down approach Attempts to substantiate or disprove preconceived idea  Knowledge   Testing Discovery Bottom-up approach Start with data and tries to get it to tell us something we didn’t already know
  • 25. Data Mining Methodology  Hypothesis Testing  Generate good ideas  Determine what data allow these hypotheses to be tested  Locate the data  Prepare the data for analysis  Build computer models based on the data  Evaluate computer model to confirm or reject hypotheses
  • 26. Data Mining Methodology  Knowledge Discovery  Directed          Identified sources of pre classified data Prepare data analysis Select appropriated KD techniques based on data characteristics and data mining goal Divide data into training, testing and evaluation Use the training dataset to build model Tune the model by applying it to test dataset Take action based on data mining results Measure the effect of the action taken Restart the DM process taking advantage of new data generated by the action taken
  • 27. Data Mining Methodology  Knowledge Discovery  Undirected       Identified available data sources Prepare data analysis Select appropriated undirected KD techniques based on data characteristics and data mining goal Use the selected technique to uncover hidden structure in the data Identify potential targets for directed KD Generate new hypothesis to test
  • 28. Revision:: Two Approaches In data Mining Predict the future value Data Mining Predictive Define R/S among data Descriptive •Classification •Clustering •Time •Association •Regression Series Analysis •Prediction •Summarization Rules •Sequence Discovery
  • 31. Knowledge Discovery Process  1.0 Selection  The data needs for the data mining process may be obtained from many different and heterogeneous data sources  Examples     Business Transactions Scientific Data Video and pictures UUM Student Database
  • 32.
  • 33. Knowledge Discovery Process   2.0 Pre Processing Main idea – to ensure that data is clean (high quality of data).  The data to be used by the process may have incorrect or missing data.  There may be anomalous data from multiple sources involving different data types and metrics  Erroneous data may be corrected or removed, whereas missing data must be supplied or predicted (Often using data mining tools)
  • 34. Knowledge Discovery Process  3.0 Transformation  Data from different sources must be converted into a common format for processing  Some data may be encoded or transformed into more usable formats  Example::  Data Reduction Data Cleaning, Data Integration, Data Transformation, Data Reduction and Data Discretization
  • 35. Knowledge Discovery Process      4.0 Data Mining Main idea –to use intelligent method to extract patterns and knowledge from database This step applies algorithms to the transformed data to generate the desired results. The heart of KD process (where unknown pattern will be revealed). Example of algorithms: Regression (classification, prediction), Neural Networks (prediction, classification, clustering), Apriori Algorithms (association rules), KMeans & K-Nearest Neighbor (clustering), Decision Tree (classification), Instance Learning (classification).
  • 36. Knowledge Discovery Process  5.0 Interpretation/Evaluation  How the data mining results are presented to the users is extremely important because the usefulness of the results is dependent on it  Example::  Graphical  Geometric  Icon Based  Pixel Based  Hierarchical Based  Hybrid
  • 37. Case Study: Predicting SQS Final Year’s Studentrecord Selected Performance Student database {contains 30,000 records} Academics activities Knowledge (apply model) Testing result: 90 % correct  accept model {matric, PMK, grades} – only 2,000 records (contains incomplete records etc. Clean record {replace the missing value, removed the replicated} academics Selection academics Pre-processing Transformation Generated Model : pattern for prediction Interpretation Y=w1x1+w2x2+b1 & evaluation Data mining Using neural networks : transform into numerical.
  • 38. Assignment 1    Group Assignment >> you may be selected (randomly) to present your answer? (2 minutes max) Discuss how prediction/forecasting related to your life? Or any issues related to prediction/forecasting that might interest to you. You may discuss     Give an appropriated example? Ect. Weather forecasting can determine your daily exercise planning? How it been done? Minimum 1 pages Due Date: 18 September 2013