SlideShare une entreprise Scribd logo
1  sur  30
 INTRODUCTION
 DATA MINING
 WHY DATA MINING
 APPLICATION OF DATA MINING
 STEPS OF DATA MINING
 DATA MINING TECHNIQUES
 THREAT OF DATA MINING
 SOLUTION OF THREAT
 ROLE OF DATA MINING
 DATA WAREHOUSE
 OLTP & OLAP
 DATA MINING TOOLS
 LATEST RESEARCH
INTRODUCTION
Data mining, the extraction of hidden predictive information
from large databases, is a powerful new technology with great
potential to help companies focus on the most important
information in their data warehouses.
DATA MINING
It is extraction of previously unknown, valid and understandable
information or pattern from data in repositories or sources :
 Databases
 Text files
 Social networks
 Computer simulation
The information obtained should be such that is can be used in any
organizations and enterprises for business making.
Why Data Mining ?
Data. Data everywhere yet
 I can’t find the data I need
 I can’t get the data I need
 I can’t understand the data I found
 I can’t use the data I found
• Data explosion problem
Advance data collection tools and database technology lead to
tremendous amounts of data stored in database.
• We are drawing in data, but starving for
knowledge!
• Solution: Data warehousing and Data mining
- Data warehousing and on-line analytical processing.
- Extraction of interesting knowledge using data mining.
APPLICATION OF DATA MINING
Data Mining is primarily used today by companies with a strong
consumer focus — retail, financial, communication, and marketing
organizations.
1. FINANCE INDUSTRY
Credit Card Analysis
2. INSURANCE INDUSTRY
Claims and Fraud Analysis
3. TELECOMMUNICATION
Call Record Analysis
4. TRANSPORT
Logistics Management
5. CONSUMER GOODS
Promotion Analysis
6. SCIENTIFIC RESERCH
Image, Video, Speech
7. UTILITIES
Power Usage Analysis
STEPS OF DATA MINING
 Data integration
 Data selection
 Data transformation
 Data mining
 Pattern evaluation
 Knowledge presentation
DATA MINING TECHNIQUES
Classification and Prediction
example – Focused Hiring
Cluster Analysis
example – Market Segmentation
Outlier Analysis
example – Fraud Detection
Association Analysis
example – Market Basket Analysis
Evolution Analysis
example – Forecasting stock market index using Time series Analysis
Threat To Privacy From Data Mining
They data mine information about your buying habits, sites you surf, so they
can personalize your search results when you use their search engine. It's
both frightening but on the other hand, in theory it's a way for companies to
tailor your online experience. The problem, of course, is that while generally
the data isn't scoured by humans, it is used by machines.
SOLUTION OF DATA MINING THREAT
SOLUTIONS :
 Purposes Specification & Use Limitation
 Openness
 Security Measures like Encryption
ROLE OF DATA MINING IN IT
Business Intelligence
Model Tool Method
Behavioral Basics
Information TechnologyData
Problem
Decision
DATA WAREHOUSE
Data warehousing is a technology that aggregates
structured data from one or more sources so that it can
be compared and analyzed for greater business
intelligence.
DATA WAREHOUSE
 Data warehouse provides the enterprise with a
memory.
 Data Mining provides enterprise with intelligence.
OLTP & OLAP
On-Line Transaction Processing (OLTP)
Short, simple, frequent queries and modifications
Each involving a small number of tuples
Example – answering queries from a web interface, sales at cash registers,
selling airline tickets.
On-line Application Processing (OLAP)
Few but complex queries --- may run for hours.
Queries do not depend on having an absolutely up-to-date
Database.
Example – analyst at Wal-mart look for items with increasing sales in some
region.
DATA MINING TOOLS
 Microsoft SQL Server 2005
 Microsoft SQL Server 2008
 Oracle Data Mining
 DB Miner
Latest Research and Reviews on Data
Mining
1. Systematic discovery of mutation-specific synthetic lethal by mining pan-
cancer human primary tumor data.
2. Multi-label Learning for Predicting the Activities of Antimicrobial
Peptides.
3. Semantic correction system - Little complex but interesting. Generally
retried text faces semantic error, hence leads to wrong result. Applying
this as preprocessing leads to better outcomes.
4. Syntactic correction system - Much needed now a days. Non-English
speakers creates much syntactical error. It can also be used as
preprocessing job in many projects. So you algorithm should
automatically detect such errors and suggest correct grammar.
5. Search engine for Wikipedia - Wikipedia data available as dump file.
Check dbpedia for reference. Apply indexing techniques and build
small kind of SE for wiki pages. As Wikipedia already provides this
functionality but you can work on better user experience, result
optimization.
6. Twitter tweets classifier - Pretty easy and interesting too. Creating
learning system for various categories kind of Sports, entertainment,
business, politics, Hollywood etc. Train the classifier (naive bayes,
SVM) and predict the category for incoming tweets.
7. Sentiment analysis for twitter, review, conversations - There are few
packages available in R which can help to perform this job. One needs to add
few additional feature on top of that to make more intuitive. Nltk, Stanford,
good open source tools for the same.
8. Spam mail detection - Again learning based classification system. Train
the classifier using users pre-selected spam mail which would be able to
classify new upcoming mails. If uses mark new mail as spam, then
retrain(may be some other better option).
9. Sarcasms detection - This can be very interesting one. In sentiment
analysis we identify users sentiment regarding something's, here we identify
sarcasm expressed by users. Check out Page on psu.edu - Sarcasm detection
on twitter
Data Mining and Data Warehouse

Contenu connexe

Tendances

Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data VisualizationRaffael Marty
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.pptneelamoberoi1030
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data miningKamal Acharya
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : ConceptsPragya Pandey
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & ApplicationsFazle Rabbi Ador
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data miningHadi Fadlallah
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 

Tendances (20)

Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data Visualization
 
Data mining
Data miningData mining
Data mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Big data
Big dataBig data
Big data
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Big Data
Big DataBig Data
Big Data
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 

Similaire à Data Mining and Data Warehouse

A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentationmillerca2
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investmentvijayk23x
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattooMohamed Magdy
 
Interview for saby upadhyay
Interview for  saby upadhyayInterview for  saby upadhyay
Interview for saby upadhyayAnthonyBennet
 
Interview for saby upadhyay
Interview for  saby upadhyayInterview for  saby upadhyay
Interview for saby upadhyayCameronDonovan
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of dataHarsha MV
 
Data Science- Basics.pptx
Data Science- Basics.pptxData Science- Basics.pptx
Data Science- Basics.pptxRupaliKute3
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analyticssunnypatil1778
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewDr. Ananth Krishnamoorthy
 
How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...WiMLDSMontreal
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxAbderrahmanABID2
 
Business analytics Project.docx
Business analytics Project.docxBusiness analytics Project.docx
Business analytics Project.docxkushi62
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategyHimanshu Bari
 

Similaire à Data Mining and Data Warehouse (20)

A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investment
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
 
Interview for saby upadhyay
Interview for  saby upadhyayInterview for  saby upadhyay
Interview for saby upadhyay
 
Interview for saby upadhyay
Interview for  saby upadhyayInterview for  saby upadhyay
Interview for saby upadhyay
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of data
 
Data Science- Basics.pptx
Data Science- Basics.pptxData Science- Basics.pptx
Data Science- Basics.pptx
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analytics
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape Overview
 
How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...How to build a data science project in a corporate setting, by Soraya Christi...
How to build a data science project in a corporate setting, by Soraya Christi...
 
Big data
Big dataBig data
Big data
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
IT Ready - DW: 1st Day
IT Ready - DW: 1st Day IT Ready - DW: 1st Day
IT Ready - DW: 1st Day
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
 
Business analytics Project.docx
Business analytics Project.docxBusiness analytics Project.docx
Business analytics Project.docx
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
Big dataplatform operationalstrategy
Big dataplatform operationalstrategyBig dataplatform operationalstrategy
Big dataplatform operationalstrategy
 

Dernier

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 

Dernier (20)

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 

Data Mining and Data Warehouse

  • 1.
  • 2.  INTRODUCTION  DATA MINING  WHY DATA MINING  APPLICATION OF DATA MINING  STEPS OF DATA MINING  DATA MINING TECHNIQUES  THREAT OF DATA MINING  SOLUTION OF THREAT  ROLE OF DATA MINING  DATA WAREHOUSE  OLTP & OLAP  DATA MINING TOOLS  LATEST RESEARCH
  • 3. INTRODUCTION Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses.
  • 4. DATA MINING It is extraction of previously unknown, valid and understandable information or pattern from data in repositories or sources :  Databases  Text files  Social networks  Computer simulation The information obtained should be such that is can be used in any organizations and enterprises for business making.
  • 5. Why Data Mining ? Data. Data everywhere yet  I can’t find the data I need  I can’t get the data I need  I can’t understand the data I found  I can’t use the data I found
  • 6. • Data explosion problem Advance data collection tools and database technology lead to tremendous amounts of data stored in database. • We are drawing in data, but starving for knowledge! • Solution: Data warehousing and Data mining - Data warehousing and on-line analytical processing. - Extraction of interesting knowledge using data mining.
  • 7. APPLICATION OF DATA MINING Data Mining is primarily used today by companies with a strong consumer focus — retail, financial, communication, and marketing organizations.
  • 9. 2. INSURANCE INDUSTRY Claims and Fraud Analysis
  • 15. STEPS OF DATA MINING  Data integration  Data selection  Data transformation  Data mining  Pattern evaluation  Knowledge presentation
  • 16.
  • 17. DATA MINING TECHNIQUES Classification and Prediction example – Focused Hiring Cluster Analysis example – Market Segmentation Outlier Analysis example – Fraud Detection Association Analysis example – Market Basket Analysis Evolution Analysis example – Forecasting stock market index using Time series Analysis
  • 18. Threat To Privacy From Data Mining They data mine information about your buying habits, sites you surf, so they can personalize your search results when you use their search engine. It's both frightening but on the other hand, in theory it's a way for companies to tailor your online experience. The problem, of course, is that while generally the data isn't scoured by humans, it is used by machines.
  • 19. SOLUTION OF DATA MINING THREAT SOLUTIONS :  Purposes Specification & Use Limitation  Openness  Security Measures like Encryption
  • 20. ROLE OF DATA MINING IN IT Business Intelligence Model Tool Method Behavioral Basics Information TechnologyData Problem Decision
  • 21. DATA WAREHOUSE Data warehousing is a technology that aggregates structured data from one or more sources so that it can be compared and analyzed for greater business intelligence.
  • 22.
  • 23. DATA WAREHOUSE  Data warehouse provides the enterprise with a memory.  Data Mining provides enterprise with intelligence.
  • 24. OLTP & OLAP On-Line Transaction Processing (OLTP) Short, simple, frequent queries and modifications Each involving a small number of tuples Example – answering queries from a web interface, sales at cash registers, selling airline tickets. On-line Application Processing (OLAP) Few but complex queries --- may run for hours. Queries do not depend on having an absolutely up-to-date Database. Example – analyst at Wal-mart look for items with increasing sales in some region.
  • 25.
  • 26. DATA MINING TOOLS  Microsoft SQL Server 2005  Microsoft SQL Server 2008  Oracle Data Mining  DB Miner
  • 27. Latest Research and Reviews on Data Mining 1. Systematic discovery of mutation-specific synthetic lethal by mining pan- cancer human primary tumor data. 2. Multi-label Learning for Predicting the Activities of Antimicrobial Peptides. 3. Semantic correction system - Little complex but interesting. Generally retried text faces semantic error, hence leads to wrong result. Applying this as preprocessing leads to better outcomes.
  • 28. 4. Syntactic correction system - Much needed now a days. Non-English speakers creates much syntactical error. It can also be used as preprocessing job in many projects. So you algorithm should automatically detect such errors and suggest correct grammar. 5. Search engine for Wikipedia - Wikipedia data available as dump file. Check dbpedia for reference. Apply indexing techniques and build small kind of SE for wiki pages. As Wikipedia already provides this functionality but you can work on better user experience, result optimization. 6. Twitter tweets classifier - Pretty easy and interesting too. Creating learning system for various categories kind of Sports, entertainment, business, politics, Hollywood etc. Train the classifier (naive bayes, SVM) and predict the category for incoming tweets.
  • 29. 7. Sentiment analysis for twitter, review, conversations - There are few packages available in R which can help to perform this job. One needs to add few additional feature on top of that to make more intuitive. Nltk, Stanford, good open source tools for the same. 8. Spam mail detection - Again learning based classification system. Train the classifier using users pre-selected spam mail which would be able to classify new upcoming mails. If uses mark new mail as spam, then retrain(may be some other better option). 9. Sarcasms detection - This can be very interesting one. In sentiment analysis we identify users sentiment regarding something's, here we identify sarcasm expressed by users. Check out Page on psu.edu - Sarcasm detection on twitter