SlideShare une entreprise Scribd logo
1  sur  71
Business Intelligence, Data Warehousing
Data Marts, Data Mining
Presented by
Mr. Manish Tripathi ( I – 15-18-19)
Thakur Institute of Management Studies
&
Research
(Sunday 26 March, 2017)
1
Business Intelligence
2
WHAT IS BUSINESS INTELLIGENCE?
• BI is a technology-driven process for analyzing data
and presenting actionable information to help
corporate executives, business managers and other
end users make more informed business decisions
• BI encompasses a wide variety of tools, applications
and methodologies that enable organizations to
collect data from internal systems and external
sources
• Prepare it for analysis, develop and run queries against
the data, and create reports, dashboards and data
visualizations to make the analytical results available
to corporate decision makers as well as operational
workers
3
WHAT IS BUSINESS INTELLIGENCE?
• BI technologies provide historical, current and
predictive views of business operations
• Identifying new opportunities and implementing
an effective strategy based on insights can provide
businesses with a competitive market advantage
and long-term stability
• Business intelligence can be used to support a
wide range of business decisions ranging from
operational to strategic
4
BENEFITS OF BUSINESS INTELLIGENCE
• The potential benefits of business intelligence
programs include accelerating and improving
decision making; optimizing internal business
processes; increasing operational efficiency;
driving new revenues; and gaining competitive
advantages over business rivals.
• It removes guesswork
• Gives quicker responses to your business-related
queries
• Obtain important business metrics reports
whenever and wherever you need them
5
BENEFITS OF BUSINESS INTELLIGENCE
• Gain a better understanding of business’ past,
present and future
• Gain valuable insight into your customer’s
behaviour
• Pinpoint up-selling as well as cross-selling
opportunities
• Develop efficiency
6
Intelligent Value Chain Networks Results in
Business Intelligence
7
Business Intelligence Process Flow
8
BUSINESS INTELLIGENCE TOOLS
• SAP Crystal Reports
• SAS Enterprise BI Server
• Oracle Business Intelligence Enterprise Edition Plus
• IBM Cognos 8 BI
• Microsoft PowerPivot
• MicroStrategy Reporting Suite
• Salesforce CRM
• TIBCO Spotfire Analytics
• Information Builders WebFOCUS
9
GARTNER 2016 MAGIC QUADRANT FOR BUSINESS INTELLIGENCE
10
Data Warehousing
11
WHAT IS DATA WAREHOUSING?
• A data warehouse is a federated repository for all
the data that an enterprise's various business
systems collect
• It is a collection of corporate information and data
derived from operational systems and external
data sources
• A data warehouse is designed to support business
decisions by allowing data consolidation, analysis
and reporting at different aggregate levels
12
MOST POPULAR DATA WAREHOUSING DEFINITIONS
Ralph Kimball
• A data warehouse is a copy of transaction data
specifically structured for query and analysis
Bill Inmon
• A data warehouse is a subject-oriented,
integrated, time-variant and non-volatile collection
of data in support of management's decision
making process
13
Properties of Data Warehousing
14
Subject-Oriented
A data warehouse can be used to analyze a
particular subject area. For example, "sales"
can be a particular subject 15
Integrated
A data warehouse integrates data from
multiple data sources. For example, source A
and source B may have different ways of
identifying a product, but in a data warehouse,
there will be only a single way of identifying a
product
16
Time-Variant
Historical data is kept in a data warehouse. For
example, one can retrieve data from 3 months,
6 months, 12 months, or even older data from
a data warehouse. This contrasts with a
transactions system, where often only the most
recent data is kept 17
Non-volatile
Once data is in the data warehouse, it will not
change. So, historical data in a data warehouse
should never be altered
18
Purpose of Data Warehousing
19
Purpose of Data Warehousing
• Keeping Analysis/Reporting and Production Separate
• Information Integration from multiple systems- Single
point source for information
• Data Consistency and Quality
• High Response Time- Production Databases are tuned
to expected transaction load
• High Response time- Normalized Data vs. Dimensional
Modeling
• Establish the foundation for Decision Support
• Maintain data history, even if the source transaction
systems do not
20
Difference between Data Warehousing and
normal Database
21
Data Warehousing vs. normal Database
1- SIZE
Data warehouses are potentially much bigger than
the databases from where the data is derived.
Databases usually store only the data that is currently
in active use; older records can be purged and moved
to backups, mainly for performance reasons. Data
warehouses are used to store much older historical
records; it's also common to use data warehouses to
store additional information that is bought or
captured elsewhere to complement the information
that is generated and stored by the internal database
system
22
Data Warehousing vs. normal Database
2- Normalization
Databases are usually normalized, which means that
a lot of work is done to guarantee that there's a
unique copy of any given bit of information, which is
important for performance and consistency reasons.
But it's common to store different versions of the
same information on a data warehouse, using
different structures to compose and access the
information. In other words, data warehouses are
messier and more irregular, partly by design, as they
need to be able to work with so many different
sources of information
23
Data Warehousing vs. normal Database
3- Access pattern
Database records are often retrieved and updated
one by one; data warehouses are nearly always
acessed by reporting engines that work on entire
datasets at a time to generate aggregates and other
analytical information. Databases are frequently
updated, sometimes only a field or record at a time;
data warehouses aren't updated very frequently, and
for all practical purposes, never at the field or record
level; instead data is appended in large batches
24
Data Warehousing vs. normal Database
4- Use
Normal databases are used for OLTP whereas data
warehousing is used for OLAP
25
Data Warehousing vs. normal Database
5- Performance
For normal database performance is important and
optimized for write operation. Whereas for data
warehouse performance is not critical and optimized
for read operations.
26
Data Warehousing vs. normal Database
6- Table & Joins
For normal database the tables and joins are complex
since they are normalized (for RDMS). This is done to
reduce redundant data and to save storage space.
Whereas for data warehouse for the Tables and joins
are simple since they are de-normalized. This is done
to reduce the response time for analytical queries.
27
Data Warehousing vs. normal Database
7- Data source
For normal database mostly internal data sources are
used. Whereas for data warehouse external data
sources may also be used like macro economic
indicators, competitor data, market data, etc.
28
DATA WAREHOUSING PRODUCTS
• Teradata EDW (enterprise data warehouse)
• Oracle Exadata
• Amazon Redshift
• Cloudera Enterprise Data Hub (EDH)
• Marklogic
• IBM Netezza data warehouse appliance
• SAP Business Warehouse
• MS SQL Parallel Data Warehouse
29
GARTNER 2016 MAGIC QUADRANT FOR DATA WAREHOUSE
30
7 STEPS IN BUILDING DATA WAREHOUSE
(MANAGEMENT VIEW)
• Step 1: Determine Business Objectives
• Step 2: Collect and Analyze Information
• Step 3: Identify Core Business Processes
• Step 4: Construct a Conceptual Data Model
• Step 5: Locate Data Sources and Plan Data
Transformations
• Step 6: Set Tracking Duration
• Step 7: Implement the Plan
31
3 STEPS IN BUILDING DATA WAREHOUSE
(TECHNICAL VIEW)
• Extract
• Transform
• Load
32
ETL Process
33
ETL Process
34
DATA MART
• The data mart is a subset of the data warehouse and
is usually oriented to a specific business line or team
• A data mart is a repository of data that is designed to
serve a particular community of knowledge workers
• Because data marts are optimized to look at data in a
unique way, the design process tends to start with an
analysis of user needs
• Today, data virtualization software can be used to
create virtual data marts, pulling data from disparate
sources and combining it with other data as necessary
to meet the needs of specific business users
35
DATA MART
• A virtual data mart provides knowledge workers
with access to the data they need while
preventing data silos and giving the organization's
data management team a level of control over the
organization's data throughout its lifecycle
36
REASONS FOR CREATING A DATA MART
• Easy access to frequently needed data
• Creates collective view by a group of users
• Improves end-user response time
• Ease of creation
• Lower cost than implementing a full data
warehouse
• Potential users are more clearly defined than in a
full data warehouse
• Contains only business essential data and is less
cluttered.
37
Data mart
38
Data mart creation
39
DATA LAKE
• A data lake is a storage repository that holds a vast
amount of raw data in its native format until it is needed
• A data lake uses a flat architecture to store data
• Each data element in a lake is assigned a unique identifier
and tagged with a set of extended metadata tags
• When a business question arises, the data lake can be
queried for relevant data, and that smaller set of data can
then be analyzed to help answer the question
• The term data lake is often associated with Hadoop-
oriented object storage
• In such a scenario, an organization's data is first loaded
into the Hadoop platform, and then business analytics and
data mining tools are applied to the data where it resides
on Hadoop's cluster nodes
40
Data lake
41
DATA MART VS. DATA WAREHOUSE
1- Data Scope
The first, and most obvious difference is
the information scope each one stores. On
one hand, data warehouses save all kinds
of data related to system. On the other
hand, data marts just store specific subject
information, becoming much more focused
on these functionalities.
42
DATA MART VS. DATA WAREHOUSE
2- Size
We can say that a data warehouse is
usually much bigger than data marts,
because it keeps a lot more data.
43
DATA MART VS. DATA WAREHOUSE
3-Integration
A data warehouse usually integrates
several sources of data in order to feed
its database and the system’s needs. In
opposite, a data mart has a lot less
integration to do, since its data is very
specific
44
DATA MART VS. DATA WAREHOUSE
4- Data Scope
The first, and most obvious difference is
the information scope each one stores. On
one hand, data warehouses save all kinds
of data related to system. On the other
hand, data marts just store specific subject
information, becoming much more focused
on these functionalities.
45
DATA MART VS. DATA WAREHOUSE
5- Creation
Creating a data warehouse is way more
difficult and time consuming than building
a data mart. Building all the structure, a
relationships between data, its a long and
very important step. Plus we need to think
and analyse how we will integrate all of the
information sources. Since data marts are
smaller and subject oriented, these actions
tend to be much simpler. 46
DATA MART VS. DATA WAREHOUSE
6-Management
Like creation, the management of data
warehouses is far more complex than
data marts. For the same reasons, it is
obvious that when we have a lot more
data, relationships, processes to
manage, it becomes a harder task.
47
DATA MART VS. DATA WAREHOUSE
7- Cost
In overall, in terms of cost, data marts
are cheaper than data warehouse. To
build and maintain a data warehouse
we need significantly more physical
resources like servers, disk space,
memory and CPU. Due to the
complexity of the systems, a data mart
requires less time to build and operate.48
DATA MART VS. DATA WAREHOUSE
8- Performance
The performance of a system always
depends on how it is built, the
infrastructure which supports it, the
processes, the number of users, etc.
Usually a data mart is faster than a data
warehouse because of the inherited
complexity and large data. 49
Multidimensional Analysis
50
MULTIDIMENSIONAL ANALYSIS
• Multi-Dimensional Analysis is an Informational
Analysis on data which takes into account many
different relationships, each of which represents a
dimension
• For example, a retail analyst may want to
understand the relationships among sales by
region, by quarter, by demographic distribution
(income, education level, gender), by product
• Multi-dimensional analysis will yield results for
these complex relationships
51
MULTIDIMENSIONAL ANALYSIS
• Multi-dimensional Data Analysis (MDDA) refers to
the process of summarizing data across multiple
levels (called dimensions) and then presenting the
results in a multi-dimensional grid format
• This process is also referred to as OLAP cube, Data
Pivot., Decision Cube, and Crosstab
52
OLAP CUBE
• An OLAP cube is a multidimensional database that
is optimized for data warehouse and online
analytical processing (OLAP) applications
• An OLAP cube is a method of storing data in a
multidimensional form, generally for reporting
purposes
• In OLAP cubes, data are categorized by dimensions
• OLAP cubes are often pre-summarized across
dimensions to drastically improve query time over
relational databases
53
Multidimensional Analysis
54
Data mining
55
WHAT IS DATA MINING?
• Data mining is the practice of automatically searching
large stores of data to discover patterns and trends
that go beyond simple analysis
• Data mining uses sophisticated mathematical
algorithms to segment the data and evaluate the
probability of future events
• It is the process of finding anomalies, patterns and
correlations within large data sets to predict outcomes
• The overall goal of the data mining process is to
extract information from a data set and transform it
into an understandable structure for further use
• Also known as Knowledge Discovery in Data (KDD)
56
The phases, and the iterative nature, of a data mining project.
The process flow shows that a data mining project does not
stop when a particular solution is deployed. The results of
data mining trigger new business questions, which in turn can
be used to develop more focused models.
57
1- PROBLEM DEFINITION
• This initial phase of a data mining project focuses
on understanding the project objectives and
requirements. Once we have specified the project
from a business perspective, we can formulate it
as a data mining problem and develop a
preliminary implementation plan.
• For example, the business problem might be:
"How can I sell more of my product to customers?"
You might translate this into a data mining
problem such as: "Which customers are most likely
to purchase the product?"
58
2- Data Gathering and Preparation
The data understanding phase involves data collection
and exploration. As you take a closer look at the data,
you can determine how well it addresses the business
problem. You might decide to remove some of the
data or add additional data. This is also the time to
identify data quality problems and to scan for
patterns in the data.
59
3- Model Building and Evaluation
In this phase, you select and apply various modeling
techniques and calibrate the parameters to optimal
values. If the algorithm requires data transformations,
you will need to step back to the previous phase to
implement them.
60
4- Knowledge Deployment
• Knowledge deployment is the use of data mining
within a target environment
• In the deployment phase, insight and actionable
information can be derived from data
61
Data mining process
62
Data Mining Models
• A mining model is created by applying an
algorithm to data
• it is a set of data, statistics, and patterns that can
be applied to new data to generate predictions
and make inferences about relationships
• A data mining model gets data from a mining
structure and then analyzes that data by using a
data mining algorithm
• The mining structure and mining model are
separate objects
• The mining structure stores information that
defines the data source
63
Data Mining Models
• A mining model stores information derived from
statistical processing of the data, such as the
patterns found as a result of analysis
• A mining model is empty until the data provided
by the mining structure has been processed and
analyzed.
• After a mining model has been processed, it
contains metadata, results, and bindings back to
the mining structure
• Model contains metadata, patterns, and bindings
64
Data mining model
65
Data mining model
66
Data Mining Algorithms
• An algorithm in data mining is a set of heuristics
and calculations that creates a model from data
• To create a model, the algorithm first analyzes the
data you provide, looking for specific types of
patterns or trends
• The algorithm uses the results of this analysis over
many iterations to find the optimal parameters for
creating the mining model
• These parameters are then applied across the
entire data set to extract actionable patterns and
detailed statistics.
67
Data Mining Algorithms
• The mining model that an algorithm creates from
your data can take various forms, including:
1. A set of clusters that describe how the cases in a
dataset are related
2. A decision tree that predicts an outcome, and
describes how different criteria affect that outcome
3. A mathematical model that forecasts sales
4. A set of rules that describe how products are grouped
together in a transaction, and the probabilities that
products are purchased together
68
Data mining Algorithms 69
Data mining tools (Analytics) 70
71

Contenu connexe

Tendances

Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consultingadivasoft
 
Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse Lesa Cote
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Vibrant Technologies & Computers
 
Data Warehousing Overview
Data Warehousing OverviewData Warehousing Overview
Data Warehousing OverviewAhmed Gamal
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaRadhika Kotecha
 
Data Warehousing and Mining
Data Warehousing and MiningData Warehousing and Mining
Data Warehousing and Miningethantelaviv
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingEyad Manna
 
data warehousing
data warehousingdata warehousing
data warehousing143sohil
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Miningcpjcollege
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etlAashish Rathod
 

Tendances (20)

Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse Role of Database Management System in A Data Warehouse
Role of Database Management System in A Data Warehouse
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehousing Overview
Data Warehousing OverviewData Warehousing Overview
Data Warehousing Overview
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Hadoop & Data Warehouse
Hadoop & Data Warehouse Hadoop & Data Warehouse
Hadoop & Data Warehouse
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Data Warehousing and Mining
Data Warehousing and MiningData Warehousing and Mining
Data Warehousing and Mining
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
 
Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
data warehousing
data warehousingdata warehousing
data warehousing
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 

Similaire à Manish tripathi-ea-dw-bi

DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxSalehaMariyam
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Harish Chand
 
Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse FundamentalsRashmi Bhat
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxParnalSatle
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Martin Bém
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OnePanchaleswar Nayak
 
Master data management and data warehousing
Master data management and data warehousingMaster data management and data warehousing
Master data management and data warehousingZahra Mansoori
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousingEr. Nawaraj Bhandari
 
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxvipush1
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data WarehousingAAKANKSHA JAIN
 
Day 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminologyDay 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminologytovetrivel
 
Managing Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataManaging Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataVineet
 

Similaire à Manish tripathi-ea-dw-bi (20)

DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse Fundamentals
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
 
Data Mining
Data MiningData Mining
Data Mining
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28Prague data management meetup 2017-02-28
Prague data management meetup 2017-02-28
 
Data Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_OneData Warehouse Design on Cloud ,A Big Data approach Part_One
Data Warehouse Design on Cloud ,A Big Data approach Part_One
 
Master data management and data warehousing
Master data management and data warehousingMaster data management and data warehousing
Master data management and data warehousing
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Data wirehouse
Data wirehouseData wirehouse
Data wirehouse
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
Day 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminologyDay 02 sap_bi_overview_and_terminology
Day 02 sap_bi_overview_and_terminology
 
Managing Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataManaging Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big Data
 

Plus de A P

Social media-workplace-13 september2015
Social media-workplace-13 september2015Social media-workplace-13 september2015
Social media-workplace-13 september2015A P
 
Rote wto-it-diffusion
Rote wto-it-diffusionRote wto-it-diffusion
Rote wto-it-diffusionA P
 
Role of-wto-in-promoting-un-sustainable-development-goals
Role of-wto-in-promoting-un-sustainable-development-goalsRole of-wto-in-promoting-un-sustainable-development-goals
Role of-wto-in-promoting-un-sustainable-development-goalsA P
 
Microsoft nokia-acquisition
Microsoft nokia-acquisitionMicrosoft nokia-acquisition
Microsoft nokia-acquisitionA P
 
Manish tripathi-tcs-financial-management-9 october2016
Manish tripathi-tcs-financial-management-9 october2016Manish tripathi-tcs-financial-management-9 october2016
Manish tripathi-tcs-financial-management-9 october2016A P
 
Manish tripathi-sqa&m-six-sigma-12 feb2017
Manish tripathi-sqa&m-six-sigma-12 feb2017Manish tripathi-sqa&m-six-sigma-12 feb2017
Manish tripathi-sqa&m-six-sigma-12 feb2017A P
 
Manish tripathi-sqa&m-11 feb2017 -
Manish tripathi-sqa&m-11 feb2017 -Manish tripathi-sqa&m-11 feb2017 -
Manish tripathi-sqa&m-11 feb2017 -A P
 
Manish tripathi-se-failure stories-12feb2017
Manish tripathi-se-failure stories-12feb2017Manish tripathi-se-failure stories-12feb2017
Manish tripathi-se-failure stories-12feb2017A P
 
Manish tripathi-se-cmm-5 feb2017
Manish tripathi-se-cmm-5 feb2017Manish tripathi-se-cmm-5 feb2017
Manish tripathi-se-cmm-5 feb2017A P
 
Manish tripathi-principals-of-management-book-review-21 august2015
Manish tripathi-principals-of-management-book-review-21 august2015Manish tripathi-principals-of-management-book-review-21 august2015
Manish tripathi-principals-of-management-book-review-21 august2015A P
 
Manish tripathi-mis-erp-failure-14 august2016
Manish tripathi-mis-erp-failure-14 august2016Manish tripathi-mis-erp-failure-14 august2016
Manish tripathi-mis-erp-failure-14 august2016A P
 
Manish tripathi-itim-incident-management
Manish tripathi-itim-incident-managementManish tripathi-itim-incident-management
Manish tripathi-itim-incident-managementA P
 
Manish tripathi-itim-dr
Manish tripathi-itim-drManish tripathi-itim-dr
Manish tripathi-itim-drA P
 
Manish tripathi-innovation-trends-19 august2016
Manish tripathi-innovation-trends-19 august2016Manish tripathi-innovation-trends-19 august2016
Manish tripathi-innovation-trends-19 august2016A P
 
Manish tripathi-innovation-26 august2016
Manish tripathi-innovation-26 august2016Manish tripathi-innovation-26 august2016
Manish tripathi-innovation-26 august2016A P
 
Manish tripathi-innovation-5 august2016
Manish tripathi-innovation-5 august2016Manish tripathi-innovation-5 august2016
Manish tripathi-innovation-5 august2016A P
 
Manish tripathi-innovation-2 septembre2016
Manish tripathi-innovation-2 septembre2016Manish tripathi-innovation-2 septembre2016
Manish tripathi-innovation-2 septembre2016A P
 
Manish tripathi-i-15-18-19-bia-ppt-oil-gas
Manish tripathi-i-15-18-19-bia-ppt-oil-gasManish tripathi-i-15-18-19-bia-ppt-oil-gas
Manish tripathi-i-15-18-19-bia-ppt-oil-gasA P
 
Manish tripathi-hrm-leadership-6 april2016
Manish tripathi-hrm-leadership-6 april2016Manish tripathi-hrm-leadership-6 april2016
Manish tripathi-hrm-leadership-6 april2016A P
 
Manish tripathi-gst-27 august2016
Manish tripathi-gst-27 august2016Manish tripathi-gst-27 august2016
Manish tripathi-gst-27 august2016A P
 

Plus de A P (20)

Social media-workplace-13 september2015
Social media-workplace-13 september2015Social media-workplace-13 september2015
Social media-workplace-13 september2015
 
Rote wto-it-diffusion
Rote wto-it-diffusionRote wto-it-diffusion
Rote wto-it-diffusion
 
Role of-wto-in-promoting-un-sustainable-development-goals
Role of-wto-in-promoting-un-sustainable-development-goalsRole of-wto-in-promoting-un-sustainable-development-goals
Role of-wto-in-promoting-un-sustainable-development-goals
 
Microsoft nokia-acquisition
Microsoft nokia-acquisitionMicrosoft nokia-acquisition
Microsoft nokia-acquisition
 
Manish tripathi-tcs-financial-management-9 october2016
Manish tripathi-tcs-financial-management-9 october2016Manish tripathi-tcs-financial-management-9 october2016
Manish tripathi-tcs-financial-management-9 october2016
 
Manish tripathi-sqa&m-six-sigma-12 feb2017
Manish tripathi-sqa&m-six-sigma-12 feb2017Manish tripathi-sqa&m-six-sigma-12 feb2017
Manish tripathi-sqa&m-six-sigma-12 feb2017
 
Manish tripathi-sqa&m-11 feb2017 -
Manish tripathi-sqa&m-11 feb2017 -Manish tripathi-sqa&m-11 feb2017 -
Manish tripathi-sqa&m-11 feb2017 -
 
Manish tripathi-se-failure stories-12feb2017
Manish tripathi-se-failure stories-12feb2017Manish tripathi-se-failure stories-12feb2017
Manish tripathi-se-failure stories-12feb2017
 
Manish tripathi-se-cmm-5 feb2017
Manish tripathi-se-cmm-5 feb2017Manish tripathi-se-cmm-5 feb2017
Manish tripathi-se-cmm-5 feb2017
 
Manish tripathi-principals-of-management-book-review-21 august2015
Manish tripathi-principals-of-management-book-review-21 august2015Manish tripathi-principals-of-management-book-review-21 august2015
Manish tripathi-principals-of-management-book-review-21 august2015
 
Manish tripathi-mis-erp-failure-14 august2016
Manish tripathi-mis-erp-failure-14 august2016Manish tripathi-mis-erp-failure-14 august2016
Manish tripathi-mis-erp-failure-14 august2016
 
Manish tripathi-itim-incident-management
Manish tripathi-itim-incident-managementManish tripathi-itim-incident-management
Manish tripathi-itim-incident-management
 
Manish tripathi-itim-dr
Manish tripathi-itim-drManish tripathi-itim-dr
Manish tripathi-itim-dr
 
Manish tripathi-innovation-trends-19 august2016
Manish tripathi-innovation-trends-19 august2016Manish tripathi-innovation-trends-19 august2016
Manish tripathi-innovation-trends-19 august2016
 
Manish tripathi-innovation-26 august2016
Manish tripathi-innovation-26 august2016Manish tripathi-innovation-26 august2016
Manish tripathi-innovation-26 august2016
 
Manish tripathi-innovation-5 august2016
Manish tripathi-innovation-5 august2016Manish tripathi-innovation-5 august2016
Manish tripathi-innovation-5 august2016
 
Manish tripathi-innovation-2 septembre2016
Manish tripathi-innovation-2 septembre2016Manish tripathi-innovation-2 septembre2016
Manish tripathi-innovation-2 septembre2016
 
Manish tripathi-i-15-18-19-bia-ppt-oil-gas
Manish tripathi-i-15-18-19-bia-ppt-oil-gasManish tripathi-i-15-18-19-bia-ppt-oil-gas
Manish tripathi-i-15-18-19-bia-ppt-oil-gas
 
Manish tripathi-hrm-leadership-6 april2016
Manish tripathi-hrm-leadership-6 april2016Manish tripathi-hrm-leadership-6 april2016
Manish tripathi-hrm-leadership-6 april2016
 
Manish tripathi-gst-27 august2016
Manish tripathi-gst-27 august2016Manish tripathi-gst-27 august2016
Manish tripathi-gst-27 august2016
 

Dernier

How Software Developers Destroy Business Value.pptx
How Software Developers Destroy Business Value.pptxHow Software Developers Destroy Business Value.pptx
How Software Developers Destroy Business Value.pptxAaron Stannard
 
W.H.Bender Quote 62 - Always strive to be a Hospitality Service professional
W.H.Bender Quote 62 - Always strive to be a Hospitality Service professionalW.H.Bender Quote 62 - Always strive to be a Hospitality Service professional
W.H.Bender Quote 62 - Always strive to be a Hospitality Service professionalWilliam (Bill) H. Bender, FCSI
 
internship thesis pakistan aeronautical complex kamra
internship thesis pakistan aeronautical complex kamrainternship thesis pakistan aeronautical complex kamra
internship thesis pakistan aeronautical complex kamraAllTops
 
Information Technology Project Management, Revised 7th edition test bank.docx
Information Technology Project Management, Revised 7th edition test bank.docxInformation Technology Project Management, Revised 7th edition test bank.docx
Information Technology Project Management, Revised 7th edition test bank.docxssuserf63bd7
 
Persuasive and Communication is the art of negotiation.
Persuasive and Communication is the art of negotiation.Persuasive and Communication is the art of negotiation.
Persuasive and Communication is the art of negotiation.aruny7087
 
The Psychology Of Motivation - Richard Brown
The Psychology Of Motivation - Richard BrownThe Psychology Of Motivation - Richard Brown
The Psychology Of Motivation - Richard BrownSandaliGurusinghe2
 
Marketing Management 16th edition by Philip Kotler test bank.docx
Marketing Management 16th edition by Philip Kotler test bank.docxMarketing Management 16th edition by Philip Kotler test bank.docx
Marketing Management 16th edition by Philip Kotler test bank.docxssuserf63bd7
 
Beyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentBeyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentNimot Muili
 
Gautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Gautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelGautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Gautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNitya salvi
 
International Ocean Transportation p.pdf
International Ocean Transportation p.pdfInternational Ocean Transportation p.pdf
International Ocean Transportation p.pdfAlejandromexEspino
 
Siliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime Siliguri
Siliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime SiliguriSiliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime Siliguri
Siliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime Siligurimeghakumariji156
 
digital Human resource management presentation.pdf
digital Human resource management presentation.pdfdigital Human resource management presentation.pdf
digital Human resource management presentation.pdfArtiSrivastava23
 
Safety T fire missions army field Artillery
Safety T fire missions army field ArtillerySafety T fire missions army field Artillery
Safety T fire missions army field ArtilleryKennethSwanberg
 

Dernier (14)

How Software Developers Destroy Business Value.pptx
How Software Developers Destroy Business Value.pptxHow Software Developers Destroy Business Value.pptx
How Software Developers Destroy Business Value.pptx
 
W.H.Bender Quote 62 - Always strive to be a Hospitality Service professional
W.H.Bender Quote 62 - Always strive to be a Hospitality Service professionalW.H.Bender Quote 62 - Always strive to be a Hospitality Service professional
W.H.Bender Quote 62 - Always strive to be a Hospitality Service professional
 
internship thesis pakistan aeronautical complex kamra
internship thesis pakistan aeronautical complex kamrainternship thesis pakistan aeronautical complex kamra
internship thesis pakistan aeronautical complex kamra
 
Information Technology Project Management, Revised 7th edition test bank.docx
Information Technology Project Management, Revised 7th edition test bank.docxInformation Technology Project Management, Revised 7th edition test bank.docx
Information Technology Project Management, Revised 7th edition test bank.docx
 
Persuasive and Communication is the art of negotiation.
Persuasive and Communication is the art of negotiation.Persuasive and Communication is the art of negotiation.
Persuasive and Communication is the art of negotiation.
 
The Psychology Of Motivation - Richard Brown
The Psychology Of Motivation - Richard BrownThe Psychology Of Motivation - Richard Brown
The Psychology Of Motivation - Richard Brown
 
Marketing Management 16th edition by Philip Kotler test bank.docx
Marketing Management 16th edition by Philip Kotler test bank.docxMarketing Management 16th edition by Philip Kotler test bank.docx
Marketing Management 16th edition by Philip Kotler test bank.docx
 
Beyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentBeyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable development
 
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTECAbortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
 
Gautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Gautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot ModelGautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Gautam Buddh Nagar Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
International Ocean Transportation p.pdf
International Ocean Transportation p.pdfInternational Ocean Transportation p.pdf
International Ocean Transportation p.pdf
 
Siliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime Siliguri
Siliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime SiliguriSiliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime Siliguri
Siliguri Escorts Service Girl ^ 9332606886, WhatsApp Anytime Siliguri
 
digital Human resource management presentation.pdf
digital Human resource management presentation.pdfdigital Human resource management presentation.pdf
digital Human resource management presentation.pdf
 
Safety T fire missions army field Artillery
Safety T fire missions army field ArtillerySafety T fire missions army field Artillery
Safety T fire missions army field Artillery
 

Manish tripathi-ea-dw-bi

  • 1. Business Intelligence, Data Warehousing Data Marts, Data Mining Presented by Mr. Manish Tripathi ( I – 15-18-19) Thakur Institute of Management Studies & Research (Sunday 26 March, 2017) 1
  • 3. WHAT IS BUSINESS INTELLIGENCE? • BI is a technology-driven process for analyzing data and presenting actionable information to help corporate executives, business managers and other end users make more informed business decisions • BI encompasses a wide variety of tools, applications and methodologies that enable organizations to collect data from internal systems and external sources • Prepare it for analysis, develop and run queries against the data, and create reports, dashboards and data visualizations to make the analytical results available to corporate decision makers as well as operational workers 3
  • 4. WHAT IS BUSINESS INTELLIGENCE? • BI technologies provide historical, current and predictive views of business operations • Identifying new opportunities and implementing an effective strategy based on insights can provide businesses with a competitive market advantage and long-term stability • Business intelligence can be used to support a wide range of business decisions ranging from operational to strategic 4
  • 5. BENEFITS OF BUSINESS INTELLIGENCE • The potential benefits of business intelligence programs include accelerating and improving decision making; optimizing internal business processes; increasing operational efficiency; driving new revenues; and gaining competitive advantages over business rivals. • It removes guesswork • Gives quicker responses to your business-related queries • Obtain important business metrics reports whenever and wherever you need them 5
  • 6. BENEFITS OF BUSINESS INTELLIGENCE • Gain a better understanding of business’ past, present and future • Gain valuable insight into your customer’s behaviour • Pinpoint up-selling as well as cross-selling opportunities • Develop efficiency 6
  • 7. Intelligent Value Chain Networks Results in Business Intelligence 7
  • 9. BUSINESS INTELLIGENCE TOOLS • SAP Crystal Reports • SAS Enterprise BI Server • Oracle Business Intelligence Enterprise Edition Plus • IBM Cognos 8 BI • Microsoft PowerPivot • MicroStrategy Reporting Suite • Salesforce CRM • TIBCO Spotfire Analytics • Information Builders WebFOCUS 9
  • 10. GARTNER 2016 MAGIC QUADRANT FOR BUSINESS INTELLIGENCE 10
  • 12. WHAT IS DATA WAREHOUSING? • A data warehouse is a federated repository for all the data that an enterprise's various business systems collect • It is a collection of corporate information and data derived from operational systems and external data sources • A data warehouse is designed to support business decisions by allowing data consolidation, analysis and reporting at different aggregate levels 12
  • 13. MOST POPULAR DATA WAREHOUSING DEFINITIONS Ralph Kimball • A data warehouse is a copy of transaction data specifically structured for query and analysis Bill Inmon • A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process 13
  • 14. Properties of Data Warehousing 14
  • 15. Subject-Oriented A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject 15
  • 16. Integrated A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product 16
  • 17. Time-Variant Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept 17
  • 18. Non-volatile Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered 18
  • 19. Purpose of Data Warehousing 19
  • 20. Purpose of Data Warehousing • Keeping Analysis/Reporting and Production Separate • Information Integration from multiple systems- Single point source for information • Data Consistency and Quality • High Response Time- Production Databases are tuned to expected transaction load • High Response time- Normalized Data vs. Dimensional Modeling • Establish the foundation for Decision Support • Maintain data history, even if the source transaction systems do not 20
  • 21. Difference between Data Warehousing and normal Database 21
  • 22. Data Warehousing vs. normal Database 1- SIZE Data warehouses are potentially much bigger than the databases from where the data is derived. Databases usually store only the data that is currently in active use; older records can be purged and moved to backups, mainly for performance reasons. Data warehouses are used to store much older historical records; it's also common to use data warehouses to store additional information that is bought or captured elsewhere to complement the information that is generated and stored by the internal database system 22
  • 23. Data Warehousing vs. normal Database 2- Normalization Databases are usually normalized, which means that a lot of work is done to guarantee that there's a unique copy of any given bit of information, which is important for performance and consistency reasons. But it's common to store different versions of the same information on a data warehouse, using different structures to compose and access the information. In other words, data warehouses are messier and more irregular, partly by design, as they need to be able to work with so many different sources of information 23
  • 24. Data Warehousing vs. normal Database 3- Access pattern Database records are often retrieved and updated one by one; data warehouses are nearly always acessed by reporting engines that work on entire datasets at a time to generate aggregates and other analytical information. Databases are frequently updated, sometimes only a field or record at a time; data warehouses aren't updated very frequently, and for all practical purposes, never at the field or record level; instead data is appended in large batches 24
  • 25. Data Warehousing vs. normal Database 4- Use Normal databases are used for OLTP whereas data warehousing is used for OLAP 25
  • 26. Data Warehousing vs. normal Database 5- Performance For normal database performance is important and optimized for write operation. Whereas for data warehouse performance is not critical and optimized for read operations. 26
  • 27. Data Warehousing vs. normal Database 6- Table & Joins For normal database the tables and joins are complex since they are normalized (for RDMS). This is done to reduce redundant data and to save storage space. Whereas for data warehouse for the Tables and joins are simple since they are de-normalized. This is done to reduce the response time for analytical queries. 27
  • 28. Data Warehousing vs. normal Database 7- Data source For normal database mostly internal data sources are used. Whereas for data warehouse external data sources may also be used like macro economic indicators, competitor data, market data, etc. 28
  • 29. DATA WAREHOUSING PRODUCTS • Teradata EDW (enterprise data warehouse) • Oracle Exadata • Amazon Redshift • Cloudera Enterprise Data Hub (EDH) • Marklogic • IBM Netezza data warehouse appliance • SAP Business Warehouse • MS SQL Parallel Data Warehouse 29
  • 30. GARTNER 2016 MAGIC QUADRANT FOR DATA WAREHOUSE 30
  • 31. 7 STEPS IN BUILDING DATA WAREHOUSE (MANAGEMENT VIEW) • Step 1: Determine Business Objectives • Step 2: Collect and Analyze Information • Step 3: Identify Core Business Processes • Step 4: Construct a Conceptual Data Model • Step 5: Locate Data Sources and Plan Data Transformations • Step 6: Set Tracking Duration • Step 7: Implement the Plan 31
  • 32. 3 STEPS IN BUILDING DATA WAREHOUSE (TECHNICAL VIEW) • Extract • Transform • Load 32
  • 35. DATA MART • The data mart is a subset of the data warehouse and is usually oriented to a specific business line or team • A data mart is a repository of data that is designed to serve a particular community of knowledge workers • Because data marts are optimized to look at data in a unique way, the design process tends to start with an analysis of user needs • Today, data virtualization software can be used to create virtual data marts, pulling data from disparate sources and combining it with other data as necessary to meet the needs of specific business users 35
  • 36. DATA MART • A virtual data mart provides knowledge workers with access to the data they need while preventing data silos and giving the organization's data management team a level of control over the organization's data throughout its lifecycle 36
  • 37. REASONS FOR CREATING A DATA MART • Easy access to frequently needed data • Creates collective view by a group of users • Improves end-user response time • Ease of creation • Lower cost than implementing a full data warehouse • Potential users are more clearly defined than in a full data warehouse • Contains only business essential data and is less cluttered. 37
  • 40. DATA LAKE • A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed • A data lake uses a flat architecture to store data • Each data element in a lake is assigned a unique identifier and tagged with a set of extended metadata tags • When a business question arises, the data lake can be queried for relevant data, and that smaller set of data can then be analyzed to help answer the question • The term data lake is often associated with Hadoop- oriented object storage • In such a scenario, an organization's data is first loaded into the Hadoop platform, and then business analytics and data mining tools are applied to the data where it resides on Hadoop's cluster nodes 40
  • 42. DATA MART VS. DATA WAREHOUSE 1- Data Scope The first, and most obvious difference is the information scope each one stores. On one hand, data warehouses save all kinds of data related to system. On the other hand, data marts just store specific subject information, becoming much more focused on these functionalities. 42
  • 43. DATA MART VS. DATA WAREHOUSE 2- Size We can say that a data warehouse is usually much bigger than data marts, because it keeps a lot more data. 43
  • 44. DATA MART VS. DATA WAREHOUSE 3-Integration A data warehouse usually integrates several sources of data in order to feed its database and the system’s needs. In opposite, a data mart has a lot less integration to do, since its data is very specific 44
  • 45. DATA MART VS. DATA WAREHOUSE 4- Data Scope The first, and most obvious difference is the information scope each one stores. On one hand, data warehouses save all kinds of data related to system. On the other hand, data marts just store specific subject information, becoming much more focused on these functionalities. 45
  • 46. DATA MART VS. DATA WAREHOUSE 5- Creation Creating a data warehouse is way more difficult and time consuming than building a data mart. Building all the structure, a relationships between data, its a long and very important step. Plus we need to think and analyse how we will integrate all of the information sources. Since data marts are smaller and subject oriented, these actions tend to be much simpler. 46
  • 47. DATA MART VS. DATA WAREHOUSE 6-Management Like creation, the management of data warehouses is far more complex than data marts. For the same reasons, it is obvious that when we have a lot more data, relationships, processes to manage, it becomes a harder task. 47
  • 48. DATA MART VS. DATA WAREHOUSE 7- Cost In overall, in terms of cost, data marts are cheaper than data warehouse. To build and maintain a data warehouse we need significantly more physical resources like servers, disk space, memory and CPU. Due to the complexity of the systems, a data mart requires less time to build and operate.48
  • 49. DATA MART VS. DATA WAREHOUSE 8- Performance The performance of a system always depends on how it is built, the infrastructure which supports it, the processes, the number of users, etc. Usually a data mart is faster than a data warehouse because of the inherited complexity and large data. 49
  • 51. MULTIDIMENSIONAL ANALYSIS • Multi-Dimensional Analysis is an Informational Analysis on data which takes into account many different relationships, each of which represents a dimension • For example, a retail analyst may want to understand the relationships among sales by region, by quarter, by demographic distribution (income, education level, gender), by product • Multi-dimensional analysis will yield results for these complex relationships 51
  • 52. MULTIDIMENSIONAL ANALYSIS • Multi-dimensional Data Analysis (MDDA) refers to the process of summarizing data across multiple levels (called dimensions) and then presenting the results in a multi-dimensional grid format • This process is also referred to as OLAP cube, Data Pivot., Decision Cube, and Crosstab 52
  • 53. OLAP CUBE • An OLAP cube is a multidimensional database that is optimized for data warehouse and online analytical processing (OLAP) applications • An OLAP cube is a method of storing data in a multidimensional form, generally for reporting purposes • In OLAP cubes, data are categorized by dimensions • OLAP cubes are often pre-summarized across dimensions to drastically improve query time over relational databases 53
  • 56. WHAT IS DATA MINING? • Data mining is the practice of automatically searching large stores of data to discover patterns and trends that go beyond simple analysis • Data mining uses sophisticated mathematical algorithms to segment the data and evaluate the probability of future events • It is the process of finding anomalies, patterns and correlations within large data sets to predict outcomes • The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use • Also known as Knowledge Discovery in Data (KDD) 56
  • 57. The phases, and the iterative nature, of a data mining project. The process flow shows that a data mining project does not stop when a particular solution is deployed. The results of data mining trigger new business questions, which in turn can be used to develop more focused models. 57
  • 58. 1- PROBLEM DEFINITION • This initial phase of a data mining project focuses on understanding the project objectives and requirements. Once we have specified the project from a business perspective, we can formulate it as a data mining problem and develop a preliminary implementation plan. • For example, the business problem might be: "How can I sell more of my product to customers?" You might translate this into a data mining problem such as: "Which customers are most likely to purchase the product?" 58
  • 59. 2- Data Gathering and Preparation The data understanding phase involves data collection and exploration. As you take a closer look at the data, you can determine how well it addresses the business problem. You might decide to remove some of the data or add additional data. This is also the time to identify data quality problems and to scan for patterns in the data. 59
  • 60. 3- Model Building and Evaluation In this phase, you select and apply various modeling techniques and calibrate the parameters to optimal values. If the algorithm requires data transformations, you will need to step back to the previous phase to implement them. 60
  • 61. 4- Knowledge Deployment • Knowledge deployment is the use of data mining within a target environment • In the deployment phase, insight and actionable information can be derived from data 61
  • 63. Data Mining Models • A mining model is created by applying an algorithm to data • it is a set of data, statistics, and patterns that can be applied to new data to generate predictions and make inferences about relationships • A data mining model gets data from a mining structure and then analyzes that data by using a data mining algorithm • The mining structure and mining model are separate objects • The mining structure stores information that defines the data source 63
  • 64. Data Mining Models • A mining model stores information derived from statistical processing of the data, such as the patterns found as a result of analysis • A mining model is empty until the data provided by the mining structure has been processed and analyzed. • After a mining model has been processed, it contains metadata, results, and bindings back to the mining structure • Model contains metadata, patterns, and bindings 64
  • 67. Data Mining Algorithms • An algorithm in data mining is a set of heuristics and calculations that creates a model from data • To create a model, the algorithm first analyzes the data you provide, looking for specific types of patterns or trends • The algorithm uses the results of this analysis over many iterations to find the optimal parameters for creating the mining model • These parameters are then applied across the entire data set to extract actionable patterns and detailed statistics. 67
  • 68. Data Mining Algorithms • The mining model that an algorithm creates from your data can take various forms, including: 1. A set of clusters that describe how the cases in a dataset are related 2. A decision tree that predicts an outcome, and describes how different criteria affect that outcome 3. A mathematical model that forecasts sales 4. A set of rules that describe how products are grouped together in a transaction, and the probabilities that products are purchased together 68
  • 70. Data mining tools (Analytics) 70
  • 71. 71