SlideShare une entreprise Scribd logo
1  sur  23
Slowly Changing
Dimension: Categories
By:
Prof. Sunita Sahu
Assistant Prof, VESIT,Mumbai
Slowly Changing Dimension: Categories
 Dimensions that change slowly over time, rather
than changing on regular schedule, time-base.
 In Data Warehouse there is a need to track
changes in dimension attributes in order to report
historical data.
 The usual changes to dimension tables are
classified into three types
 Type 1
 Type 2
 Type 3
2
Example
3
Order fact
Product Key
Time Key
Customer Key
Salesperson Key
Order Dollars
Cost Dollars
Margin Dollars
Sale Units
Customer
Customer Key
Customer Name
Customer Code
Martial Status
Address
State
Zip
Salesperson
Salesperson Key
Salesperson
Name
Territory Name
Region Name
Product
Product Key
Product Name
Product Code
Product Line
Brand
Time
Time Key
Date
Month
Quarter
Year
Type 1 Changes: Error Correction
 Usually relate to corrections of errors in the source
system.
 For example, the customer dimension: change in
name because of spelling mistake
4
Type 1 Changes, cont.
General Principles for Type 1 changes:
 Usually, the changes relate to correction of errors in
the source system
 Sometimes the change in the source system has no
significance
 The old value in the source system needs to be
discarded
 The change in the source system need not be
preserved in the DWH
5
Applying Type 1 changes
 Overwrite the attribute value in the dimension table
row with the new value
 The old value of the attribute is not preserved
 No other changes are made in the dimension table
row.
 The key of this dimension table or any other key
values are not affected.
 Easiest to implement.
6
 Before the change:
Customer_ID Customer_Name Customer_Type
1 Cust_1 Corporate
 After the change:
Customer_ID Customer_Name Customer_Type
1 Cust_1 Retail
Type 2 Changes:
 Let’s look at the martial status of customer.
 One the DWH’s requirements is to track orders by
martial status
 All changes before 11/10/2004 will be under Martial
Status = Single, and all changes after that date will be
under Martial Status = Married
 We need to aggregate the orders before and after the
marriage separately
8
Type 2 Changes, cont.
 General Principles for Type 2 changes:
 They usually relate to true changes in source
systems.
 There is a need to preserve history in the DWH.
 This type of change partitions the history in the DWH.
 Every change for the same attributes must be
preserved.
9
Type 2 Implementation
 The steps:
 Add a new dimension table row with the new value of
the changed attribute
 An effective date will be included in the dimension
table
 There are no changes to the original row in the
dimension table
 The key of the original row is not affected
 The new row is inserted with a new surrogate key
10
 Before the change:

Custo
mer_ID
Customer_N
ame
Customer_T
ype
Start_Date End_Date
1 Cust_1 Corporate 22-07-2010
31-12-9999
Custo
mer_ID
Customer_N
ame
Customer_T
ype
Start_Date End_Date
1 Cust_1 Corporate 22-07-2010
31-12-9999
2 Cust_1 Retail 22-07-2010 31-12-9999
Type 2 Example
Type 3 Changes
 Type 3 Slowly Changing Dimension, there will be two
columns to indicate the particular attribute of interest, one
indicating the original value, and one indicating the current
value.
 There will also be a column that indicates when the current
value becomes active.
 Not common at all
 Time-consuming
 We want to track history without lifting heavy burden.
 There are many soft changes and we don’t care for the
“far” history
12
Type 3 Changes
 General Principles:
 They usually relate to “soft” or tentative changes in
the source systems
 There is a need to keep track of history with old and
new values of the changes attribute
 They are used to compare performances across the
transition
 They provide the ability to track forward and backward
13
Type 3
 No new dimension row is needed
 The existing queries will seamlessly switch to the
current value.
 Any queries that need to use the old value must be
revised accordingly.
 The technique works best for one soft change at a
time.
 If there is a succession of changes, more
sophisticated techniques must be advised
14
Customer Key Name State
1001 Williams New York
 After Williams moved from New York to Los Angeles, the
original information gets updated, and we have the following
table (assuming the effective date of change is February 20,
2010):
Customer Key Name Original State Current State Effective Date
1001 Williams New York Los Angeles 20-FEB-2010

Type 3
 Advantages
 This does not increase the size of the table, since new
information is updated.
 This allows us to keep some part of history.
 Disadvantages
 Type 3 will not be able to keep all history where an attribute is
changed more than once. For example, if Williams later
moves to Texas on December 15, 2003, the Los Angeles
information will be lost.
Type 3
Large Dimension Table
 Dimension table is large based on two factors.
 very deep: that is, the dimension has a very large
number of rows.
 Very wide: that is, the dimension may have a large
number of attributes or columns.
 In a data warehouse, typically the customer and
product dimensions are likely to be large.
 Such customer dimension tables may have as
many as 100 million rows.
 The product dimension of large retailers is also
quite huge.
Junk Dimension
 The junk dimension is simply a structure that provides a convenient
place to store the junk attributes. It is just a collection of random
transactional codes, flags and/or text attributes that are unrelated to
any particular dimension.
 In OLTP tables that are full of flag fields and yes/no attributes, many
of which are used for operational support and have no
documentation except for the column names and the memory banks
of the person who created them. Not only do those types of attributes
not integrate easily into conventional dimensions such as Customer,
Vendor, Time, Location, and Product, but you also don’t want to carry
bad design into the data warehouse.However, some of the
miscellaneous attributes will contain data that has significant
business value, so you have to do something with them.

Junk Dimension
 Advantage of junk dimension:
 It provides a recognizable location for related codes,
indicators and their descriptors in a dimensional
framework.
 This avoids the creation of multiple dimension tables.
 Provide a smaller, quicker point of entry for queries
compared to performance when these attributes are
directly in the fact table.
 An interesting use for a junk dimension is to capture the
context of a specific transaction. While our common,
conformed dimensions contain the key dimensional
attributes of interest, there are likely attributes about the
transaction that are not known until the transaction is
processed.

Junk Dimension
Rapidly Changing Dimensions
 If one or more of its attributes changes frequently.
 when you deal with a type 2 change, you create an
additional dimension table row with the new value of
the changed attribute. By doing so, you are able to
preserve the history.
 consider customer dimension. Here the number of
rows tends to be large, sometimes in the range of
even a million or more rows. But significant attributes
in a customer dimension may change many timesin a
year. Rapidly changing large dimensions can be too
problematic for the type 2 approach.
Rapidly Changing Dimensions
 One effective approach is to break the large
dimension table into one or more simpler dimension
tables. How can you accomplish this?
 Obviously, you need to break off the rapidly
changing attributes into another dimension table,
leaving the slowly changing attributes behind in the
original table.
Solution to rapidly changing dimension
 Large dimensions call for special considerations.
 Because of the sheer size, many data warehouse
functions involving large dimensions may be slow
and inefficient.
 You need to address the following issues by using
effective design methods, by choosing proper
indexes, and by applying other optimizing
techniques:

Contenu connexe

Tendances

Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Kernel Training
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowTreasure Data, Inc.
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaionsridhark1981
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Date and Timestamp Types In Snowflake (By Faysal Shaarani)
Date and Timestamp Types In Snowflake (By Faysal Shaarani)Date and Timestamp Types In Snowflake (By Faysal Shaarani)
Date and Timestamp Types In Snowflake (By Faysal Shaarani)Faysal Shaarani (MBA)
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesMark Kromer
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Dimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | TypesDimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | Typesumair saeed
 
2. Entity Relationship Model in DBMS
2. Entity Relationship Model in DBMS2. Entity Relationship Model in DBMS
2. Entity Relationship Model in DBMSkoolkampus
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Power BI Data Modeling.pdf
Power BI Data Modeling.pdfPower BI Data Modeling.pdf
Power BI Data Modeling.pdfVishnuGone
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQLRam Kedem
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schemaSayed Ahmed
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Modeling Basics
Data Modeling BasicsData Modeling Basics
Data Modeling Basicsrenuindia
 

Tendances (20)

Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaion
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Date and Timestamp Types In Snowflake (By Faysal Shaarani)
Date and Timestamp Types In Snowflake (By Faysal Shaarani)Date and Timestamp Types In Snowflake (By Faysal Shaarani)
Date and Timestamp Types In Snowflake (By Faysal Shaarani)
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Dimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | TypesDimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | Types
 
2. Entity Relationship Model in DBMS
2. Entity Relationship Model in DBMS2. Entity Relationship Model in DBMS
2. Entity Relationship Model in DBMS
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Power BI Data Modeling.pdf
Power BI Data Modeling.pdfPower BI Data Modeling.pdf
Power BI Data Modeling.pdf
 
SQL Basics
SQL BasicsSQL Basics
SQL Basics
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQL
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Modeling Basics
Data Modeling BasicsData Modeling Basics
Data Modeling Basics
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
2.0 sql data types for my sql, sql server
2.0 sql data types for my sql, sql server2.0 sql data types for my sql, sql server
2.0 sql data types for my sql, sql server
 

En vedette

Informatica reusable mapplett_date4day
Informatica reusable mapplett_date4dayInformatica reusable mapplett_date4day
Informatica reusable mapplett_date4daydba3003
 
Business requirements gathering for bi
Business requirements gathering for biBusiness requirements gathering for bi
Business requirements gathering for biCorey Dayhuff
 
ETIS10 - BI Business Requirements - Presentation
ETIS10 - BI Business Requirements - PresentationETIS10 - BI Business Requirements - Presentation
ETIS10 - BI Business Requirements - PresentationDavid Walker
 
Writing software requirement document
Writing software requirement documentWriting software requirement document
Writing software requirement documentSunita Sahu
 
Olap fundamentals
Olap fundamentalsOlap fundamentals
Olap fundamentalsAmit Sharma
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overviewashok kumar
 
Informatica Power Center 7.1
Informatica Power Center 7.1Informatica Power Center 7.1
Informatica Power Center 7.1ganblues
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Business Analysis Fundamentals – Writing Good Business Requirements
Business Analysis Fundamentals – Writing Good Business RequirementsBusiness Analysis Fundamentals – Writing Good Business Requirements
Business Analysis Fundamentals – Writing Good Business RequirementsInterpro
 
Gathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business RequirementsGathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business RequirementsWynyard Group
 
MS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data miningMS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data miningDataminingTools Inc
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualizationlesterathayde
 
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau -  Data, Graphs, Filters, Dashboards and Advanced featuresLearning Tableau -  Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced featuresVenkata Reddy Konasani
 
Tableau presentation
Tableau presentationTableau presentation
Tableau presentationkt166212
 

En vedette (20)

2 designer
2 designer2 designer
2 designer
 
Informatica reusable mapplett_date4day
Informatica reusable mapplett_date4dayInformatica reusable mapplett_date4day
Informatica reusable mapplett_date4day
 
Intelligent BI
Intelligent BIIntelligent BI
Intelligent BI
 
A New Approach to Defining BI Requirements
A New Approach to Defining BI RequirementsA New Approach to Defining BI Requirements
A New Approach to Defining BI Requirements
 
Business requirements gathering for bi
Business requirements gathering for biBusiness requirements gathering for bi
Business requirements gathering for bi
 
Seminar datawarehouse @ Universitas Multimedia Nusantara
Seminar datawarehouse @ Universitas Multimedia NusantaraSeminar datawarehouse @ Universitas Multimedia Nusantara
Seminar datawarehouse @ Universitas Multimedia Nusantara
 
ETIS10 - BI Business Requirements - Presentation
ETIS10 - BI Business Requirements - PresentationETIS10 - BI Business Requirements - Presentation
ETIS10 - BI Business Requirements - Presentation
 
IBM Netezza
IBM NetezzaIBM Netezza
IBM Netezza
 
Writing software requirement document
Writing software requirement documentWriting software requirement document
Writing software requirement document
 
Olap fundamentals
Olap fundamentalsOlap fundamentals
Olap fundamentals
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Informatica Power Center 7.1
Informatica Power Center 7.1Informatica Power Center 7.1
Informatica Power Center 7.1
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Business Analysis Fundamentals – Writing Good Business Requirements
Business Analysis Fundamentals – Writing Good Business RequirementsBusiness Analysis Fundamentals – Writing Good Business Requirements
Business Analysis Fundamentals – Writing Good Business Requirements
 
In-Memory DataBase
In-Memory DataBaseIn-Memory DataBase
In-Memory DataBase
 
Gathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business RequirementsGathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business Requirements
 
MS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data miningMS SQL SERVER: Olap cubes and data mining
MS SQL SERVER: Olap cubes and data mining
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualization
 
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau -  Data, Graphs, Filters, Dashboards and Advanced featuresLearning Tableau -  Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
 
Tableau presentation
Tableau presentationTableau presentation
Tableau presentation
 

Similaire à Slowly changing dimension

Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Adapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologiesAdapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologiesTom Breur
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
Adapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologiesAdapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologiesbboyina
 
Technical Presentation - TimeWIzard
Technical Presentation - TimeWIzardTechnical Presentation - TimeWIzard
Technical Presentation - TimeWIzardPraveen Kumar Peddi
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecyclebartlowe
 
Change managementtraining
Change managementtrainingChange managementtraining
Change managementtrainingmelaku sebsbie
 
Change managementtraining
Change managementtrainingChange managementtraining
Change managementtrainingsaranyasanjay
 
Change Management Training
Change Management TrainingChange Management Training
Change Management TrainingFelix Cabo Jr.
 
Massmaintenance
MassmaintenanceMassmaintenance
MassmaintenanceDavid Chan
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13AnwarrChaudary
 
Survey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data WarehousesSurvey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data WarehousesEtisalat
 

Similaire à Slowly changing dimension (20)

Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Adapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologiesAdapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologies
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Cs437 lecture 7-8
Cs437 lecture 7-8Cs437 lecture 7-8
Cs437 lecture 7-8
 
Adapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologiesAdapting data warehouse architecture to benefit from agile methodologies
Adapting data warehouse architecture to benefit from agile methodologies
 
Technical Presentation - TimeWIzard
Technical Presentation - TimeWIzardTechnical Presentation - TimeWIzard
Technical Presentation - TimeWIzard
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecycle
 
Group - 9 Final Deliverable
Group - 9 Final DeliverableGroup - 9 Final Deliverable
Group - 9 Final Deliverable
 
Data modelling interview question
Data modelling interview questionData modelling interview question
Data modelling interview question
 
Change managementtraining
Change managementtrainingChange managementtraining
Change managementtraining
 
Change managementtraining
Change managementtrainingChange managementtraining
Change managementtraining
 
Change Management Training
Change Management TrainingChange Management Training
Change Management Training
 
Massmaintenance
MassmaintenanceMassmaintenance
Massmaintenance
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Survey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data WarehousesSurvey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data Warehouses
 
BI Suite Overview
BI Suite OverviewBI Suite Overview
BI Suite Overview
 
Chapter 6.pptx
Chapter 6.pptxChapter 6.pptx
Chapter 6.pptx
 

Plus de Sunita Sahu

Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed SystemSunita Sahu
 
Writing software requirement document
Writing software requirement documentWriting software requirement document
Writing software requirement documentSunita Sahu
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure callSunita Sahu
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed systemSunita Sahu
 
Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables Sunita Sahu
 
Attacks in MANET
Attacks in MANETAttacks in MANET
Attacks in MANETSunita Sahu
 

Plus de Sunita Sahu (6)

Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
 
Writing software requirement document
Writing software requirement documentWriting software requirement document
Writing software requirement document
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure call
 
Clock synchronization in distributed system
Clock synchronization in distributed systemClock synchronization in distributed system
Clock synchronization in distributed system
 
Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables
 
Attacks in MANET
Attacks in MANETAttacks in MANET
Attacks in MANET
 

Dernier

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniquesugginaramesh
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 

Dernier (20)

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 

Slowly changing dimension

  • 1. Slowly Changing Dimension: Categories By: Prof. Sunita Sahu Assistant Prof, VESIT,Mumbai
  • 2. Slowly Changing Dimension: Categories  Dimensions that change slowly over time, rather than changing on regular schedule, time-base.  In Data Warehouse there is a need to track changes in dimension attributes in order to report historical data.  The usual changes to dimension tables are classified into three types  Type 1  Type 2  Type 3 2
  • 3. Example 3 Order fact Product Key Time Key Customer Key Salesperson Key Order Dollars Cost Dollars Margin Dollars Sale Units Customer Customer Key Customer Name Customer Code Martial Status Address State Zip Salesperson Salesperson Key Salesperson Name Territory Name Region Name Product Product Key Product Name Product Code Product Line Brand Time Time Key Date Month Quarter Year
  • 4. Type 1 Changes: Error Correction  Usually relate to corrections of errors in the source system.  For example, the customer dimension: change in name because of spelling mistake 4
  • 5. Type 1 Changes, cont. General Principles for Type 1 changes:  Usually, the changes relate to correction of errors in the source system  Sometimes the change in the source system has no significance  The old value in the source system needs to be discarded  The change in the source system need not be preserved in the DWH 5
  • 6. Applying Type 1 changes  Overwrite the attribute value in the dimension table row with the new value  The old value of the attribute is not preserved  No other changes are made in the dimension table row.  The key of this dimension table or any other key values are not affected.  Easiest to implement. 6
  • 7.  Before the change: Customer_ID Customer_Name Customer_Type 1 Cust_1 Corporate  After the change: Customer_ID Customer_Name Customer_Type 1 Cust_1 Retail
  • 8. Type 2 Changes:  Let’s look at the martial status of customer.  One the DWH’s requirements is to track orders by martial status  All changes before 11/10/2004 will be under Martial Status = Single, and all changes after that date will be under Martial Status = Married  We need to aggregate the orders before and after the marriage separately 8
  • 9. Type 2 Changes, cont.  General Principles for Type 2 changes:  They usually relate to true changes in source systems.  There is a need to preserve history in the DWH.  This type of change partitions the history in the DWH.  Every change for the same attributes must be preserved. 9
  • 10. Type 2 Implementation  The steps:  Add a new dimension table row with the new value of the changed attribute  An effective date will be included in the dimension table  There are no changes to the original row in the dimension table  The key of the original row is not affected  The new row is inserted with a new surrogate key 10
  • 11.  Before the change:  Custo mer_ID Customer_N ame Customer_T ype Start_Date End_Date 1 Cust_1 Corporate 22-07-2010 31-12-9999 Custo mer_ID Customer_N ame Customer_T ype Start_Date End_Date 1 Cust_1 Corporate 22-07-2010 31-12-9999 2 Cust_1 Retail 22-07-2010 31-12-9999 Type 2 Example
  • 12. Type 3 Changes  Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value.  There will also be a column that indicates when the current value becomes active.  Not common at all  Time-consuming  We want to track history without lifting heavy burden.  There are many soft changes and we don’t care for the “far” history 12
  • 13. Type 3 Changes  General Principles:  They usually relate to “soft” or tentative changes in the source systems  There is a need to keep track of history with old and new values of the changes attribute  They are used to compare performances across the transition  They provide the ability to track forward and backward 13
  • 14. Type 3  No new dimension row is needed  The existing queries will seamlessly switch to the current value.  Any queries that need to use the old value must be revised accordingly.  The technique works best for one soft change at a time.  If there is a succession of changes, more sophisticated techniques must be advised 14
  • 15. Customer Key Name State 1001 Williams New York  After Williams moved from New York to Los Angeles, the original information gets updated, and we have the following table (assuming the effective date of change is February 20, 2010): Customer Key Name Original State Current State Effective Date 1001 Williams New York Los Angeles 20-FEB-2010  Type 3
  • 16.  Advantages  This does not increase the size of the table, since new information is updated.  This allows us to keep some part of history.  Disadvantages  Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Williams later moves to Texas on December 15, 2003, the Los Angeles information will be lost. Type 3
  • 17. Large Dimension Table  Dimension table is large based on two factors.  very deep: that is, the dimension has a very large number of rows.  Very wide: that is, the dimension may have a large number of attributes or columns.  In a data warehouse, typically the customer and product dimensions are likely to be large.  Such customer dimension tables may have as many as 100 million rows.  The product dimension of large retailers is also quite huge.
  • 18. Junk Dimension  The junk dimension is simply a structure that provides a convenient place to store the junk attributes. It is just a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension.  In OLTP tables that are full of flag fields and yes/no attributes, many of which are used for operational support and have no documentation except for the column names and the memory banks of the person who created them. Not only do those types of attributes not integrate easily into conventional dimensions such as Customer, Vendor, Time, Location, and Product, but you also don’t want to carry bad design into the data warehouse.However, some of the miscellaneous attributes will contain data that has significant business value, so you have to do something with them. 
  • 19. Junk Dimension  Advantage of junk dimension:  It provides a recognizable location for related codes, indicators and their descriptors in a dimensional framework.  This avoids the creation of multiple dimension tables.  Provide a smaller, quicker point of entry for queries compared to performance when these attributes are directly in the fact table.  An interesting use for a junk dimension is to capture the context of a specific transaction. While our common, conformed dimensions contain the key dimensional attributes of interest, there are likely attributes about the transaction that are not known until the transaction is processed. 
  • 21. Rapidly Changing Dimensions  If one or more of its attributes changes frequently.  when you deal with a type 2 change, you create an additional dimension table row with the new value of the changed attribute. By doing so, you are able to preserve the history.  consider customer dimension. Here the number of rows tends to be large, sometimes in the range of even a million or more rows. But significant attributes in a customer dimension may change many timesin a year. Rapidly changing large dimensions can be too problematic for the type 2 approach.
  • 22. Rapidly Changing Dimensions  One effective approach is to break the large dimension table into one or more simpler dimension tables. How can you accomplish this?  Obviously, you need to break off the rapidly changing attributes into another dimension table, leaving the slowly changing attributes behind in the original table.
  • 23. Solution to rapidly changing dimension  Large dimensions call for special considerations.  Because of the sheer size, many data warehouse functions involving large dimensions may be slow and inefficient.  You need to address the following issues by using effective design methods, by choosing proper indexes, and by applying other optimizing techniques: