SlideShare une entreprise Scribd logo
1  sur  23
Rameswara Reddy.K.V
 Traditional OLTP not designed for DWH
 DWH projects  data model, database schema
 Data model  Analysis but poor performance
 Model schema
 The schema methodology that is gaining wide spread
acceptance for dwh is Star Schema.
Data Layout for Best Access
 All industries have developed considerable expertise in
implementing efficient operational systems such as
payroll, inventory tracking, and purchasing.
 The original objectives in developing an abstract
model known as the relational model
 The data warehouse RDBMS typically needs to process
queries that are large complex, ad hoc, and data
intensive. so data warehouse RDBMS are very different
,it use a database schema for maximizing concurrency
and optimizing insert, update, and delete
performance.
 For solving modern business problems such as market
analysis and financial forecasting requires query-
centric database schemas that are array oriented and
multidimensional in nature.
Multi dimensional data model
 How much revenue did the new product generate?
 How much revenue did the new product generate by
month, in the north eastern division, broken down by
user demographic, by sales office, relative to the
previous version of the product, compare with the
plan?
 The major database technology for DWH is RDBMS.
Star Schema
 The multi dimensional view of data that is expressed
using relational database semantics is provided by the
database schema design called star schema.
Two groups: facts and dimensions
Facts are the core data element being analyzed
e.g.. items sold
dimensions are attributes about the facts
e.g. date of purchase
DBA View point
 A star schema is a relational schema organized around
a central table joined to few smaller tables (dimension
tables)using foreign key references.
 The facts are typically additive and are accessed via
dimensions.
 Dimension tables are heavily indexed
 A star schema created for every industry
Potential performance problem
with star schemas
 Indexing
 Level Indicators
 Star schema design
 Star schema join problem
Indexing
 It improve the performance in the star schema design.
 The table in star schema design contain the entire
hierarchy of attributes(PERIOD dimension this
hierarchy could be day->week->month->quarter-
>year),one approach is to create multi part key of day,
week, month ,quarter ,year .it presents some problems
in the star schema model because it should be in
normalized
problem
 it require multiple metadata definitions
 Since the fact table must carry all key components as
part of its primary key, addition or deletion of levels in
the physical modification of the affected table.
 Carrying all the segments of the compound
dimensional key in the fact table increases the size of
the index, thus impacting both performance and
scalability.
Level Indicator
 Another potential problem with the star schema
design is that in order to navigate the dimensions
successfully.
 The dimensional table design includes a level of
hierarchy indicator for every record.
 Every query that is retrieving detail records from a
table that stores details & aggregates must use this
indicator as an additional constraint to obtain a correct
result.
Pairwise problem:
 The traditional OLAP RDBMS engines are not designed for the set of complex
queries that can be issued against a star schema.
 We need to retrieve related information from several tables in a single query
has a limitation
 Many OLTP RDBMS can join only two tables, so the complex query must be in
the RDBMS needs to break the query into series of pairwise joins. And also
provide the intermediate result. this process will be do up to end of the result
 Thus intermediate results can be large & very costly to create. It affects the
query performance. The number of ways to pairwise join a set of N tables is N!(
N factorial) for example.
 The query has five table 5!=(6x5x4x3x2x1)=120 combinations.
Star schema join problem:
 Because the number of pairwise join combinations is often too
large to fully evaluate many RDBMS optimizers limit the
selection on the basis of a particular criterion. In data
warehousing environment this strategy is very inefficient for star
schemas.
 In a star schema the only table directly related to most other
tables in the fact table this means that the fact table is natural
candidate for the first pairwise join.
 Unfortunately the fact table is typically the very largest table in
the query. A pairwise join order generates very large intermediate
result set. So it affects query performance.
Solution to performance
problems:
 A common optimization provides some relief for the
star schema join problem.
 The basic idea of this optimization is to get around the
pairwise join strategy of selecting only related tables.
 When two tables are joined and no columns ―link‖
the tables every combination of two tables’ rows are
produced. In terms of relational algebra this is called a
Cartesian product.
Contd..
 Normally the RDBMS optimizer logic would never
consider Cartesian product but star schema
considering these Cartesian products. Can sometimes
improve query.
 Alternatively it is a Cartesian product of all dimension
tables is first generated.
 The key cost is the need to generate the Cartesian
product of the dimension tables as long as the cost of
generating the Cartesian product is less than the cost
of generating intermediate results with the fact table.
Star Join and Star Index
A STARjoin is a high-speed, single-pass, parallelizable multitable join and is
introduced by Red Brick’s RDBMS.
 STARindexes to accelerate the join performance
 STARindexes are created in one or more foreign key columns of a fact
table.
 Traditional multicolumn references a single table where as the
STARindex can reference multiple tables
 With multicolumn indexes, if a query’s WHERE Clause does
not contain on all the columns in the composite index, the
index cannot be fully used unless the specified columns are a
leading subset.
 The STARjoin using STARindex could efficiently join
the dimension tables to the fact table without penalty
of generating the full Cartesian product.
 The STARjoin algorithm is able to generate a
Cartesian product in regions where these are rows of
interest and bypass generating Cartesian products over
region where these are no rows.
 10 to 20 times faster than traditional pairwise join
techniques
Bit mapped Indexing
Data is loaded into SYBASE IQ, it converts all data into
a series of bitmaps; which are them highly
compressed and stored in disk.
 SYBASE IQ indexes do not point to data stored
elsewhere all data is contained in the index structure.
Data Cardinality.
 Bitmap indexes are used to queries against low-cardinality data-that is data
in which the total number of potential values is relatively low.
 For example, state code data cardinality is 50 and gender cardinality is only
2(male and female). 
 For low cardinality data, each distinct value has its own bitmap index
consisting of a bit for every row in the table.
 The bit map index representation is a 10000 bit long vector which has its
bits turned ON (value of 1) for every record that satisfies “gender=”M”
condition.
 Bitmap indexes can become cumbersome and even unsuitable for high
cardinality data where the range of potential value is high.
 SYBASE IQ high cardinality index starts at 1000 distinct values.
Index Types.
The SYBASE IQ provides five index techniques. One is
a default index called the Fast projection index and the other is
either a low-or high-cardinality index.
Performance.
SYBASE IQ technology achieves very good performance
in ad hoc queries for several reasons.
 Bitwise Technology. This allows rapid response to queries
containing various data type, supports data aggregation and
grouping.
 Compression. SYBASE IQ uses sophisticated algorithm to
compress data into bitmapping SYBASE IQ can hold more
data in memory minimizing expensive I/O operations.
 Optimized memory-based processing: SYBASE IQ caches data columns in memory
according to the nature of user’s queries.
 Columnwise processing: SYBASE IQ scans columns not rows. For the low selectivity
queries (those that select only a few attributes from a multi attribute row) the technique of
scanning by columns drastically reduces the amount of data the engine has to search.
 Low Overhead: As an engine optimized for decision support, SYBASE IQ does not carry an
overhead associated with traditional OLTP designed RDBMS performance.
 Large Block I/O: Block size high in SYBASE IQ can be tuned from 512 bytes to 64 Kbytes,
so that the system can read as much information as necessary in a single I/O.
 Operating-system-level parallelism: SYBASE IQ breaks low-level operations like sorts,
bitmap manipulation, load and I/O into non blocking operation’s that the operating systems can
schedule independently and in parallel.
 Prejoin and ad hoc join Capabilities: SYBASE IQ allows users to take advantage of know
join relationships between tables by defining them in advance and building indexes between
tables.
Shortcoming of Indexing.
Some of the tradeoffs of the SYBASE IQ are as follows
 No Updates. SYBASE IQ does not support updates, and therefore is
unsuitable
 Lack of core RDBMS features. SYBASE IQ does not support all the
backup and recovery and also does not support stored procedures, data
integrity checker, data replication, complex data types.
 Less advantage for planned queries. SYBASE IQ advantages are
most obvious when running ad hoc queries.
 High memory Usage. SYBASE IQ takes advantage of available system
memory to avoid expensive I/O operations.
Column Local Storage
Performance in the data warehouse environment can be achieved
by storing data in memory in column wise instead to store one row at a
time and each row can be viewed and accessed as single record.
Emp-id Emp-Name Dept Salary
1004
1005
1006
Suresh
Mani
Sara
CSE
MECH
CIVIL
15000
25000
23000
1004 Suresh CSE
15000
1005 Mani
MECH 25000
1006 Sara CIVIL
23000
1004 1005 1006
Suresh Mani Sara
CSE MECH
CIVIL
15000 25000
23000

Contenu connexe

Tendances

Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data miningSlideshare
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data miningMITS Gwalior
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data miningkavitha muneeshwaran
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
Data Reduction
Data ReductionData Reduction
Data ReductionRajan Shah
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clusteringKrish_ver2
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesKrish_ver2
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data MiningIffat Firozy
 
Object Oriented Database Management System
Object Oriented Database Management SystemObject Oriented Database Management System
Object Oriented Database Management SystemAjay Jha
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationDataminingTools Inc
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligenceMinakshi Atre
 

Tendances (20)

Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data mining
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Data Reduction
Data ReductionData Reduction
Data Reduction
 
Spatial Database
Spatial DatabaseSpatial Database
Spatial Database
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
Data clustring
Data clustring Data clustring
Data clustring
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data Mining
 
Object Oriented Database Management System
Object Oriented Database Management SystemObject Oriented Database Management System
Object Oriented Database Management System
 
Data Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalizationData Mining: Data cube computation and data generalization
Data Mining: Data cube computation and data generalization
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligence
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 

Similaire à Dbms schemas for decision support

Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouseUday Kothari
 
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...inscit2006
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of DenormalizationAliya Saldanha
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09guest9d79e073
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Mark Ginnebaugh
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx政宏 张
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEEMEMTECHSTUDENTPROJECTS
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecturehasanshan
 
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)MILL5
 
Performance By Design
Performance By DesignPerformance By Design
Performance By DesignGuy Harrison
 
World2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewWorld2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewFarah Omer
 
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...Datavail
 

Similaire à Dbms schemas for decision support (20)

Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
 
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of Denormalization
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
MULTIMEDIA MODELING
MULTIMEDIA MODELINGMULTIMEDIA MODELING
MULTIMEDIA MODELING
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx
 
CS636-olap.ppt
CS636-olap.pptCS636-olap.ppt
CS636-olap.ppt
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecture
 
Database aggregation using metadata
Database aggregation using metadataDatabase aggregation using metadata
Database aggregation using metadata
 
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
 
Mr bi
Mr biMr bi
Mr bi
 
Performance By Design
Performance By DesignPerformance By Design
Performance By Design
 
World2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewWorld2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverview
 
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
 

Dernier

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 

Dernier (20)

Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Dbms schemas for decision support

  • 2.  Traditional OLTP not designed for DWH  DWH projects  data model, database schema  Data model  Analysis but poor performance  Model schema  The schema methodology that is gaining wide spread acceptance for dwh is Star Schema.
  • 3. Data Layout for Best Access  All industries have developed considerable expertise in implementing efficient operational systems such as payroll, inventory tracking, and purchasing.  The original objectives in developing an abstract model known as the relational model
  • 4.  The data warehouse RDBMS typically needs to process queries that are large complex, ad hoc, and data intensive. so data warehouse RDBMS are very different ,it use a database schema for maximizing concurrency and optimizing insert, update, and delete performance.  For solving modern business problems such as market analysis and financial forecasting requires query- centric database schemas that are array oriented and multidimensional in nature.
  • 5. Multi dimensional data model  How much revenue did the new product generate?  How much revenue did the new product generate by month, in the north eastern division, broken down by user demographic, by sales office, relative to the previous version of the product, compare with the plan?  The major database technology for DWH is RDBMS.
  • 6. Star Schema  The multi dimensional view of data that is expressed using relational database semantics is provided by the database schema design called star schema. Two groups: facts and dimensions Facts are the core data element being analyzed e.g.. items sold dimensions are attributes about the facts e.g. date of purchase
  • 7. DBA View point  A star schema is a relational schema organized around a central table joined to few smaller tables (dimension tables)using foreign key references.  The facts are typically additive and are accessed via dimensions.  Dimension tables are heavily indexed  A star schema created for every industry
  • 8. Potential performance problem with star schemas  Indexing  Level Indicators  Star schema design  Star schema join problem
  • 9. Indexing  It improve the performance in the star schema design.  The table in star schema design contain the entire hierarchy of attributes(PERIOD dimension this hierarchy could be day->week->month->quarter- >year),one approach is to create multi part key of day, week, month ,quarter ,year .it presents some problems in the star schema model because it should be in normalized
  • 10. problem  it require multiple metadata definitions  Since the fact table must carry all key components as part of its primary key, addition or deletion of levels in the physical modification of the affected table.  Carrying all the segments of the compound dimensional key in the fact table increases the size of the index, thus impacting both performance and scalability.
  • 11. Level Indicator  Another potential problem with the star schema design is that in order to navigate the dimensions successfully.  The dimensional table design includes a level of hierarchy indicator for every record.  Every query that is retrieving detail records from a table that stores details & aggregates must use this indicator as an additional constraint to obtain a correct result.
  • 12. Pairwise problem:  The traditional OLAP RDBMS engines are not designed for the set of complex queries that can be issued against a star schema.  We need to retrieve related information from several tables in a single query has a limitation  Many OLTP RDBMS can join only two tables, so the complex query must be in the RDBMS needs to break the query into series of pairwise joins. And also provide the intermediate result. this process will be do up to end of the result  Thus intermediate results can be large & very costly to create. It affects the query performance. The number of ways to pairwise join a set of N tables is N!( N factorial) for example.  The query has five table 5!=(6x5x4x3x2x1)=120 combinations.
  • 13. Star schema join problem:  Because the number of pairwise join combinations is often too large to fully evaluate many RDBMS optimizers limit the selection on the basis of a particular criterion. In data warehousing environment this strategy is very inefficient for star schemas.  In a star schema the only table directly related to most other tables in the fact table this means that the fact table is natural candidate for the first pairwise join.  Unfortunately the fact table is typically the very largest table in the query. A pairwise join order generates very large intermediate result set. So it affects query performance.
  • 14. Solution to performance problems:  A common optimization provides some relief for the star schema join problem.  The basic idea of this optimization is to get around the pairwise join strategy of selecting only related tables.  When two tables are joined and no columns ―link‖ the tables every combination of two tables’ rows are produced. In terms of relational algebra this is called a Cartesian product.
  • 15. Contd..  Normally the RDBMS optimizer logic would never consider Cartesian product but star schema considering these Cartesian products. Can sometimes improve query.  Alternatively it is a Cartesian product of all dimension tables is first generated.  The key cost is the need to generate the Cartesian product of the dimension tables as long as the cost of generating the Cartesian product is less than the cost of generating intermediate results with the fact table.
  • 16. Star Join and Star Index A STARjoin is a high-speed, single-pass, parallelizable multitable join and is introduced by Red Brick’s RDBMS.  STARindexes to accelerate the join performance  STARindexes are created in one or more foreign key columns of a fact table.  Traditional multicolumn references a single table where as the STARindex can reference multiple tables  With multicolumn indexes, if a query’s WHERE Clause does not contain on all the columns in the composite index, the index cannot be fully used unless the specified columns are a leading subset.
  • 17.  The STARjoin using STARindex could efficiently join the dimension tables to the fact table without penalty of generating the full Cartesian product.  The STARjoin algorithm is able to generate a Cartesian product in regions where these are rows of interest and bypass generating Cartesian products over region where these are no rows.  10 to 20 times faster than traditional pairwise join techniques
  • 18. Bit mapped Indexing Data is loaded into SYBASE IQ, it converts all data into a series of bitmaps; which are them highly compressed and stored in disk.  SYBASE IQ indexes do not point to data stored elsewhere all data is contained in the index structure.
  • 19. Data Cardinality.  Bitmap indexes are used to queries against low-cardinality data-that is data in which the total number of potential values is relatively low.  For example, state code data cardinality is 50 and gender cardinality is only 2(male and female).  For low cardinality data, each distinct value has its own bitmap index consisting of a bit for every row in the table.  The bit map index representation is a 10000 bit long vector which has its bits turned ON (value of 1) for every record that satisfies “gender=”M” condition.  Bitmap indexes can become cumbersome and even unsuitable for high cardinality data where the range of potential value is high.  SYBASE IQ high cardinality index starts at 1000 distinct values.
  • 20. Index Types. The SYBASE IQ provides five index techniques. One is a default index called the Fast projection index and the other is either a low-or high-cardinality index. Performance. SYBASE IQ technology achieves very good performance in ad hoc queries for several reasons.  Bitwise Technology. This allows rapid response to queries containing various data type, supports data aggregation and grouping.  Compression. SYBASE IQ uses sophisticated algorithm to compress data into bitmapping SYBASE IQ can hold more data in memory minimizing expensive I/O operations.
  • 21.  Optimized memory-based processing: SYBASE IQ caches data columns in memory according to the nature of user’s queries.  Columnwise processing: SYBASE IQ scans columns not rows. For the low selectivity queries (those that select only a few attributes from a multi attribute row) the technique of scanning by columns drastically reduces the amount of data the engine has to search.  Low Overhead: As an engine optimized for decision support, SYBASE IQ does not carry an overhead associated with traditional OLTP designed RDBMS performance.  Large Block I/O: Block size high in SYBASE IQ can be tuned from 512 bytes to 64 Kbytes, so that the system can read as much information as necessary in a single I/O.  Operating-system-level parallelism: SYBASE IQ breaks low-level operations like sorts, bitmap manipulation, load and I/O into non blocking operation’s that the operating systems can schedule independently and in parallel.  Prejoin and ad hoc join Capabilities: SYBASE IQ allows users to take advantage of know join relationships between tables by defining them in advance and building indexes between tables.
  • 22. Shortcoming of Indexing. Some of the tradeoffs of the SYBASE IQ are as follows  No Updates. SYBASE IQ does not support updates, and therefore is unsuitable  Lack of core RDBMS features. SYBASE IQ does not support all the backup and recovery and also does not support stored procedures, data integrity checker, data replication, complex data types.  Less advantage for planned queries. SYBASE IQ advantages are most obvious when running ad hoc queries.  High memory Usage. SYBASE IQ takes advantage of available system memory to avoid expensive I/O operations.
  • 23. Column Local Storage Performance in the data warehouse environment can be achieved by storing data in memory in column wise instead to store one row at a time and each row can be viewed and accessed as single record. Emp-id Emp-Name Dept Salary 1004 1005 1006 Suresh Mani Sara CSE MECH CIVIL 15000 25000 23000 1004 Suresh CSE 15000 1005 Mani MECH 25000 1006 Sara CIVIL 23000 1004 1005 1006 Suresh Mani Sara CSE MECH CIVIL 15000 25000 23000