SlideShare une entreprise Scribd logo
1  sur  12
Data Warehousing – Dimensions | Star and
                    Snowflake Schemas




Eric Matthews - DataWithUs
Defining Some Key Terms
 Dimension
    • Data Element
    • Categorizes each item in a data set
    • Provides Structured Labeling/Tagging
    • Dimensions can consist of hierarchies. For example: Date |
      Month, Quarter, Year
    • Dimension tables contain appropriate foreign keys to join
      to fact tables.
 Dimension – Primary Role
    • Data Filtering
    • Data Grouping
    • Data Labeling

 Fact
    • Measures, Counted, or aggregate event. For example:
      Sales, Admissions, Blood Pressure, Inventory can all be
      construed as “facts”
    • Fact Tables contain appropriate joining keys
Defining Some Key Terms (continued)
 Conformed Dimension
    • Common set of data structures/attributes
    • Can cut across many facts, but…
    • The row headers in an answer must be able to exactly
      match, or…
    • Can be an exact subset



 These definitions will come into brighter light as we look at some
 examples.
Star Schema



   • Most atomic form of dimension modeling

   • Consists of dimension table(s) modeled around a fact table

   • Optimized for querying large data sets
Star Schema
                  Logical                Dimension Table
                                          Patient
Dimension Table                           Demographics
 Date/Time

                            Fact Table


                               Keys
                                           Dimension Table
                              Facts          Referring
Dimension Table                              Physician
  Insurance
  Carrier
Star Schema – Talking Points for Next Diagram
Note: Have original table schema as point of reference.


  • Discuss aggregation from source table to fact table rolling
    up totals (How this needed to be done).
  • Discuss the notion of rolling up fact tables to create other
    fact tables (use account type, financial class, and service
    code columns in the fact table for basis of discussion)
  • Discuss some of the pitfalls of dimension tables by using
    the physician dimension as an example (example:
    Physicians can change jobs)
  • Discuss the Date Dimension from the perspective of the
    data in the table… which transitions us to a key point…

  …which is similar to how one needs to resolve foreign keys in
  reporting the dimension table is a table form of the same
  concept.

  Additionally, If one has well defined master data then populating
  the dimension tables can be done using a columnar subset of the
  source master data table.
Fact Table: Acct Fin Rollup
Dimension Table
Date                                                      Dimension Table
                             ACCT_NUM                     Patient
 WEEK                        ACCT_PTPTR
 YEAR                                                       ACCT_PTPTR
                             ACCT_GUARANTOR_ID              PATIENT_NAME
 QUARTER                     ACCT_REFERRING_MD
 MONTH                                                      CITY
                             ACCT_START_DATE                STATE
                             ACCT_END_DATE                  ZIP
                             PLAN_SEQ1
                             ACCT_TYPE
   Dimension Table           FC
   Insurance Plan/Carrier    HOSPITAL_SERVICE_CODE
    PLAN_SEQ1
    PLAN_NAME                TOT_TOTAL_CHARGES
                                                          Dimension Table
    CARRIER                  TOT_TOTAL_PAYMENTS
                                                          Referring Physician
    CITY                     TOT_TOTAL_ADJUSTMENTS
                             TOT_BALANCE                   ACCT_REFERRING_MD
    STATE
                                                           PHYSICIAN_NAME
    ZIP
                                                           AFFILIATION
                                                           AFFILIATION_CITY
                                                           AFFILIATION_STATE
                                                           AFFILIATION_ZIP
Snowflake Schema
    • Think Star Schema where the dimension tables are
      normalized

    • Can be used to segregate rows in dimension tables that
      have a high percentage of null data (for faster lookup, you
      cannot index null )
Snowflake Schema



       Fact Table

    product_key


                    Dimension Table
    Units            product_key
    Cost Per Unit    supplier_key

                      Product Info    Dimension Table
                                       supplier_key

                                        Supplier Info
Conformed Dimension
  A conformed dimension is a set of data attributes that have been
  physically implemented in multiple tables using the same structure. A
  conformed dimension can be applied to different fact tables. For
  example:

 Dimension Table
    Patient
    Demographics
    (Gender, Age)
                                                  Fact Table
                                                     Hypertension
                                                     Studies
Note: The classic example for
a conformed dimension is                          Fact Table
date. I wanted to offer a
different example.                                   Lab Results


                                                  Fact Table
                                                    Diabetes
                                                    Assessment
Transition to Next Point of Discussion

  Star and Snowflake schemas are optimized for
  querying large data sets.

  They should support:
      • OLAP cubes
      • Business Intelligence and Analytic Applications
      • Ad hoc queries
The End

Contenu connexe

Tendances

Tendances (20)

Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
OLTP vs OLAP
OLTP vs OLAPOLTP vs OLAP
OLTP vs OLAP
 
OLAP v/s OLTP
OLAP v/s OLTPOLAP v/s OLTP
OLAP v/s OLTP
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse Fundamentals
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Architecture of data mining system
Architecture of data mining systemArchitecture of data mining system
Architecture of data mining system
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 

En vedette

Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRyan Andhavarapu
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_modelClaudia Gomez
 
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...Perficient, Inc.
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schemaSayed Ahmed
 
Best Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyBest Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyWhereScape
 
Difference between star schema and snowflake schema
Difference between star schema and snowflake schemaDifference between star schema and snowflake schema
Difference between star schema and snowflake schemaUmar Ali
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)yesheeka
 
How business analysts are catalysts for business change
How business analysts are catalysts for business changeHow business analysts are catalysts for business change
How business analysts are catalysts for business changePatrick Van Renterghem
 
Information Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideInformation Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideDan D'Angelo
 
3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)Patrick Van Renterghem
 
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Patrick Van Renterghem
 
Pedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPatrick Van Renterghem
 
Smarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattSmarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattVincent Kwon
 
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Patrick Van Renterghem
 

En vedette (20)

Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_model
 
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Best Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyBest Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse Quickly
 
Difference between star schema and snowflake schema
Difference between star schema and snowflake schemaDifference between star schema and snowflake schema
Difference between star schema and snowflake schema
 
Star schema
Star schemaStar schema
Star schema
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Snowflakes for christmas
Snowflakes for christmasSnowflakes for christmas
Snowflakes for christmas
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
 
Star schema
Star schemaStar schema
Star schema
 
Dw case study
Dw case studyDw case study
Dw case study
 
How business analysts are catalysts for business change
How business analysts are catalysts for business changeHow business analysts are catalysts for business change
How business analysts are catalysts for business change
 
Information Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideInformation Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference Guide
 
3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)
 
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
 
Trends for 2014
Trends for 2014Trends for 2014
Trends for 2014
 
Pedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup Presentation
 
Smarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattSmarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D Watt
 
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
 

Similaire à Warehousing dimension star-snowflake_schemas

(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdfMobeenMasoudi
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
First Steps to Define Grain
First Steps to Define GrainFirst Steps to Define Grain
First Steps to Define GrainRyan Casey
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdfMobeenMasoudi
 
Performance management capability
Performance management capabilityPerformance management capability
Performance management capabilitydesigner DATA
 
Editingglossary
EditingglossaryEditingglossary
EditingglossaryRubiah69
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tablechirag patil
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfJayanti Pande
 
IDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIntisarAhmad5
 
Meta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaMeta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaData Portal India
 
Lecture 1: NBERMetrics
Lecture 1: NBERMetricsLecture 1: NBERMetrics
Lecture 1: NBERMetricsNBER
 

Similaire à Warehousing dimension star-snowflake_schemas (20)

(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf
 
Data Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptxData Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptx
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
First Steps to Define Grain
First Steps to Define GrainFirst Steps to Define Grain
First Steps to Define Grain
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Data modelling interview question
Data modelling interview questionData modelling interview question
Data modelling interview question
 
1234
12341234
1234
 
Dw concepts
Dw conceptsDw concepts
Dw concepts
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf
 
Performance management capability
Performance management capabilityPerformance management capability
Performance management capability
 
Editingglossary
EditingglossaryEditingglossary
Editingglossary
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact table
 
19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdf
 
IDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptx
 
Meta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaMeta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform India
 
Lecture 1: NBERMetrics
Lecture 1: NBERMetricsLecture 1: NBERMetrics
Lecture 1: NBERMetrics
 
ITReady DW Day2
ITReady DW Day2ITReady DW Day2
ITReady DW Day2
 

Dernier

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Dernier (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Warehousing dimension star-snowflake_schemas

  • 1. Data Warehousing – Dimensions | Star and Snowflake Schemas Eric Matthews - DataWithUs
  • 2. Defining Some Key Terms Dimension • Data Element • Categorizes each item in a data set • Provides Structured Labeling/Tagging • Dimensions can consist of hierarchies. For example: Date | Month, Quarter, Year • Dimension tables contain appropriate foreign keys to join to fact tables. Dimension – Primary Role • Data Filtering • Data Grouping • Data Labeling Fact • Measures, Counted, or aggregate event. For example: Sales, Admissions, Blood Pressure, Inventory can all be construed as “facts” • Fact Tables contain appropriate joining keys
  • 3. Defining Some Key Terms (continued) Conformed Dimension • Common set of data structures/attributes • Can cut across many facts, but… • The row headers in an answer must be able to exactly match, or… • Can be an exact subset These definitions will come into brighter light as we look at some examples.
  • 4. Star Schema • Most atomic form of dimension modeling • Consists of dimension table(s) modeled around a fact table • Optimized for querying large data sets
  • 5. Star Schema Logical Dimension Table Patient Dimension Table Demographics Date/Time Fact Table Keys Dimension Table Facts Referring Dimension Table Physician Insurance Carrier
  • 6. Star Schema – Talking Points for Next Diagram Note: Have original table schema as point of reference. • Discuss aggregation from source table to fact table rolling up totals (How this needed to be done). • Discuss the notion of rolling up fact tables to create other fact tables (use account type, financial class, and service code columns in the fact table for basis of discussion) • Discuss some of the pitfalls of dimension tables by using the physician dimension as an example (example: Physicians can change jobs) • Discuss the Date Dimension from the perspective of the data in the table… which transitions us to a key point… …which is similar to how one needs to resolve foreign keys in reporting the dimension table is a table form of the same concept. Additionally, If one has well defined master data then populating the dimension tables can be done using a columnar subset of the source master data table.
  • 7. Fact Table: Acct Fin Rollup Dimension Table Date Dimension Table ACCT_NUM Patient WEEK ACCT_PTPTR YEAR ACCT_PTPTR ACCT_GUARANTOR_ID PATIENT_NAME QUARTER ACCT_REFERRING_MD MONTH CITY ACCT_START_DATE STATE ACCT_END_DATE ZIP PLAN_SEQ1 ACCT_TYPE Dimension Table FC Insurance Plan/Carrier HOSPITAL_SERVICE_CODE PLAN_SEQ1 PLAN_NAME TOT_TOTAL_CHARGES Dimension Table CARRIER TOT_TOTAL_PAYMENTS Referring Physician CITY TOT_TOTAL_ADJUSTMENTS TOT_BALANCE ACCT_REFERRING_MD STATE PHYSICIAN_NAME ZIP AFFILIATION AFFILIATION_CITY AFFILIATION_STATE AFFILIATION_ZIP
  • 8. Snowflake Schema • Think Star Schema where the dimension tables are normalized • Can be used to segregate rows in dimension tables that have a high percentage of null data (for faster lookup, you cannot index null )
  • 9. Snowflake Schema Fact Table product_key Dimension Table Units product_key Cost Per Unit supplier_key Product Info Dimension Table supplier_key Supplier Info
  • 10. Conformed Dimension A conformed dimension is a set of data attributes that have been physically implemented in multiple tables using the same structure. A conformed dimension can be applied to different fact tables. For example: Dimension Table Patient Demographics (Gender, Age) Fact Table Hypertension Studies Note: The classic example for a conformed dimension is Fact Table date. I wanted to offer a different example. Lab Results Fact Table Diabetes Assessment
  • 11. Transition to Next Point of Discussion Star and Snowflake schemas are optimized for querying large data sets. They should support: • OLAP cubes • Business Intelligence and Analytic Applications • Ad hoc queries