SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Data warehousing logical design

        Mirjana Mazuran
      mazuran@elet.polimi.it



        December 15, 2009




                                  1/18
Outline




      Data Warehouse logical design
          ROLAP model
               star schema
               snowflake schema
      Exercise 1: wine company
      Exercise 2: real estate agency




                                       2/18
Introduction
Logical design




          Starting from the conceptual design it is necessary to determin
          the logical schema of data
          We use ROLAP (Relational On-Line Analytical Processing)
          model to represent multidimensional data
          ROLAP uses the relational data model, which means that
          data is stored in relations
          Given the DFM representation of multidimensional data, two
          schemas are used:
                 star schema
                 snowflake schema




                                                                            3/18
ROLAP
Star schema


         Each dimension is represented by a relation such that:
              the primary key of the relation is the primary key of the
              dimension
              the attributes of the relation describe all aggregation levels of
              the dimension
         A fact is represented by a relation such that:
              the primary key of the relation is the set of primary keys
              imported from all the dimension tables
              the attributes of the relation are the measures of the fact

    Pros and Cons
         few joins are needed during query execution
         dimension tables are denormalized
         denormalization introduces redundancy

                                                                                  4/18
ROLAP
Star schema: example


                 MaxAmount                     Time                           Policy
       Amount                 Class            IdTime                         IdPolicy
                                               Day                            Class
      Year                                     Month                          Amount
                                               Year                           MaxAmount
                            IdPolicy
         Month                                              Accident
                 Accident         Motivation                IdTime
                                                            IdPolicy
             NrOfAccidents                                  IdCustomer
      Day    Cost                                           IdMotivation
                                                            NrOfAccidents
                         IDCustomer                         Cost

                                               Customer                     Motivation
                                               IdCustomer                   IdMotivation
        BYear                City Region       Birthday                     Motivation
                                               Gender
                   Sex
                                               City
                                               Region


                                                                                           5/18
ROLAP
Snowflake schema
        Each (primary) dimension is represented by a relation:
             the primary key of the relation is the primary key of the
             dimension
             the attributes of the relation directly depend by the primary key
             a set of foreign keys is used to access information at different
             levels of aggregation. Such information is part of the
             secondary dimensions and is stored in dedicated relations
        A fact is represented by a relation such that:
             the primary key of the relation is the set of primary keys
             imported from all and only the primary dimension tables
             the attributes of the relation are the measures of the fact

    Pros and Cons
        denormalization is reduced
        less memory space is required
        a lot of joins can be required if they involve attributes in
        secondary dimension tables
                                                                                 6/18
ROLAP
Snowflake schema: example



                Category                        Time                       Furniture
                                                IdTime                     IdFurniture
             Type         Material
                                                Day                        Material
     Year                                       Month                      Type
                                                Year                       Category
                          IdFurniture
       Month                                                 Sale
                   Sale                                      IdTime
                                                             IdFurniture
              Quantity
                                                             IdCustomer
              Income
     Day      Discount                                       Quantity
                                                             Income
                      IDCustomer                             Discount

                              Region            Customer                     Place
                                                IdCustomer                   IdPlace
            BYear          City         State   IdPlace                      City
                                                BirthYear                    Region
                    Sex                         Sex                          State



                                                                                         7/18
Exercise 1
Wine company

    An online order wine company requires the designing of a
    datawarehouse to record the quantity and sales of its wines to its
    customers. Part of the original database is composed by the
    following tables:
    CUSTOMER (Code, Name, Address, Phone, BDay, Gender)
          WINE (Code, Name, Type, Vintage, BottlePrice, CasePrice,
               Class)
         CLASS (Code, Name, Region)
          TIME (TimeStamp, Date, Year)
        ORDER (Customer, Wine, Time, nrBottles, nrCases)
    Note that the tables represent the main entities of the ER schema,
    thus it is necessary to derive the significant relashionships among
    them in order to correctly design the data warehouse.

                                                                         8/18
Exercise 1: A possible solution
Snowflake schema

          FACT Sales
    MEASURES Quantity, Cost
    DIMENSIONS Customer, Area, Time, Wine → Class

          Customer                        Wine
         CustomerCode                     WineCode
         Birthday                         ClassCode
         Gender                           Type
                         SalesTable
                                          Vintage
                         WineCode
                         CustomerCode
                         OrderTimeCode
                         Quantity
         Time                             Class
                         Cost
         TimeCode                         ClassCode
         Date                             Region
         Year




                                                      9/18
Exercise 2
Real estate agency
    Let us consider the case of a real estate agency whose database is
    composed by the following tables:
        OWNER (IDOwner, Name, Surname, Address, City, Phone)
        ESTATE (IDEstate, IDOwner, Category, Area, City, Province,
                  Rooms, Bedrooms, Garage, Meters)
    CUSTOMER (IDCust, Name, Surname, Budget, Address, City,
                  Phone)
         AGENT (IDAgent, Name, Surname, Office, Address, City,
                  Phone)
       AGENDA (IDAgent, Data, Hour, IDEstate, ClientName)
           VISIT (IDEstate,IDAgent, IDCust, Date, Duration)
           SALE (IDEstate,IDAgent, IDCust, Date, AgreedPrice,
                  Status)
          RENT (IDEstate,IDAgent, IDCust, Date, Price, Status,
                  Time)
                                                                         10/18
Exercise 2
Real estate agency
          Goal:
               Provide a supervisor with an overview of the situation. The
               supervisor must have a global view of the business, in terms of
               the estates the agency deals with and of the agents’ work.
          Questions:
            1. Design a conceptual schema for the DW.
            2. What facts and dimensions do you consider?
            3. Design a Star Schema or Snowflake Schema for the DW.
          Write the following SQL queries:
               How many customers have visited properties of at least 3
               different categories?
               What is the average duration of visits per property category?
               Who has paid the highest price among the customers that
               have viewed properties of at least 3 different categories?
               Who has bought a flat for the highest price w.r.t. each month?
               What kind of property sold for the highest price w.r.t each city
               and month?
                                                                                  11/18
Exercise 2: A possible solution
Facts and dimensions
    Points 1 and 2 are left as homework. In particular it is required to
    discuss the facts of interest (with respect to your point of view)
    and then define, for each fact, the attribute tree, dimensions and
    measures with the corresponding glossary.
    The following ideas will be used during the solution of the exercise:
         supervisors should be able to control the sales of the agency
                FACT Sales
         MEASURES OfferPrice, AgreedPrice, Status
         DIMENSIONS EstateID, OwnerID, CustomerID, AgentID,
                       TimeID
         supervisors should be able to control the work of the agents by
         analyzing the visits to the estates, which the agents are in
         charge of
                FACT Viewing
         MEASURES Duration
         DIMENSIONS EstateID, CustomerID, AgentID, TimeID
                                                                            12/18
Exercise 2: A possible solution
Star schema
       Estate                           Owner
       EstateID          SalesTable     OwnerID
       Category          EstateID       Name
       Area              OwnerID        Surname
       City              CustomerID     Address
       Province          AgentID        City
       Rooms             TimeID         Phone
       Bedrooms          OfferPrice
                                         Agent
       Garage            AgreedPrice
                                        AgentID
       Meters            Status
                                        Name
       Sheet
                                        Surname
       Map
                                        Office
                                        Address
        Customer                        City
       CustomerID                       Phone
                         ViewingTable
       Name
                         EstateID
       Surname                           Time
                         CustomerID
       Budget                           TimeID
                         AgentID
       Address                          Day
                         TimeID
       City                             Month
                         Duration
       Phone                            Year
                                                  13/18
Exercise 2: A possible solution
Snowflake schema
       Estate            SalesTable      Time
      EstateID          EstateID        TimeID
      PlaceID           TimeID          Day
      Category          CustomerID      Month
      Rooms              OwnerID        Year
      Bedrooms          AgentID         Owner
      Garage            OfferPrice      OwnerID
      Meters            AgreedPrice     PlaceID
      Sheet             Status          Name
      Map                               Surname
       Customer          ViewingTable
                                        Address
      CustomerID        TimeID
                                        Phone
      PlaceID           EstateID
                                         Agent
      Name               CustomerID
                                        AgentID
      Surname           AgentID         PlaceID
      Budget            Duration        Name
      Address                           Surname
                         Place
      Phone                             Office
                        PlaceID
                                        Address
                        City
                                        Phone
                        Province
                        Area

                                                  14/18
Exercise 2: A possible solution
SQL queries wrt the star schema



          How many customers have visited properties of at least 3
          different categories?

          CREATE VIEW Cust3Cat AS
             SELECT V.CustomerID
             FROM ViewingTable V, Estate E
             WHERE V.EstateID = E.EstateID
             GROUP BY V.CustomerID
             HAVING COUNT(DISTINCT E.Category) >=3

          SELECT COUNT(*)
          FROM Cust3Cat



                                                                     15/18
Exercise 2: A possible solution
SQL queries wrt the star schema
          Who has paid the highest price among the customers that
          have viewed properties of at least 3 different categories?
          SELECT C.CustomerID
          FROM Cust3Cat C, SalesTable S
          WHERE C.CustID = S.CustID AND S.AgreedPrice IN
          (SELECT MAX(S.AgreedPrice)
                    FROM Cust3Cat C1, SalesTable S1
                    WHERE C1.CustomerID = S1.CustomerID)

          Average duration of visits per property category
          SELECT E.Category, AVG(V.Duration)
          FROM ViewingTable V, Estate E
          WHERE V.EstateID = E.EstateID
          GROUP BY E.Category
                                                                      16/18
Exercise 2: A possible solution
SQL queries wrt the star schema


          Who has bought a flat for the highest price w.r.t. each month?

          SELECT S.CustomerID, T.Month, T.Year, S.AgreedPrice
          FROM SalesTable S, Estate E, Time T
          WHERE S.EstateID = E.EstateID AND S.TimeID =
          T.TimeID AND E.Category = “flat” AND (T.Month,
          T.Year, S.AgreedPrice) IN (
               SELECT T1.Month, T1.Year,
               MAX(S1.AgreedPrice)
               FROM SalesTable S1, Estate E1, Time T1
               WHERE S1.EstateID = E1.EstateID AND
               S1.TimeID = T1.TimeID AND E1.Category =
               “flat”
               GROUP BY T1.Month, T1.Year

                                                                          17/18
Exercise 2: A possible solution
SQL queries wrt the star schema
          What kind of property sold for the highest price w.r.t each city
          and month?
          SELECT E.Category, E.City, T.Moth, T.Year,
                 E.AgreedPrice
          FROM SalesTable S, Time T, Estate E
          WHERE S.TimeID = T.TimeID AND E.EstateID =
          S.EstateID AND (P.AgreedPrice, P.City, T.month,
          T.year) IN (
                    SELECT MAX(E1.AgreedPrice), E1.City,
                    T1.Month, T1.Year)
                    FROM SalesTable S1, Time T1, Estate E1
                    WHERE S1.TimeID = T1.TimeID AND
                    E1.EstateID = S1.EstateID
                    GROUP BY T.Month, T.Year, E.City)

                                                                             18/18

Contenu connexe

Tendances

Portuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing CampaignPortuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing CampaignRehan Akhtar
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceMahir Haque
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Seerat Malik
 
Introduction to Data Visualization
Introduction to Data Visualization Introduction to Data Visualization
Introduction to Data Visualization Ana Jofre
 
Subject Lines that Convert: A review of 100+ successful subject lines reveals...
Subject Lines that Convert: A review of 100+ successful subject lines reveals...Subject Lines that Convert: A review of 100+ successful subject lines reveals...
Subject Lines that Convert: A review of 100+ successful subject lines reveals...MarketingExperiments
 
Data Visualization Techniques
Data Visualization TechniquesData Visualization Techniques
Data Visualization TechniquesAllAnalytics
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
 
Intro to Tableau Public
Intro to Tableau PublicIntro to Tableau Public
Intro to Tableau PublicDUSPviz
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional modelGersiton Pila Challco
 
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...Wil van der Aalst
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendSalah Amean
 

Tendances (20)

Fraud and Risk in Big Data
Fraud and Risk in Big DataFraud and Risk in Big Data
Fraud and Risk in Big Data
 
Portuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing CampaignPortuguese Bank - Direct Marketing Campaign
Portuguese Bank - Direct Marketing Campaign
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 
Data Wrangling
Data WranglingData Wrangling
Data Wrangling
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Introduction to Data Visualization
Introduction to Data Visualization Introduction to Data Visualization
Introduction to Data Visualization
 
Subject Lines that Convert: A review of 100+ successful subject lines reveals...
Subject Lines that Convert: A review of 100+ successful subject lines reveals...Subject Lines that Convert: A review of 100+ successful subject lines reveals...
Subject Lines that Convert: A review of 100+ successful subject lines reveals...
 
Data Visualization Techniques
Data Visualization TechniquesData Visualization Techniques
Data Visualization Techniques
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
data science
data sciencedata science
data science
 
Intro to Tableau Public
Intro to Tableau PublicIntro to Tableau Public
Intro to Tableau Public
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
 
Marketing et big data
Marketing et big dataMarketing et big data
Marketing et big data
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 

En vedette

Wise real estate planning
Wise real estate planningWise real estate planning
Wise real estate planningPaul Cheema
 
ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016Andrej Zafka
 
Difference between molap, rolap and holap in ssas
Difference between molap, rolap and holap  in ssasDifference between molap, rolap and holap  in ssas
Difference between molap, rolap and holap in ssasUmar Ali
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support SystemsShigem
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
Internet of Things and its applications
Internet of Things and its applicationsInternet of Things and its applications
Internet of Things and its applicationsPasquale Puzio
 

En vedette (6)

Wise real estate planning
Wise real estate planningWise real estate planning
Wise real estate planning
 
ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016ROLAP partitioning in MS SQL Server 2016
ROLAP partitioning in MS SQL Server 2016
 
Difference between molap, rolap and holap in ssas
Difference between molap, rolap and holap  in ssasDifference between molap, rolap and holap  in ssas
Difference between molap, rolap and holap in ssas
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support Systems
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Internet of Things and its applications
Internet of Things and its applicationsInternet of Things and its applications
Internet of Things and its applications
 

Similaire à Dwlogical0910

Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...FrederikN
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dwsam2sung2
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerWildan Maulana
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modelingjamessnape
 
Marino mercedesportfolio201212
Marino mercedesportfolio201212Marino mercedesportfolio201212
Marino mercedesportfolio201212Marino Mercedes
 
Marino-mercedesportfolio201212
Marino-mercedesportfolio201212Marino-mercedesportfolio201212
Marino-mercedesportfolio201212Marino Mercedes
 
Greenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureGreenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureDavid Hoerster
 
Carnegie Presentation
Carnegie PresentationCarnegie Presentation
Carnegie Presentationgopack21
 
8 sony - innovating with tag management
8   sony - innovating with tag management8   sony - innovating with tag management
8 sony - innovating with tag managementEnsighten
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_modelClaudia Gomez
 
StrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com DatasheetStrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com Datasheetmflannigan
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting SuiteClassic Polo
 
retail marketing
retail marketingretail marketing
retail marketingsabasri
 
Document process v7
Document process v7Document process v7
Document process v7Bala Kris
 
ICA Incentives
ICA IncentivesICA Incentives
ICA IncentivesBirkhoff
 

Similaire à Dwlogical0910 (20)

Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
Your favorite data modeling tool, your partner in crime for Data Warehouse Au...
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dw
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence Layer
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Marino mercedesportfolio201212
Marino mercedesportfolio201212Marino mercedesportfolio201212
Marino mercedesportfolio201212
 
Marino-mercedesportfolio201212
Marino-mercedesportfolio201212Marino-mercedesportfolio201212
Marino-mercedesportfolio201212
 
Greenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureGreenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows Azure
 
SAP Power Designer
SAP Power DesignerSAP Power Designer
SAP Power Designer
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Carnegie Presentation
Carnegie PresentationCarnegie Presentation
Carnegie Presentation
 
8 sony - innovating with tag management
8   sony - innovating with tag management8   sony - innovating with tag management
8 sony - innovating with tag management
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_model
 
Competitve Intelligence and Pricing
Competitve Intelligence and PricingCompetitve Intelligence and Pricing
Competitve Intelligence and Pricing
 
StrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com DatasheetStrikeIron Address Verification for Salesforce.com Datasheet
StrikeIron Address Verification for Salesforce.com Datasheet
 
Micro strategy Reporting Suite
Micro strategy Reporting SuiteMicro strategy Reporting Suite
Micro strategy Reporting Suite
 
retail marketing
retail marketingretail marketing
retail marketing
 
Document process v7
Document process v7Document process v7
Document process v7
 
Birdie Analysis
Birdie AnalysisBirdie Analysis
Birdie Analysis
 
ICA Incentives
ICA IncentivesICA Incentives
ICA Incentives
 

Dernier

CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 

Dernier (20)

CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 

Dwlogical0910

  • 1. Data warehousing logical design Mirjana Mazuran mazuran@elet.polimi.it December 15, 2009 1/18
  • 2. Outline Data Warehouse logical design ROLAP model star schema snowflake schema Exercise 1: wine company Exercise 2: real estate agency 2/18
  • 3. Introduction Logical design Starting from the conceptual design it is necessary to determin the logical schema of data We use ROLAP (Relational On-Line Analytical Processing) model to represent multidimensional data ROLAP uses the relational data model, which means that data is stored in relations Given the DFM representation of multidimensional data, two schemas are used: star schema snowflake schema 3/18
  • 4. ROLAP Star schema Each dimension is represented by a relation such that: the primary key of the relation is the primary key of the dimension the attributes of the relation describe all aggregation levels of the dimension A fact is represented by a relation such that: the primary key of the relation is the set of primary keys imported from all the dimension tables the attributes of the relation are the measures of the fact Pros and Cons few joins are needed during query execution dimension tables are denormalized denormalization introduces redundancy 4/18
  • 5. ROLAP Star schema: example MaxAmount Time Policy Amount Class IdTime IdPolicy Day Class Year Month Amount Year MaxAmount IdPolicy Month Accident Accident Motivation IdTime IdPolicy NrOfAccidents IdCustomer Day Cost IdMotivation NrOfAccidents IDCustomer Cost Customer Motivation IdCustomer IdMotivation BYear City Region Birthday Motivation Gender Sex City Region 5/18
  • 6. ROLAP Snowflake schema Each (primary) dimension is represented by a relation: the primary key of the relation is the primary key of the dimension the attributes of the relation directly depend by the primary key a set of foreign keys is used to access information at different levels of aggregation. Such information is part of the secondary dimensions and is stored in dedicated relations A fact is represented by a relation such that: the primary key of the relation is the set of primary keys imported from all and only the primary dimension tables the attributes of the relation are the measures of the fact Pros and Cons denormalization is reduced less memory space is required a lot of joins can be required if they involve attributes in secondary dimension tables 6/18
  • 7. ROLAP Snowflake schema: example Category Time Furniture IdTime IdFurniture Type Material Day Material Year Month Type Year Category IdFurniture Month Sale Sale IdTime IdFurniture Quantity IdCustomer Income Day Discount Quantity Income IDCustomer Discount Region Customer Place IdCustomer IdPlace BYear City State IdPlace City BirthYear Region Sex Sex State 7/18
  • 8. Exercise 1 Wine company An online order wine company requires the designing of a datawarehouse to record the quantity and sales of its wines to its customers. Part of the original database is composed by the following tables: CUSTOMER (Code, Name, Address, Phone, BDay, Gender) WINE (Code, Name, Type, Vintage, BottlePrice, CasePrice, Class) CLASS (Code, Name, Region) TIME (TimeStamp, Date, Year) ORDER (Customer, Wine, Time, nrBottles, nrCases) Note that the tables represent the main entities of the ER schema, thus it is necessary to derive the significant relashionships among them in order to correctly design the data warehouse. 8/18
  • 9. Exercise 1: A possible solution Snowflake schema FACT Sales MEASURES Quantity, Cost DIMENSIONS Customer, Area, Time, Wine → Class Customer Wine CustomerCode WineCode Birthday ClassCode Gender Type SalesTable Vintage WineCode CustomerCode OrderTimeCode Quantity Time Class Cost TimeCode ClassCode Date Region Year 9/18
  • 10. Exercise 2 Real estate agency Let us consider the case of a real estate agency whose database is composed by the following tables: OWNER (IDOwner, Name, Surname, Address, City, Phone) ESTATE (IDEstate, IDOwner, Category, Area, City, Province, Rooms, Bedrooms, Garage, Meters) CUSTOMER (IDCust, Name, Surname, Budget, Address, City, Phone) AGENT (IDAgent, Name, Surname, Office, Address, City, Phone) AGENDA (IDAgent, Data, Hour, IDEstate, ClientName) VISIT (IDEstate,IDAgent, IDCust, Date, Duration) SALE (IDEstate,IDAgent, IDCust, Date, AgreedPrice, Status) RENT (IDEstate,IDAgent, IDCust, Date, Price, Status, Time) 10/18
  • 11. Exercise 2 Real estate agency Goal: Provide a supervisor with an overview of the situation. The supervisor must have a global view of the business, in terms of the estates the agency deals with and of the agents’ work. Questions: 1. Design a conceptual schema for the DW. 2. What facts and dimensions do you consider? 3. Design a Star Schema or Snowflake Schema for the DW. Write the following SQL queries: How many customers have visited properties of at least 3 different categories? What is the average duration of visits per property category? Who has paid the highest price among the customers that have viewed properties of at least 3 different categories? Who has bought a flat for the highest price w.r.t. each month? What kind of property sold for the highest price w.r.t each city and month? 11/18
  • 12. Exercise 2: A possible solution Facts and dimensions Points 1 and 2 are left as homework. In particular it is required to discuss the facts of interest (with respect to your point of view) and then define, for each fact, the attribute tree, dimensions and measures with the corresponding glossary. The following ideas will be used during the solution of the exercise: supervisors should be able to control the sales of the agency FACT Sales MEASURES OfferPrice, AgreedPrice, Status DIMENSIONS EstateID, OwnerID, CustomerID, AgentID, TimeID supervisors should be able to control the work of the agents by analyzing the visits to the estates, which the agents are in charge of FACT Viewing MEASURES Duration DIMENSIONS EstateID, CustomerID, AgentID, TimeID 12/18
  • 13. Exercise 2: A possible solution Star schema Estate Owner EstateID SalesTable OwnerID Category EstateID Name Area OwnerID Surname City CustomerID Address Province AgentID City Rooms TimeID Phone Bedrooms OfferPrice Agent Garage AgreedPrice AgentID Meters Status Name Sheet Surname Map Office Address Customer City CustomerID Phone ViewingTable Name EstateID Surname Time CustomerID Budget TimeID AgentID Address Day TimeID City Month Duration Phone Year 13/18
  • 14. Exercise 2: A possible solution Snowflake schema Estate SalesTable Time EstateID EstateID TimeID PlaceID TimeID Day Category CustomerID Month Rooms OwnerID Year Bedrooms AgentID Owner Garage OfferPrice OwnerID Meters AgreedPrice PlaceID Sheet Status Name Map Surname Customer ViewingTable Address CustomerID TimeID Phone PlaceID EstateID Agent Name CustomerID AgentID Surname AgentID PlaceID Budget Duration Name Address Surname Place Phone Office PlaceID Address City Phone Province Area 14/18
  • 15. Exercise 2: A possible solution SQL queries wrt the star schema How many customers have visited properties of at least 3 different categories? CREATE VIEW Cust3Cat AS SELECT V.CustomerID FROM ViewingTable V, Estate E WHERE V.EstateID = E.EstateID GROUP BY V.CustomerID HAVING COUNT(DISTINCT E.Category) >=3 SELECT COUNT(*) FROM Cust3Cat 15/18
  • 16. Exercise 2: A possible solution SQL queries wrt the star schema Who has paid the highest price among the customers that have viewed properties of at least 3 different categories? SELECT C.CustomerID FROM Cust3Cat C, SalesTable S WHERE C.CustID = S.CustID AND S.AgreedPrice IN (SELECT MAX(S.AgreedPrice) FROM Cust3Cat C1, SalesTable S1 WHERE C1.CustomerID = S1.CustomerID) Average duration of visits per property category SELECT E.Category, AVG(V.Duration) FROM ViewingTable V, Estate E WHERE V.EstateID = E.EstateID GROUP BY E.Category 16/18
  • 17. Exercise 2: A possible solution SQL queries wrt the star schema Who has bought a flat for the highest price w.r.t. each month? SELECT S.CustomerID, T.Month, T.Year, S.AgreedPrice FROM SalesTable S, Estate E, Time T WHERE S.EstateID = E.EstateID AND S.TimeID = T.TimeID AND E.Category = “flat” AND (T.Month, T.Year, S.AgreedPrice) IN ( SELECT T1.Month, T1.Year, MAX(S1.AgreedPrice) FROM SalesTable S1, Estate E1, Time T1 WHERE S1.EstateID = E1.EstateID AND S1.TimeID = T1.TimeID AND E1.Category = “flat” GROUP BY T1.Month, T1.Year 17/18
  • 18. Exercise 2: A possible solution SQL queries wrt the star schema What kind of property sold for the highest price w.r.t each city and month? SELECT E.Category, E.City, T.Moth, T.Year, E.AgreedPrice FROM SalesTable S, Time T, Estate E WHERE S.TimeID = T.TimeID AND E.EstateID = S.EstateID AND (P.AgreedPrice, P.City, T.month, T.year) IN ( SELECT MAX(E1.AgreedPrice), E1.City, T1.Month, T1.Year) FROM SalesTable S1, Time T1, Estate E1 WHERE S1.TimeID = T1.TimeID AND E1.EstateID = S1.EstateID GROUP BY T.Month, T.Year, E.City) 18/18