SlideShare une entreprise Scribd logo
1  sur  28
Real World Business Intelligence
     and Data Warehousing
           Dr. Thomas Zurek
            January 2012
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Examples of Business Intelligence Scenarios

 fraud detection
 •   retail company
 •   point-of-sales data & given discounts
 •   huge amounts of data
 •   a prototypical BI question
 •   screencam
 production analysis
 • solar power production
 long tail analysis
 • e-commerce companies like Amazon, Ebay, iTunes, Netflix, …
 • translate sales of popular products into (additional) sales in the long tail
 • BI integrated into operational processes
Long Tail Analysis (1) – An Example from Amazon
Long Tail Analysis (2)   Source: Chris Anderson, The Long Tail, Wired, October
                         2004, http://www.wired.com/wired/archive/12.10/tail.html
Long Tail Analysis (3)




 •   Source: Chris Anderson, The Long Tail, Wired, October 2004, http://www.wired.com/wired/archive/12.10/tail.html
Business Intelligence and Data Warehouses

• Business Intelligence
  An environment in which business users conduct analyses that yield overall
  understanding of where
        the business has been,
        where it is now, and
        where it will be in the near future (i.e. planning, predictive).



• Data Warehouse
     An implementation of an informational database used to collect, integrate
      and provide sharable data sourced from multiple operational databases for
      analyses.
     Provide data that is reliable, consistent, understandable.
     It typically serves as the foundation for a business intelligence system.
A Typical Data Warehouse Architecture




                                                                                                                                    Project Governance
         End-user access / Presentation


                                                                       BI Layer                                            ODS
       Reporting / Analyses /
       Planning
Main Service : Make data available for reporting & planning tools
Transform      : Application specific/(dis-)aggregate/lookup
Content        : Application specific
History        : Application specific
Store          : IC,DSO, Info Set, Virtual Provider, Multi Provider.




        Data Propagation                                               Data Warehouse                          Corp.
Main Service   : Spot for apps/Delta to app/App recovery                                                       Memory
Transform      : Enriched || General Business logic
Content        : Data source || Business domain specific
History        : Determined by rebuild requirements of apps
Store          : DSO(can be logical partitioned)
                                                           Business




                                                                                                                                    IT Governance
      Harmonization                                       transform
Main Service   : Integrated, harmonized
Transform      : Harmonize quality assure (in flow|| lookup)
Content        : Defined fields
History        : Short or not at all || Long term
Store          : Info source || IO/DSO/Z-table


      Data Acquisition
Main Service   : Decouple, Fast load and distribute
Transform      : 1:1
Content        : 1 data source, All fields
History        : 4 weeks
Store          : PSA, DSO-WO.
                                                      Provide data
                                                                           Source 1     Source 2   Source 3   Source 4   Source 5
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Main Challenges in the Data Warehousing Layer
 physical connectivity to source systems
 •   many protocols
 •   many formats, code pages, unicode / non-unicode
 •   network quality
 •   source system dependency (down times, peak times, …)
 transformation, cleansing, scrubbing
 •   Jun 1, 2011 = 1.6.2011 = 06/01/11 = …
 •   VW Touareg = VW TOUAREG = *product+ 87654 = …
 •   currency and unit conversions: e.g. box  kg
 •   resolve ID clashes: e.g. same product no. used in different subsiduaries
 •   enrich data: add attributes from source A to data from source B
 consistency, integrity, compliance
 • create one version of the truth
 • track data flows; know where the data originated ("data provenance")
 • keep log and other change information for audits
Main Challenges in the BI Layer
 calculations
 •   aggregation of facts: SUM, MIN, MAX, AVG, COUNT, COUNT DISTINCT, …
 •   formulas: e.g. revenue per employee, profitability, …
 •   multi-dimensionality: e.g. time – region – product – sales org
 •   hierarchies: versioning, logic, various types of hierarchies
 •   currency and unit conversions
 •   exceptions: e.g. "good": revenue > 1 mio, "bad": revenue < 500000
 security
 performance
 • use efficient data structures
 • caching
 • precalculation
 planning
 • actuals (read-only) vs plan data
 • planning session / transaction
Main Challenges in the BI Frontend Layer
The frontend layer exposes the rich functionality of the platform.
 many user groups
 • casual user
 • advanced user
 • expert user: familiar w/ domain, data model, technology
 many contexts
 • operational: any employee supervising operations, processes
 • tactical: managers
 • strategical: higher management, board
 many technologies
 •   web: browser, portals, …
 •   Office (esp. Excel)
 •   specific tools
 •   dissemination via email, collaboration spaces, …
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
SQL and OLAP: Example of a Simple Query

                   (Standard) key                              Calculated key
                                          COUNT DISTINCT
                 figure aggregated                          figure, normalizing
                                            key figure
                       by SUM                                  to the subtotal



   Country          Material         Quantity       No. of       Share per
                                                    Customers    Country
                    Pencil           10             5            67% (10/15)

   DE               Paper            5              3            33% (5/15)

                    Subtotal         15             6            100%

                    Pencil           7              3            39% (7/18)

   US               Glue             11             5            61% (11/18)

                    Subtotal         18             7            100%
   Grand Total                       33             11           100%
SQL and OLAP: Data to Calculate the Query Result
            SELECT Country, Material, Customer, SUM(Quantity), 1 FROM …

Country   Material   Customer     Quantity   No. of Customers

                     Aral            2              1
                                                                This is what can be
                     BP              3              1
                                                                 retrieved by SQL.
          Pencil     Esso            1              1
                                                                This is the starting
                     Shell           2              1
DE                                                               point for further
                     Texaco          2              1
                                                                 calculations.
                     BP              1              1
                                                                16 rows 
          Paper      Esso            1              1               imagine a retailer
                     Jet             3              1              o 10000s of materials
                     Agip            1              1              o 10000s of customers
                                                                    imagine a utilities or
          Pencil     Chevron         3              1
                                                                    mobile phone
                     Texaco          3              1               company
                     Agip            3              1              o millions of customers
US                                                                  combinatorics let this
                     Elf             3              1               result explode
          Glue       Exxon           1              1
                     Repsol          2              1
                     Shell           2              1
SQL and OLAP: Layer Definition for Example Query



                                                                 LQ: Coun, Mat,Cust, SUM(Quan), 1




                                      L1: Coun, SUM(Quan)                L5: Coun, Cust, 1          L6: Cust, 1




              L2:                               L3:
                                                                              L4:
LQ.Coun, LQ.Mat, SUM(LQ.Quan)/   LQ.Coun, SUM(LQ.Quan)/SUM(L1.
                                                                 SUM(LQ.Quan)/SUM(L1.Quan), fro
         SUM(L1.Quan)                         Quan)
                                                                          m LQ join L1
         from LQ join L1                  from LQ join L1
SQL and OLAP: Assemble Query Result


  Country               Material   Quantity         No. of Customers   Share per
                                                                       Country


                                   LQ: Coun, Mat,   LQ: Coun, Mat,
                        …                                              L2
  …                                SUM(Quan)        SUM(1)


                        Subtotal   L1               L5: Coun, SUM(1)   L3

  Grand Total                      L1: SUM(Quan)    L6: SUM(1)         L4




© SAP AG 2009. All rights
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
What SAP Offers in this Context
 SAP Business Objects portfolio




                                                                                                                                    Project Governance
           End-user access / Presentation
    o     frontend tools
    o     data quality and extraction                                  BI Layer                                            ODS
       Reporting / Analyses /
       Planning
Main Service : Make data available for reporting & planning tools
    o
Transform modeling tools
            : Application specific/(dis-)aggregate/lookup
Content        : Application specific
History        : Application specific
    o
Store     analytic applications (EPM)
               : IC,DSO, Info Set, Virtual Provider, Multi Provider.


 SAP Sybase portfolio
          Data Propagation                                             Data Warehouse                          Corp.
    o     databases (ASE,app/App…)
Main Service : Spot for apps/Delta to IQ, recovery                                                             Memory
Transform      : Enriched || General Business logic
Content        : Data source || Business domain specific
    o
History   modeling tools
               : Determined by rebuild requirements of apps
Store          : DSO(can be logical partitioned)
 SAP Business Warehouse                                   Business




                                                                                                                                    IT Governance
      Harmonization                                       transform
    o     DW:: Integrated, quality assure (in flow|| lookup)
Main Service
Transform
                application on top of DB
               Harmonize
                           harmonized

Content        : Defined fields
    o
History
Store
          bestShort or not|| IO/DSO/Z-table
             :
                 practice || Long term
             : Info source
                           at all
                                  approach
    o Data Acquisition semantics
       built-in SAP
Main Service : Decouple, Fast load and distribute
 SAP HANA
Transform : 1:1
Content        : 1 data source, All fields
History        : 4 weeks
    o
Store     in-memory DB appliance data
               : PSA, DSO-WO.
                             Provide
                                                                           Source 1     Source 2   Source 3   Source 4   Source 5
SAP HANA + SAP Business Warehouse (BW)
• In general:
          DW = DB + X     e.g. with X = BW

• Now:
          DB  HANA

• Thus:
          DW = HANA + Y   with Y = BW optimized for HANA
SAP Business Warehouse: the X or Y in more detail

• Data Warehouse                                • BI Layer
 o modeling of                                   o analytic modeling
     data flows                                      shared dimensions
     transformations                                 hierarchies
     data containers                                 measures + KPIs
 o data movement and transformation                  currency and unit handling
   processes
                                                     time dependency / versioning
     design tools for such processes
                                                     formulas
     scheduling
     monitoring
                                                 o dimensional data containers
     archiving
                                                   (cubes)
 o connectivity and extraction                   o planning infrastructure
     native connectivity to SAP systems              modeling
     and extractors                                  planning session concept
     first-class integration of Data Services        planning functions
     (ETL)                                       o security
SAP HANA: Key Impacts on Modern DBMS

Advances in Technology    Application-Awareness
• column-store            • DB tailored towards the
                            applications
• in-memory
                          • providing generic operations
• multi-core processors     •   frequently used by those applications
• data compression          •   not in standard SQL (or else)
• infiniband              • examples
                            •   currency conversion
• hard- and software        •   unit of measure conversion
  bundling                  •   hierarchy logic
• NoSQL (i.e. no-ACID)      •   delta management  BW's DSO
                            •   calculation engine
• …                         •   planning engine
SAP HANA: In-Memory Computing
                  Programming Against a New Scarce Resource…




                                           Type of
                                                   Size           Latency (~)
                                           Memory
                                           L1 CPU
                                                      64K         1 ns
                                           Cache
                                           L2 CPU
                                                      256K        5 ns
                                           Cache
                                           L3 CPU
                                                      8M          20 ns
                                           Cache
                                           Main       GBs up to
                                                                  100ns
                                           Memory     TBs
                                           Disk       TBs         >1.000.000 ns



 need cache-conscious data-structures and algorithms !
SAP HANA™
                                                            SAP HANA™
 SAP Business Objects tools      Other query tools / apps
                                                             in-memory software + hardware
                                                              (HP, IBM, Fujitsu, Cisco, Dell, Hitachi)
         SQL       BICS            SQL        MDX
                                                             data modeling and data management
                      SAP HANA
                                                             data acquisition
           SAP In-Memory Computing Studio
                                                            Current Scenarios
                SAP In-Memory Database                       stand-alone data marts
        Calculation and           Row & Column                   operational data marts
        Planning Engine              Storage
                                                                 analytic data marts
                                                             accelerator for ERP scenarios
                                    SAP Business
        Real-Time Data
          Replication
                                    Objects Data                 e.g. controlling & profitability analysis (CO-PA)
                                      Services
                                                                 transparent, i.e. consumption stays with ERP
                                                             DB for Business Warehouse (BW)
                                                                 BW optimized for HANA
SAP Business           SAP NetWeaver          Other data
   Suite             Business Warehouse        sources
                                                                 HANA optimizations for BW
Agenda

1. Business Intelligence and Data Warehouses

      definition

      examples

2. What are the Challenges?

3. SQL and OLAP

4. What SAP does …

5. Take Aways
Take Aways

1. What are Business Intelligence and Data Warehousing?

2. What are some of the challenges?

3. SAP's efforts and products in that space.
Real World Business Intelligence and Data Warehousing

Contenu connexe

Tendances

Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
pcherukumalla
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
Shanthi Mukkavilli
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
cpjcollege
 

Tendances (20)

Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehousing Overview
Data Warehousing OverviewData Warehousing Overview
Data Warehousing Overview
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Ppt
PptPpt
Ppt
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
TDWI Roundtable: The HANA EDW
TDWI Roundtable: The HANA EDWTDWI Roundtable: The HANA EDW
TDWI Roundtable: The HANA EDW
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.Data ware housing - Introduction to data ware housing process.
Data ware housing - Introduction to data ware housing process.
 

Similaire à Real World Business Intelligence and Data Warehousing

Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
akitda
 
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Jonathan Seidman
 

Similaire à Real World Business Intelligence and Data Warehousing (20)

Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTDataHadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
 
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe RocherINTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
INTERFACE by apidays 2023 - API Green Score, Yannick Tremblais, Groupe Rocher
 
SURENDRANATH GANDLA4
SURENDRANATH GANDLA4SURENDRANATH GANDLA4
SURENDRANATH GANDLA4
 
NaliniProfile
NaliniProfileNaliniProfile
NaliniProfile
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
 
Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases
 
Enterprise Data Lakes
Enterprise Data LakesEnterprise Data Lakes
Enterprise Data Lakes
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
DoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics PlatformDoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics Platform
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on Hadoop
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
 
Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP
 
Times ten 18.1_overview_meetup
Times ten 18.1_overview_meetupTimes ten 18.1_overview_meetup
Times ten 18.1_overview_meetup
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
 
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011Distributed Data Analysis with Hadoop and R - Strangeloop 2011
Distributed Data Analysis with Hadoop and R - Strangeloop 2011
 
GeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL toolGeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL tool
 

Dernier

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Real World Business Intelligence and Data Warehousing

  • 1. Real World Business Intelligence and Data Warehousing Dr. Thomas Zurek January 2012
  • 2. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 3. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 4. Examples of Business Intelligence Scenarios  fraud detection • retail company • point-of-sales data & given discounts • huge amounts of data • a prototypical BI question • screencam  production analysis • solar power production  long tail analysis • e-commerce companies like Amazon, Ebay, iTunes, Netflix, … • translate sales of popular products into (additional) sales in the long tail • BI integrated into operational processes
  • 5. Long Tail Analysis (1) – An Example from Amazon
  • 6. Long Tail Analysis (2) Source: Chris Anderson, The Long Tail, Wired, October 2004, http://www.wired.com/wired/archive/12.10/tail.html
  • 7. Long Tail Analysis (3) • Source: Chris Anderson, The Long Tail, Wired, October 2004, http://www.wired.com/wired/archive/12.10/tail.html
  • 8. Business Intelligence and Data Warehouses • Business Intelligence An environment in which business users conduct analyses that yield overall understanding of where  the business has been,  where it is now, and  where it will be in the near future (i.e. planning, predictive). • Data Warehouse  An implementation of an informational database used to collect, integrate and provide sharable data sourced from multiple operational databases for analyses.  Provide data that is reliable, consistent, understandable.  It typically serves as the foundation for a business intelligence system.
  • 9. A Typical Data Warehouse Architecture Project Governance End-user access / Presentation BI Layer ODS Reporting / Analyses / Planning Main Service : Make data available for reporting & planning tools Transform : Application specific/(dis-)aggregate/lookup Content : Application specific History : Application specific Store : IC,DSO, Info Set, Virtual Provider, Multi Provider. Data Propagation Data Warehouse Corp. Main Service : Spot for apps/Delta to app/App recovery Memory Transform : Enriched || General Business logic Content : Data source || Business domain specific History : Determined by rebuild requirements of apps Store : DSO(can be logical partitioned) Business IT Governance Harmonization transform Main Service : Integrated, harmonized Transform : Harmonize quality assure (in flow|| lookup) Content : Defined fields History : Short or not at all || Long term Store : Info source || IO/DSO/Z-table Data Acquisition Main Service : Decouple, Fast load and distribute Transform : 1:1 Content : 1 data source, All fields History : 4 weeks Store : PSA, DSO-WO. Provide data Source 1 Source 2 Source 3 Source 4 Source 5
  • 10. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 11. Main Challenges in the Data Warehousing Layer  physical connectivity to source systems • many protocols • many formats, code pages, unicode / non-unicode • network quality • source system dependency (down times, peak times, …)  transformation, cleansing, scrubbing • Jun 1, 2011 = 1.6.2011 = 06/01/11 = … • VW Touareg = VW TOUAREG = *product+ 87654 = … • currency and unit conversions: e.g. box  kg • resolve ID clashes: e.g. same product no. used in different subsiduaries • enrich data: add attributes from source A to data from source B  consistency, integrity, compliance • create one version of the truth • track data flows; know where the data originated ("data provenance") • keep log and other change information for audits
  • 12. Main Challenges in the BI Layer  calculations • aggregation of facts: SUM, MIN, MAX, AVG, COUNT, COUNT DISTINCT, … • formulas: e.g. revenue per employee, profitability, … • multi-dimensionality: e.g. time – region – product – sales org • hierarchies: versioning, logic, various types of hierarchies • currency and unit conversions • exceptions: e.g. "good": revenue > 1 mio, "bad": revenue < 500000  security  performance • use efficient data structures • caching • precalculation  planning • actuals (read-only) vs plan data • planning session / transaction
  • 13. Main Challenges in the BI Frontend Layer The frontend layer exposes the rich functionality of the platform.  many user groups • casual user • advanced user • expert user: familiar w/ domain, data model, technology  many contexts • operational: any employee supervising operations, processes • tactical: managers • strategical: higher management, board  many technologies • web: browser, portals, … • Office (esp. Excel) • specific tools • dissemination via email, collaboration spaces, …
  • 14. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 15. SQL and OLAP: Example of a Simple Query (Standard) key Calculated key COUNT DISTINCT figure aggregated figure, normalizing key figure by SUM to the subtotal Country Material Quantity No. of Share per Customers Country Pencil 10 5 67% (10/15) DE Paper 5 3 33% (5/15) Subtotal 15 6 100% Pencil 7 3 39% (7/18) US Glue 11 5 61% (11/18) Subtotal 18 7 100% Grand Total 33 11 100%
  • 16. SQL and OLAP: Data to Calculate the Query Result SELECT Country, Material, Customer, SUM(Quantity), 1 FROM … Country Material Customer Quantity No. of Customers Aral 2 1 This is what can be BP 3 1 retrieved by SQL. Pencil Esso 1 1 This is the starting Shell 2 1 DE point for further Texaco 2 1 calculations. BP 1 1 16 rows  Paper Esso 1 1 imagine a retailer Jet 3 1 o 10000s of materials Agip 1 1 o 10000s of customers imagine a utilities or Pencil Chevron 3 1 mobile phone Texaco 3 1 company Agip 3 1 o millions of customers US combinatorics let this Elf 3 1 result explode Glue Exxon 1 1 Repsol 2 1 Shell 2 1
  • 17. SQL and OLAP: Layer Definition for Example Query LQ: Coun, Mat,Cust, SUM(Quan), 1 L1: Coun, SUM(Quan) L5: Coun, Cust, 1 L6: Cust, 1 L2: L3: L4: LQ.Coun, LQ.Mat, SUM(LQ.Quan)/ LQ.Coun, SUM(LQ.Quan)/SUM(L1. SUM(LQ.Quan)/SUM(L1.Quan), fro SUM(L1.Quan) Quan) m LQ join L1 from LQ join L1 from LQ join L1
  • 18. SQL and OLAP: Assemble Query Result Country Material Quantity No. of Customers Share per Country LQ: Coun, Mat, LQ: Coun, Mat, … L2 … SUM(Quan) SUM(1) Subtotal L1 L5: Coun, SUM(1) L3 Grand Total L1: SUM(Quan) L6: SUM(1) L4 © SAP AG 2009. All rights
  • 19. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 20. What SAP Offers in this Context  SAP Business Objects portfolio Project Governance End-user access / Presentation o frontend tools o data quality and extraction BI Layer ODS Reporting / Analyses / Planning Main Service : Make data available for reporting & planning tools o Transform modeling tools : Application specific/(dis-)aggregate/lookup Content : Application specific History : Application specific o Store analytic applications (EPM) : IC,DSO, Info Set, Virtual Provider, Multi Provider.  SAP Sybase portfolio Data Propagation Data Warehouse Corp. o databases (ASE,app/App…) Main Service : Spot for apps/Delta to IQ, recovery Memory Transform : Enriched || General Business logic Content : Data source || Business domain specific o History modeling tools : Determined by rebuild requirements of apps Store : DSO(can be logical partitioned)  SAP Business Warehouse Business IT Governance Harmonization transform o DW:: Integrated, quality assure (in flow|| lookup) Main Service Transform application on top of DB Harmonize harmonized Content : Defined fields o History Store bestShort or not|| IO/DSO/Z-table : practice || Long term : Info source at all approach o Data Acquisition semantics built-in SAP Main Service : Decouple, Fast load and distribute  SAP HANA Transform : 1:1 Content : 1 data source, All fields History : 4 weeks o Store in-memory DB appliance data : PSA, DSO-WO. Provide Source 1 Source 2 Source 3 Source 4 Source 5
  • 21. SAP HANA + SAP Business Warehouse (BW) • In general: DW = DB + X e.g. with X = BW • Now: DB  HANA • Thus: DW = HANA + Y with Y = BW optimized for HANA
  • 22. SAP Business Warehouse: the X or Y in more detail • Data Warehouse • BI Layer o modeling of o analytic modeling data flows shared dimensions transformations hierarchies data containers measures + KPIs o data movement and transformation currency and unit handling processes time dependency / versioning design tools for such processes formulas scheduling monitoring o dimensional data containers archiving (cubes) o connectivity and extraction o planning infrastructure native connectivity to SAP systems modeling and extractors planning session concept first-class integration of Data Services planning functions (ETL) o security
  • 23. SAP HANA: Key Impacts on Modern DBMS Advances in Technology Application-Awareness • column-store • DB tailored towards the applications • in-memory • providing generic operations • multi-core processors • frequently used by those applications • data compression • not in standard SQL (or else) • infiniband • examples • currency conversion • hard- and software • unit of measure conversion bundling • hierarchy logic • NoSQL (i.e. no-ACID) • delta management  BW's DSO • calculation engine • … • planning engine
  • 24. SAP HANA: In-Memory Computing Programming Against a New Scarce Resource… Type of Size Latency (~) Memory L1 CPU 64K 1 ns Cache L2 CPU 256K 5 ns Cache L3 CPU 8M 20 ns Cache Main GBs up to 100ns Memory TBs Disk TBs >1.000.000 ns  need cache-conscious data-structures and algorithms !
  • 25. SAP HANA™ SAP HANA™ SAP Business Objects tools Other query tools / apps  in-memory software + hardware (HP, IBM, Fujitsu, Cisco, Dell, Hitachi) SQL BICS SQL MDX  data modeling and data management SAP HANA  data acquisition SAP In-Memory Computing Studio Current Scenarios SAP In-Memory Database  stand-alone data marts Calculation and Row & Column operational data marts Planning Engine Storage analytic data marts  accelerator for ERP scenarios SAP Business Real-Time Data Replication Objects Data e.g. controlling & profitability analysis (CO-PA) Services transparent, i.e. consumption stays with ERP  DB for Business Warehouse (BW) BW optimized for HANA SAP Business SAP NetWeaver Other data Suite Business Warehouse sources HANA optimizations for BW
  • 26. Agenda 1. Business Intelligence and Data Warehouses  definition  examples 2. What are the Challenges? 3. SQL and OLAP 4. What SAP does … 5. Take Aways
  • 27. Take Aways 1. What are Business Intelligence and Data Warehousing? 2. What are some of the challenges? 3. SAP's efforts and products in that space.

Notes de l'éditeur

  1. So, what’s inside HANA? This architecture diagram explains the main components and capabilities. …So, I keep throwing around words like ‘massive’ amounts of data and ‘amazing’ speed. What kinds of scale, speed and improvement are customers seeing?