DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
Real World Business Intelligence and Data Warehousing
1. Real World Business Intelligence
and Data Warehousing
Dr. Thomas Zurek
January 2012
2. Agenda
1. Business Intelligence and Data Warehouses
definition
examples
2. What are the Challenges?
3. SQL and OLAP
4. What SAP does …
5. Take Aways
3. Agenda
1. Business Intelligence and Data Warehouses
definition
examples
2. What are the Challenges?
3. SQL and OLAP
4. What SAP does …
5. Take Aways
4. Examples of Business Intelligence Scenarios
fraud detection
• retail company
• point-of-sales data & given discounts
• huge amounts of data
• a prototypical BI question
• screencam
production analysis
• solar power production
long tail analysis
• e-commerce companies like Amazon, Ebay, iTunes, Netflix, …
• translate sales of popular products into (additional) sales in the long tail
• BI integrated into operational processes
6. Long Tail Analysis (2) Source: Chris Anderson, The Long Tail, Wired, October
2004, http://www.wired.com/wired/archive/12.10/tail.html
7. Long Tail Analysis (3)
• Source: Chris Anderson, The Long Tail, Wired, October 2004, http://www.wired.com/wired/archive/12.10/tail.html
8. Business Intelligence and Data Warehouses
• Business Intelligence
An environment in which business users conduct analyses that yield overall
understanding of where
the business has been,
where it is now, and
where it will be in the near future (i.e. planning, predictive).
• Data Warehouse
An implementation of an informational database used to collect, integrate
and provide sharable data sourced from multiple operational databases for
analyses.
Provide data that is reliable, consistent, understandable.
It typically serves as the foundation for a business intelligence system.
9. A Typical Data Warehouse Architecture
Project Governance
End-user access / Presentation
BI Layer ODS
Reporting / Analyses /
Planning
Main Service : Make data available for reporting & planning tools
Transform : Application specific/(dis-)aggregate/lookup
Content : Application specific
History : Application specific
Store : IC,DSO, Info Set, Virtual Provider, Multi Provider.
Data Propagation Data Warehouse Corp.
Main Service : Spot for apps/Delta to app/App recovery Memory
Transform : Enriched || General Business logic
Content : Data source || Business domain specific
History : Determined by rebuild requirements of apps
Store : DSO(can be logical partitioned)
Business
IT Governance
Harmonization transform
Main Service : Integrated, harmonized
Transform : Harmonize quality assure (in flow|| lookup)
Content : Defined fields
History : Short or not at all || Long term
Store : Info source || IO/DSO/Z-table
Data Acquisition
Main Service : Decouple, Fast load and distribute
Transform : 1:1
Content : 1 data source, All fields
History : 4 weeks
Store : PSA, DSO-WO.
Provide data
Source 1 Source 2 Source 3 Source 4 Source 5
10. Agenda
1. Business Intelligence and Data Warehouses
definition
examples
2. What are the Challenges?
3. SQL and OLAP
4. What SAP does …
5. Take Aways
11. Main Challenges in the Data Warehousing Layer
physical connectivity to source systems
• many protocols
• many formats, code pages, unicode / non-unicode
• network quality
• source system dependency (down times, peak times, …)
transformation, cleansing, scrubbing
• Jun 1, 2011 = 1.6.2011 = 06/01/11 = …
• VW Touareg = VW TOUAREG = *product+ 87654 = …
• currency and unit conversions: e.g. box kg
• resolve ID clashes: e.g. same product no. used in different subsiduaries
• enrich data: add attributes from source A to data from source B
consistency, integrity, compliance
• create one version of the truth
• track data flows; know where the data originated ("data provenance")
• keep log and other change information for audits
12. Main Challenges in the BI Layer
calculations
• aggregation of facts: SUM, MIN, MAX, AVG, COUNT, COUNT DISTINCT, …
• formulas: e.g. revenue per employee, profitability, …
• multi-dimensionality: e.g. time – region – product – sales org
• hierarchies: versioning, logic, various types of hierarchies
• currency and unit conversions
• exceptions: e.g. "good": revenue > 1 mio, "bad": revenue < 500000
security
performance
• use efficient data structures
• caching
• precalculation
planning
• actuals (read-only) vs plan data
• planning session / transaction
13. Main Challenges in the BI Frontend Layer
The frontend layer exposes the rich functionality of the platform.
many user groups
• casual user
• advanced user
• expert user: familiar w/ domain, data model, technology
many contexts
• operational: any employee supervising operations, processes
• tactical: managers
• strategical: higher management, board
many technologies
• web: browser, portals, …
• Office (esp. Excel)
• specific tools
• dissemination via email, collaboration spaces, …
14. Agenda
1. Business Intelligence and Data Warehouses
definition
examples
2. What are the Challenges?
3. SQL and OLAP
4. What SAP does …
5. Take Aways
15. SQL and OLAP: Example of a Simple Query
(Standard) key Calculated key
COUNT DISTINCT
figure aggregated figure, normalizing
key figure
by SUM to the subtotal
Country Material Quantity No. of Share per
Customers Country
Pencil 10 5 67% (10/15)
DE Paper 5 3 33% (5/15)
Subtotal 15 6 100%
Pencil 7 3 39% (7/18)
US Glue 11 5 61% (11/18)
Subtotal 18 7 100%
Grand Total 33 11 100%
16. SQL and OLAP: Data to Calculate the Query Result
SELECT Country, Material, Customer, SUM(Quantity), 1 FROM …
Country Material Customer Quantity No. of Customers
Aral 2 1
This is what can be
BP 3 1
retrieved by SQL.
Pencil Esso 1 1
This is the starting
Shell 2 1
DE point for further
Texaco 2 1
calculations.
BP 1 1
16 rows
Paper Esso 1 1 imagine a retailer
Jet 3 1 o 10000s of materials
Agip 1 1 o 10000s of customers
imagine a utilities or
Pencil Chevron 3 1
mobile phone
Texaco 3 1 company
Agip 3 1 o millions of customers
US combinatorics let this
Elf 3 1 result explode
Glue Exxon 1 1
Repsol 2 1
Shell 2 1
17. SQL and OLAP: Layer Definition for Example Query
LQ: Coun, Mat,Cust, SUM(Quan), 1
L1: Coun, SUM(Quan) L5: Coun, Cust, 1 L6: Cust, 1
L2: L3:
L4:
LQ.Coun, LQ.Mat, SUM(LQ.Quan)/ LQ.Coun, SUM(LQ.Quan)/SUM(L1.
SUM(LQ.Quan)/SUM(L1.Quan), fro
SUM(L1.Quan) Quan)
m LQ join L1
from LQ join L1 from LQ join L1
19. Agenda
1. Business Intelligence and Data Warehouses
definition
examples
2. What are the Challenges?
3. SQL and OLAP
4. What SAP does …
5. Take Aways
20. What SAP Offers in this Context
SAP Business Objects portfolio
Project Governance
End-user access / Presentation
o frontend tools
o data quality and extraction BI Layer ODS
Reporting / Analyses /
Planning
Main Service : Make data available for reporting & planning tools
o
Transform modeling tools
: Application specific/(dis-)aggregate/lookup
Content : Application specific
History : Application specific
o
Store analytic applications (EPM)
: IC,DSO, Info Set, Virtual Provider, Multi Provider.
SAP Sybase portfolio
Data Propagation Data Warehouse Corp.
o databases (ASE,app/App…)
Main Service : Spot for apps/Delta to IQ, recovery Memory
Transform : Enriched || General Business logic
Content : Data source || Business domain specific
o
History modeling tools
: Determined by rebuild requirements of apps
Store : DSO(can be logical partitioned)
SAP Business Warehouse Business
IT Governance
Harmonization transform
o DW:: Integrated, quality assure (in flow|| lookup)
Main Service
Transform
application on top of DB
Harmonize
harmonized
Content : Defined fields
o
History
Store
bestShort or not|| IO/DSO/Z-table
:
practice || Long term
: Info source
at all
approach
o Data Acquisition semantics
built-in SAP
Main Service : Decouple, Fast load and distribute
SAP HANA
Transform : 1:1
Content : 1 data source, All fields
History : 4 weeks
o
Store in-memory DB appliance data
: PSA, DSO-WO.
Provide
Source 1 Source 2 Source 3 Source 4 Source 5
21. SAP HANA + SAP Business Warehouse (BW)
• In general:
DW = DB + X e.g. with X = BW
• Now:
DB HANA
• Thus:
DW = HANA + Y with Y = BW optimized for HANA
22. SAP Business Warehouse: the X or Y in more detail
• Data Warehouse • BI Layer
o modeling of o analytic modeling
data flows shared dimensions
transformations hierarchies
data containers measures + KPIs
o data movement and transformation currency and unit handling
processes
time dependency / versioning
design tools for such processes
formulas
scheduling
monitoring
o dimensional data containers
archiving
(cubes)
o connectivity and extraction o planning infrastructure
native connectivity to SAP systems modeling
and extractors planning session concept
first-class integration of Data Services planning functions
(ETL) o security
23. SAP HANA: Key Impacts on Modern DBMS
Advances in Technology Application-Awareness
• column-store • DB tailored towards the
applications
• in-memory
• providing generic operations
• multi-core processors • frequently used by those applications
• data compression • not in standard SQL (or else)
• infiniband • examples
• currency conversion
• hard- and software • unit of measure conversion
bundling • hierarchy logic
• NoSQL (i.e. no-ACID) • delta management BW's DSO
• calculation engine
• … • planning engine
24. SAP HANA: In-Memory Computing
Programming Against a New Scarce Resource…
Type of
Size Latency (~)
Memory
L1 CPU
64K 1 ns
Cache
L2 CPU
256K 5 ns
Cache
L3 CPU
8M 20 ns
Cache
Main GBs up to
100ns
Memory TBs
Disk TBs >1.000.000 ns
need cache-conscious data-structures and algorithms !
25. SAP HANA™
SAP HANA™
SAP Business Objects tools Other query tools / apps
in-memory software + hardware
(HP, IBM, Fujitsu, Cisco, Dell, Hitachi)
SQL BICS SQL MDX
data modeling and data management
SAP HANA
data acquisition
SAP In-Memory Computing Studio
Current Scenarios
SAP In-Memory Database stand-alone data marts
Calculation and Row & Column operational data marts
Planning Engine Storage
analytic data marts
accelerator for ERP scenarios
SAP Business
Real-Time Data
Replication
Objects Data e.g. controlling & profitability analysis (CO-PA)
Services
transparent, i.e. consumption stays with ERP
DB for Business Warehouse (BW)
BW optimized for HANA
SAP Business SAP NetWeaver Other data
Suite Business Warehouse sources
HANA optimizations for BW
26. Agenda
1. Business Intelligence and Data Warehouses
definition
examples
2. What are the Challenges?
3. SQL and OLAP
4. What SAP does …
5. Take Aways
27. Take Aways
1. What are Business Intelligence and Data Warehousing?
2. What are some of the challenges?
3. SAP's efforts and products in that space.
Notes de l'éditeur
So, what’s inside HANA? This architecture diagram explains the main components and capabilities. …So, I keep throwing around words like ‘massive’ amounts of data and ‘amazing’ speed. What kinds of scale, speed and improvement are customers seeing?