SlideShare une entreprise Scribd logo
1  sur  33
Data Warehouses: A Whistle-Stop Tour Cade Roux [email_address]
How did I get here?
Typical Business "Design"
Typical Goal Scenario
What happened?
What success still looks like - version 1
What success still looks like - version 2
What success should look like
Dimensional Modeling
Normal Form
The intuitive resolution of contemporary design problems simply lies beyond the reach of a single individual’s integrative grasp… … there are bounds to man’s cognitive and creative capacity… … the very frequent failure of individual designers to produce well organized forms suggests strongly that there are limits to the individual designer’s capacity. Christopher Alexander – Notes on the Synthesis of Form, Introduction: The Need for Rationality
Facts and Dimensions
 
Best Practices ,[object Object],[object Object],[object Object]
Worst Practices ,[object Object],[object Object],[object Object]
Links ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Q&A
Glossary ,[object Object],[object Object],[object Object]
Glossary (2) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Glossary (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Glossary (4) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Glossary (5) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Glossary (6) ,[object Object],[object Object],[object Object],[object Object]
Glossary (7) ,[object Object],[object Object],[object Object],[object Object]
Dimensional Modelling ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Topics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conformed Dimensions ,[object Object],[object Object],[object Object]
Some things to keep in mind ,[object Object],[object Object],[object Object]
NULLs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Performance Issues ,[object Object],[object Object],[object Object],[object Object],[object Object]
Application Logic ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
 

Contenu connexe

Similaire à Data Warehousing for Gnocode

Modernising the data warehouse - January 2019
Modernising the data warehouse - January 2019Modernising the data warehouse - January 2019
Modernising the data warehouse - January 2019Phil Watt
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional modelGersiton Pila Challco
 
Implementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White PaperImplementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White Papershashanksalunkhe12
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?Venu Anuganti
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecyclebartlowe
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousingEr. Nawaraj Bhandari
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Oracle Hyperion overview
Oracle Hyperion overviewOracle Hyperion overview
Oracle Hyperion overviewClick4learning
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processingSamraiz Tejani
 

Similaire à Data Warehousing for Gnocode (20)

Modernising the data warehouse - January 2019
Modernising the data warehouse - January 2019Modernising the data warehouse - January 2019
Modernising the data warehouse - January 2019
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Implementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White PaperImplementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White Paper
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
 
Data Vault Overview
Data Vault OverviewData Vault Overview
Data Vault Overview
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecycle
 
OLAP
OLAPOLAP
OLAP
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Oracle Hyperion overview
Oracle Hyperion overviewOracle Hyperion overview
Oracle Hyperion overview
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 

Dernier (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Data Warehousing for Gnocode

Notes de l'éditeur

  1. We're going to start out looking at some diagrams of systems as they usually are before a data warehouse initiative is put into place. And we'll see what the target looks like, as well as some typical outcomes. We'll do a quick review of normalized relational database theory. We'll look at dimensional models and their advantages for data warehousing. I’ve got an example which shows how complex reporting can get in even a simple traditional normalized model, and then a possible dimensional model for some reporting. Then I’ll cover a couple best practices/worst practices and we’ll hit the Q&A.
  2. Typically business processes have evolved and there is no coherent data strategy in place. Individual business units are responsible for systems, people need things tied together and they get it built. I’ve categorized the data reporting needs in 4 ways. Reporting: No interpretation, effectively sending data/information to people to interpret Analysis: Forward-looking or ad-hoc, usually designed to answer questions and what-if scenarios Tactics: Operational requirements, business-as-usual, batch processes, dashboards Strategy: Dashboards, status, monitoring Pros: You only build something if it's necessary Cons: Changes to systems are hard to make since lots of downstream dependencies Data is fragmented, systems differ, formats differ, conventions differ, retention varies, processes can interfere with production In all these cases, I’m assuming regular business processes are creating/updating this data in their own applications. We’re not trying to bring everything into one application, but simply to tie the systems together reliably for read only usage.
  3. Solution! Make a bottleneck DEFINE Data Warehouse DEFINE ETL Everything in one place (although it might not really be one place in the physical world, there is one logical clearinghouse) Standards, data cleansing, consistency, unified security, data management, consolidated storage management Pros: What? Cons: It's a bottleneck
  4. It would be nice to think you always get to the promised land, but I'm going to start off by showing the scenarios you still end up getting a lot of the time.
  5. In this scenario, you've got important data which isn't in the warehouse (yet), and you end up building a process which goes around the warehouse for that data.  And it would be nice to think that eventually that data gets in the warehouse and you redefine your process and you're all homogeneous again. But the fact is, these things stick around.  For. a. long. time.
  6. A lot of times, you're still in love with reports.  Because that's what users always ask for. For some reason users like to work.  This is not a problem that programmers have. A programmer would rather spend 40 hours writing a program and never do any more work, than have to spend 1 hour each year consolidating 25 reports into a TPS report for a boss. On the other hand, users will insist on asking for reports and spend 10 hours a week using them (to do what? - well, that's the question that's never asked) instead of having a program written in 20 hours which does it for them with a click (or no-click). The fact is that the question not asked is “what is the user's goal?”  By users and programmers being focused on the requests and not the goal, the system is never going to get insight into the actual user and  business goals.  Why would you expect to get a system to help the enterprise achieve its goals if questions are never asked in terms of goals and only in terms of requirements or current usage patterns?
  7. What stops this from happening? For one, users asking for "things" instead of everyone understanding their goals. Another is having a process which isn't responsive enough to get the users to their changing goals - so users end up using the "things" they have in new and different ways instead of getting to their new goals most efficiently. Also, the old “things” become the new goals instead of understanding why the old “things” were necessary, seeing if they are still necessary, seeing if the needs for them have changed. So in data warehousing, we look for ways to make things easier by accommodating user needs without anticipating user needs. Two big areas where we want to make some progress is making the data more accessible for self-reporting and data analysts who aren’t necessarily programmers. And performance – you want to be able to get very powerful results with modest computing power and without sacrificing your application which is generating and managing all your operational data. Dimensional modeling has certain advantages related to exactly this.
  8. When we talk about dimensional modeling, this is modeling the data in a completely different way than traditional Entity-Relationship Modeling (E-R, or ERM).
  9. Traditional normal forms in relational databases can be summed up as "the key, the whole key, and nothing but the key". Data is organized in relations based on sharing a key.  Relation means that all the data is related to the key. Keys between tables can be used to join the data together. Relational databases are now dominant, but before the 80s, they competed with a number of database systems such as network databases, hierarchical databases, and various special file-based databases. A normalized database, while optimized for transactions and a goal of a system to have a low intellectual distance from the real world, requires a great deal of effort to extract data in all but the most trivial models. I won't dwell on the entire relational theory, but just point out a problem when reporting off normalized data - multiplicity of paths in many-to-many relationships can be hard for ordinary report-writing users to get their heads around.
  10. In 1964, Christopher Alexander made a case for decomposition of architectural designs based on a need to measure goodness of fit and to solve design problems at different scales in an iterative and composable way based on understanding context/solution as two sides to the same boundary and quality of design fitness measured by absence of fit. This became the basis for his Pattern Language. This was borrowed for the Design Patterns movement in software. Today we not only recognize the power of design patterns in the software industry, but the sound software engineering principles of cohesion and coupling (Constantine and Yourdon). The impact of this on data warehousing, especially those which rely on dimensional modeling is that dimensional modeling is born out of pragmatism in the face of data complexity. In all cases, dimensional modeling is a move away from a normalized database to a database optimized for data analysis. I want to make clear that the techniques I’m going to discuss are data modeling for a specific usage pattern – high-performance read-only data analysis. This is not a transactional model and I do not advocate you designing any application against a dimensional model unless it’s an analytical platform.
  11. In transforming from a normal form to a dimensional model, the facts which comprise a traditional normal form are allocated to several simple star schemas. Star schemas are particularly simply to query and to optimize for. Typically, the star model means that each fact has a foreign key to every dimension.  There is no possibility for multiplication during joins. It is usually possible to represent the star in a flattened form which is universally equivalent and without any loss of data or need for separate interpretation. A data warehouse system will have multiple fact tables determined by the grain (usually the time) and subject matter. Different stars may share some dimensions. When these dimensions are system-wide or enterprise-wide, they are called conformed dimensions. Date or time are very good examples. Customer might also be an example if you have relatively homogeneous customers. Account might be an example.
  12. I’ve modeled a relatively simple system in a normalized form. This system is a conference/meeting management system. Several times a year, a conference is held, so you need to track classes, venues, sections, attendees, attendance, hotels, invoices etc. For several years we had a system like this to manage training meetings for office managers and in looking for an example for this presentation, I threw this together from memory – it’s only a partial model, I haven’t included hotel accomodations, room sharing or details like employee-office relationships.
  13. What you'll find is that the terms are typically used very vaguely and it is often difficult to get a lot out of the terminology when applied to a specific system without looking at the actual architecture. The key things about data warehouse that's always common: - it's a copy of the data - it never changes Integration is always part of the goal, subject-orientation depends on the compartmentalization mindset. The top-down and bottom-up associations are largely false .
  14. There are a number of tools out there.  Nothing is going to do it off the shelf. Huge problems dealing with data which can sometimes be relatively freeform - like Excel
  15. Kimball is largely responsible for popularizing this approach, which relies on remodelling the data to a star model (or sometimes a snowflake model) to simplify reporting, eliminate common user mistakes. Many approaches keep the data in its original normalized form and use parallelism (Teradata), or in a more explicit object/dynamic form and use parallelism (Hadoop, Map/Reduce)
  16. Something especially true about data warehouses is that they don't submit well to top-down development. The users won't know what they need until they can use the data warehouse to learn about the data. So it's a chicken and egg situation and it is perfect for incremental delivery. As long as users will have Excel, they will generate contradictory reporting. Total control is impossible - users WILL work around controls to produce information. The dimensional model works to our advantage here.
  17. When people say bottom-up, they expect that you will handle each department's individually and then somehow tie it all up together at the end.  In reality, no one works that way. People in an enterprise know the common data and they know that it needs to be shared.  Modelling the data without knowing how it is going to be used will miss out key factors that are best handled in the modelling or in the ETL - for instance, pre-calculating logic. When people say top-down, there is an expectation that you take all the reports and from that deduce all the needs and that there isn't any new insight going to be gained from the departmental level of work. In practice, you always have to burn the candle at every end - feed back and forth in order to be successful.
  18. You may hear these terms. Like I said, they don't really matter until you've seen a system's architecture whether they are using these concepts and what they represent.
  19. In this relatively simple example, a calendar entry has a foreign key to a meeting entity. A meeting entity contains references t a meal and a venue. Attendees are linked to meetings (a typical many-to-many relationship). Some data queries can be awkward for reporting purposes because of the larger network of tables and especially the many-to-many or cascading many-to-one relationships which can cause multiplicity of results.
  20. While this example is rather contrived, it shows that although a given meeting attended had only one certain meal and one venue it could be held at, you do not have to go through the entity relationships in order to traverse the entity-relationship-model.