SlideShare une entreprise Scribd logo
1  sur  12
Recipes of Data Warehouse and
Business Intelligence
How to think agile
A project of Data Warehouse and Business Intelligence, is a long and complex work that requires many months,
often years, especially if we are talking about Enterprise Data Warehouse, to be able to see the light.
Indeed, I think we should stop to call it project, but we should call it process. But it is not a process whatever: it
is the process that transforms data into knowledge, knowledge into prediction, the prediction in action.
We apply for example, this process in the world CRM (Customer Relationship Management).
The raw data of the customers who come from different systems, are transformed into greater knowledge of
customers and their preferences. From the knowledge of the customers we can predict their future attitudes.
The knowledge of the
future allows us to act to change or adapt new business strategy. This is what allows us the process of Data
Warehouse and Business Intelligence.
You can well imagine how this process is essential for any company that wants to compete in the global market.
Unfortunately what scares more investment is time. In fact, the time factor is crucial, and in busy life today, you
do not want to wait and you look for shortcuts to get the desired results in the shortest time possible.
That's why we talk about Agile Data Warehouse.
Introduction
Agile Data Warehouse and Business Intelligence
What is and what is not the Agile Data Warehouse
So let us just what it should be not.
It should not be a commercial product or a solution sold by some companies.
It should not be a database or a different design of the logical and physical structure. It must be a methodology,
thought as a design philosophy, to apply to the entire life cycle of the process.
Agile Data Warehouse and Business Intelligence
Build
Ideally, and, quite simply, we can divide the process of Data Warehouse in three main phases.
I stress simply, because behind these phases, there are several design steps that we know well (requirements
gathering, analysis, programming, ..).
Build Test
Maintenance
and Iterative
evolution
Build: all activity that leads to the test phase.
Test: all activity of verification and control that, before and after the deploy in production, ending with the
acceptance of the system made by end users.
Maintenance and evolution iterative: all activity relating to the management and growth of the Data Warehouse
To successfully implement a process of Agile Data Warehouse, we have to be "agile" in each of these
components.
We need to be agile in the Build phase. There is little to explain. This need is easily understood. We must try to
minimize the time of the ETL process that, historically, is the most time-consuming phase of the process.
Agile Data Warehouse and Business Intelligence
Test
We need to be agile in the test phase. This step is critical because it is the phase in which end users are starting
to see the data and they begin to evaluate the result. This means provide fast response time to end users.
Look Out. I'm not talking about the response times of a report or a query to the Data Warehouse (I will take it for
granted), but about the response time to the causes of faults and of problems. Let me explain.
As stated at the beginning, we have to be agile in the whole life cycle of the Data Warehouse. Many of you will
think that "agile" it means only to reach quickly the deploy into production .
In practice able to accelerate as much as possible the ETL process in order to provide end-users the Data Mart for
their analysis. But this is only part of the story.
In my opinion, the most important moment in which we must be "agile" is AFTER having concluded the build
phase.
The real success of the Data Warehouse will depend on how we will be quick to answer to questions of end-
users, to their contestation of the displayed data. How we will be quick to identify the problems of the loading
process, in knowing where they are occurring and why. And we have to be fast in solving problems
Maintenance and iterative evolution
Finally we need to be agile in the maintenance and the iterative evolution. This means that we have to answer
quickly to requests for modification of the system, and especially of its evolution. Do not forget that it is a
process.
Do not forget that on the basis of an initial Data Warehouse, little by little they will be added over time, new
dimensions of analysis and new Data Mart to analyze. It most likely will need to add new information to the
dimensions and to the facts already built.
Agile Data Warehouse and Business Intelligence
I hope now it is clear what we want to achieve when we speak of Agile Data Warehouse. But the essential point
is how to reach these objectives. As mentioned above, you do not need a product, but only a good methodology.
Here are some personal advice based on my experience.
We can act on various aspects, many of which have already been the subject of reflections on my blog or on my
Slideshare.
Agile in the Build - Naming Convention
I never tire of emphasizing the importance of setting a precise naming convention for all objects of the project.
We must do this now, before creating any type of information structure. This will allow us to have a clear and
simplified management of all the logical and physical components (tables, sequences, views, files, documents,
etc.) that constitute the Data Warehouse.
Not only. Follow a specific naming convention allows us to create configuration, creation and control
mechanisms, very quickly.
Agile in the Build - Reduction of the computing chain
Another point to consider is the modelling philosophy of the Data Warehouse. Indeed, it is probably the first
thing to consider. I will not go in the historical debate on the approach: Inmon against Kimball. Both are valid
with their strengths and weaknesses.
But if we speak of "agile", for me the choice of the Kimball philosophy is crucial. All what can reduce the
computational and structural chain present in the ETL process, is undoubtedly an important factor.
I think having an ODS (Operational Data Store), that is basically a duplication historicised almost all the structures
already present in Staging Area, before of the structures dedicated to the analysis, is an activity that costs time
and money.
Agile Data Warehouse and Business Intelligence
Agile in the Build - Simplification of data types
Another way to be "agile" is a consequence of the general rule to always think in a simplified way.
We need to reduce to a minimum the types of data (in the sense of the database) in the Data Warehouse. An
RDBMS such as Oracle, and the same goes for the other manufacturers, has more than 30 different data types
(NUMBER of various types, CHAR, VARCHAR, DATE, etc.): we can not think of having this variety of types inside
the Data Warehouse . Too many complications in their treatment and conversion.
Try to think to the semplicity of the source files: except for some special cases, are all text files.
With fixed length record or with terminator, they are always streams of data that you can easily open it with any
text editor.
The ultimate in simplicity. My advice is to keep almost intact this simplicity inside the Data Warehouse using only
two data types
• Numeric - just to represent amounts, quantities, percentages, etc.
• Alphanumeric - for all other data.
We can use the DATE data type, only for technical fields, such as insertion date, last update date, etc.
Although in the source systems the data representing codes, indicators, flags, etc. are numeric, we must see
them alphanumeric inside the Data Warehouse.
Transform all the data that represent dates, in alphanumeric and in the standard format YYYYMMDD.
Agile in the Build - Sequentiality
We must try to think, and in 90% of cases you can do, that every component of the process is connected to the
next, and that their sequential execution leads to final loading of the Data Warehouse.
Agile Data Warehouse and Business Intelligence
Mind you, I'm not saying that you can not work in parallel, but to identify which components are completely
independent of each other to the point that they can run in parallel, it is not an easy task; not counting all the
arguments necessary for their synchronization.
The parallelism also requires specific hardware configurations, and specific settings of the database, to actually
get a performance boost that, I speak from experience, it is not obvious.
Certainly, the dimension tables may be loaded in parallel (if there are not logical connections between them),
but in a "agile" world we must try to think in a simple and sequential.
Do not forget that the ETL process, by its nature is inherently sequential.
You can not upload a Data Mart of Level 1 before you have loaded those of level 0. You can not load a Data Mart
Level 0 if you have not loaded the dimensions, which, in turn, can not be loaded unless you have loaded the
staging area tables , and so on.
Agile in the Build - Reduction of the external tools
It 'a design choice, dependent on many factors, whether and which tool to use for the implementation of the
Data Warehouse.
Each company has its own rules and, above all, a budget. If you have plenty of money available to buy the tools
(and especially a lot of time learning how to use them) , there is no problem.
If your budget is low, my advice is to use the least possible number of instruments. Often we tend to look for
specific tools to do specific jobs such as quadrature, process control, quality control, job scheduling, etc.
Do not forget that each of them has its own structures, which then need to communicate with all other
structures, increasing the complexity of the entire system.
My opinion is to invest much more in having a very good knowledge of the programming language of the
database, a good editor and a good interface to access the database. These three elements will save us a lot of
time.
Agile Data Warehouse and Business Intelligence
Agile in the Test – Configuration and log
To be agile at this stage, we have to build a very accurate control architecture. I have already written a lot about
how to report the system faults automatically and how to have the control of the modules of an ETL process. My
advice is to always have this magical pair of structures (tables): configuration and log. At minimum:
Configuration tables of the Staging Area - Logging tables of the Staging Area loading.
Configuration tables of the dimensions - Logging tables of the dimensions loading.
Configuration tables of the facts - Logging tables of the facts loading
Agile in the Test – Data Lineage
Have a structure of Data Lineage means to be able to travel all the way of information, seen by the end user, back
until the origin of the data. Complicated, is not always possible (see the data calculated) but essential to prove
the correctness
of the loading process. To put it simply, we must be able to prove that the problem was already present inside
the feeding source. So you need to use some metadata tables to manage the data lineage.
Agile in the Maintenance and Iterative evolution – Modularity (and uncertainty)
To be agile at this stage we have to be modular. is the uncertainty that forces us to be modular.
Uncertainty not in the sense that it is allowed us to be uncertain how to proceed, but in the sense of being aware
that anything will change. Let me explain.
In a process of Data Warehouse, it is rare that all logics are well defined from the start.
Agile Data Warehouse and Business Intelligence
We should not necessarily think about deficiencies of analysis (which sometimes we have) or errors in the
requirements gathering.
The problem is that the logic evolve while you progress in the work. I think it is a natural process, linked to the
complexity of the system, with which we have to live with no dramas. The source systems provide data that is
not sure to be exactly those expected from the analysis, both as size and as content.
This often is discovered later, when the data begin to be analyzed (and then after loading them).
Business users change their minds, sometimes the business strategies changes. It turns out, later, that also
served another data not provided by the analysis. Users want to make the comparison with other data that were
not foreseen, etc ..
There is a saying very eloquent on the needs of end users. The saying is: "I will know when I will see." I'll know
what I want when I see it. Absolutely true.
This requires us to continuously modify the programs to meet the new design requirements.
Logic (and programs) to add, to change, to be removed; logic that are to be added, but in two months will be
removed, in short, anyone with a little 'experience, will certainly have to face these situations.
To limit the consequences of the uncertainty, it is essential to the principle of modularity. That's why to every
business need must correspond to a single processing unit, simple or complex it is.
If I load a table of Staging Area, there must be some modules that they do it, and they do only that.
If I have to run a check quadrature between three key performance indicators, there must be a module that does
it, and does just that.
When it turns out that the KPI to check are 4, we will add new modules. If I have to add the calculation the price
of a derivative financial product, there must be a module that does it; no matter if I send to develop that module
to a programmer who lives in another part of the world. The important thing is the immediacy with which I insert
it in the system. In this way, do not pretend to eliminate uncertainty, but with the modularity, I manage it better.
Agile Data Warehouse and Business Intelligence
The last tip is the clear separation between the business and the infrastructure.
You have already seen it in action, in some of my previous articles.
The simple techniques exposed about messaging and control are independent of the context. They are
infrastructure, not business.
That the business related to the Data Warehouse is about the financial environment, automotive, or for large
retail chains, does not affect in any way the use of those techniques.
We must use the configuration and log tables, absolutely independent of the context in which they work.
This allows us, for example, to add a new Data Mart focusing exclusively on business related to the Data Mart,
and reusing the infrastructural software for the process monitoring.
Agile in the Maintenance and Iterative evolution – Separation between business and infrastructure
Agile Data Warehouse and Business Intelligence
Build
Maintenance and
Iterative evolution
Test
Data Lineage
Modularity (and uncertainty)
Configuration and log
Reduction of the
computing chain
Naming Convention
Simplification of data types
Sequentiality
Reduction of the external tools
Separation between
business and infrastructure
Agile
Agile
Agile
Agile Data Warehouse and Business Intelligence
Conclusion
Be agile in a process (or project) of Data Warehouse and Business Intelligence is possible. You just have to be
guided by a correct methodology that I tried to summarize in the points described.
http://www.slideshare.net/jackbim/recipe-9-techniques-to-control-the-processing-units-in-the-etl-process
http://www.slideshare.net/jackbim/recipes-6-of-data-warehouse-naming-convention-techniques
http://www.slideshare.net/jackbim/recipes-8-the-naming-convention-part-2
http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-1
http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2
References

Contenu connexe

En vedette

Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...Massimo Cenci
 
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...Massimo Cenci
 
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...Massimo Cenci
 
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...Massimo Cenci
 
Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Massimo Cenci
 
Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1Massimo Cenci
 
Building & Scaling Data Teams
Building & Scaling Data TeamsBuilding & Scaling Data Teams
Building & Scaling Data TeamsOutreach Digital
 
Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...
Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...
Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...APMG-International Showcase UK
 
Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...
Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...
Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...Tom Breur
 
Dimensional Planning on Fixed Price Projects (XPDays 2008)
Dimensional Planning on Fixed Price Projects (XPDays 2008)Dimensional Planning on Fixed Price Projects (XPDays 2008)
Dimensional Planning on Fixed Price Projects (XPDays 2008)inxin
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business IntelligenceDon Jackson
 
Agile Testing: The Role Of The Agile Tester
Agile Testing: The Role Of The Agile TesterAgile Testing: The Role Of The Agile Tester
Agile Testing: The Role Of The Agile TesterDeclan Whelan
 
BI the Agile Way
BI the Agile WayBI the Agile Way
BI the Agile Waynvvrajesh
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business IntelligenceDavid Portnoy
 
Agile Testing Process
Agile Testing ProcessAgile Testing Process
Agile Testing ProcessIntetics
 
Introduction to Agile software testing
Introduction to Agile software testingIntroduction to Agile software testing
Introduction to Agile software testingKMS Technology
 

En vedette (18)

Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
 
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
 
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
 
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 8 of Data Warehouse and Business Intelligence - Naming convention tec...
 
Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2
 
Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1
 
Building & Scaling Data Teams
Building & Scaling Data TeamsBuilding & Scaling Data Teams
Building & Scaling Data Teams
 
Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...
Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...
Delivering an Agile SharePoint Project in a Global Pharmaceutical Company (Ph...
 
Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...
Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...
Tom Breur - Agile Business Intelligence - accounting for progress - keynote d...
 
Dimensional Planning on Fixed Price Projects (XPDays 2008)
Dimensional Planning on Fixed Price Projects (XPDays 2008)Dimensional Planning on Fixed Price Projects (XPDays 2008)
Dimensional Planning on Fixed Price Projects (XPDays 2008)
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
full-stack agile - Scrum Basics
full-stack agile -  Scrum Basicsfull-stack agile -  Scrum Basics
full-stack agile - Scrum Basics
 
Agile Testing: The Role Of The Agile Tester
Agile Testing: The Role Of The Agile TesterAgile Testing: The Role Of The Agile Tester
Agile Testing: The Role Of The Agile Tester
 
Agile Testing by Example
Agile Testing by ExampleAgile Testing by Example
Agile Testing by Example
 
BI the Agile Way
BI the Agile WayBI the Agile Way
BI the Agile Way
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
Agile Testing Process
Agile Testing ProcessAgile Testing Process
Agile Testing Process
 
Introduction to Agile software testing
Introduction to Agile software testingIntroduction to Agile software testing
Introduction to Agile software testing
 

Plus de Massimo Cenci

Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaMassimo Cenci
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Massimo Cenci
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Massimo Cenci
 
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniMassimo Cenci
 
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongMassimo Cenci
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmerMassimo Cenci
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Massimo Cenci
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlMassimo Cenci
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Massimo Cenci
 
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Massimo Cenci
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiNote di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiMassimo Cenci
 

Plus de Massimo Cenci (19)

Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging area
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etl
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
 
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
 
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmer
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sql
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
 
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiNote di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
 

Dernier

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 

Dernier (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Recipes 11 of Data Warehouse and Business Intelligence - How to think agile

  • 1. Recipes of Data Warehouse and Business Intelligence How to think agile
  • 2. A project of Data Warehouse and Business Intelligence, is a long and complex work that requires many months, often years, especially if we are talking about Enterprise Data Warehouse, to be able to see the light. Indeed, I think we should stop to call it project, but we should call it process. But it is not a process whatever: it is the process that transforms data into knowledge, knowledge into prediction, the prediction in action. We apply for example, this process in the world CRM (Customer Relationship Management). The raw data of the customers who come from different systems, are transformed into greater knowledge of customers and their preferences. From the knowledge of the customers we can predict their future attitudes. The knowledge of the future allows us to act to change or adapt new business strategy. This is what allows us the process of Data Warehouse and Business Intelligence. You can well imagine how this process is essential for any company that wants to compete in the global market. Unfortunately what scares more investment is time. In fact, the time factor is crucial, and in busy life today, you do not want to wait and you look for shortcuts to get the desired results in the shortest time possible. That's why we talk about Agile Data Warehouse. Introduction Agile Data Warehouse and Business Intelligence What is and what is not the Agile Data Warehouse So let us just what it should be not. It should not be a commercial product or a solution sold by some companies. It should not be a database or a different design of the logical and physical structure. It must be a methodology, thought as a design philosophy, to apply to the entire life cycle of the process.
  • 3. Agile Data Warehouse and Business Intelligence Build Ideally, and, quite simply, we can divide the process of Data Warehouse in three main phases. I stress simply, because behind these phases, there are several design steps that we know well (requirements gathering, analysis, programming, ..). Build Test Maintenance and Iterative evolution Build: all activity that leads to the test phase. Test: all activity of verification and control that, before and after the deploy in production, ending with the acceptance of the system made by end users. Maintenance and evolution iterative: all activity relating to the management and growth of the Data Warehouse To successfully implement a process of Agile Data Warehouse, we have to be "agile" in each of these components. We need to be agile in the Build phase. There is little to explain. This need is easily understood. We must try to minimize the time of the ETL process that, historically, is the most time-consuming phase of the process.
  • 4. Agile Data Warehouse and Business Intelligence Test We need to be agile in the test phase. This step is critical because it is the phase in which end users are starting to see the data and they begin to evaluate the result. This means provide fast response time to end users. Look Out. I'm not talking about the response times of a report or a query to the Data Warehouse (I will take it for granted), but about the response time to the causes of faults and of problems. Let me explain. As stated at the beginning, we have to be agile in the whole life cycle of the Data Warehouse. Many of you will think that "agile" it means only to reach quickly the deploy into production . In practice able to accelerate as much as possible the ETL process in order to provide end-users the Data Mart for their analysis. But this is only part of the story. In my opinion, the most important moment in which we must be "agile" is AFTER having concluded the build phase. The real success of the Data Warehouse will depend on how we will be quick to answer to questions of end- users, to their contestation of the displayed data. How we will be quick to identify the problems of the loading process, in knowing where they are occurring and why. And we have to be fast in solving problems Maintenance and iterative evolution Finally we need to be agile in the maintenance and the iterative evolution. This means that we have to answer quickly to requests for modification of the system, and especially of its evolution. Do not forget that it is a process. Do not forget that on the basis of an initial Data Warehouse, little by little they will be added over time, new dimensions of analysis and new Data Mart to analyze. It most likely will need to add new information to the dimensions and to the facts already built.
  • 5. Agile Data Warehouse and Business Intelligence I hope now it is clear what we want to achieve when we speak of Agile Data Warehouse. But the essential point is how to reach these objectives. As mentioned above, you do not need a product, but only a good methodology. Here are some personal advice based on my experience. We can act on various aspects, many of which have already been the subject of reflections on my blog or on my Slideshare. Agile in the Build - Naming Convention I never tire of emphasizing the importance of setting a precise naming convention for all objects of the project. We must do this now, before creating any type of information structure. This will allow us to have a clear and simplified management of all the logical and physical components (tables, sequences, views, files, documents, etc.) that constitute the Data Warehouse. Not only. Follow a specific naming convention allows us to create configuration, creation and control mechanisms, very quickly. Agile in the Build - Reduction of the computing chain Another point to consider is the modelling philosophy of the Data Warehouse. Indeed, it is probably the first thing to consider. I will not go in the historical debate on the approach: Inmon against Kimball. Both are valid with their strengths and weaknesses. But if we speak of "agile", for me the choice of the Kimball philosophy is crucial. All what can reduce the computational and structural chain present in the ETL process, is undoubtedly an important factor. I think having an ODS (Operational Data Store), that is basically a duplication historicised almost all the structures already present in Staging Area, before of the structures dedicated to the analysis, is an activity that costs time and money.
  • 6. Agile Data Warehouse and Business Intelligence Agile in the Build - Simplification of data types Another way to be "agile" is a consequence of the general rule to always think in a simplified way. We need to reduce to a minimum the types of data (in the sense of the database) in the Data Warehouse. An RDBMS such as Oracle, and the same goes for the other manufacturers, has more than 30 different data types (NUMBER of various types, CHAR, VARCHAR, DATE, etc.): we can not think of having this variety of types inside the Data Warehouse . Too many complications in their treatment and conversion. Try to think to the semplicity of the source files: except for some special cases, are all text files. With fixed length record or with terminator, they are always streams of data that you can easily open it with any text editor. The ultimate in simplicity. My advice is to keep almost intact this simplicity inside the Data Warehouse using only two data types • Numeric - just to represent amounts, quantities, percentages, etc. • Alphanumeric - for all other data. We can use the DATE data type, only for technical fields, such as insertion date, last update date, etc. Although in the source systems the data representing codes, indicators, flags, etc. are numeric, we must see them alphanumeric inside the Data Warehouse. Transform all the data that represent dates, in alphanumeric and in the standard format YYYYMMDD. Agile in the Build - Sequentiality We must try to think, and in 90% of cases you can do, that every component of the process is connected to the next, and that their sequential execution leads to final loading of the Data Warehouse.
  • 7. Agile Data Warehouse and Business Intelligence Mind you, I'm not saying that you can not work in parallel, but to identify which components are completely independent of each other to the point that they can run in parallel, it is not an easy task; not counting all the arguments necessary for their synchronization. The parallelism also requires specific hardware configurations, and specific settings of the database, to actually get a performance boost that, I speak from experience, it is not obvious. Certainly, the dimension tables may be loaded in parallel (if there are not logical connections between them), but in a "agile" world we must try to think in a simple and sequential. Do not forget that the ETL process, by its nature is inherently sequential. You can not upload a Data Mart of Level 1 before you have loaded those of level 0. You can not load a Data Mart Level 0 if you have not loaded the dimensions, which, in turn, can not be loaded unless you have loaded the staging area tables , and so on. Agile in the Build - Reduction of the external tools It 'a design choice, dependent on many factors, whether and which tool to use for the implementation of the Data Warehouse. Each company has its own rules and, above all, a budget. If you have plenty of money available to buy the tools (and especially a lot of time learning how to use them) , there is no problem. If your budget is low, my advice is to use the least possible number of instruments. Often we tend to look for specific tools to do specific jobs such as quadrature, process control, quality control, job scheduling, etc. Do not forget that each of them has its own structures, which then need to communicate with all other structures, increasing the complexity of the entire system. My opinion is to invest much more in having a very good knowledge of the programming language of the database, a good editor and a good interface to access the database. These three elements will save us a lot of time.
  • 8. Agile Data Warehouse and Business Intelligence Agile in the Test – Configuration and log To be agile at this stage, we have to build a very accurate control architecture. I have already written a lot about how to report the system faults automatically and how to have the control of the modules of an ETL process. My advice is to always have this magical pair of structures (tables): configuration and log. At minimum: Configuration tables of the Staging Area - Logging tables of the Staging Area loading. Configuration tables of the dimensions - Logging tables of the dimensions loading. Configuration tables of the facts - Logging tables of the facts loading Agile in the Test – Data Lineage Have a structure of Data Lineage means to be able to travel all the way of information, seen by the end user, back until the origin of the data. Complicated, is not always possible (see the data calculated) but essential to prove the correctness of the loading process. To put it simply, we must be able to prove that the problem was already present inside the feeding source. So you need to use some metadata tables to manage the data lineage. Agile in the Maintenance and Iterative evolution – Modularity (and uncertainty) To be agile at this stage we have to be modular. is the uncertainty that forces us to be modular. Uncertainty not in the sense that it is allowed us to be uncertain how to proceed, but in the sense of being aware that anything will change. Let me explain. In a process of Data Warehouse, it is rare that all logics are well defined from the start.
  • 9. Agile Data Warehouse and Business Intelligence We should not necessarily think about deficiencies of analysis (which sometimes we have) or errors in the requirements gathering. The problem is that the logic evolve while you progress in the work. I think it is a natural process, linked to the complexity of the system, with which we have to live with no dramas. The source systems provide data that is not sure to be exactly those expected from the analysis, both as size and as content. This often is discovered later, when the data begin to be analyzed (and then after loading them). Business users change their minds, sometimes the business strategies changes. It turns out, later, that also served another data not provided by the analysis. Users want to make the comparison with other data that were not foreseen, etc .. There is a saying very eloquent on the needs of end users. The saying is: "I will know when I will see." I'll know what I want when I see it. Absolutely true. This requires us to continuously modify the programs to meet the new design requirements. Logic (and programs) to add, to change, to be removed; logic that are to be added, but in two months will be removed, in short, anyone with a little 'experience, will certainly have to face these situations. To limit the consequences of the uncertainty, it is essential to the principle of modularity. That's why to every business need must correspond to a single processing unit, simple or complex it is. If I load a table of Staging Area, there must be some modules that they do it, and they do only that. If I have to run a check quadrature between three key performance indicators, there must be a module that does it, and does just that. When it turns out that the KPI to check are 4, we will add new modules. If I have to add the calculation the price of a derivative financial product, there must be a module that does it; no matter if I send to develop that module to a programmer who lives in another part of the world. The important thing is the immediacy with which I insert it in the system. In this way, do not pretend to eliminate uncertainty, but with the modularity, I manage it better.
  • 10. Agile Data Warehouse and Business Intelligence The last tip is the clear separation between the business and the infrastructure. You have already seen it in action, in some of my previous articles. The simple techniques exposed about messaging and control are independent of the context. They are infrastructure, not business. That the business related to the Data Warehouse is about the financial environment, automotive, or for large retail chains, does not affect in any way the use of those techniques. We must use the configuration and log tables, absolutely independent of the context in which they work. This allows us, for example, to add a new Data Mart focusing exclusively on business related to the Data Mart, and reusing the infrastructural software for the process monitoring. Agile in the Maintenance and Iterative evolution – Separation between business and infrastructure
  • 11. Agile Data Warehouse and Business Intelligence Build Maintenance and Iterative evolution Test Data Lineage Modularity (and uncertainty) Configuration and log Reduction of the computing chain Naming Convention Simplification of data types Sequentiality Reduction of the external tools Separation between business and infrastructure Agile Agile Agile
  • 12. Agile Data Warehouse and Business Intelligence Conclusion Be agile in a process (or project) of Data Warehouse and Business Intelligence is possible. You just have to be guided by a correct methodology that I tried to summarize in the points described. http://www.slideshare.net/jackbim/recipe-9-techniques-to-control-the-processing-units-in-the-etl-process http://www.slideshare.net/jackbim/recipes-6-of-data-warehouse-naming-convention-techniques http://www.slideshare.net/jackbim/recipes-8-the-naming-convention-part-2 http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-1 http://www.slideshare.net/jackbim/recipe-7-of-data-warehouse-a-messaging-system-for-oracle-dwh-2 References