SlideShare une entreprise Scribd logo
1  sur  21
Recipes of Data Warehouse and
Business Intelligence
Naming convention techniques
(part 2)
Introduction (1)
• In the part 1, the application of the naming convention techniques, had the tables as its privileged
object , certainly the basic entities of an information system. We have defined a name to these
entities in their easier form (table), in their aggregate form (materialized view or summary table)
in their logical form (view).
• It was emphasized that these techniques can be applied to any logical/physical entity of a Data
Warehouse. So, I wish to complete these thoughts having in mind three targets:
– Completeness: The tables are basic, but alone, do not constitute a Data Warehouse. There
must be access rights to view them, programs must exist to load them ,must exist indexes to
speed their access, there must be constraints to ensure data integrity. Even programs, rights,
indexes and constraints must be created by respecting the naming convention. The tables are
made of attributes. Even the attributes have a name. We will speak.
– Pragmatism: Only seeing apply the techniques described in a real case, we can recognize the
utility, then we will examine and we will give a name to all the other entities in the game,
using the sample Data Warehouse.
– Knowledge: Some of the entities that will be subject to naming, are specific to Oracle and
this is a good opportunity to give them a brief description.
• The choices made for the naming convention are only guidelines. They are not a dogma. The
convention can be discussed and changed according to our needs and to our particular view of
the system.
Introduction (2)
• The main objective was to put attention to the importance and usefulness of the Naming
Convention.
• Another point I wish to emphasize is that the convention is "Database Administator oriented"
and not "Business oriented”. It means that the names chosen, for example, for the tables, will be
physical ones, and the names will be those that only the DBA sees. The "rest of the world" should
not see those names, but the "logical" names that are filtered by synonyms and/or views.
The users of the Naming Convention
• Based on some useful questions received, I want to clarify this point. Take for example the
EDW_COM_CDI_CUST_DIT entity. This entity represents the customers (CUST) of the dimension
table (DIT) of the conformed dimensions section (CDI) of the common area (COM) for all entities,
of our Data Warehouse (EDW). The content of this entity is clear to any DBA, also to who, for
example, inherits the management of a Data Warehouse that does not know. (try to think if the
table had been named A01DWCST).
• The EDW_COM_CDI_CUST_DIT entity is seen and handled only by the DBA. In my view, the only
other users who can see the entity (and only what we need, that are, usually, facts and
dimensions) are the business-area builders, by means of an administration module, that is part of
the front-end tool (eg Oracle Business Intelligence).
• These users do not need to see the EDW_COM_CDI_CUST_DIT physical name, but a
view/synonym (logical name) as, for example, CUSTOMER_DIT. If we had the foresight to make
unique the last two components of the name, the rest of the world will see an entity name much
shorter, simple and near to its business logic.
The users of the Naming Convention
A01DWCST EDW_COM_CDI_CUST_DIT
Without Naming Convention With Naming Convention
CUSTOMER_DIT
DBA
Architect
The Naming Convention of the table attributes
• As we know from the theory of relational databases, the table attributes are a set of specific
characteristics of the various entities that define the logical model. Since we have defined a
Naming Convention for the entity, it is necessary define a Naming Convention for attributes.
• The paradigm that underlies the Naming Convention of the table attributes can be summarized in
the following formula:
<attribute name> = <logical name>_<type code>
• For them, the name is very simplified because their logical context is already structurally defined
by the table to which they belong.
• Into the data dictionary tables of an RDBMS, such as Oracle, you can locate all the attributes and
their tables associated. So an effective Naming Convention will be very useful in the research of all
the attributes with certain common characteristics. Here are some examples from my personal
experiences.
• In a Data Warehouse for a bank, was born the need to change the size of the currency numbers
fields from two to six decimal places. The need was clearly linked to rounding problems. Hundreds
of tables with different columns had to be involved in the modification. Have adopted the Naming
Convention to identify all the columns of currency amounts with “*_AMT” was decisive. Has
allowed us to generate a script that, accessing to the data dictionary tables , it made dynamically
the change of structure of all and only the affected columns.
The Naming Convention of the table attributes (2)
• Some ETL and reporting tools allow us to identify automatically all the descriptive columns of
alphanumeric codes that will be displayed in the output: the interface of the tool will then use a
clause "like" to locate the fields. If you use the standard to name the description of all the code
columns with "*_DSC, this will allow you to take advantage of this feature of the tool, and it will
not need to specify one by one all the fields. Now we see some examples of the <type code>.
• COD - Code - Alphanumeric code: This is the classic code that is associated with a description and
a domain. It may be a customer number, an order type, an account status, etc.. I suggest to deal all
the numeric codes as alphanumeric codes.
• DSC - Description - the code description is always the description associated with the code that is
used in the reporting tools and in front-end. It is a design choice understand if only a single
description is sufficient, or define a short description (SDS which stands for short description) and
a long description (LDS which stands for long description). It often happens that the user requires
the concatenation of the short description with the code (CSD which stands for code plus short
description)
• AMT - Amount - Always indicates an amount.
• QTY - Quantity - Indicates a quantity in pieces, weight or in some other unit of measure.
• KEY - Always indicates an artificial key. A column with this type must exist in all dimension tables
and in the corresponding columns of the fact tables.
The Naming Convention of the table attributes (3)
• DTS - Date Stamp - Date: indicates a date in the Oracle format, that is inclusive of the time (hours,
minutes, seconds)
• FLG - Flag - It is always a binary field, ie that it may be only 0 or 1.
• TXT - Text - Field of generic text
• YMD - Day in the YYYYMMDD format
The other entities of a Data Warehouse
• Identify all the main entities or structures of a Data Warehouse, is not an easy job without
forgetting that in Oracle there are over 30 different types of structures.
• Eeach RDBMS has its own requirements and peculiarities and would be long-winded and useless
to try to give a Naming Convention at all. So we will focus on the main entities, almost always
present, leaving to the reader the application of the learned techniques for the remaining ones.
Here is the list of entities, subject of our next guidelines.
• Index
• Tablespace
• Datafile
• Integrity Constraint
• Role
• Package
The Naming Convention of the indexes
• As the Naming Convention is linked to the type of index, I will give a brief overview of the most
common types of indexes. They typically cover 90% of the need for a Data Warehouse.
• As everyone knows, the indexes are data structures that are created on one or more columns in a
table to optimize the performance of the data access; the goal of an index is therefore to provide
an immediate physical access to the rows of the table that contains the values.
• In Oracle, but there are also in other RDBMS, the indexes most used are the classic B-tree indexes,
the local or global bitmap indexes, and the function index. The paradigm that underlies the
Naming Convention of the indexes can be summarized in the following formula:
<index name> = <project code>_<area code>_<section code>_<logical name>_<index type>
• The Naming Convention of the indexes will then have the same syntax of the entities, but will only
change the tipology. In practice its name is identical to that of the table on which it is created
except for the suffix. What follows is a list of the indexes applicable to a sales fact table. X indicates
a progressive number.
EDW_DM0_SLS_LBx: Represents a local bitmap index.
EDW_DM0_SLS_GBx: Represents a global bitmap index.
EDW_DM0_SLS_NUx: Represents a generic btree index not unique
EDW_DM0_SLS_UIx: Represents a generic index btree unique
EDW_DM0_SLS_FUx: Represents a function index
The Naming Convention of integrity constraints
• The Integrity constraints allow us to associate some rules to the Data Warehouse tables, to order
to prevent the introduction of outliers or non-compliant values.
• It is needless to emphasize the importance that these rules have in the design of the system.
Dispelling immediately a myth that often we hear: constraints on tables encumbers the data
manipulation operations. Nothing could be further from the truth. Let's see to make things clear.
1. Is obvious that the introduction of an integrity constraint slows down the processes of
manipulation of the table, but its overhead is minimal and, as a percentage, its weight in the
loading process will be negligible. I remember you, however, that the constraints can be
turned off before the data loading and reactivated immediately after loading.
2. If implemented programmatically in your application, the constraints will never be so
complete, secure and manageable as those defined automatically by the RDBMS.
3. Always enter the integrity constraints, even if the source systems are in turn RDBMS with the
active constraints. Do not to trust is better: try to think of what it means to have discovered
duplicate keys after loading a few months of data and be in production.
4. The integrity constraints are necessary to activate the query rewrite in Oracle, ie its internal
functionality, which is able to rewrite a query based on the fact table, and redirecting it on a
materialized view. Without the integrity constraints between the fact table and its dimension
table this mechanism will never work.
The Naming Convention of integrity constraints (2)
• The paradigm that is the basis of the Naming Convention of the integrity constraints can be
summarized in the following formula:
<constr. name> = <project code>_<area code>_<section code>_<logical name>_< constr. type>
• The following is a list of integrity constraints applicable to a fact table of sales. X indicates a
progressive number.
EDW_DM0_SLS_Nxx: To indicate the requirement to have always a non-null value for a field. XX is
a sequential number for each column in the table that requires the constraint.
EDW_DM0_SLS_PK1: To indicate the primary key.
EDW_DM0_SLS_UKx: To specify a unique key.
EDW_DM0_SLS_FKx: To specify the foreign key. If you think that the number of foreign key can
be higher of 9, use the convention Fxx
EDW_DM0_SLS_CKx: To indicate a more complex constraint based on some conditions. (for
example a start date should always be prior of the end date)
The Naming Convention of the tablespace
• The tablespaces are logical drives that connect objects with common logical characteristics. Each
table, materialized view or index always has a table space that contains, either expressed explicitly
inside the script of creation, or implied, that is (the default), the tablespace of the user who
created the object.
• In turn each tablespace are associated with one or more datafiles. The paradigm that underlies
the Naming Convention of the tablespace can be summarized in the following formula:
<tbs name> = <project code>_<area code>_<section code>_< tbs type>
where the section code and type code are optional; In fact, the technique to be applied in this
case, is not unique, but depends on the size of the objects that constitute the tablespace.
Referring to our example of the sales , we have the following:
EDW_COM: Tablespace of common entities. In the area that we have defined COM, there are
definitely tables and indexes of little size, compared to data from other areas, so will be sufficient
the project code plus the area code.
EDW_STA: Tablespace for temporary objects. Also in this case, the staging tables, which are only
transient and of small dimensions, may stay into only one tablespace.
The Naming Convention of the tablespace (2)
EDW_DM0_SLS: Tablespace objects from the sales data mart. If the total space occupied by these
objects is limited, this may be sufficient only one tablespace. (limited,for me, is under 8 Gb). If the
volumes are higher, it can be used DFT, IFT, DMT and IMT, ie fact table, index fact table , materialized
view and index materialized view.
In cases of VLDW (Very Large Data Warehouse) is conceivable a tablespace for indexes, and a
tablespace for the data, of each table.
The Naming Convention of the datafiles
• The next considerations are valid if you are not using the Automatic Storage Management feature
of Oracle.
• As stated in the previous paragraph, the tablespace is made up datafiles. At the time of the
creation of the tablespace, you must already know about, the total space occupied by the objects
that will stay in the tablespace, because you will be asked to allocate physical space.
• Let's forget about "to drive" the location of the data files on some disks of the Database Server.
Now the virtualization techniques of physical space allow us to see a single disc. My advice is to
divide the space occupied by the objects of the tablespace in a number of different files, of size
not too high, for their better management.
• The paradigm that underlies the Naming Convention of the datafiles can be summarized in the
following formula:
<datafile name> = <tablespace name>_XX.<file type>
• In this case, XX is a progressive number, the type 01,02, .., while the file type is usually fixed to DBF
(Data Base File). Of course, instead of DBF you can also associate other acronyms, it is important
that all the datafiles follow the same logic.
The Naming Convention of the roles
• In a Data Warehouse, tables and their structures, must be aggregated to be accessible to users for
data selection. I spoke at the beginning of the users of the Data Warehouse. I am aware that often
the reality is more complicated, and there will always be users who access or wish to access the
data directly. For this reason I speak about roles.
• Provide access, means giving the grant to the entities. Because users generally have access to one
or more data marts, the best way to simplify the management of access rights is to group all
accesses to the data mart using roles. (When I speak about Data Mart,that is logical, I intend the
fact table and the related dimension tables).
• So the grant does not associate a user with a structure, but a user with a role. Appears
immediately clear that the Naming Convention of the roles is closely connected to the data marts,
ie with the logical partitioning at the section level . The paradigm that underlies the Naming
Convention of roles can be summarized in the following formula:
<role name> = <project code>_<area code>_<section code>_<type code>
• The type code may be optional, as users of the Data Warehouse will access always with "SELECT"
query (I hope !); this does not mean that we cannot use "_SEL" to indicate the role of read-only
access, and with "_UPD“ the role of insert, update and delete. The next figure shows a summary
of the techniques applied so far.
Sales Data Mart (SLS)
Indici
Indici
Sales Fact Table
EDW_DM0_SLS_FAT
Local Bitmap index
EDW_DM0_SLS_LBx
Monthly Sales
EDW_DM0_SLS_MONTH_FMV
Access
Role
To Sales Data Mart
EDW_DM0_SLS_SEL
Constraint
Primary key
EDW_DM0_SLS_MONTH_PKx
Indici
Local Bitmap index
EDW_DM0_SLS_MONTH_LBx
Constraints
foreign key
EDW_DM0_SLS_MONTH_FKx
Constraint
Primary key
EDW_DM0_SLS_PKx
Constraints
foreign key
EDW_DM0_SLS_FKx
Common Area (COM)
Datafile 1
EDW_COM_01.DBF
Enterprise Data Warehouse (EDW)
Indici
Common objects
Datafile 4
EDW_DM0_SLS_04.DBF
Datafile 3
EDW_DM0_SLS_03.DBF
Datafile 2
EDW_DM0_SLS_02.DBF
Datafile 1
EDW_DM0_SLS_01.DBF
Tablespace
EDW_DM0_SLS
Tablespace EDW_COM
The Naming Convention of the packages
• Packages are libraries of PL/SQL code. In Oracle, PL/SQL (procedural language sql) is the internal
database language, although you can write programs in Java, C, or other programming languages,
callable from PL/SQL modules.
• These modules may be procedures or functions. In Oracle, to use the package is crucial: I highly
recommend that all modules necessary for the loading process are contained into packages.
• The advantages of their use are numerous, and I will mention only two:
1. Modularity: organize your programs in an orderly manner according to the context in which they
operate is essential for anyone that work, or will work, on the project.
2. Performance: when you call a module of a package for the first once, the entire package is loaded
into memory. Subsequent calls to other modules of the package doesn't require disk access.
• Returning to the Naming Convention, this means that the procedure which is used to load the fact
table of the sales or the aggregate monthly one, must be contained in the package that has the
same name (if possible) of the target table.
• If this procedure uses functions or procedures of the generic Data Mart of the sales, such a
procedure should be contained in the package that has the same name of the corresponding
section. The logical process will continue until reaching the common procedures to the entire Data
Warehouse (for example, a function that returns me the difference of two dates for calculate the
delta).
• Next figure shows an example of such encapsulation.
Daily Sales
EDW_DM0_SLSD_FAT
Monthly Sales
EDW_DM0_SLSM_FMV
Loading modules
Daily Sales package
EDW_DM0_SLSD_FAT
Loading modules
Monthly Sales package
EDW_DM0_SLSM_FMV
Common package for the
Data Mart of Sales
EDW_DM0_SLS
Common modules
Common package for all
Data Mart of level 0
EDW_DM0
Common modules
Common package for all
Data Warehouse
EDW
Common modules
Load Load
Call
Call
Call
Call
Call
Call
The Naming Convention of the packages (2)
• The Naming Convention to be adopted for the package is very flexible and may be in its most
extensive form:
<pkg name> = <project code>_<area code>_<section code>_<logical name>_<type code>
as well as in its simplest form:
< project code >
• The presence of the logical name and of the type code can be usable in complex systems where
the number of package tends to be very high.
• Do not forget that the type code must give value added to the semantics of the name. Add as
_PKG type code does not create added value, as this information is obtainable from the Oracle
catalog with a simple select statement.
• If you decide that all modules that recall Java procedures in a certain section are within a specific
package, then "_PKG" and "_JPK" will definitely effective choices.
• In the case where, as in Oracle, it is not possible to have the same name for a package and a table,
the use of "_PKG" Will be mandatory.
Conclusions
• We have really reached the end of this short journey within the Naming Convention techniques .
What is mentioned, is not certainly exhaustive of the many possible applications of these
techniques.
• Each of us, on the basis of own experience, can partition and can codify according to their needs
and according to your own intuition. Indeed it is not important the choice by which you partition
or codify the system, but it is important follow a method of standardization, in the most rigorous
way.
• An effective Naming Convention certainly provides all the tools necessary to keep under control
soon the system, in terms of knowledge, management and maintenance

Contenu connexe

Tendances

Tendances (20)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
Apache Spark sql
Apache Spark sqlApache Spark sql
Apache Spark sql
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache Drill
 
Informatica slides
Informatica slidesInformatica slides
Informatica slides
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 

En vedette

En vedette (8)

Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
Recipe 5 of Data Warehouse and Business Intelligence - The null values manage...
 
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
Data Warehouse and Business Intelligence - Recipe 7 - A messaging system for ...
 
Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3Data Warehouse and Business Intelligence - Recipe 3
Data Warehouse and Business Intelligence - Recipe 3
 
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
Recipe 14 of Data Warehouse and Business Intelligence - Build a Staging Area ...
 
Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2Data Warehouse and Business Intelligence - Recipe 2
Data Warehouse and Business Intelligence - Recipe 2
 
Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1Data Warehouse and Business Intelligence - Recipe 1
Data Warehouse and Business Intelligence - Recipe 1
 
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
ata Warehouse and Business Intelligence - Recipe 7 - A messaging system for O...
 
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
 

Similaire à Recipes 8 of Data Warehouse and Business Intelligence - Naming convention techniques (part2)

Oracle data integrator project
Oracle data integrator projectOracle data integrator project
Oracle data integrator project
Amit Sharma
 

Similaire à Recipes 8 of Data Warehouse and Business Intelligence - Naming convention techniques (part2) (20)

Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
Recipes 6 of Data Warehouse and Business Intelligence - Naming convention tec...
 
Oracle data integrator project
Oracle data integrator projectOracle data integrator project
Oracle data integrator project
 
Database Basics
Database BasicsDatabase Basics
Database Basics
 
Module02
Module02Module02
Module02
 
Chapter 6
Chapter 6Chapter 6
Chapter 6
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
Sqlserver interview questions
Sqlserver interview questionsSqlserver interview questions
Sqlserver interview questions
 
Lecture 18 - Model-Driven Service Development
Lecture 18 - Model-Driven Service DevelopmentLecture 18 - Model-Driven Service Development
Lecture 18 - Model-Driven Service Development
 
Bank mangement system
Bank mangement systemBank mangement system
Bank mangement system
 
Introduction to the Structured Query Language SQL
Introduction to the Structured Query Language SQLIntroduction to the Structured Query Language SQL
Introduction to the Structured Query Language SQL
 
Object relational database management system
Object relational database management systemObject relational database management system
Object relational database management system
 
realtime system.docx
realtime system.docxrealtime system.docx
realtime system.docx
 
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data WranglingArtifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
 
05. Physical Data Specification Template
05. Physical Data Specification Template05. Physical Data Specification Template
05. Physical Data Specification Template
 
Islamic University Previous Year Question Solution 2018 (ADBMS)
Islamic University Previous Year Question Solution 2018 (ADBMS)Islamic University Previous Year Question Solution 2018 (ADBMS)
Islamic University Previous Year Question Solution 2018 (ADBMS)
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
Recipes 10 of Data Warehouse and Business Intelligence - The descriptions man...
 
INTRODUCTION OF DATA BASE
INTRODUCTION OF DATA BASEINTRODUCTION OF DATA BASE
INTRODUCTION OF DATA BASE
 
Codds rules & keys
Codds rules & keysCodds rules & keys
Codds rules & keys
 
Dbms important questions and answers
Dbms important questions and answersDbms important questions and answers
Dbms important questions and answers
 

Plus de Massimo Cenci

Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
Massimo Cenci
 

Plus de Massimo Cenci (18)

Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
Recipe 16 of Data Warehouse and Business Intelligence - The monitoring of the...
 
Il controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging areaIl controllo temporale dei data file in staging area
Il controllo temporale dei data file in staging area
 
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
Recipe 14 - Build a Staging Area for an Oracle Data Warehouse (2)
 
Tecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etlTecniche di progettazione della staging area in un processo etl
Tecniche di progettazione della staging area in un processo etl
 
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
Note di Data Warehouse e Business Intelligence - Il giorno di riferimento dei...
 
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
Recipe 12 of Data Warehouse and Business Intelligence - How to identify and c...
 
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"Note di Data Warehouse e Business Intelligence - Pensare "Agile"
Note di Data Warehouse e Business Intelligence - Pensare "Agile"
 
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioniNote di Data Warehouse e Business Intelligence - La gestione delle descrizioni
Note di Data Warehouse e Business Intelligence - La gestione delle descrizioni
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrongData Warehouse - What you know about etl process is wrong
Data Warehouse - What you know about etl process is wrong
 
Letter to a programmer
Letter to a programmerLetter to a programmer
Letter to a programmer
 
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
Recipes 9 of Data Warehouse and Business Intelligence - Techniques to control...
 
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
Note di Data Warehouse e Business Intelligence - Tecniche di Naming Conventio...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
Data Warehouse e Business Intelligence in ambiente Oracle - Il sistema di mes...
 
Oracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sqlOracle All-in-One - how to send mail with attach using oracle pl/sql
Oracle All-in-One - how to send mail with attach using oracle pl/sql
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi (pa...
 
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisiNote di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
Note di Data Warehouse e Business Intelligence - Le Dimensioni di analisi
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

Recipes 8 of Data Warehouse and Business Intelligence - Naming convention techniques (part2)

  • 1. Recipes of Data Warehouse and Business Intelligence Naming convention techniques (part 2)
  • 2. Introduction (1) • In the part 1, the application of the naming convention techniques, had the tables as its privileged object , certainly the basic entities of an information system. We have defined a name to these entities in their easier form (table), in their aggregate form (materialized view or summary table) in their logical form (view). • It was emphasized that these techniques can be applied to any logical/physical entity of a Data Warehouse. So, I wish to complete these thoughts having in mind three targets: – Completeness: The tables are basic, but alone, do not constitute a Data Warehouse. There must be access rights to view them, programs must exist to load them ,must exist indexes to speed their access, there must be constraints to ensure data integrity. Even programs, rights, indexes and constraints must be created by respecting the naming convention. The tables are made of attributes. Even the attributes have a name. We will speak. – Pragmatism: Only seeing apply the techniques described in a real case, we can recognize the utility, then we will examine and we will give a name to all the other entities in the game, using the sample Data Warehouse. – Knowledge: Some of the entities that will be subject to naming, are specific to Oracle and this is a good opportunity to give them a brief description. • The choices made for the naming convention are only guidelines. They are not a dogma. The convention can be discussed and changed according to our needs and to our particular view of the system.
  • 3. Introduction (2) • The main objective was to put attention to the importance and usefulness of the Naming Convention. • Another point I wish to emphasize is that the convention is "Database Administator oriented" and not "Business oriented”. It means that the names chosen, for example, for the tables, will be physical ones, and the names will be those that only the DBA sees. The "rest of the world" should not see those names, but the "logical" names that are filtered by synonyms and/or views.
  • 4. The users of the Naming Convention • Based on some useful questions received, I want to clarify this point. Take for example the EDW_COM_CDI_CUST_DIT entity. This entity represents the customers (CUST) of the dimension table (DIT) of the conformed dimensions section (CDI) of the common area (COM) for all entities, of our Data Warehouse (EDW). The content of this entity is clear to any DBA, also to who, for example, inherits the management of a Data Warehouse that does not know. (try to think if the table had been named A01DWCST). • The EDW_COM_CDI_CUST_DIT entity is seen and handled only by the DBA. In my view, the only other users who can see the entity (and only what we need, that are, usually, facts and dimensions) are the business-area builders, by means of an administration module, that is part of the front-end tool (eg Oracle Business Intelligence). • These users do not need to see the EDW_COM_CDI_CUST_DIT physical name, but a view/synonym (logical name) as, for example, CUSTOMER_DIT. If we had the foresight to make unique the last two components of the name, the rest of the world will see an entity name much shorter, simple and near to its business logic.
  • 5. The users of the Naming Convention A01DWCST EDW_COM_CDI_CUST_DIT Without Naming Convention With Naming Convention CUSTOMER_DIT DBA Architect
  • 6. The Naming Convention of the table attributes • As we know from the theory of relational databases, the table attributes are a set of specific characteristics of the various entities that define the logical model. Since we have defined a Naming Convention for the entity, it is necessary define a Naming Convention for attributes. • The paradigm that underlies the Naming Convention of the table attributes can be summarized in the following formula: <attribute name> = <logical name>_<type code> • For them, the name is very simplified because their logical context is already structurally defined by the table to which they belong. • Into the data dictionary tables of an RDBMS, such as Oracle, you can locate all the attributes and their tables associated. So an effective Naming Convention will be very useful in the research of all the attributes with certain common characteristics. Here are some examples from my personal experiences. • In a Data Warehouse for a bank, was born the need to change the size of the currency numbers fields from two to six decimal places. The need was clearly linked to rounding problems. Hundreds of tables with different columns had to be involved in the modification. Have adopted the Naming Convention to identify all the columns of currency amounts with “*_AMT” was decisive. Has allowed us to generate a script that, accessing to the data dictionary tables , it made dynamically the change of structure of all and only the affected columns.
  • 7. The Naming Convention of the table attributes (2) • Some ETL and reporting tools allow us to identify automatically all the descriptive columns of alphanumeric codes that will be displayed in the output: the interface of the tool will then use a clause "like" to locate the fields. If you use the standard to name the description of all the code columns with "*_DSC, this will allow you to take advantage of this feature of the tool, and it will not need to specify one by one all the fields. Now we see some examples of the <type code>. • COD - Code - Alphanumeric code: This is the classic code that is associated with a description and a domain. It may be a customer number, an order type, an account status, etc.. I suggest to deal all the numeric codes as alphanumeric codes. • DSC - Description - the code description is always the description associated with the code that is used in the reporting tools and in front-end. It is a design choice understand if only a single description is sufficient, or define a short description (SDS which stands for short description) and a long description (LDS which stands for long description). It often happens that the user requires the concatenation of the short description with the code (CSD which stands for code plus short description) • AMT - Amount - Always indicates an amount. • QTY - Quantity - Indicates a quantity in pieces, weight or in some other unit of measure. • KEY - Always indicates an artificial key. A column with this type must exist in all dimension tables and in the corresponding columns of the fact tables.
  • 8. The Naming Convention of the table attributes (3) • DTS - Date Stamp - Date: indicates a date in the Oracle format, that is inclusive of the time (hours, minutes, seconds) • FLG - Flag - It is always a binary field, ie that it may be only 0 or 1. • TXT - Text - Field of generic text • YMD - Day in the YYYYMMDD format
  • 9. The other entities of a Data Warehouse • Identify all the main entities or structures of a Data Warehouse, is not an easy job without forgetting that in Oracle there are over 30 different types of structures. • Eeach RDBMS has its own requirements and peculiarities and would be long-winded and useless to try to give a Naming Convention at all. So we will focus on the main entities, almost always present, leaving to the reader the application of the learned techniques for the remaining ones. Here is the list of entities, subject of our next guidelines. • Index • Tablespace • Datafile • Integrity Constraint • Role • Package
  • 10. The Naming Convention of the indexes • As the Naming Convention is linked to the type of index, I will give a brief overview of the most common types of indexes. They typically cover 90% of the need for a Data Warehouse. • As everyone knows, the indexes are data structures that are created on one or more columns in a table to optimize the performance of the data access; the goal of an index is therefore to provide an immediate physical access to the rows of the table that contains the values. • In Oracle, but there are also in other RDBMS, the indexes most used are the classic B-tree indexes, the local or global bitmap indexes, and the function index. The paradigm that underlies the Naming Convention of the indexes can be summarized in the following formula: <index name> = <project code>_<area code>_<section code>_<logical name>_<index type> • The Naming Convention of the indexes will then have the same syntax of the entities, but will only change the tipology. In practice its name is identical to that of the table on which it is created except for the suffix. What follows is a list of the indexes applicable to a sales fact table. X indicates a progressive number. EDW_DM0_SLS_LBx: Represents a local bitmap index. EDW_DM0_SLS_GBx: Represents a global bitmap index. EDW_DM0_SLS_NUx: Represents a generic btree index not unique EDW_DM0_SLS_UIx: Represents a generic index btree unique EDW_DM0_SLS_FUx: Represents a function index
  • 11. The Naming Convention of integrity constraints • The Integrity constraints allow us to associate some rules to the Data Warehouse tables, to order to prevent the introduction of outliers or non-compliant values. • It is needless to emphasize the importance that these rules have in the design of the system. Dispelling immediately a myth that often we hear: constraints on tables encumbers the data manipulation operations. Nothing could be further from the truth. Let's see to make things clear. 1. Is obvious that the introduction of an integrity constraint slows down the processes of manipulation of the table, but its overhead is minimal and, as a percentage, its weight in the loading process will be negligible. I remember you, however, that the constraints can be turned off before the data loading and reactivated immediately after loading. 2. If implemented programmatically in your application, the constraints will never be so complete, secure and manageable as those defined automatically by the RDBMS. 3. Always enter the integrity constraints, even if the source systems are in turn RDBMS with the active constraints. Do not to trust is better: try to think of what it means to have discovered duplicate keys after loading a few months of data and be in production. 4. The integrity constraints are necessary to activate the query rewrite in Oracle, ie its internal functionality, which is able to rewrite a query based on the fact table, and redirecting it on a materialized view. Without the integrity constraints between the fact table and its dimension table this mechanism will never work.
  • 12. The Naming Convention of integrity constraints (2) • The paradigm that is the basis of the Naming Convention of the integrity constraints can be summarized in the following formula: <constr. name> = <project code>_<area code>_<section code>_<logical name>_< constr. type> • The following is a list of integrity constraints applicable to a fact table of sales. X indicates a progressive number. EDW_DM0_SLS_Nxx: To indicate the requirement to have always a non-null value for a field. XX is a sequential number for each column in the table that requires the constraint. EDW_DM0_SLS_PK1: To indicate the primary key. EDW_DM0_SLS_UKx: To specify a unique key. EDW_DM0_SLS_FKx: To specify the foreign key. If you think that the number of foreign key can be higher of 9, use the convention Fxx EDW_DM0_SLS_CKx: To indicate a more complex constraint based on some conditions. (for example a start date should always be prior of the end date)
  • 13. The Naming Convention of the tablespace • The tablespaces are logical drives that connect objects with common logical characteristics. Each table, materialized view or index always has a table space that contains, either expressed explicitly inside the script of creation, or implied, that is (the default), the tablespace of the user who created the object. • In turn each tablespace are associated with one or more datafiles. The paradigm that underlies the Naming Convention of the tablespace can be summarized in the following formula: <tbs name> = <project code>_<area code>_<section code>_< tbs type> where the section code and type code are optional; In fact, the technique to be applied in this case, is not unique, but depends on the size of the objects that constitute the tablespace. Referring to our example of the sales , we have the following: EDW_COM: Tablespace of common entities. In the area that we have defined COM, there are definitely tables and indexes of little size, compared to data from other areas, so will be sufficient the project code plus the area code. EDW_STA: Tablespace for temporary objects. Also in this case, the staging tables, which are only transient and of small dimensions, may stay into only one tablespace.
  • 14. The Naming Convention of the tablespace (2) EDW_DM0_SLS: Tablespace objects from the sales data mart. If the total space occupied by these objects is limited, this may be sufficient only one tablespace. (limited,for me, is under 8 Gb). If the volumes are higher, it can be used DFT, IFT, DMT and IMT, ie fact table, index fact table , materialized view and index materialized view. In cases of VLDW (Very Large Data Warehouse) is conceivable a tablespace for indexes, and a tablespace for the data, of each table.
  • 15. The Naming Convention of the datafiles • The next considerations are valid if you are not using the Automatic Storage Management feature of Oracle. • As stated in the previous paragraph, the tablespace is made up datafiles. At the time of the creation of the tablespace, you must already know about, the total space occupied by the objects that will stay in the tablespace, because you will be asked to allocate physical space. • Let's forget about "to drive" the location of the data files on some disks of the Database Server. Now the virtualization techniques of physical space allow us to see a single disc. My advice is to divide the space occupied by the objects of the tablespace in a number of different files, of size not too high, for their better management. • The paradigm that underlies the Naming Convention of the datafiles can be summarized in the following formula: <datafile name> = <tablespace name>_XX.<file type> • In this case, XX is a progressive number, the type 01,02, .., while the file type is usually fixed to DBF (Data Base File). Of course, instead of DBF you can also associate other acronyms, it is important that all the datafiles follow the same logic.
  • 16. The Naming Convention of the roles • In a Data Warehouse, tables and their structures, must be aggregated to be accessible to users for data selection. I spoke at the beginning of the users of the Data Warehouse. I am aware that often the reality is more complicated, and there will always be users who access or wish to access the data directly. For this reason I speak about roles. • Provide access, means giving the grant to the entities. Because users generally have access to one or more data marts, the best way to simplify the management of access rights is to group all accesses to the data mart using roles. (When I speak about Data Mart,that is logical, I intend the fact table and the related dimension tables). • So the grant does not associate a user with a structure, but a user with a role. Appears immediately clear that the Naming Convention of the roles is closely connected to the data marts, ie with the logical partitioning at the section level . The paradigm that underlies the Naming Convention of roles can be summarized in the following formula: <role name> = <project code>_<area code>_<section code>_<type code> • The type code may be optional, as users of the Data Warehouse will access always with "SELECT" query (I hope !); this does not mean that we cannot use "_SEL" to indicate the role of read-only access, and with "_UPD“ the role of insert, update and delete. The next figure shows a summary of the techniques applied so far.
  • 17. Sales Data Mart (SLS) Indici Indici Sales Fact Table EDW_DM0_SLS_FAT Local Bitmap index EDW_DM0_SLS_LBx Monthly Sales EDW_DM0_SLS_MONTH_FMV Access Role To Sales Data Mart EDW_DM0_SLS_SEL Constraint Primary key EDW_DM0_SLS_MONTH_PKx Indici Local Bitmap index EDW_DM0_SLS_MONTH_LBx Constraints foreign key EDW_DM0_SLS_MONTH_FKx Constraint Primary key EDW_DM0_SLS_PKx Constraints foreign key EDW_DM0_SLS_FKx Common Area (COM) Datafile 1 EDW_COM_01.DBF Enterprise Data Warehouse (EDW) Indici Common objects Datafile 4 EDW_DM0_SLS_04.DBF Datafile 3 EDW_DM0_SLS_03.DBF Datafile 2 EDW_DM0_SLS_02.DBF Datafile 1 EDW_DM0_SLS_01.DBF Tablespace EDW_DM0_SLS Tablespace EDW_COM
  • 18. The Naming Convention of the packages • Packages are libraries of PL/SQL code. In Oracle, PL/SQL (procedural language sql) is the internal database language, although you can write programs in Java, C, or other programming languages, callable from PL/SQL modules. • These modules may be procedures or functions. In Oracle, to use the package is crucial: I highly recommend that all modules necessary for the loading process are contained into packages. • The advantages of their use are numerous, and I will mention only two: 1. Modularity: organize your programs in an orderly manner according to the context in which they operate is essential for anyone that work, or will work, on the project. 2. Performance: when you call a module of a package for the first once, the entire package is loaded into memory. Subsequent calls to other modules of the package doesn't require disk access. • Returning to the Naming Convention, this means that the procedure which is used to load the fact table of the sales or the aggregate monthly one, must be contained in the package that has the same name (if possible) of the target table. • If this procedure uses functions or procedures of the generic Data Mart of the sales, such a procedure should be contained in the package that has the same name of the corresponding section. The logical process will continue until reaching the common procedures to the entire Data Warehouse (for example, a function that returns me the difference of two dates for calculate the delta). • Next figure shows an example of such encapsulation.
  • 19. Daily Sales EDW_DM0_SLSD_FAT Monthly Sales EDW_DM0_SLSM_FMV Loading modules Daily Sales package EDW_DM0_SLSD_FAT Loading modules Monthly Sales package EDW_DM0_SLSM_FMV Common package for the Data Mart of Sales EDW_DM0_SLS Common modules Common package for all Data Mart of level 0 EDW_DM0 Common modules Common package for all Data Warehouse EDW Common modules Load Load Call Call Call Call Call Call
  • 20. The Naming Convention of the packages (2) • The Naming Convention to be adopted for the package is very flexible and may be in its most extensive form: <pkg name> = <project code>_<area code>_<section code>_<logical name>_<type code> as well as in its simplest form: < project code > • The presence of the logical name and of the type code can be usable in complex systems where the number of package tends to be very high. • Do not forget that the type code must give value added to the semantics of the name. Add as _PKG type code does not create added value, as this information is obtainable from the Oracle catalog with a simple select statement. • If you decide that all modules that recall Java procedures in a certain section are within a specific package, then "_PKG" and "_JPK" will definitely effective choices. • In the case where, as in Oracle, it is not possible to have the same name for a package and a table, the use of "_PKG" Will be mandatory.
  • 21. Conclusions • We have really reached the end of this short journey within the Naming Convention techniques . What is mentioned, is not certainly exhaustive of the many possible applications of these techniques. • Each of us, on the basis of own experience, can partition and can codify according to their needs and according to your own intuition. Indeed it is not important the choice by which you partition or codify the system, but it is important follow a method of standardization, in the most rigorous way. • An effective Naming Convention certainly provides all the tools necessary to keep under control soon the system, in terms of knowledge, management and maintenance