SlideShare a Scribd company logo
1 of 14
Raising data
management
standards
www.etlsolutions.com
E&P Data Standardisation
Implementing corporate data
standards in the field using
metadata-centric tools
• Many companies are today engaged in the standardization of the data they hold in
corporate data stores. In the Geoscience sector, where ETL Solutions is active, a
great variety of data stores are used, such as OpenWorks, Corporate Data Store,
Seabed, Finder and PPDM.
• These data stores usually define formal catalogue controlled values to assist with
enforcing standards. However, this data, usually held in database tables - which
themselves contain catalogue controlled values - has often been extended or adapted
for regional circumstances.
• This causes two day-to-day issues:
1. Corporate reports assembled from such data can easily be misleading. An attempt to
discover something as simple as the relative number of 'on shore' and 'off shore' wells may
fail if one of the regional stalls has decided to classify wells as 'tidal', 'shallow water' and
'deep water' - or just as 'on' and 'off'.
2. Engineers trained in one region, or working for one subcontractor, can find it difficult to work
in another region where standard reference values are chosen from a different set.
• While these are awkward issues, it is typically difficult to assign a direct cost to any
accounting centre and hence calculate the benefit of fixing it. A more pressing
reason often arises when stores are to be upgraded to newer (and more consistent)
versions, or when data held in legacy stores needs to be sufficiently cleaned so it can
be migrated into corporate standard stores. Often, the day-to-day issues are tolerated
until a migration is required by the business.
The challenge (1)
• What makes this a tricky problem is the mismatch between the
tools available to the average engineer - usually not much more
than Excel and SQLplus - and the extreme complexity of a
modern data store: PPDM 3.8, for example, contains over 1700
tables.
• A hand-tool approach to such a problem is going to be very
expensive, tying up valuable staff for weeks on end and quite
possibly failing.
• It is in these circumstances that metadata tools in the right
domain specialist’s hands come into their own. ETL Solutions'
Transformation Manager, for example, is capable of introspecting
and interpreting the metadata from any one of the stores in less
than a minute.
• With Transformation Manager, it is possible to see the structure of
the data in an easily understood Graphical User Interface (GUI)
and indeed to navigate the data (or instance values) within this
same user interface. However, much more usefully than merely
viewing the data, it is possible to use the metadata to directly
discover - and then manipulate - all the data that will be affected
by changes in standards values.
The challenge (2)
• The client’s pilot project uses a
metadata-driven method to
support the data
standardisation and migration
of legacy databases to the
client’s standards.
• Transformation Manager has
been used in a once-only
‘Prepare Phase’ to
automatically create
transforms that will be used
during every project migration.
• These generic transforms will
be run during the
Extract, Verify and Update
phases of the migration
process (see the diagram
opposite).
Process
Extract
Review
Verify
Update
Mapping
spreadsheet
Mapping
spreadsheet
Error reports
OW data dict
changes
Standards
• The client’s standards are defined using two spreadsheets; this is the starting point for
each migration of a legacy project.
• The standard values spreadsheet (below) contains a worksheet for each table that
includes their standard values. These tables contain a number of verification code tables
(_VC), a number of reference data tables (_R), and other tables, e.g.
OW_DATA_SOURCE, that contain standards data. Each worksheet consists of a header
row specifying the fields defined by the company’s standards, followed by rows which
provide the set of permitted values.
Standard values
• The data dictionary spreadsheet, shown above, defines data to be entered into the OW
Data_Dict Table to specify default values or data ranges covered by the company
standards.
• The data dictionary spreadsheet is used to set default values, which may be null, and a
range of values that each field must conform to, expressed either as a numeric value
range, e.g. for CASING_SIZE ‘0..10e+4’, or as a set of value strings.
Data dictionary
• Any project can be updated by running a command line
script during an Extract phase to determine any non-
standard values used by the project. This automatic
process reviews the project tables, determining any
values currently used which require updating to
conform to the client’s standards. A set of mapping
CSV files is created with values currently used which
are not to standard.
• A mapping file (.CSV) is created for each table and its
natural key field, as shown opposite, which contains
values that are not present in the client’s standards.
• An absent mapping spreadsheet indicates that all
current values are consistent with the standard values.
A mapping file initially contains the set of existing
values used in the project and forms the interface for a
user to enter a standards value to be used instead.
When mapping spreadsheets have already been
created for a previous project, these can be provided
and will be updated during the extract process. In this
case, the spreadsheet will contain some new standard
values as recommendations for the project migration.
Mapping files
• One of the principal activities the user undertakes, referred to as the Review phase, is to
check the mapping spreadsheets and provide any standard values that are required to be
used instead of the non-standard presently used.
• Once this has been completed, the user again runs a command line script to verify that
once the client’s standard values are applied, the project will be consistent with the
standards. Transformation Manager transforms are run to verify that this is the case.
• The output is a set of reports; one of these as viewed by Excel is shown below.
Review reports
• During the Verify stage, the user will correct all non-standard values
and data dictionary errors until the reports indicate that a completely
consistent project will be produced if migration proceeds.
• Up to this stage, the project data has been accessed in read only
mode and the project has been available to all other users. It is not
until the final Upgrade phase of the project migration that exclusive
access to the project is required.
• Once verification has been completed, the final Update phase is again
run from the command line and completes the migration process.
• The Extract, Review and Update phases implemented using
Transformation Manager are completely automatic, and provide
warnings if the migration process is not completely compatible with
the client’s standards. The main user input is in the Review phase,
where a user with knowledge of the project specifies which standard
values shall be used to replace any non-standard values presently
used.
• Throughout the migration phases, the user operates in a simple user
environment, with minimal involvement of system or database
administrators. The project remains online until the final Upgrade
phase, which requires exclusive access to the project database for a
relatively short duration.
Process to completion
• If the business wants to leverage the benefits
of corporate data which conforms to a defined
set of standards, using metadata-driven
standardization tools will enable the business
to truly implement and maintain the
standards.
• Also, such a programme can be delivered in a
fraction of the effort, time and risk that often
adversely affects even well supported
initiatives.
• Transformation Manager delivers corporate
data standards, ensuring they are
implemented, enforced and inherently more
flexible to meet changing needs.
• Finally, in the future, the migration process
can easily be extended to other Geoscience
areas of interest, and to other types of
database stores in virtually any sector.
Conclusions
Our offerings: E&P data management
Transformation
Manager
software
Transformation
Manager data
loader
developer kits
Support,
training and
mentoring
services
Data loader
and
connector
development
Data
migration
packaged
services
 Everything under one roof
 Greater control and
transparency
 Identify and test against errors
iteratively
 Greater understanding of the
transformation requirement
 Automatically document
 Re-use and change
management
 Uses domain specific
terminology in the mapping
Why Transformation Manager?
For the user:
Why Transformation Manager?
 Reduces cost and effort
 Reduces risk in the project
 Delivers higher quality and
reduces error
 Increases control and
transparency in the
development
 Single product
 Reduces time to market
For the business:
Raising data
management
standards
www.etlsolutions.com
Raising data
management
standards
www.etlsolutions.com
Contact information
Karl Glenn
kg@etlsolutions.com
+44 (0) 1912 894040

More Related Content

What's hot

Data Quality: Issues and Fixes
Data Quality: Issues and FixesData Quality: Issues and Fixes
Data Quality: Issues and Fixes
CRRC-Armenia
 
ETIS09 - Data Quality: Common Problems & Checks - Presentation
ETIS09 -  Data Quality: Common Problems & Checks - PresentationETIS09 -  Data Quality: Common Problems & Checks - Presentation
ETIS09 - Data Quality: Common Problems & Checks - Presentation
David Walker
 
Data migration methodology_for_sap_v01a
Data migration methodology_for_sap_v01aData migration methodology_for_sap_v01a
Data migration methodology_for_sap_v01a
Abhaya Sarangi
 

What's hot (20)

Data Quality: Issues and Fixes
Data Quality: Issues and FixesData Quality: Issues and Fixes
Data Quality: Issues and Fixes
 
Ray Scott - Agile Solutions – Leading with Test Data Management - EuroSTAR 2012
Ray Scott - Agile Solutions – Leading with Test Data Management - EuroSTAR 2012Ray Scott - Agile Solutions – Leading with Test Data Management - EuroSTAR 2012
Ray Scott - Agile Solutions – Leading with Test Data Management - EuroSTAR 2012
 
ETIS09 - Data Quality: Common Problems & Checks - Presentation
ETIS09 -  Data Quality: Common Problems & Checks - PresentationETIS09 -  Data Quality: Common Problems & Checks - Presentation
ETIS09 - Data Quality: Common Problems & Checks - Presentation
 
Data Warehouse 102
Data Warehouse 102Data Warehouse 102
Data Warehouse 102
 
20171019 data migration (rk)
20171019 data migration (rk)20171019 data migration (rk)
20171019 data migration (rk)
 
Data migration methodology_for_sap_v01a
Data migration methodology_for_sap_v01aData migration methodology_for_sap_v01a
Data migration methodology_for_sap_v01a
 
Automate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile wayAutomate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile way
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Multidimensional Challenges and the Impact of Test Data Management
Multidimensional Challenges and the Impact of Test Data ManagementMultidimensional Challenges and the Impact of Test Data Management
Multidimensional Challenges and the Impact of Test Data Management
 
Data Warehouse Methodology
Data Warehouse MethodologyData Warehouse Methodology
Data Warehouse Methodology
 
Third Nature - Open Source Data Warehousing
Third Nature - Open Source Data WarehousingThird Nature - Open Source Data Warehousing
Third Nature - Open Source Data Warehousing
 
Test data management a case study Presented at SiGIST
Test data management a case study Presented at SiGISTTest data management a case study Presented at SiGIST
Test data management a case study Presented at SiGIST
 
Ibm Optim Techical Overview 01282009
Ibm Optim Techical Overview 01282009Ibm Optim Techical Overview 01282009
Ibm Optim Techical Overview 01282009
 
How can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of othersHow can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of others
 
Planning Data Warehouse
Planning Data WarehousePlanning Data Warehouse
Planning Data Warehouse
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
 
2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli2 data warehouse life cycle golfarelli
2 data warehouse life cycle golfarelli
 
How to Make Wise Post-Production Changes to Oracle Clinical/Remote Data Captu...
How to Make Wise Post-Production Changes to Oracle Clinical/Remote Data Captu...How to Make Wise Post-Production Changes to Oracle Clinical/Remote Data Captu...
How to Make Wise Post-Production Changes to Oracle Clinical/Remote Data Captu...
 
Migrating data: How to reduce risk
Migrating data: How to reduce riskMigrating data: How to reduce risk
Migrating data: How to reduce risk
 
Data Quality Challenges & Solution Approaches in Yahoo!’s Massive Data
Data Quality Challenges & Solution Approaches in Yahoo!’s Massive DataData Quality Challenges & Solution Approaches in Yahoo!’s Massive Data
Data Quality Challenges & Solution Approaches in Yahoo!’s Massive Data
 

Similar to E&P data management: Implementing data standards

rizwan cse exp resume
rizwan cse exp resumerizwan cse exp resume
rizwan cse exp resume
shaik rizwan
 
Akram_Resume_ETL_Informatica
Akram_Resume_ETL_InformaticaAkram_Resume_ETL_Informatica
Akram_Resume_ETL_Informatica
Akram Bhuyan
 
Alejandro Chico Resume
Alejandro Chico ResumeAlejandro Chico Resume
Alejandro Chico Resume
Alex Chico
 
DMM9 - Data Migration Testing
DMM9 - Data Migration TestingDMM9 - Data Migration Testing
DMM9 - Data Migration Testing
Nick van Beest
 
Amit Kumar_Resume
Amit Kumar_ResumeAmit Kumar_Resume
Amit Kumar_Resume
Amit Kumar
 
Copy of Alok_Singh_CV
Copy of Alok_Singh_CVCopy of Alok_Singh_CV
Copy of Alok_Singh_CV
Alok Singh
 
Subhoshree_ETLDeveloper
Subhoshree_ETLDeveloperSubhoshree_ETLDeveloper
Subhoshree_ETLDeveloper
Subhoshree Deo
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 

Similar to E&P data management: Implementing data standards (20)

Resume quaish abuzer
Resume quaish abuzerResume quaish abuzer
Resume quaish abuzer
 
ETL•Accelerator
ETL•AcceleratorETL•Accelerator
ETL•Accelerator
 
rizwan cse exp resume
rizwan cse exp resumerizwan cse exp resume
rizwan cse exp resume
 
Resume - Deepak v.s
Resume -  Deepak v.sResume -  Deepak v.s
Resume - Deepak v.s
 
Akram_Resume_ETL_Informatica
Akram_Resume_ETL_InformaticaAkram_Resume_ETL_Informatica
Akram_Resume_ETL_Informatica
 
Madhu_Resume
Madhu_ResumeMadhu_Resume
Madhu_Resume
 
Collaborate 2012-business data transformation and consolidation
Collaborate 2012-business data transformation and consolidationCollaborate 2012-business data transformation and consolidation
Collaborate 2012-business data transformation and consolidation
 
Collaborate 2012-business data transformation and consolidation for a global ...
Collaborate 2012-business data transformation and consolidation for a global ...Collaborate 2012-business data transformation and consolidation for a global ...
Collaborate 2012-business data transformation and consolidation for a global ...
 
Alejandro Chico Resume
Alejandro Chico ResumeAlejandro Chico Resume
Alejandro Chico Resume
 
Migrating Data Warehouse Solutions from Oracle to non-Oracle Databases
Migrating Data Warehouse Solutions from Oracle to non-Oracle DatabasesMigrating Data Warehouse Solutions from Oracle to non-Oracle Databases
Migrating Data Warehouse Solutions from Oracle to non-Oracle Databases
 
HamsaBalajiresume
HamsaBalajiresumeHamsaBalajiresume
HamsaBalajiresume
 
Enterprise resource planning_system
Enterprise resource planning_systemEnterprise resource planning_system
Enterprise resource planning_system
 
Ganesh profile
Ganesh profileGanesh profile
Ganesh profile
 
DMM9 - Data Migration Testing
DMM9 - Data Migration TestingDMM9 - Data Migration Testing
DMM9 - Data Migration Testing
 
Amit Kumar_Resume
Amit Kumar_ResumeAmit Kumar_Resume
Amit Kumar_Resume
 
Subhoshree resume
Subhoshree resumeSubhoshree resume
Subhoshree resume
 
Copy of Alok_Singh_CV
Copy of Alok_Singh_CVCopy of Alok_Singh_CV
Copy of Alok_Singh_CV
 
Puneet Verma CV
Puneet Verma CVPuneet Verma CV
Puneet Verma CV
 
Subhoshree_ETLDeveloper
Subhoshree_ETLDeveloperSubhoshree_ETLDeveloper
Subhoshree_ETLDeveloper
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 

More from ETLSolutions

More from ETLSolutions (6)

How to create a successful proof of concept
How to create a successful proof of conceptHow to create a successful proof of concept
How to create a successful proof of concept
 
How to prepare data before a data migration
How to prepare data before a data migrationHow to prepare data before a data migration
How to prepare data before a data migration
 
An example of a successful proof of concept
An example of a successful proof of conceptAn example of a successful proof of concept
An example of a successful proof of concept
 
Data integration case study: Oil & Gas industry
Data integration case study: Oil & Gas industryData integration case study: Oil & Gas industry
Data integration case study: Oil & Gas industry
 
Data integration case study: Automotive industry
Data integration case study: Automotive industryData integration case study: Automotive industry
Data integration case study: Automotive industry
 
Automotive data integration: An example of a successful project structure
Automotive data integration: An example of a successful project structureAutomotive data integration: An example of a successful project structure
Automotive data integration: An example of a successful project structure
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

E&P data management: Implementing data standards

  • 1. Raising data management standards www.etlsolutions.com E&P Data Standardisation Implementing corporate data standards in the field using metadata-centric tools
  • 2. • Many companies are today engaged in the standardization of the data they hold in corporate data stores. In the Geoscience sector, where ETL Solutions is active, a great variety of data stores are used, such as OpenWorks, Corporate Data Store, Seabed, Finder and PPDM. • These data stores usually define formal catalogue controlled values to assist with enforcing standards. However, this data, usually held in database tables - which themselves contain catalogue controlled values - has often been extended or adapted for regional circumstances. • This causes two day-to-day issues: 1. Corporate reports assembled from such data can easily be misleading. An attempt to discover something as simple as the relative number of 'on shore' and 'off shore' wells may fail if one of the regional stalls has decided to classify wells as 'tidal', 'shallow water' and 'deep water' - or just as 'on' and 'off'. 2. Engineers trained in one region, or working for one subcontractor, can find it difficult to work in another region where standard reference values are chosen from a different set. • While these are awkward issues, it is typically difficult to assign a direct cost to any accounting centre and hence calculate the benefit of fixing it. A more pressing reason often arises when stores are to be upgraded to newer (and more consistent) versions, or when data held in legacy stores needs to be sufficiently cleaned so it can be migrated into corporate standard stores. Often, the day-to-day issues are tolerated until a migration is required by the business. The challenge (1)
  • 3. • What makes this a tricky problem is the mismatch between the tools available to the average engineer - usually not much more than Excel and SQLplus - and the extreme complexity of a modern data store: PPDM 3.8, for example, contains over 1700 tables. • A hand-tool approach to such a problem is going to be very expensive, tying up valuable staff for weeks on end and quite possibly failing. • It is in these circumstances that metadata tools in the right domain specialist’s hands come into their own. ETL Solutions' Transformation Manager, for example, is capable of introspecting and interpreting the metadata from any one of the stores in less than a minute. • With Transformation Manager, it is possible to see the structure of the data in an easily understood Graphical User Interface (GUI) and indeed to navigate the data (or instance values) within this same user interface. However, much more usefully than merely viewing the data, it is possible to use the metadata to directly discover - and then manipulate - all the data that will be affected by changes in standards values. The challenge (2)
  • 4. • The client’s pilot project uses a metadata-driven method to support the data standardisation and migration of legacy databases to the client’s standards. • Transformation Manager has been used in a once-only ‘Prepare Phase’ to automatically create transforms that will be used during every project migration. • These generic transforms will be run during the Extract, Verify and Update phases of the migration process (see the diagram opposite). Process Extract Review Verify Update Mapping spreadsheet Mapping spreadsheet Error reports OW data dict changes Standards
  • 5. • The client’s standards are defined using two spreadsheets; this is the starting point for each migration of a legacy project. • The standard values spreadsheet (below) contains a worksheet for each table that includes their standard values. These tables contain a number of verification code tables (_VC), a number of reference data tables (_R), and other tables, e.g. OW_DATA_SOURCE, that contain standards data. Each worksheet consists of a header row specifying the fields defined by the company’s standards, followed by rows which provide the set of permitted values. Standard values
  • 6. • The data dictionary spreadsheet, shown above, defines data to be entered into the OW Data_Dict Table to specify default values or data ranges covered by the company standards. • The data dictionary spreadsheet is used to set default values, which may be null, and a range of values that each field must conform to, expressed either as a numeric value range, e.g. for CASING_SIZE ‘0..10e+4’, or as a set of value strings. Data dictionary
  • 7. • Any project can be updated by running a command line script during an Extract phase to determine any non- standard values used by the project. This automatic process reviews the project tables, determining any values currently used which require updating to conform to the client’s standards. A set of mapping CSV files is created with values currently used which are not to standard. • A mapping file (.CSV) is created for each table and its natural key field, as shown opposite, which contains values that are not present in the client’s standards. • An absent mapping spreadsheet indicates that all current values are consistent with the standard values. A mapping file initially contains the set of existing values used in the project and forms the interface for a user to enter a standards value to be used instead. When mapping spreadsheets have already been created for a previous project, these can be provided and will be updated during the extract process. In this case, the spreadsheet will contain some new standard values as recommendations for the project migration. Mapping files
  • 8. • One of the principal activities the user undertakes, referred to as the Review phase, is to check the mapping spreadsheets and provide any standard values that are required to be used instead of the non-standard presently used. • Once this has been completed, the user again runs a command line script to verify that once the client’s standard values are applied, the project will be consistent with the standards. Transformation Manager transforms are run to verify that this is the case. • The output is a set of reports; one of these as viewed by Excel is shown below. Review reports
  • 9. • During the Verify stage, the user will correct all non-standard values and data dictionary errors until the reports indicate that a completely consistent project will be produced if migration proceeds. • Up to this stage, the project data has been accessed in read only mode and the project has been available to all other users. It is not until the final Upgrade phase of the project migration that exclusive access to the project is required. • Once verification has been completed, the final Update phase is again run from the command line and completes the migration process. • The Extract, Review and Update phases implemented using Transformation Manager are completely automatic, and provide warnings if the migration process is not completely compatible with the client’s standards. The main user input is in the Review phase, where a user with knowledge of the project specifies which standard values shall be used to replace any non-standard values presently used. • Throughout the migration phases, the user operates in a simple user environment, with minimal involvement of system or database administrators. The project remains online until the final Upgrade phase, which requires exclusive access to the project database for a relatively short duration. Process to completion
  • 10. • If the business wants to leverage the benefits of corporate data which conforms to a defined set of standards, using metadata-driven standardization tools will enable the business to truly implement and maintain the standards. • Also, such a programme can be delivered in a fraction of the effort, time and risk that often adversely affects even well supported initiatives. • Transformation Manager delivers corporate data standards, ensuring they are implemented, enforced and inherently more flexible to meet changing needs. • Finally, in the future, the migration process can easily be extended to other Geoscience areas of interest, and to other types of database stores in virtually any sector. Conclusions
  • 11. Our offerings: E&P data management Transformation Manager software Transformation Manager data loader developer kits Support, training and mentoring services Data loader and connector development Data migration packaged services
  • 12.  Everything under one roof  Greater control and transparency  Identify and test against errors iteratively  Greater understanding of the transformation requirement  Automatically document  Re-use and change management  Uses domain specific terminology in the mapping Why Transformation Manager? For the user:
  • 13. Why Transformation Manager?  Reduces cost and effort  Reduces risk in the project  Delivers higher quality and reduces error  Increases control and transparency in the development  Single product  Reduces time to market For the business: