Paper: Testing & Quality Assurance in Data Migration Projects
Authors: Klaus Haller, Florian Matthes, Christopher Schulz
Session: Industry Track Session 3: Evolution and migration
Industry - Testing & Quality Assurance in Data Migration Projects
1. Fakultät für Informatik
Technische Universität München
Testing & Quality Assurance in
Data Migration Projects
Williamsburg, 26th of September 2011
Klaus Haller2
Florian Matthes1
Christian Neubert1
Christopher Schulz1
1Lehrstuhl
I19 (sebis), Fakultät für Informatik, TU München, Garching, Germany
2Swisscom IT Services Finance, Testing & Quality Assurance, Zurich, Switzerland
110816_CS_ICSM 2011 1
2. The author team
Software Engineering for Business Information Systems (sebis)
Prof. Dr. Florian Matthes is holder of the chair Software Engineering for
Business Information Systems (sebis) at the TU München, Germany
Research areas in Enterprise Architecture Management & Social Software
Swisscom IT Services Finance
Design, implementation, and operations of IT systems (customer-specific and
standard software) and BPO services for ~190 banking & insurance institutions
The Testing & QA group offers management and technical consulting, test
automation, and testing as a services
Christian Neubert Authors
PhD student at sebis, Dr. Klaus Haller (Swisscom IT
primary research area: Web Services Finance)
2.0 Tools, Hybrid Wikis, Prof. Florian Matthes (TU München)
Model driven development
Christopher Schulz (TU München)
Professional working
experience as software
engineer in the area of
logistics
110816_CS_ICSM 2011 2
3. Mastering data migration projects is a
challenging task
„83% of data migrations fail outright or exceed their allotted budgets
and implementation schedules.“
[Gartner Group, 2005]
”..current success rate for the data migration portion of projects (that is
those that were delivered on time and on budget) is just 16%.”
[Bloor research, 2007]
“Few companies have the necessary skills to manage, build and
implement a successful data migration.”
[Endava, 2007]
110816_CS_ICSM 2011 3
4. Definition, drivers, and characteristics of data
migration projects
Data migration
Tool supported one-time process which aims at migrating
formatted data from a source structure to a target data
structure whereas both structures differ on a conceptual
and/or technical level
Drivers
Corporate events like mergers and acquisitions or carve-outs
Implementation of novel business-models and processes
Technological progress and upgrades
New statutory and regulatory requirements
Characteristics
Re-occurring replacement or consolidation of existing
business applications
Everlasting although infrequently performed discipline
Constantly underestimated in size and complexity
110816_CS_ICSM 2011 4
5. Research focus
How does a comprehensive process model for migrating data to
relational databases looks like?
What are risks frequently occurring in context of data migration
projects and is there an appropriate classification scheme
helping to structure them?
Which dedicated testing and risk mitigation techniques cope
with these issues from a technical and organizational point of
view?
110816_CS_ICSM 2011 5
6. Migration programs rest on an architecture
Source staging database
Copy of source database to
uncouple both databases
Transformation database
Stores intermediate results of
the data migration programs
Target staging database
Stores the result of the transfor-
mation ready for the upload
Data migration program
Transforms and moves the data & its representation from source to target database
Comprises the subprograms extract & pre-filter, transform, and upload
Orchestration component
Ensures the correct starting order of the programs using a timetable-like mechanism
110816_CS_ICSM 2011 6
7. Migrating data in a stepwise & iterative style
Practice-proven process model
consists of 4 main stages which are
subdivided into 14 distinct phases
1. Initialization, prepares the
necessary infrastructure and
organization
2. Development, implements the
actual data migration programs
3. Testing, validates the
correctness, stability, and
execution time of both, data and
migration programs
4. Cut-Over, finally switches to the
target application by executing the
migration programs
110816_CS_ICSM 2011 7
8. A risk model helps to turn vague migration
fears into concrete risks
Shaped like a house, the model is subdivided into
• business risks often articulated by the customer,
• IT management risks with a technical focus, and
• data migration risks covering issues associated with migration programs
Business and IT management risks are abstract but map on data migration risks
110816_CS_ICSM 2011 8
9. Different testing techniques mitigate the risk
often emerging in data migration projects
Concrete testing techniques, their explicit mapping on risks, as well as
dedicated testing phases assure the quality of data migration projects
110816_CS_ICSM 2011 9
10. Systematize the testing-based quality
assurance techniques
Data validation
Combination of automated and manual comparisons to validate completeness,
semantical correctness, and consistency on the structure & data level
Completeness and type correspondence tests
Automated comparison of all data to identify new or missing business objects
Appearance tests
Manual comparison of a selection of business objects on GUI level
Integration tests
Semi-automated tests dedicated to the proper functioning of the target application
with the migrated data in context of its interlinked applications
Processability test
Test focusing on coordinated interplay of target business application and new data
Partial/Full Migration run test
Semi-automated validation of the data migration programs in part or entirety
110816_CS_ICSM 2011 10
11. Each data migration is risk is covered by a
different set of testing techniques
Risk Testing technique
Stability Partial/full migration run test
Corruption Appearance test, processability test, integration test
Semantics Appearance test, processability test, integration test
Completeness completeness & type correspondence
Execution risk Full migration run test
Orchestration risk Partial/full migration run test
Dimensioning Partial/full migration run test
Interference Operational risk, no testing
Parameterization Appearance test, processability test, integration test,
completeness & type correspondence test
110816_CS_ICSM 2011 11
12. Project management-based quality assurance
Involve an external data migration team
Experienced specialists bring in methodologies, tool support, and know-how
Reduce IT management risks of extended delays and overspends
Exercise due while perform project scoping
Careful scoping in strategy phase applying source-push or target-pull principles
Eliminate risk of data and transformation loss
Apply a data migration platform
Scalable and reusable platform ensures independence from source & target
database while providing increased migration leeway for testing measures
Mitigate risk of corruption and instability
Prevent budget and time overruns
Reduce risk of interference between the migration teams
Reduce parameterization risk
110816_CS_ICSM 2011 12
13. Project management-based quality assurance
Thoroughly analyze and cleanse data
In-depth analysis helps to understand the data’s semantics & structure and to
seize migration project’s characteristics more accurately
Prevent project delays and budget overruns
Mitigate the risks of corruption
Reduce performance and stability risk for target application
Bring down the risk of unstable data migration programs
Migrate in an incremental and iterative manner
Early and regular generation of migration results ensures a high project
traceability and the possibility for frequent adjustments
Reduce risk of project failure
110816_CS_ICSM 2011 13
14. Summary and outlook
To deliver a data migration project in time and on budget, a stringent approach,
proactive risk mitigation techniques, and distinct test activities are required
This contribution…
outlines a practice-proven process model describing how to proceeded when
shifting data from a source to a target database
introduces and classifies dedicated risk mitigation techniques and project
management practices helping to assure the quality in data migration projects
Future directions
Empirically evaluate process model, risk mitigation, and project management
techniques in practice
Examine the case where several source databases have to be consolidated
resulting in data migration series
Enhance process model with additional data harmonization activities
Identify alternative versions of the model and techniques for NoSQL databases
110816_CS_ICSM 2011 14
15. Thank you for your attention!
Any
Questions?
Contact
Christian.Neubert@in.tum.de
Christopher.Schulz@in.tum.de
Klaus.Haller@swisscom.com
Further information
http://wwwmatthes.in.tum.de/wikis/sebis/mergers-and-acquisitions
http://finance.swisscom.com/
110816_CS_ICSM 2011 15