Building a mind map for test data management.
1. Test data source
2. Extract or create data
3. Transform data
Subject: Test Data Management
Tags: Test Data Management, mindmap
Mindmap for Test Data Management
In the era of Big data and cloud computing, most of the application under test (AUT)
contains huge amount of data interrelated with each other. Test data plays a major role for
the success any project. Extracting the right data from the large volume is always a
tedious task. Test Data Management (TDM) consists of managing the provisioning of
required test data efficiently and effectively, while at the same time ensuring compliance
to regulatory and organizational standards. Managing effective test data not only reduces
redundancy, also helps to improve the entire testing process in an organized and planned
Sometimes we need to deliver real quick. To achieve reliable test result in a short time
frame, I have been applying different techniques. A mind map is a perfect solution where
we can visualize the most important tasks to be done. This article is all about creating an
effective Mind map for TDM.
From my personal experience, Apart from creating test plan, test cases and test scripts we
spend a considerable amount of time to prepare test data. In fact, it takes around 55-60%
of QA effort. With traditional approach of test data generation, we followed a hybrid
approach to minimize the amount of effort being spent on creating/generating test data.
• Identify the challenges and constraints behind managing and maintaining high volumes of
• Analysis of the best practices which can be followed to ensure effective maintenance of
• Limitations of the solutions being suggested.
• Recommendations on future scope of improvement in Test Data Management.
Software Test Life cycle (STLC) and its dependency on test data
Test data is closely related to all the stages of STLC. Just think, if we don’t have appropriate
data to create test plan, test case development or automated scripts, how reliable and
effective the entire testing process will be?
Not only test planning and creation, test data plays a very vital role in test execution. It
2. also includes QA report generation and regression testing of any future iteration of the
Challenges for Effective TDM
• In most of the application, extremely high volume of data is used.
• Different modules of the same application in the project may require different sets of data.
• In most of the scenarios a lot of applications interface with each other thus increasing the
dependency of all of them on the same sets of data. Also due to this the volume of data to
be generated also goes up.
• Amount of effort required to create data that have to be used across multiple releases and
• Centralized database is generally used for many applications. So maintenance becomes
• Defect injection ratio goes up due to manual errors.
• Faulty selection of test data.
• Increase in rework effort as a result of lack of proper TDM.
• Increase in cost to the client as rework effort increases.
• Dissatisfied clients.
Best practices followed in the industry for effective TDM
1. Test data source
2. Extract or create data
3. Transform data
1. Test data source - The first step to generate appropriate test data is to collect relevant
information about test the data source. This can be achieved by the following techniques-
• Data identification – Accurate test data can be identified based upon different attributes
like Physical attributes, Relational attributes, Location attributes, Value attributes,
Conceptual attributes etc.
• Data dependencies – Is a situation in which a program statement refers to the data of a
preceding statement. Data depends on True data dependency, Anti-dependency, Output
3. • Data types – Choose different types of data including Integers, Booleans, Characters,
Floating point number data types.
• Data requirements - It is important to thoroughly understand the requirements to generate
test data. Important data set includes valid data set, No data, Invalid data set, Illegal data
format, Boundary condition data set and Data set for load test.
2. Extract/ Create – Once we identify the source of test data, the next step is to extract
data. If the project is started from the scratch then we should create test data. Different
data extraction methods can be applied to extract data are-
• Data selection - Before developing new data, identify the already existing data sets. We
must select Data from production database and data within test scope. To reduce human
error while data selection, it is always better to automate data selection process for
accurate test data selection.
• Data mapping - While creating test data we can create data element mappings between
two distinct data models using data mapping. Tasks associated with data mapping are
Identification of data relationships, Discovery of hidden data and Eliminate redundant data
• Data mining- It allows user to analyze data from different perspectives using Anomaly
detection, Dependency modeling, Clustering, Classification, Regression and Sequential
pattern mining. It is beneficial to use automation to correlate data among dozens of fields
in large relational databases.
3. Transform - Next step is to transform the extracted data into a suitable form. The
following techniques should be applied during data transform-
• Data security - Apply data security rules to minimize collected data and Limit access levels.
Use encryption methods. Always create a backup of test data & setup data recovery.
• Data masking - It is also advisable to apply masking to a data field for additional security.
Substitution, Number and date variance, shuffling, Data Encryption, Nulling out/ Deletion,
Masking out are some of the techniques can be applied to mask test data.
• Data preparation - Data in real world is inconsistent. Before using the data as usable test
data we should prepare data by applying Data discretization, Data cleaning, Data
integration, Data transformation and data reduction techniques.
• Data privatization - Because of data in test environment are less secure we can add table-
based randomization and character-based randomization to make it more secure.
• Data Sub setting – Define, create a sub set of large database by importing & exporting
subset templates and dumps for use in test database.
4. Provision - When data transformation is completed, we should provision the data
towards delivering polished data to use in staging environment. Provisioning can be done
by applying data migration and data validation techniques.
5. Final test data - Includes data refresh, data maintenance and actionable test data. Test
4. data should be refreshed two or three times in each monthly or quarterly release cycle.
The project should have configuration management plan. It ensures us with the data for
future reference. Includes
• Data refresh - Move test data to target, Load data in production, back up & notify.
• Data maintenance - Create reusable data, maintain version tracking, actionable test data.
• & Configuration management - Crawl through data, Test data for future referral.
Benefits of TDM and a Mindmap made it simpler!
•Reduce tons of work for what need to be done
•See the whole picture of TDM at once
•Improved data quality and thus improved testing quality
•Considerable effort reduction
•Improved defect detection due to correct data
•Reliable and better quality
•Focus on what is required for better result
•Accurate test data sets
•Up to 80% reduction in test environment storage capacity requirements.
•Up to 50% reduction in test environment CPU requirements.
TDM Mindmap: A case study
5. Process of manually selecting the Test Data
•Many large databases of integrated circuits were provided by client.
•We had to select the test data manually from that database.
•Integrated circuit specifications were selected based on SKU, Model Number, Frequency
Range and other parameters.
•These product specifications along with product description, images and some unique
parameter needed to be put in the unified database.
•All products had to be indexed accurately, so that it can be accessed through parametric
Using Mindmap to create TDM layout
We used Mindmap to research and generate what should the correct process to manage test
data. Because of the nature and the complexity of the project, we automated test data
generation and managing process with DBUnit with our customized data migration validation
tool. In our Mindmap, all the steps and associated branches are added for easier reference
•From our hybrid approach, we were able to identify defects early in the development stage. It
not only saved huge time and resources also helped us to accelerate the entire process.
•Reduction in cost to organization by 40-50% by automated process and correct TDM process.
•Only the necessary test data was created, hence redundancy and confusion was avoided.
•Data security was increased marginally with well-defined data security policy.
•Overall improvement of testing process.
4.Kaus Haller; wp test data management conference journal
5.Kunal Taneja1, Yi Zhang2, Tao Xie1; Automated Test Generation for Database Applications
via Mock Objects