The document discusses QuerySurge, an automated data testing solution that helps verify data quality and find errors. It notes that traditional data quality tools focus on profiling, cleansing and monitoring data, while QuerySurge also enables data testing through easy-to-use query wizards and comparison of source and target data without SQL coding. QuerySurge allows collaborative testing across teams and platforms, integrates with development tools, and can significantly reduce testing time and improve data quality.
3. Business Intelligence (BI) software
CxOs are using Business Intelligence & Analytics to make critical business decisions
– with the assumption that the underlying data is fine.
“The average organization loses
$8.2 million annually through
poor Data Quality.”
- Gartner
ETL
Data Architecture
The Executive Office and Critical Data
potential problem
areas
4. Current Business Case for Data Testing
built by
QuerySurge™
“46% of companies cite data quality as a barrier for adopting
Business Intelligence products”
- InformationWeek
“On average, U.S. organizations believe 32% of their data is
inaccurate”
– Experian Data Quality research report
“Poor data quality is a primary reason for 40% of all business
initiatives failing to achieve their targeted benefits”
- analyst firm Gartner
“90% percent of U.S. companies have some sort of data
quality solution in place today”
– Experian Data Quality research report
Data quality solutions are not enough!
5. o Profiling
o Parsing and standardization
o Generalized Cleansing
o Matching
o Monitoring
o Enrichment
o Subject-area-specific support
o Metadata management
o Configuration environment
Data Quality tools vs. Data Testing tool
built by
QuerySurge™
Data Completeness
Data Transformation
Regression Testing
Primary Characteristics of Data Quality tools
courtesy of Gartner’s “Magic Quadrant for Data Quality Tools”
Data
Verification &
Validation?
Primary Characteristics of Data Testing tools
Courtesy of the book "Testing the Data Warehouse Practicum"
Data
Verification &
Validation?
8. Method #1: Stare & Compare
built by
QuerySurge™
• Review Business Rules (i.e. Mapping Document: data flow mapping, data movement requirements)
• Write Tests in SQL editor
• Execute 2 Tests: 1 at Source & 1 at Target
• Dump results to 2 Excel files
• Compare results by eye (‘Stare & Compare’ or ‘sampling’)
Issue with Stare & Compare:
Impossible to visually compare billions of data sets.
Result: usually less than 1% of data is compared
Example:
Current QuerySurge customer has:
• a single test with 100 million rows & 200 columns
• = 20 billion data sets
• the client has > 7,000 total tests
9. built by
QuerySurge™
MINUS QUERIES subtract one result set from another result set to show difference
Comment: MINUS QUERIES need to be executed 2x (Source MINUS Target; Target MINUS Source)
Result sets may not be accurate when dealing with duplicate rows of data
No historical data from past testing – audit and regulatory issues
Processing of minus queries puts pressure on the servers
Double execution means 2x testing time and resource utilization
Method #2: Minus Queries
Minus Query #1: Table_1 MINUS Table_2
Minus Query #2: Table_2 MINUS Table_1
Result Set #1
Result Set #2
ISSUES with MINUS QUERIES
Write 2 MINUS queries
in SQL editor
Execute
MINUS queries 2x
10. DataTesting Compare Methods: 2 issues
built by
QuerySurge™
1) There is a fundamental issue with both current methods:
The assumption that all team members can write SQL/HQL code
2) Neither method fully satisfies any of the conditions below:
Data Completeness
Data Transformation
Regression Testing
12. What is QuerySurge™?
the collaborative
Data Testing solution that
finds bad data & provides
a holistic view of your
data’s health
built by
13. the QuerySurge advantage
built by
QuerySurge™
Automate the entire testing cycle
Automate the launch, tests, comparison, auto-emailed results
Create Tests easily with no SQL programming
Query Wizards ensure minimal time & effort to create tests
Test across different platforms
Data Warehouse, Hadoop, NoSQL, database, flat file, XML
Collaborate with team
Data Health dashboard, shared tests & auto-emailed reports
Verify more data & do it quickly
verifies up to 100% of all data up to 1,000 x faster
Integrate for Continuous Delivery (DevOps)
Integrates with most Build, ETL & QA management software
15. SQL
HQL
SQL
HQL
SQL
SQL
QS pulls data from data sources
QS pulls data from target data store
QS compares data quickly
QS generates reports, audit trails
How QuerySurge Works
Reports, Data Health Dashboard, auto emails
built by
QuerySurge™
Source Data Target Data
Data Stores
• Databases
• Data Warehouses
• Data Marts
Flat Files
• Fixed Width
• Delimited
• Excel
Big Data stores
• Hadoop
• NoSQL
Data
Warehouses
XML
Web Services
16. Data Process: Developer & Tester
built by
QuerySurge™
Developer: Codes data movement based on Business Requirements
Tester: Tests data movement based on Business Requirements
Business
Intelligence
ETL
Source Data
Big Data ETL Process Target DWH
17. Collaboration
Testers
- functional testing
- regression testing
- result analysis
Developers / DBAs
- unit testing
- result analysis
Data Analysts
- review, analyze data
- verify mapping failures
Operations teams
- monitoring
- result analysis
Managers
- oversight
- result analysis
Share information on the
built by
QuerySurge™
19. Design Library
• Create Query Pairs (source & target SQLs)
• Great for team members skilled with SQL
QuerySurge™ Modules
Scheduling
Build groups of Query Pairs
Schedule Test Runs
built by
QuerySurge™
20. Deep-Dive Reporting
Examine and automatically
email test results
Run Dashboard
View real-time execution
Analyze real-time results
QuerySurge™ Modules
built by
QuerySurge™
21. built by
QuerySurge™
• view data reliability & pass rate
• add, move, filter, zoom-in on any
data widget & underlying data
• verify build success or failure
QuerySurge™ Modules
22. Fast and Easy.
No programming needed.
built by
QuerySurge™
QuerySurge™ Modules
• Perform 80% of all data tests -
no SQL coding needed
• Opens up testing to novices &
non-technical team members
• Speeds up testing for skilled SQL coders
• provides a huge Return-On-Investment
23. QuerySurge Test Management Connectors
built by
QuerySurge™
Drive QuerySurge execution from your Test Management Solution
See QuerySurge Pass/Fail results in your Test Management solution
Click link to drill into detailed results in QuerySurge
• HP ALM (Quality Center)
• Microsoft Team Foundation Server
• IBM Rational Quality Manager
Integration with leading
Test Management Solutions
24. QuerySurge & DevOps: Continuous Delivery & Integration
built by
QuerySurge™
Automated
Testing
Automated
Reporting
Automated
Launch
Data Integration/ETL
solutions
QuerySurge™
and many others…
email
report
Test Management
solutions
QuerySurge™
email
report
and many others…
QuerySurge™
Automated Build
solutions
email
report
25. • Reduce your costs & risks
• Improve your data quality
• Accelerate your testing cycles
• Share information with your team
built by
QuerySurge™
• Realize a huge ROI (like 1,600%)
QuerySurge’s Impact
27. built by
QuerySurge™
About
FACTS
Founded:
1996
headquarters:
Manhattan, New York
Customer profile:
• Fortune 1000
• 600+ customers
Strategic Partners:
IBM, Microsoft, HP,
Oracle, Teradata,
HortonWorks, Cloudera,
MongoDB
Software Division:
QuerySurge
RTTS is the parent company of QuerySurge
and is the premier pure-play QA & Testing organization
that specializes in test automation
QuerySurge provides insight into the health of your data throughout your organization through BI dashboards and reporting at your fingertips. It is a collaborative tool that allows for distributed use of the tool throughout your organization and provides for a sharable, holistic view of your data’s health and your organization’s level of maturity of your data management.
QuerySurge finds bad data by natively connecting to:
any data source, whether it is any type of database, flat file or xml and
can connect to any data target, whether it is a db, file, xml, data warehouse or hadoop implementation.
QuerySurge pulls data from the source and the target and compares them very quickly (typically in a few minutes) and then produces reports that show every data difference, even if there are millions of rows and hundreds of columns in the test. These reports can be automatically emailed to your team.
You can pick from a multitude of reports or export the results so that you can build your own reports.
Your distributed team from around the world can use any of these web browsers: Internet Explorer, Chrome, Firefox and Safari.
Installs on operating systems: Windows & Linux.
QS connects to any JDBC-compliant data source. Even if it is not listed here.
QuerySurge can utilized by active practitioners such as testers & developers to create and launch tests, or by managers, analysts and operations to view data test results and the overall health of the data. QuerySurge facilitates this by providing 2 types of licenses: (1) full user & (2) participant user.
(1) Full User – This type of user has unlimited access to create QueryPairs, Suites, and Scenarios. This user can also schedule and run tests, see results, run and export reports, and export data. Perfect for anyone creating and/or running data tests while performing analysis of results.
(2) Participant User – This user cannot create or run tests, but has access to all other information - including viewing all query pairs, results, and reports, receiving email notifications, and exporting test results and reports. Perfect for managers, analysts, architects, DBAs, developers, and operations users who need to know the health of their data.
QuerySurge helps your team coordinate your data quality initiatives while speeding up your development and testing cycles and finding your bad data. Why risk having your team identify trends and develop strategic initiatives when the underlying data is incorrect? QuerySurge reduces this risk.