SlideShare une entreprise Scribd logo
1  sur  29
1
Presenter
Bill Hayduk
Founder / President
Presenter
Jeff Bocarsly, Ph.D.
Senior Architect
built bybuilt by
QuerySurge™
built by
The average organization loses $8.2 million annually
through poor Data Quality.
- Gartner
46% of companies cite Data Quality as a barrier
for adopting Business Intelligence products.
- InformationWeek
The cost per patient data of Phase 3 clinical studies of
new pharmaceuticals exceeds $26,000.
- Journal of Clinical Research Best Practices
built by
QuerySurge™
Pharma’s
2 Largest
Data Warehousing Concerns
built by
QuerySurge™
Pharma’s Largest Data Warehouse Concerns
(1) Data Integrity (2) Compliance
built by
QuerySurge™
(1) Data Integrity
high risk of defects that are not readily visible
Missing Data
Truncation of Data
Data Type Mismatch
Null Translation errors
Incorrect Type Translation
Misplaced Data
Extra Records
Transformation Logic Errors/Holes
Simple/Small Errors
Sequence Generator errors
Undocumented Requirements
Not Enough Records
built by
Pharma’s Data Warehouse Concerns
QuerySurge™
Pharma’s Data Warehouse Concerns
(2) Compliance
Need to comply with Part 11 mandates
historical test information test version history
test execution data:
who, what & when
test cycle information
visibility of assets archived test results
built by
QuerySurge™
Why is this Important?
 Periodic data reporting to FDA
 Periodic data reporting to int’l
bodies
(1) Data Integrity (2) Compliance
 FDA announced audits
 Unannounced FDA audits
Consequences
Severe financial and
business
built by
QuerySurge™
Pharma’s
Testing & Reporting
Needs
built by
QuerySurge™
 automate the manual testing of data
 compare millions of rows of data quickly
 flag mismatches and inconsistencies in data sets
 provide flexibility in scheduling test runs
 generate informative reports that can easily be shared
with the team
 validate up to 100% of all of all data, mitigating the risk
Data Integrity needs
Need a testing solution that can…
built by
QuerySurge™
Part 11 Reporting needs
 track test history
 provide reporting on test version history
 record all test execution by testing owner’s name
and date
 deliver auditable reports of test cycles
 store all test outcomes and test data
 offer a read-only user for reviewing test assets
 support archiving of results
Need a testing solution that can…
built by
QuerySurge™
built by
The solution…
built by
QuerySurge™
What is QuerySurge™?
the collaborative
Data Testing solution that
finds bad data & provides
a holistic view of your
data’s health
built by
QuerySurge™
• Reduce your costs & risks
• Improve your data quality
• Accelerate your testing cycles
• Share information with your team
with QuerySurge™ you can:
built by
QuerySurge™
• Provides huge ROI (i.e. 1,300%)*
*based on client’s calculation of Return on Investment
Finding Bad Data
SQL
HQL
SQL
HQL
SQL
SQL
 QS pulls data from data sources
 QS pulls data from target data store
 QS compares data quickly
 QS generates reports, audit trails
How?
Reports, Data Health Dashboard
built by
QuerySurge™
QuerySurge™ Architecture
Web-based…
Installs on...
Linux
Connects to…
…or any other JDBC compliant data source
built by
QuerySurge™
QuerySurge
Controller
QuerySurge
Server
QuerySurge
Agents
Flat Files
Collaboration
Testers
- functional testing
- regression testing
- result analysis
Developers / DBAs
- unit testing
- result analysis
Data Analysts
- review, analyze data
- verify mapping failures
Operations teams
- monitoring
- result analysis
Managers
- oversight
- result analysis
Share information on the
built by
QuerySurge™
the QuerySurge advantage
built by
QuerySurge™
Automate the entire testing cycle
 Automate kickoff, tests, comparison, auto-emailed results
Create Tests easily with no SQL programming
 ensures minimal time & effort to create tests / obtain results
Test across different platforms
 data warehouse, Hadoop, NoSQL, database, flat file, XML
Collaborate with team
 Data Health dashboard, shared tests & auto-emailed reports
Verify more data & do it quickly
 verifies up to 100% of all data up to 1,000 x faster
Integrate for Continuous Delivery
 Integrates with most Build, ETL & QA management software
built by
QuerySurge™
QuerySurge™ Modules
Design Tests
SchedulingDeep-Dive Reporting
Run Dashboard
Query WizardsData Health Dashboard
Fast and Easy.
No programming needed.
built by
QuerySurge™
QuerySurge™ Modules
Compare by Table, Column & Row
• Perform 80% of all data tests
•Automatically generates SQL code
• Opens up testing to novice & non-
technical team members
• Speeds up testing for skilled SQL coders
• provides a huge Return-On-Investment
built by
QuerySurge™
QuerySurge™ Modules
3 Types of Data Comparison Wizards:
The also provide you with automated features for:
o filtering (‘Where’ clause) and
o sorting (‘Order By’ clause)
Column-Level Comparison:
This is great for Big Data stores and Data Warehouses where tables will have some columns
containing transformations and some columns with no transformations. Many tables and
columns can be compared simultaneously and quickly.
Table-Level Comparison:
This comparator is great for Data Migrations and Database Upgrades with no
transformations at all. Many tables can be compared simultaneously and quickly.
Row Count Comparison:
Great for all - Big Data stores, Data Warehouses, Data Migrations and Database Upgrades.
Many tables and rows can be compared simultaneously and quickly.
Design Library
 Create custom Query Pairs (source & target SQLs)
 Great for team members skilled with SQL
QuerySurge™ Modules
Scheduling
 Build groups of Query Pairs
 Schedule Test Runs for:
• immediately
• at a specific date/time
• automatically after build or
ETL process
built by
QuerySurge™
Deep-Dive Reporting
 Examine and automatically
email test results
Run Dashboard
 View real-time execution
 Analyze real-time results
QuerySurge™ Modules
built by
QuerySurge™
built by
QuerySurge™
• view data reliability & pass rate
• add, move, filter, zoom-in on any data
widget & underlying data
• verify build success or failure
Test Management Connectors
built by
QuerySurge™
 Drive QuerySurge execution from your Test Management Solution
 Outcome results (Pass/Fail/etc.) are returned from QuerySurge to your Test Management Solution
 Results are linked in your Test Management Solution so that you can click directly into detailed QuerySurge
results
• HP ALM (Quality Center)
• Microsoft Team Foundation Server
• IBM Rational Quality Manager
Integration with leading
Test Management Solutions
Case Study
Fortune 500 firm:
Clinical Trial Data
built by
QuerySurge™
Case Study: Fortune 500 Pharma
Challenge
How can a Data Warehouse team assure data
integrity over multiple builds when the cost per patient
data of Phase 3 clinical studies exceeds $26,000 and
volume of live case data is > 1 TB?
Strategy
Implement QuerySurge™ to dramatically increase
coverage of data that is verified for each build.
Implementation
• 1,000 SQL queries written to compare case data from
the source systems to the DWH after ETL.
• QuerySurge™automated the scheduling, test runs,
comparisons and reporting for each build.
built by
QuerySurge™
Metrics
 500 mappings
 2.5 million data items
 1.25 billion verifications
 Complete run finished in 7 days
 45% of data was covered.
 14 builds were deployed
 115 defects were discovered and
remediated
Case Study: Fortune 500 Pharma
Benefits
• 10-fold increase in the speed of testing.
• Huge increase in coverage of data (from less than 1/10 % to 45%)
• Production defects discovered that were missed in previous cycles
• Huge savings on clean records (115 defects x $26,000/record)
• A huge time savings (3.6 years x 10 people)
• Avoidance of lawsuits and FDA fines
built by
QuerySurge™
(1) a Trial in the Cloud of QuerySurge, including self-learning
tutorial that works with sample data for 3 days or
(2) a Downloaded Trial of QuerySurge, including self-learning
tutorial with sample data or your data for 15 days or
(3) a Proof of Concept of QuerySurge, including a kickoff &
setup meeting and weekly meetings with our team of experts
for 30 days
http://www.querysurge.com/compare-trial-optionsfor more information, Go here
TRIAL
IN THE CLOUD
built by
QuerySurge™
Free Trials
built by
QuerySurge™
QuerySurge
For more on the Pharma & QuerySurge, go to
www.querysurge.com/solutions/pharmaceutical-industry

Contenu connexe

Tendances

Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
Databricks
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data Management
Sabir Akhtar
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 
Achieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data ManagementAchieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data Management
DATAVERSITY
 

Tendances (20)

Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data Management
 
Difference between fact tables and dimension tables
Difference between fact tables and dimension tablesDifference between fact tables and dimension tables
Difference between fact tables and dimension tables
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
 
Oracle product mdm pim data hub
Oracle product mdm   pim data hubOracle product mdm   pim data hub
Oracle product mdm pim data hub
 
Oracle Cloud
Oracle CloudOracle Cloud
Oracle Cloud
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
ETL Testing Training Presentation
ETL Testing Training PresentationETL Testing Training Presentation
ETL Testing Training Presentation
 
Big Data Case study - caixa bank
Big Data Case study - caixa bankBig Data Case study - caixa bank
Big Data Case study - caixa bank
 
Achieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data ManagementAchieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data Management
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challenge
 
Example data specifications and info requirements framework OVERVIEW
Example data specifications and info requirements framework OVERVIEWExample data specifications and info requirements framework OVERVIEW
Example data specifications and info requirements framework OVERVIEW
 
Chapter 2 - Retail Sales
Chapter 2 - Retail Sales Chapter 2 - Retail Sales
Chapter 2 - Retail Sales
 
Admin Webinar—An Admin's Guide to Profiles & Permissions
Admin Webinar—An Admin's Guide to Profiles & PermissionsAdmin Webinar—An Admin's Guide to Profiles & Permissions
Admin Webinar—An Admin's Guide to Profiles & Permissions
 

En vedette

What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
RTTS
 

En vedette (7)

Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE Vertica
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your Data
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solution
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 

Similaire à Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Requirements

Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
RTTS
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
RTTS
 
Reveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search SolutionReveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search Solution
d-Wise Technologies
 
Test data documentation ss
Test data documentation ssTest data documentation ss
Test data documentation ss
AshwiniPoloju
 
Maven and google pharma r&d (1)
Maven and google pharma r&d  (1)Maven and google pharma r&d  (1)
Maven and google pharma r&d (1)
Matt Barnes
 

Similaire à Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Requirements (20)

Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
 
Etl testing strategies
Etl testing strategiesEtl testing strategies
Etl testing strategies
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 
Leveraging Automated Data Validation to Reduce Software Development Timeline...
Leveraging Automated Data Validation  to Reduce Software Development Timeline...Leveraging Automated Data Validation  to Reduce Software Development Timeline...
Leveraging Automated Data Validation to Reduce Software Development Timeline...
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing Assignment
 
Qa what is_clinical_data_management
Qa what is_clinical_data_managementQa what is_clinical_data_management
Qa what is_clinical_data_management
 
Clinical data management
Clinical data management Clinical data management
Clinical data management
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity ChallengesBuilding a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing process
 
Reveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search SolutionReveal - An Enterprise Clinical Data Search Solution
Reveal - An Enterprise Clinical Data Search Solution
 
Xcellerate® Data Review
Xcellerate® Data Review Xcellerate® Data Review
Xcellerate® Data Review
 
End User Informatics
End User InformaticsEnd User Informatics
End User Informatics
 
Test data documentation ss
Test data documentation ssTest data documentation ss
Test data documentation ss
 
The Real Value of Oracle Health Checks
The Real Value of Oracle Health ChecksThe Real Value of Oracle Health Checks
The Real Value of Oracle Health Checks
 
How to Improve Quality and Efficiency Using Test Data Analytics
How to Improve Quality and Efficiency Using Test Data AnalyticsHow to Improve Quality and Efficiency Using Test Data Analytics
How to Improve Quality and Efficiency Using Test Data Analytics
 
Maven and google pharma r&d (1)
Maven and google pharma r&d  (1)Maven and google pharma r&d  (1)
Maven and google pharma r&d (1)
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 

Plus de RTTS

QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOps
RTTS
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality Experts
RTTS
 

Plus de RTTS (15)

Automated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsAutomated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI Reports
 
QuerySurge AI webinar
QuerySurge AI webinarQuerySurge AI webinar
QuerySurge AI webinar
 
State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data Testing
 
RTTS Postman and API Testing Webinar Slides.pdf
RTTS Postman and API Testing Webinar  Slides.pdfRTTS Postman and API Testing Webinar  Slides.pdf
RTTS Postman and API Testing Webinar Slides.pdf
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing Strategy
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing Project
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOps
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and Databases
 
Case study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverCase study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriver
 
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumEnterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality Experts
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Requirements

  • 1. 1 Presenter Bill Hayduk Founder / President Presenter Jeff Bocarsly, Ph.D. Senior Architect built bybuilt by QuerySurge™
  • 2. built by The average organization loses $8.2 million annually through poor Data Quality. - Gartner 46% of companies cite Data Quality as a barrier for adopting Business Intelligence products. - InformationWeek The cost per patient data of Phase 3 clinical studies of new pharmaceuticals exceeds $26,000. - Journal of Clinical Research Best Practices built by QuerySurge™
  • 3. Pharma’s 2 Largest Data Warehousing Concerns built by QuerySurge™
  • 4. Pharma’s Largest Data Warehouse Concerns (1) Data Integrity (2) Compliance built by QuerySurge™
  • 5. (1) Data Integrity high risk of defects that are not readily visible Missing Data Truncation of Data Data Type Mismatch Null Translation errors Incorrect Type Translation Misplaced Data Extra Records Transformation Logic Errors/Holes Simple/Small Errors Sequence Generator errors Undocumented Requirements Not Enough Records built by Pharma’s Data Warehouse Concerns QuerySurge™
  • 6. Pharma’s Data Warehouse Concerns (2) Compliance Need to comply with Part 11 mandates historical test information test version history test execution data: who, what & when test cycle information visibility of assets archived test results built by QuerySurge™
  • 7. Why is this Important?  Periodic data reporting to FDA  Periodic data reporting to int’l bodies (1) Data Integrity (2) Compliance  FDA announced audits  Unannounced FDA audits Consequences Severe financial and business built by QuerySurge™
  • 9.  automate the manual testing of data  compare millions of rows of data quickly  flag mismatches and inconsistencies in data sets  provide flexibility in scheduling test runs  generate informative reports that can easily be shared with the team  validate up to 100% of all of all data, mitigating the risk Data Integrity needs Need a testing solution that can… built by QuerySurge™
  • 10. Part 11 Reporting needs  track test history  provide reporting on test version history  record all test execution by testing owner’s name and date  deliver auditable reports of test cycles  store all test outcomes and test data  offer a read-only user for reviewing test assets  support archiving of results Need a testing solution that can… built by QuerySurge™
  • 11. built by The solution… built by QuerySurge™
  • 12. What is QuerySurge™? the collaborative Data Testing solution that finds bad data & provides a holistic view of your data’s health built by QuerySurge™
  • 13. • Reduce your costs & risks • Improve your data quality • Accelerate your testing cycles • Share information with your team with QuerySurge™ you can: built by QuerySurge™ • Provides huge ROI (i.e. 1,300%)* *based on client’s calculation of Return on Investment
  • 14. Finding Bad Data SQL HQL SQL HQL SQL SQL  QS pulls data from data sources  QS pulls data from target data store  QS compares data quickly  QS generates reports, audit trails How? Reports, Data Health Dashboard built by QuerySurge™
  • 15. QuerySurge™ Architecture Web-based… Installs on... Linux Connects to… …or any other JDBC compliant data source built by QuerySurge™ QuerySurge Controller QuerySurge Server QuerySurge Agents Flat Files
  • 16. Collaboration Testers - functional testing - regression testing - result analysis Developers / DBAs - unit testing - result analysis Data Analysts - review, analyze data - verify mapping failures Operations teams - monitoring - result analysis Managers - oversight - result analysis Share information on the built by QuerySurge™
  • 17. the QuerySurge advantage built by QuerySurge™ Automate the entire testing cycle  Automate kickoff, tests, comparison, auto-emailed results Create Tests easily with no SQL programming  ensures minimal time & effort to create tests / obtain results Test across different platforms  data warehouse, Hadoop, NoSQL, database, flat file, XML Collaborate with team  Data Health dashboard, shared tests & auto-emailed reports Verify more data & do it quickly  verifies up to 100% of all data up to 1,000 x faster Integrate for Continuous Delivery  Integrates with most Build, ETL & QA management software
  • 18. built by QuerySurge™ QuerySurge™ Modules Design Tests SchedulingDeep-Dive Reporting Run Dashboard Query WizardsData Health Dashboard
  • 19. Fast and Easy. No programming needed. built by QuerySurge™ QuerySurge™ Modules Compare by Table, Column & Row • Perform 80% of all data tests •Automatically generates SQL code • Opens up testing to novice & non- technical team members • Speeds up testing for skilled SQL coders • provides a huge Return-On-Investment
  • 20. built by QuerySurge™ QuerySurge™ Modules 3 Types of Data Comparison Wizards: The also provide you with automated features for: o filtering (‘Where’ clause) and o sorting (‘Order By’ clause) Column-Level Comparison: This is great for Big Data stores and Data Warehouses where tables will have some columns containing transformations and some columns with no transformations. Many tables and columns can be compared simultaneously and quickly. Table-Level Comparison: This comparator is great for Data Migrations and Database Upgrades with no transformations at all. Many tables can be compared simultaneously and quickly. Row Count Comparison: Great for all - Big Data stores, Data Warehouses, Data Migrations and Database Upgrades. Many tables and rows can be compared simultaneously and quickly.
  • 21. Design Library  Create custom Query Pairs (source & target SQLs)  Great for team members skilled with SQL QuerySurge™ Modules Scheduling  Build groups of Query Pairs  Schedule Test Runs for: • immediately • at a specific date/time • automatically after build or ETL process built by QuerySurge™
  • 22. Deep-Dive Reporting  Examine and automatically email test results Run Dashboard  View real-time execution  Analyze real-time results QuerySurge™ Modules built by QuerySurge™
  • 23. built by QuerySurge™ • view data reliability & pass rate • add, move, filter, zoom-in on any data widget & underlying data • verify build success or failure
  • 24. Test Management Connectors built by QuerySurge™  Drive QuerySurge execution from your Test Management Solution  Outcome results (Pass/Fail/etc.) are returned from QuerySurge to your Test Management Solution  Results are linked in your Test Management Solution so that you can click directly into detailed QuerySurge results • HP ALM (Quality Center) • Microsoft Team Foundation Server • IBM Rational Quality Manager Integration with leading Test Management Solutions
  • 25. Case Study Fortune 500 firm: Clinical Trial Data built by QuerySurge™
  • 26. Case Study: Fortune 500 Pharma Challenge How can a Data Warehouse team assure data integrity over multiple builds when the cost per patient data of Phase 3 clinical studies exceeds $26,000 and volume of live case data is > 1 TB? Strategy Implement QuerySurge™ to dramatically increase coverage of data that is verified for each build. Implementation • 1,000 SQL queries written to compare case data from the source systems to the DWH after ETL. • QuerySurge™automated the scheduling, test runs, comparisons and reporting for each build. built by QuerySurge™
  • 27. Metrics  500 mappings  2.5 million data items  1.25 billion verifications  Complete run finished in 7 days  45% of data was covered.  14 builds were deployed  115 defects were discovered and remediated Case Study: Fortune 500 Pharma Benefits • 10-fold increase in the speed of testing. • Huge increase in coverage of data (from less than 1/10 % to 45%) • Production defects discovered that were missed in previous cycles • Huge savings on clean records (115 defects x $26,000/record) • A huge time savings (3.6 years x 10 people) • Avoidance of lawsuits and FDA fines built by QuerySurge™
  • 28. (1) a Trial in the Cloud of QuerySurge, including self-learning tutorial that works with sample data for 3 days or (2) a Downloaded Trial of QuerySurge, including self-learning tutorial with sample data or your data for 15 days or (3) a Proof of Concept of QuerySurge, including a kickoff & setup meeting and weekly meetings with our team of experts for 30 days http://www.querysurge.com/compare-trial-optionsfor more information, Go here TRIAL IN THE CLOUD built by QuerySurge™ Free Trials
  • 29. built by QuerySurge™ QuerySurge For more on the Pharma & QuerySurge, go to www.querysurge.com/solutions/pharmaceutical-industry

Notes de l'éditeur

  1. Other Pharmaceutical Industry Complexities ------------------------------------------------------------------------ Industry consolidation causing massive integration of data FDA CFR Part 11 compliance A broad variety of data types and sources may be fed into a data warehouse. general Pharma-specific information exchange formats (e.g., HL7 feeds, CDISC feeds, other XML grammars) multiple proprietary and internal data formats, which may have been acquired in the process of industry consolidation.
  2. QuerySurge can automate the comparison of all data from source files and databases through different legs of the ETL process to the target data warehouse. QuerySurge can be scheduled to run immediately, next Monday at 11:00pm or when an event, such as the current ETL process ends. QuerySurge will execute tests that automate the comparison of target data to source data very quickly, comparing millions of rows of data in minutes. On completion of the run, QuerySurge will produce informative summary and detailed reports that can be viewed immediately or shared with the team via the automated email scheduler. QuerySurge will validate 100% of all of your data, providing full coverage and mitigating the risk while providing reports highlighting every data difference, down to the individual character.
  3. - tracks test history (user, date, each test version) - provides reporting on test version history for convenient auditing - supports tracking of deviations from approved tests - records all test execution owners by name and date - delivers auditable results reporting of test cycles - stores all test outcomes and test data for post-facto review or audit - offers a read-only user type for reviewing test assets - supports off-database archiving of results (for future restore) for effective long-term results data management
  4. QuerySurge provides insight into the health of your data throughout your organization through BI dashboards and reporting at your fingertips. It is a collaborative tool that allows for distributed use of the tool throughout your organization and provides for a sharable, holistic view of your data’s health and your organization’s level of maturity of your data management.
  5. QuerySurge helps your team coordinate your data quality initiatives while speeding up your development and testing cycles and finding your bad data. Why risk having your team identify trends and develop strategic initiatives when the underlying data is incorrect? QuerySurge reduces this risk.
  6. QuerySurge finds bad data by natively connecting to: any data source, whether it is any type of database, flat file or xml and can connect to any data target, whether it is a db, file, xml, data warehouse or hadoop implementation. QuerySurge pulls data from the source and the target and compares them very quickly (typically in a few minutes) and then produces reports that show every data difference, even if there are millions of rows and hundreds of columns in the test. These reports can be automatically emailed to your team. You can pick from a multitude of reports or export the results so that you can build your own reports.
  7. Your distributed team from around the world can use any of these web browsers: Internet Explorer, Chrome, Firefox and Safari. Installs on operating systems: Windows & Linux. QS connects to any JDBC-compliant data source. Even if it is not listed here.
  8. QuerySurge can utilized by active practitioners such as testers & developers to create and launch tests, or by managers, analysts and operations to view data test results and the overall health of the data. QuerySurge facilitates this by providing 2 types of licenses: (1) full user & (2) participant user. (1) Full User – This type of user has unlimited access to create QueryPairs, Suites, and Scenarios. This user can also schedule and run tests, see results, run and export reports, and export data. Perfect for anyone creating and/or running data tests while performing analysis of results. (2) Participant User – This user cannot create or run tests, but has access to all other information - including viewing all query pairs, results, and reports, receiving email notifications, and exporting test results and reports. Perfect for managers, analysts, architects, DBAs, developers, and operations users who need to know the health of their data.