SlideShare une entreprise Scribd logo
1  sur  10
Understanding the gaps between Data Quality
Checks and Research Capabilities in a Pediatric
Data Research Network
Ritu Khare, Hanieh Razzaghi, Levon Utidjian,
Matthew Miller, L. Charles Bailey
The Children’s Hospital of Philadelphia
2
PEDSnet CDRN = 5.7M patients in pediatrics
Phase 2 (9m): Conduct
Science Queries
Phase 1 (18m): Initial
Infrastructure
Data Quality Assessment in PEDSnet
• Data ready for research use???
• PEDSnet data quality workflow
• Design data quality checks
• https://github.com/PEDSnet/Data-Quality-Analysis
• Identify data quality issues
• Rate of extract-transform load (ETL) errors reduced from >50% to
<10% (Khare et al., JAMIA in press)
Type of Check Issue Example
Missing data Gestational age missing for 70% of patients
Invalid value Race outside the acceptable values in
PEDSnet conventions
Implausible event Encounter start date after the end date
PEDSnet Phase 1: Data Quality Assessment
0
100
200
300
400
500
600
700
1 2 3 4 5 6 7 8 9
#DataQualityChecks
Data CycleJan 2015 May 2016
Frameworks,
methods in
literature
(Brown et al. 2013,
Weiskopf and Weng,
2013, Kahn et al. 2015)
c
THEORY-DRIVEN
 50 members in
informatics
team
 Data and issue
review
DEVELOPER-
DRIVEN
PEDSnet Phase 2: Conducting Science Queries
• >30 scientific studies: Computable phenotypes,
feasibility queries, association studies, etc.
Site % children with CT-scan during
ED visits in 2013-2016
A 3.32%
B 4.87%
C 3.58%
D 2.98%
E 0.11%
F 3.62%
G 5.11%
H 5.92%
Incorrect mapping
of CT-scan
procedure
Invalid coding of
ED visits
Bug in the query
True anomaly
PEDSnet Phase 2: Data Quality Assessment
• USER-DRIVEN: >75 new issues, and 8 new check types
Check Type Issue Example
Outliers in derived values Average length of inpatient stays
Inconsistency between
similar concepts captured in
different tables
Specialty data in provider vs. care_site tables
Incorrect mapping from EHR
to PEDSnet
Mapping of labs to LOINC
Missing Expected Facts GI Providers, creatinine labs, etc
PEDSnet Phase 2: Data Quality Assessment
Check Type Issue Example
Unexpected Facts Procedures recorded in the condition table
Variability in coding Different concepts used to represent same lab or vitals
Unexpected most
frequent values
“shooting pain” identified as top inpatient visit condition
Face validity issues Tables with unexpectedly low number of records
PEDSnet Phase 2: Check Design Challenges
• Determine the combination of fields / tables
• ~100 fields in PEDSnet data model
• Determination of outlier
• Differentiate between true anomaly and real data
quality issue
• Determination of thresholds
• Experimentation with datasets
• Automatic review of ETL mappings
• labs, organisms, specialty, route, race, ethnicity,
drugs, language, procedure, smoking history
• 1000s of manually derived mappings
Conclusions
• A new (user-driven) perspective on data quality
• Usability evaluation of PEDSnet data quality
assessment program
• 20% increase in types of checks
• Future work
• Investigate the Phase 2 check design challenges
• Reverse engineering of checks from issues
identified in science queries
Acknowledgments
• PEDSnet Teams
• Leadership and governance
• Informatics
• Pilot studies
• PCORnet Governance Committees and DRN OC
• OHDSI Consortium
• Patients and Families
• This work was supported by PCORI Contract CDRN-1306-01556.
• PEDSnet Data Quality Scripts: https://github.com/PEDSnet/Data-
Quality-Analysis

Contenu connexe

Tendances

Resume_TLA_ME_pos
Resume_TLA_ME_posResume_TLA_ME_pos
Resume_TLA_ME_posTim Arleo
 
Data Management Lab: Data mapping exercise example
Data Management Lab: Data mapping exercise exampleData Management Lab: Data mapping exercise example
Data Management Lab: Data mapping exercise exampleIUPUI
 
How to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubricaHow to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubricaPubrica
 
Thomas Cayton Grad Resume CA
Thomas Cayton Grad Resume CAThomas Cayton Grad Resume CA
Thomas Cayton Grad Resume CAThomas Cayton
 
Improving surveillance and early detection of Foot-and-mouth And Similar Tran...
Improving surveillance and early detection of Foot-and-mouth And Similar Tran...Improving surveillance and early detection of Foot-and-mouth And Similar Tran...
Improving surveillance and early detection of Foot-and-mouth And Similar Tran...EuFMD
 
Lecture 9C
Lecture 9CLecture 9C
Lecture 9CCMDLMS
 
Aries systems eemug 2021 manuscript eval services panel sci score v2_edits
Aries systems eemug 2021 manuscript eval services panel sci score v2_editsAries systems eemug 2021 manuscript eval services panel sci score v2_edits
Aries systems eemug 2021 manuscript eval services panel sci score v2_editsAnita Bandrowski
 
Novel opportunities for computational biology and sociology in
Novel opportunities for computational biology and sociology inNovel opportunities for computational biology and sociology in
Novel opportunities for computational biology and sociology inavinash tiwari
 
Flow of ordinary research process
Flow of ordinary research processFlow of ordinary research process
Flow of ordinary research processDr. Vignes Gopal
 
Tarun_Jain_Resume-2016
Tarun_Jain_Resume-2016Tarun_Jain_Resume-2016
Tarun_Jain_Resume-2016Tarun Jain
 
ZhuangWan_CV_Aug2016
ZhuangWan_CV_Aug2016ZhuangWan_CV_Aug2016
ZhuangWan_CV_Aug2016Zhuang Wan
 
Health Informatics Seminar Summary
Health Informatics Seminar SummaryHealth Informatics Seminar Summary
Health Informatics Seminar Summaryjetweedy
 
Research Methods for Computational Statistics
Research Methods for Computational StatisticsResearch Methods for Computational Statistics
Research Methods for Computational StatisticsSetia Pramana
 
Can We Fix Peer Review
Can We Fix Peer ReviewCan We Fix Peer Review
Can We Fix Peer ReviewMicah Altman
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinarPistoia Alliance
 

Tendances (20)

Resume_TLA_ME_pos
Resume_TLA_ME_posResume_TLA_ME_pos
Resume_TLA_ME_pos
 
Data Management Lab: Data mapping exercise example
Data Management Lab: Data mapping exercise exampleData Management Lab: Data mapping exercise example
Data Management Lab: Data mapping exercise example
 
How to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubricaHow to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubrica
 
Thomas Cayton Grad Resume CA
Thomas Cayton Grad Resume CAThomas Cayton Grad Resume CA
Thomas Cayton Grad Resume CA
 
Introduction
IntroductionIntroduction
Introduction
 
Improving surveillance and early detection of Foot-and-mouth And Similar Tran...
Improving surveillance and early detection of Foot-and-mouth And Similar Tran...Improving surveillance and early detection of Foot-and-mouth And Similar Tran...
Improving surveillance and early detection of Foot-and-mouth And Similar Tran...
 
12 michelle dalton conul acil
12   michelle dalton conul acil12   michelle dalton conul acil
12 michelle dalton conul acil
 
Lecture 9C
Lecture 9CLecture 9C
Lecture 9C
 
Aries systems eemug 2021 manuscript eval services panel sci score v2_edits
Aries systems eemug 2021 manuscript eval services panel sci score v2_editsAries systems eemug 2021 manuscript eval services panel sci score v2_edits
Aries systems eemug 2021 manuscript eval services panel sci score v2_edits
 
Howards resume 2014
Howards resume 2014Howards resume 2014
Howards resume 2014
 
Novel opportunities for computational biology and sociology in
Novel opportunities for computational biology and sociology inNovel opportunities for computational biology and sociology in
Novel opportunities for computational biology and sociology in
 
Flow of ordinary research process
Flow of ordinary research processFlow of ordinary research process
Flow of ordinary research process
 
Analyzing data
Analyzing dataAnalyzing data
Analyzing data
 
Connecting eh rdataquad12
Connecting eh rdataquad12Connecting eh rdataquad12
Connecting eh rdataquad12
 
Tarun_Jain_Resume-2016
Tarun_Jain_Resume-2016Tarun_Jain_Resume-2016
Tarun_Jain_Resume-2016
 
ZhuangWan_CV_Aug2016
ZhuangWan_CV_Aug2016ZhuangWan_CV_Aug2016
ZhuangWan_CV_Aug2016
 
Health Informatics Seminar Summary
Health Informatics Seminar SummaryHealth Informatics Seminar Summary
Health Informatics Seminar Summary
 
Research Methods for Computational Statistics
Research Methods for Computational StatisticsResearch Methods for Computational Statistics
Research Methods for Computational Statistics
 
Can We Fix Peer Review
Can We Fix Peer ReviewCan We Fix Peer Review
Can We Fix Peer Review
 
AI in translational medicine webinar
AI in translational medicine webinarAI in translational medicine webinar
AI in translational medicine webinar
 

Similaire à Bridging Data Quality Checks and Research in a Pediatric Network

Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...lucenerevolution
 
Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12Ryan Tubbs
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized MedicineEdgewater
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-finalPeter Embi
 
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentMetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentAmrapali Zaveri, PhD
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Paolo Missier
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Paolo Missier
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Upendra Agarwal
 
SHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLPSHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLPAlAcademia Tsr
 
The Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham TaylorThe Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham TaylorHuman Variome Project
 
Building a Next Generation Clinical and Scientific Data Management Solution
Building a Next Generation Clinical and Scientific Data Management SolutionBuilding a Next Generation Clinical and Scientific Data Management Solution
Building a Next Generation Clinical and Scientific Data Management SolutionSaama
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata managementPistoia Alliance
 
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...Lew Berman
 
Multivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataMultivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataUC Davis
 
Next Gen Clinical Data Sciences
Next Gen Clinical Data SciencesNext Gen Clinical Data Sciences
Next Gen Clinical Data SciencesSaama
 
2012 DIA EMRs for clinical research
2012 DIA  EMRs for clinical research2012 DIA  EMRs for clinical research
2012 DIA EMRs for clinical researchEd Seguine
 
Clinical data munging
Clinical data mungingClinical data munging
Clinical data mungingKen Mwai
 
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSINGMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSINGIJCSEIT Journal
 
Advanced Laboratory Analytics — A Disruptive Solution for Health Systems
Advanced Laboratory Analytics — A Disruptive Solution for Health SystemsAdvanced Laboratory Analytics — A Disruptive Solution for Health Systems
Advanced Laboratory Analytics — A Disruptive Solution for Health SystemsViewics
 
Handling Third Party Vendor Data_Katalyst HLS
Handling Third Party Vendor Data_Katalyst HLSHandling Third Party Vendor Data_Katalyst HLS
Handling Third Party Vendor Data_Katalyst HLSKatalyst HLS
 

Similaire à Bridging Data Quality Checks and Research in a Pediatric Network (20)

Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...Next generation electronic medical records and search a test implementation i...
Next generation electronic medical records and search a test implementation i...
 
Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12Clinical research innovation hub walking deck v12
Clinical research innovation hub walking deck v12
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
 
Embi cri review-2012-final
Embi cri review-2012-finalEmbi cri review-2012-final
Embi cri review-2012-final
 
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentMetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
 
SHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLPSHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLP
 
The Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham TaylorThe Human Variome Database in Australia in 2014 - Graham Taylor
The Human Variome Database in Australia in 2014 - Graham Taylor
 
Building a Next Generation Clinical and Scientific Data Management Solution
Building a Next Generation Clinical and Scientific Data Management SolutionBuilding a Next Generation Clinical and Scientific Data Management Solution
Building a Next Generation Clinical and Scientific Data Management Solution
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...
IFD&TC 2018: A Novel Approach for Conveniently and Securely Collecting Person...
 
Multivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataMultivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic Data
 
Next Gen Clinical Data Sciences
Next Gen Clinical Data SciencesNext Gen Clinical Data Sciences
Next Gen Clinical Data Sciences
 
2012 DIA EMRs for clinical research
2012 DIA  EMRs for clinical research2012 DIA  EMRs for clinical research
2012 DIA EMRs for clinical research
 
Clinical data munging
Clinical data mungingClinical data munging
Clinical data munging
 
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSINGMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
 
Advanced Laboratory Analytics — A Disruptive Solution for Health Systems
Advanced Laboratory Analytics — A Disruptive Solution for Health SystemsAdvanced Laboratory Analytics — A Disruptive Solution for Health Systems
Advanced Laboratory Analytics — A Disruptive Solution for Health Systems
 
Handling Third Party Vendor Data_Katalyst HLS
Handling Third Party Vendor Data_Katalyst HLSHandling Third Party Vendor Data_Katalyst HLS
Handling Third Party Vendor Data_Katalyst HLS
 

Dernier

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 

Dernier (20)

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 

Bridging Data Quality Checks and Research in a Pediatric Network

  • 1. Understanding the gaps between Data Quality Checks and Research Capabilities in a Pediatric Data Research Network Ritu Khare, Hanieh Razzaghi, Levon Utidjian, Matthew Miller, L. Charles Bailey The Children’s Hospital of Philadelphia
  • 2. 2 PEDSnet CDRN = 5.7M patients in pediatrics Phase 2 (9m): Conduct Science Queries Phase 1 (18m): Initial Infrastructure
  • 3. Data Quality Assessment in PEDSnet • Data ready for research use??? • PEDSnet data quality workflow • Design data quality checks • https://github.com/PEDSnet/Data-Quality-Analysis • Identify data quality issues • Rate of extract-transform load (ETL) errors reduced from >50% to <10% (Khare et al., JAMIA in press) Type of Check Issue Example Missing data Gestational age missing for 70% of patients Invalid value Race outside the acceptable values in PEDSnet conventions Implausible event Encounter start date after the end date
  • 4. PEDSnet Phase 1: Data Quality Assessment 0 100 200 300 400 500 600 700 1 2 3 4 5 6 7 8 9 #DataQualityChecks Data CycleJan 2015 May 2016 Frameworks, methods in literature (Brown et al. 2013, Weiskopf and Weng, 2013, Kahn et al. 2015) c THEORY-DRIVEN  50 members in informatics team  Data and issue review DEVELOPER- DRIVEN
  • 5. PEDSnet Phase 2: Conducting Science Queries • >30 scientific studies: Computable phenotypes, feasibility queries, association studies, etc. Site % children with CT-scan during ED visits in 2013-2016 A 3.32% B 4.87% C 3.58% D 2.98% E 0.11% F 3.62% G 5.11% H 5.92% Incorrect mapping of CT-scan procedure Invalid coding of ED visits Bug in the query True anomaly
  • 6. PEDSnet Phase 2: Data Quality Assessment • USER-DRIVEN: >75 new issues, and 8 new check types Check Type Issue Example Outliers in derived values Average length of inpatient stays Inconsistency between similar concepts captured in different tables Specialty data in provider vs. care_site tables Incorrect mapping from EHR to PEDSnet Mapping of labs to LOINC Missing Expected Facts GI Providers, creatinine labs, etc
  • 7. PEDSnet Phase 2: Data Quality Assessment Check Type Issue Example Unexpected Facts Procedures recorded in the condition table Variability in coding Different concepts used to represent same lab or vitals Unexpected most frequent values “shooting pain” identified as top inpatient visit condition Face validity issues Tables with unexpectedly low number of records
  • 8. PEDSnet Phase 2: Check Design Challenges • Determine the combination of fields / tables • ~100 fields in PEDSnet data model • Determination of outlier • Differentiate between true anomaly and real data quality issue • Determination of thresholds • Experimentation with datasets • Automatic review of ETL mappings • labs, organisms, specialty, route, race, ethnicity, drugs, language, procedure, smoking history • 1000s of manually derived mappings
  • 9. Conclusions • A new (user-driven) perspective on data quality • Usability evaluation of PEDSnet data quality assessment program • 20% increase in types of checks • Future work • Investigate the Phase 2 check design challenges • Reverse engineering of checks from issues identified in science queries
  • 10. Acknowledgments • PEDSnet Teams • Leadership and governance • Informatics • Pilot studies • PCORnet Governance Committees and DRN OC • OHDSI Consortium • Patients and Families • This work was supported by PCORI Contract CDRN-1306-01556. • PEDSnet Data Quality Scripts: https://github.com/PEDSnet/Data- Quality-Analysis

Notes de l'éditeur

  1. Talk about scale and range of PEDSnet More stuff about pediatrics and CDRNs
  2. Implemented during the development, in tandem. 2 two years, Developing checks and identifying/documenting issues. Check vs issue graph. # issues logged till May 2016 etc.
  3. Implemented during the development, in tandem. 2 two years, Developing checks and identifying/documenting issues. Check vs issue graph. # issues logged till May 2016 etc.