Amia 2013: From EHRs to Linked Data: representing and mining encounter data for clinical expertise evaluation
1. From EHRs to Linked Data:
representing and mining encounter
data for clinical expertise
evaluation
Carlo Torniai
Shahim Essaid, Chris Barnes, Mike Conlon, Stephen Williams,
Janos Hajagos, Erich Bremer, Jon Corson-Rikert, Melissa Haendel
2. CTSAConnect Project
Goals:
– Identify potential collaborators, relevant resources, and
expertise across scientific disciplines
– Assemble translational teams of scientists to address specific
research questions
Approach:
Create a semantic representation of clinician and basic science
researcher expertise to enable
– Broad and computable representation of translational
expertise
– Publication of expertise as Linked Data (LD) for use in other
applications
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
3. Merging VIVO and eagle-i
Semantic People
VIVO
VIVO
Coordination
Clinical
eagle-i eagle-i
activities
Resources
eagle-i is an ontology-driven application . . . for collecting and
searching research resources.
VIVO is an ontology-driven application . . . for collecting and
displaying information about people.
Both publish Linked Data. Neither addresses clinical expertise.
CTSAconnect will produce a single Integrated Semantic
Framework, a modular collection of ontologies — that also
includes clinical expertise
www.ctsaconnect.org
3/26/2013 CTSAconnect 3
Reveal Connections. Realize Potential.
4. ISF Clinical module
ARG: Agents, Resources, Grants ontology
CM: Clinical module
IAO: Information Artifact Ontology
OBI: Ontology for Biomedical
Investigations
OGMS: Ontology for General Medical
Science
FOAF: Friend of a Friend vocabulary
BFO: Basic Formal Ontology
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
5. ISF Clinical module: encounter
ARG: Agents, Resources, Grants ontology
CM: Clinical module
OGMS: Ontology for General Medical
Science
FOAF: Friend of a Friend vocabulary
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
6. ISF Clinical module: encounter output
CM: Clinical module
OBI: Ontology for Biomedical
Investigations
OGMS: Ontology for General
Medical Science
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
7. ISF: Clinical expertise representation
Leveraging billing codes to represent clinical expertise
- expertise as “weights” associated to billing codes
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
8. Computing and publishing clinical
expertise
Step 1 Step 2 Step 3 Step 4
Aggregate Compute Map Data to Publish Linked
Clinical Data Expertise ISF Data
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
9. Aggregate clinical data
Step 1 Step 2 Step 3 Step 4
Aggregate Compute Map Data to Publish Linked
Clinical Data Expertise ISF Data
Provider ICD Code Unique Patient
ID Code Value Count Count Code Label
Unilateral or unspecified femoral hernia
1234567 552.00 1 1 with obstruction (ICD9CM 552.00)
Bilateral femoral hernia without mention
1234567 553.02 8 6 of obstruction or gangrene (ICD9CM
553.02)
Regional enteritis of large intestine
1234567 555.1 4 1 (ICD9CM 555.1)
Corrected transposition of great vessels
1234568 745.12 10 5 (ICD9CM 745.12)
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
10. Compute expertise: weighting the codes
Step 1 Step 2 Step 3 Step 4
Aggregate Compute Map Data to Publish Linked
Clinical Data Expertise ISF Data
Code Weight = code frequency * percentage of patients
A provider with 500 patients has used Syndactyly (ICD9: 755.12) for 30
unique patients 75 times
Percentage of patients with code: 6%
Code frequency: 75/30 = 2.5
Code weight: 6 * 2.5 = 15
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
11. Compute expertise: footprint
Step 1 Step 2 Step 3 Step 4
Aggregate Compute Map Data to Publish Linked
Clinical Data Expertise ISF Data
We group the codes according to the top level ICD code and get the
top 10 codes to generate the expertise footprint for each
practitioner
ICD code Weight ICD code Weight
366.1 24.42 250 43.2
250 24 366 42.82
366.9 18.4 …. ….
250.2 19.2 …. ….
…. …. …. ….
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
12. Mapping Expertise to the ISF
Step 1 Step 2 Step 3 Step 4
Aggregate Map Data to Map Data to Publish Linked
Clinical Data ISF ISF Data
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
13. Publish Linked Data
Step 1 Step 2 Step 3 Step 4
Aggregate Map Data to Compute Publish Linked
Clinical Data ISF Expertise Data
Other APIs
Endpoints
SPARQL
…
Linked Data Several means
Triple Stores to access and
cloud
query data
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
14. What can be done with the published
dataset
SELECT ?expertise ?label ?weight
WHERE
{ Select the expertise for
<http://ohsu.dev.eagle-i.net/i/1235281379> obo:BFO_0000086
?expertise. provider
http://ohsu.dev.eagle-i.net/i/1235281379
Select the weight and the label
?expertise_measurement obo:IAO_0000221 ?expertise.
for measurements relative to the
expertise
?expertise_measurement obo:ARG_2000012 ?label.
?expertise_measurement obo:IAO_0000004 ?weight. Select the weight and the label
}
for measurements
The information is enough to represent clinical expertise as a
tag cloud
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
15. Sample encounter data published as LOD
Health Care Encounter
Annotations and Instance URI
Properties
Inferred Types
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
16. Querying the sample encounter data
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
17. Next steps: enhance expertise
representation by mapping ICD9 to MeSH
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
18. Next steps: enhance expertise calculation
• More sophisticated algorithm leveraging MeSH
hierarchy
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
19. Beyond expertise
Expertise linked to MeSH will enable meaningful connections
between clinicians, basic researchers, and biomedical knowledge
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
20. Team Resources
CTSAconnect project
OHSU: Stony Brook University:
ctsaconnect.org
Melissa Haendel, Carlo Torniai,Moises Eisenberg, Erich
Bremer, Janos Hajagos
Nicole Vasilevsky, Shahim Essaid, The clinical module source:
Eric Orwoll http://bit.ly/clinical-isf
Harvard University:
Daniela Bourges-Waldegg
Cornell University: Sophia Cheng CTSAconnect ontology
Jon Corson-Rikert, Dean Krafft, sourcehttp://code.google.com/p/connect-isf/
Brian Lowe Share Center:
University of Florida: Chris Kelleher, Will Dataset and queries documentation
Mike Conlon, Chris Barnes, Corbett, Ranjit Das, Ben https://code.google.com/p/ctsaconnect/w/list
Nicholas Rejack Sharma
University at Buffalo:
Barry Smith, Dagobert
Soergel
Support : NCATS through Booz Allen
Hamilton
CTSA 10-001: 100928SB23
CTSA 10-001: 100928SB23
www.ctsaconnect.org CTSAconnect
PROJECT #: 00921-0001 Reveal Connections. Realize Potential.
Editor's Notes
Synostosis: abnorm union between bones or parts of bonesSyndactyly: A congenital anomaly of the hand or foot, marked by the webbing between adjacent fingers or toes. Syndactylies are classified as complete or incomplete by the degree of joining. Syndactylies can also be simple or complex. Simple syndactyly indicates joining of only skin or soft tissue; complex syndactyly marks joining of bony elements.Craniosynostoses: Premature closure of one or more CRANIAL SUTURES. The sutures are the joints that exist between the skull bones after birth but later close or fuse together.Antley-Bixler Syndrome: An inherited condition characterized by multiple malformations of CARTILAGE and bone including CRANIOSYNOSTOSIS; midfacehypoplasia; radiohumeralSYNOSTOSIS; CHOANAL ATRESIA; femoral bowing; neonatal fractures; and multiple joint CONTRACTURES and, occasionally, urogenital, gastrointestinal or cardiac defects. In utero exposure to FLUCONAZOLE, as well as mutations in at least two separate genes are associated with this condition - POR (encoding P450 (cytochrome) oxidoreductase ( NADPH-FERRIHEMOPROTEIN REDUCTASE)) and FGFR2 (encoding FIBROBLAST GROWTH FACTOR RECEPTOR 2).The figure attempts to show how a weight for a specific concept could be partially passed up the inheritance hierarchy and merged with other values passed up the hierarchy from other concepts. The concept “syndactyly”, which is the mapping of the ICD9 code from the previous slide, is given a weight of 15 by considering the percentage of a clinician’s patients that have that code assigned and by augmenting that percentage with the frequency of use of this code. In other words, if the code is assigned more than once to a patient, the frequency will be more than 1 and this increased frequency should be used as an indication of a provider’s expertise in this area.The next step is to pass up the weight but avoid passing up the full weight in order to avoid having high scores along the whole path to the root concept. The figure shows one way for doing this where the fraction of the weight passed up is related to the number of sibling concepts. The fraction passed up is 1/3 for the concept “synostosis” because there only two other siblings in MeSH but the fraction to the other more general concept is 1/10 due to the existence of 9 siblings under that part of the hierarchy. This choice appears to be correct in this case because we would not want to assume that a clinician that is specialized in “syndactyly” is also specialized in all the various “congenital limb deformities” but the provider can be considered an expert in “synostosis” since “synostosis” is closer “syndactyly”. The assumption is that the closeness of a subconcept is related to the number of siblings; the more siblings there are, the broader or more distant the parent concept is assumed to be.
“syndactyly” is a variable fusion of digits (fingers or toes) with or without the fusion of bones. The original ICD codes is specific to the fingers with fusion of bones. MeSH doesn’t have that level of specificity so there is no direct mapping to MeSH. However, SNOMED-CT does provide this level of specificity and as in the case for the ICD code, there is no mapping of this SNOMEC code to MeSH.We can find mappings to the MeSH heading “Syndactyly” when we use more general (parent) ICD or SNOMED codes where the concept is “any fusion of fingers or toes with or without fusion of bones”. The figure shows two ways for reaching this more general concept, either by using a parent ICD code or by using SNOMED. The indirect mapping through SNOMED will be more necessary when the original coding system does not have a hierarchy or relations that enable the navigation to a more general concept. CPT codes are an example, they do not have a native hierarchy and the use of an alternative hierarchy will be needed.