Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Extracting information
  from clinical notes

  H. Yang, I. Spasic, F. Sarafraz,
  John A. Keane, Goran Nenadic


     Sch...
Motivation & aim
 Electronic clinical notes
    electronic medical/health records
    hospital discharge summaries
 Ex...
Clinical notes
 Highly condensed text
    sometimes without proper sentences
    hospital discharge summaries are more ...
NLP challenges in clinical data
 A series of international challenges in information
  extraction from clinical narrative...
i2b2 2008
 Extract status of diseases in patients
       obesity, diabetes mellitus, hypercholesterolemia,
        hyper...
Methodology
                    Linguistic      section splitting, sentence splitting,
                 pre-processing    ...
Rule-based IE
 Disease status patterns
 - context-based patterns
   [N] negative for CHF
   [Q] question of asthma
   [U]...
Textual Annotation Results

 Performance on Disease Status (Ranked 1st)
Micro-average: Accuracy (0.9723)
Macro-average: P...
Intuitive Annotation Results

 Performance on Disease Status (Ranked 7th)
Micro-average: Accuracy (0.9572)
Macro-average:...
i2b2 2009
 Extract mentions of medication and related
  information
   drugs the patient takes
   dose, mode of applica...
Evaluation (F-measure)


              Medication                              83.59%
              Dosage                ...
Summary
 NLP and text mining techniques are useful for extraction
  of clinical data
  - disease status extraction: 95-97...
Health care  special interest-i2b2
Prochain SlideShare
Chargement dans…5
×

Health care special interest-i2b2

791 vues

Publié le

Publié dans : Technologie
  • Identifiez-vous pour voir les commentaires

Health care special interest-i2b2

  1. 1. Extracting information from clinical notes H. Yang, I. Spasic, F. Sarafraz, John A. Keane, Goran Nenadic School of Computer Science University of Manchester
  2. 2. Motivation & aim  Electronic clinical notes  electronic medical/health records  hospital discharge summaries  Extract information on  individual patients and their diseases  clinical practice  treatments, drugs used, etc.  Aim: support data analytics  e.g. monitoring quality  Huge interest locally and internationally
  3. 3. Clinical notes  Highly condensed text  sometimes without proper sentences  hospital discharge summaries are more structured  list of medications, symptoms, etc.  Terminological variability  orthographic, acronyms, local conventions  Various sections  previous history, social/family background
  4. 4. NLP challenges in clinical data  A series of international challenges in information extraction from clinical narratives  organisers: Informatics for Integrating Biology & the Bedside (i2b2)  3 shared tasks so far − De-identification of medical records and identification of smokers from their clinical records (2007) Identification of obesity & related diseases in patients from hospital discharge documents (2008) Extraction of medications and related information from patients’ discharge documents (2009)  2010 challenge  concept, assertions, relations
  5. 5. i2b2 2008  Extract status of diseases in patients  obesity, diabetes mellitus, hypercholesterolemia, hypertriglyceridemia, hypertension, heart failure (16 in total)  status: yes, no, unmentioned, questionable  on textual and “intuitive” level  28 teams worldwide  UoM ranked 1st in textual and 7th in intuitive  Our methodology  Term-based exact and approximate matching  Context-based pattern- and rule-based matching  Machine learning approach Yang, H., Spasic, I., Keane, J., Nenadic, G.: A Text Mining Approach to the Prediction of a Disease Status from Clinical Discharge Summaries, JAMIA 16(4):596-600
  6. 6. Methodology Linguistic section splitting, sentence splitting, pre-processing chunking, POS tagging, parsing Information textual evidence extraction, extraction section filtering, morphological Medical (rules, machine clues (e.g. drug/disease name resources learning) affixes) •Disease names •Drug names •Body parts Template filling, filtering negative •Symptoms results, relations and heuristics: •Abbreviations Constructing Organ : Symptom, •Synonyms results Symptom : Disease, Disease : Drug, Drug : Mode of application
  7. 7. Rule-based IE  Disease status patterns - context-based patterns [N] negative for CHF [Q] question of asthma [U] no known diagnosis of CAD [U] we should consider further asthma studies as an outpatient - semantics-based patterns [N] normal coronaries, a thin black man  Clinical resources used in sentence extraction  clinical inference rules e.g., weight>90kg, LDL>160mg/dl, HDL<35mg/dl  medications e.g., ‘anti-depressant’
  8. 8. Textual Annotation Results  Performance on Disease Status (Ranked 1st) Micro-average: Accuracy (0.9723) Macro-average: P (0.8482), R (0.7737), F-score (0.8052) #Eval #Corr #Gold Precision Recall F-score Y 2267 2132 2192 0.9404 0.9726 0.9562 N 56 40 65 0.7142 0.6153 0.6611 Q 12 9 17 0.7500 0.5294 0.6206 U 5709 5640 5770 0.9879 0.9774 0.9826
  9. 9. Intuitive Annotation Results  Performance on Disease Status (Ranked 7th) Micro-average: Accuracy (0.9572) Macro-average: P (0.6383), R (0.6294), F-score (0.6336) #Eval #Corr #Gold Precision Recall F-Score Y 2160 2068 2285 0.9574 0.9050 0.9304 N 5236 5014 5100 0.9576 0.9831 0.9702 Q 3 0 14 0 0 0
  10. 10. i2b2 2009  Extract mentions of medication and related information  drugs the patient takes  dose, mode of application, frequency, duration, etc. (for each mention)  19 teams worldwide  UoM ranked 3rd  Our approach was based on combining  extensive dictionaries  morphological and derivational patterns
  11. 11. Evaluation (F-measure) Medication 83.59% Dosage 82.67% Frequency 83.49% Mode 85.33% Duration 51.00% Reason 38.81% All fields 78.47% Spasić I, Sarafraz F, Keane JA, Nenadic G: “Medication Information Extraction with Linguistic Pattern Matching and Semantic Rules”, JAMIA (to appear)
  12. 12. Summary  NLP and text mining techniques are useful for extraction of clinical data - disease status extraction: 95-97% accuracy - medication information extraction: 80% F-measure  Construction of reliable and sufficient resources - clinical terms and abbreviations (e.g., disease synonyms, symptoms, drugs) - context patterns related to diseases, medication, etc.  Domain knowledge required  construction of domain- and task-specific resources  complex clinical facts and conditions for inference  more comprehensive knowledge representation needed

×