Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
i2b2 Challenge 2009
and
Our Participation


 Irena Spasic
 Farzaneh Sarafraz
 Goran Nenadic
Summer 2009
About i2b2

Informatics for Integrating Biology & the Bedside
Related to NIH
3 Shared tasks so far
The task: Medication Extraction

Given                   Other
   Discharge reports      Event
Wanted                    T...
Example
Ciprofloxacin 500 mg q.6h. for remaining four 
  doses baby aspirin 81 mg daily , Lasix 40 mg 
  b.i.d. , for thre...
Regulations/requirements

Medical requirements
  Drug taken by patient
  No allergies
  No food, water, diet, tobacco, alc...
Required output

Event-based annotation
Repeat individual mention for each event
  “Aspirin for headache and for leg pain”...
Training and test data

Ground Truth, 27 records
  Manually annotated by “PG students”
  Scrutinised by the community
  Re...
Our system

Linguistic Preprocessing
  Input: plain ASCII
  Output: XML
Rules
  MinorThird
Template Filling
Preprocessing

Split sentences
  A sentence and paragraph breaker
  NaCTeM: sptoolkit.jar
POS tagging
  A part-of-speech t...
Rules

Medication Dictionary (> 1000)
Morphological: medication affix (> 100)
  -bicine, -caine, etc.
Precedes a mode
  In...
Word lists and regular expressions

Dosage, mode, frequency
Duration (While, for, etc.)
Reason
  Head
     Diseases
     S...
Producing output

Remove allergies
Remove laboratory results
Merge labels
  <m>INSULIN</m> <m>GLARGINE</m>
  <f>after dial...
Evaluation process

Small training data (27)
  Organisers
  Community
Gold standard test data (260)
  Annotated by partici...
Evaluation on ground truth
inexact                 horizontal      system­level    X       0.8776

inexact                ...
Preliminary evaluation on test data
inexact    horizontal  system­level       X              0.7847

inexact    horizontal...
Prochain SlideShare
Chargement dans…5
×

I2b209

532 vues

Publié le

  • Identifiez-vous pour voir les commentaires

  • Soyez le premier à aimer ceci

I2b209

  1. 1. i2b2 Challenge 2009 and Our Participation Irena Spasic Farzaneh Sarafraz Goran Nenadic Summer 2009
  2. 2. About i2b2 Informatics for Integrating Biology & the Bedside Related to NIH 3 Shared tasks so far
  3. 3. The task: Medication Extraction Given Other Discharge reports Event Wanted Temporal Medication mention Certainty Dose Mode of application Frequency Duration Reason List/narrative
  4. 4. Example Ciprofloxacin 500 mg q.6h. for remaining four  doses baby aspirin 81 mg daily , Lasix 40 mg  b.i.d. , for three days along with potassium  chloride slow release 20 mEq b.i.d. for three  days , Motrin 400 mg q.8h. p.r.n. Pain The patient had received a total of five units  of packed red blood cells due to blood loss
  5. 5. Regulations/requirements Medical requirements Drug taken by patient No allergies No food, water, diet, tobacco, alcohol, illicit drugs Linguistic requirements the most informative base adjective phrase or the longest base noun phrase as reason
  6. 6. Required output Event-based annotation Repeat individual mention for each event “Aspirin for headache and for leg pain” Aspirin … headache Aspirin … leg pain Semantic-level expectations NITROGLYCERIN 1/150 ( 0.4 MG ) 1 TAB SL q5min  x 3
  7. 7. Training and test data Ground Truth, 27 records Manually annotated by “PG students” Scrutinised by the community Relative f-score: ~60% Unannotated training data: 620 Test data: 260
  8. 8. Our system Linguistic Preprocessing Input: plain ASCII Output: XML Rules MinorThird Template Filling
  9. 9. Preprocessing Split sentences A sentence and paragraph breaker NaCTeM: sptoolkit.jar POS tagging A part-of-speech tagger for English Tsujii: postagger Parsing (chunking) CFG parser Tsujii: chunkparser
  10. 10. Rules Medication Dictionary (> 1000) Morphological: medication affix (> 100) -bicine, -caine, etc. Precedes a mode Inhaler, supplement, etc. Medication type Cardiac, cardiovascular (~100) Symptoms (~100) Chest discomfort, etc.
  11. 11. Word lists and regular expressions Dosage, mode, frequency Duration (While, for, etc.) Reason Head Diseases Symptoms (pain, agitation, etc.) ~20 Inffixes (hyper-, -emia, etc.) Modifier (acute, chronic, etc.) <100 Time phrases, Body parts
  12. 12. Producing output Remove allergies Remove laboratory results Merge labels <m>INSULIN</m> <m>GLARGINE</m> <f>after dialysis</f> on <f>Monday</f>­ <f>Wednesday</f>­<f>Friday</f> Remove negated medications “patient instructed not to take Viagra.” etc.
  13. 13. Evaluation process Small training data (27) Organisers Community Gold standard test data (260) Annotated by participants Merge and tie-break Community Silver data (620) Voting
  14. 14. Evaluation on ground truth inexact                 horizontal      system­level    X       0.8776 inexact                 horizontal      patient­level   X       0.8928 inexact                 vertical        system­level    do      0.9150 inexact                 vertical        patient­level   do      0.9160 inexact                 vertical        system­level    f       0.9172 inexact                 vertical        patient­level   f       0.9197 inexact                 vertical        system­level    mo      0.9441 inexact                 vertical        patient­level   mo      0.9471 inexact                 vertical        system­level    m       0.9544 inexact                 vertical        patient­level   m       0.9519 inexact                 vertical        system­level    r       0.5260 inexact                 vertical        patient­level   r       0.3876 inexact                 vertical        system­level    du      0.7958 inexact                 vertical        patient­level   du      0.5846
  15. 15. Preliminary evaluation on test data inexact    horizontal  system­level     X       0.7847 inexact    horizontal  patient­level     X      0.7755 inexact    vertical    system­level     do     0.8267 inexact    vertical    patient­level     do   0.8155 inexact    vertical    system­level     f     0.8349 inexact    vertical    patient­level     f     0.8289 inexact    vertical    system­level     mo     0.8359 inexact    vertical    patient­level     mo     0.8256 inexact    vertical    system­level     m     0.8533 inexact    vertical    patient­level     m     0.8541 inexact    vertical    system­level     r     0.3881 inexact    vertical    patient­level     r     0.3883 inexact    vertical    system­level     du     0.51 inexact    vertical    patient­level     du     0.4969

×