SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
The Language Diversity
of Computing
Or, how to talk with a computer.
Jeremy Yang
(Mgr., Systems & Programming)
Translational Informatics Div.
Dept. of Internal Medicine
University of New Mexico
UNM D2K Day -- Feb. 20, 2015 1
Language Diversity Examples
Python Perl
Natural
Language
Icons Menus
C++ Java APIs SQL Sparql
XML XSD XPath URLs bash
HTML HTTP ASCII UTF-8 regex
TCP/IP
Controlled
Vocabularies REST OWL RDF 2
A Working Definition of “Language”
● Coherent symbology (symbolic system)
3
Languages: Some major advances
● FORTRAN
(1953)
● COBOL
(1960)
● C (1969)
● SQL (1979)
● WWW
(~1990)
● Semantic
Web (~2000)
○ RDF
○ Sparql
○ OWL
● C++ (1979)
● Perl (1987)
● Python (1989)
● Java (1995)
4
Language merit vs. elitism
5
Why do we care about languages?
● Compatibility
● Efficiency
● Usability
● Knowledge
representation
● Intelligence
● Evolution
Of course we care. Duh!
6
Q: So what is the problem?
A: Language gaps
CODE
JARGON
MEANING
“Interpretation”
MATH
7
Q: So what is the problem?
A: Standards (so many!)
“Why can’t my iPhone talk to my ...”
● TV
● Audio system
● Car
● Medical records
8
Domain study: Languages of
Biomedical Knowledge
Computers learning and talking medicine 9
Our project:
Illuminating the Druggable Genome (IDG)
$4.9M
10
Illuminating the Druggable Genome
Knowledge Management Center (IDG-KMC)
Translational Informatics Division
Chief: Tudor Oprea, MD, PhD
IDG-KMC Workflow
11
Slide ℅ Tudor Oprea12
IDG-KMC Collaborator Network
13
IDG-KMC
Language
Challenge:
Case #1:
Drug
Nomenclature
14
IDG-KMC
Language
Challenge:
Case #2:
Disease
Ontology
17k codes in ICD-9.
155k codes in ICD-10.
Medical “coding” a
huge $ item.
15
TCRD - Target Central Research Db
+------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+
| doid | Disease | zscore | conf | Protein | idgfam | tdl |
+------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+
| DOID:13189 | Gout | 3.512 | 1.8 | alpha-kinase 1 | Kinase | Tmacro |
| DOID:13189 | Gout | 3.214 | 1.6 | salt-inducible kinase 1 | Kinase | Tclin |
| DOID:13189 | Gout | 2.922 | 1.5 | melanocortin 3 receptor | GPCR | Tclin |
| DOID:13189 | Gout | 3.036 | 1.5 | olfactory receptor, family 2, subfamily M, member 3 | GPCR | Tdark |
| DOID:13189 | Gout | 2.797 | 1.4 | taste receptor, type 2, member 30 | GPCR | Tdark |
| DOID:13189 | Gout | 2.576 | 1.3 | taste receptor, type 2, member 16 | GPCR | Tmacro |
| DOID:13189 | Gout | 2.379 | 1.2 | hepatocyte nuclear factor 4, gamma | NR | Tgray |
| DOID:13189 | Gout | 2.441 | 1.2 | spleen tyrosine kinase | Kinase | Tclin |
| DOID:13189 | Gout | 1.948 | 1.0 | protein kinase, cGMP-dependent, type II | Kinase | Tclin |
| DOID:13189 | Gout | 1.798 | 0.9 | pannexin 1 | IC | Tmacro |
| DOID:13189 | Gout | 1.531 | 0.8 | transient receptor potential cation channel, subfamily V, member 1 | IC | Tclin+ |
| DOID:13189 | Gout | 1.517 | 0.8 | taste receptor, type 2, member 38 | GPCR | Tmacro |
| DOID:13189 | Gout | 1.565 | 0.8 | transient receptor potential cation channel, subfamily A, member 1 | IC | Tclin+ |
| DOID:13189 | Gout | 1.375 | 0.7 | transient receptor potential cation channel, subfamily M, member 3 | IC | Tmacro |
| DOID:13189 | Gout | 1.427 | 0.7 | interleukin-1 receptor-associated kinase 1 | Kinase | Tclin |
| DOID:13189 | Gout | 1.388 | 0.7 | adenosine kinase | Kinase | Tclin |
| DOID:1156 | Pseudogout | 2.410 | 1.2 | chloride channel, voltage-sensitive Kb | IC | Tmacro |
| DOID:1156 | Pseudogout | 2.423 | 1.2 | purinergic receptor P2Y, G-protein coupled, 11 | GPCR | Tchem |
| DOID:1156 | Pseudogout | 1.742 | 0.9 | calcium-sensing receptor | GPCR | Tclin+ |
| DOID:1156 | Pseudogout | 1.767 | 0.9 | activin A receptor, type I | Kinase | Tclin |
| DOID:1156 | Pseudogout | 1.454 | 0.7 | purinergic receptor P2Y, G-protein coupled, 2 | GPCR | Tclin+ |
+------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+
Sample query for disease substring “gout”:
16
Language Diversity of Computers
Final Thought:
“Can we talk?”*
℅ Joan Rivers, 1933-2014
17

Contenu connexe

Plus de Jeremy Yang

TIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS AnalyticsTIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS AnalyticsJeremy Yang
 
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizerDrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizerJeremy Yang
 
Mining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypothesesMining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypothesesJeremy Yang
 
TIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APITIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APIJeremy Yang
 
Ex-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles ExplorerEx-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles ExplorerJeremy Yang
 
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Jeremy Yang
 
Open Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterOpen Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterJeremy Yang
 
Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)Jeremy Yang
 
Bibliological data science and drug discovery
Bibliological data science and drug discoveryBibliological data science and drug discovery
Bibliological data science and drug discoveryJeremy Yang
 
BioMISS: Language Diversity of Computing
BioMISS: Language Diversity of ComputingBioMISS: Language Diversity of Computing
BioMISS: Language Diversity of ComputingJeremy Yang
 
RMSD: routine measure stirs doubts
RMSD: routine measure stirs doubtsRMSD: routine measure stirs doubts
RMSD: routine measure stirs doubtsJeremy Yang
 
Canonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformaticsCanonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformaticsJeremy Yang
 
Molecular scaffolds poster
Molecular scaffolds posterMolecular scaffolds poster
Molecular scaffolds posterJeremy Yang
 
Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryJeremy Yang
 
The BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDThe BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDJeremy Yang
 
Cheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case StudiesCheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case StudiesJeremy Yang
 
How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...Jeremy Yang
 
UNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applicationsUNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applicationsJeremy Yang
 
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in BiocomputingCyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in BiocomputingJeremy Yang
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNJeremy Yang
 

Plus de Jeremy Yang (20)

TIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS AnalyticsTIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS Analytics
 
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizerDrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
 
Mining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypothesesMining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
 
TIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APITIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST API
 
Ex-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles ExplorerEx-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles Explorer
 
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...
 
Open Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterOpen Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource poster
 
Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)
 
Bibliological data science and drug discovery
Bibliological data science and drug discoveryBibliological data science and drug discovery
Bibliological data science and drug discovery
 
BioMISS: Language Diversity of Computing
BioMISS: Language Diversity of ComputingBioMISS: Language Diversity of Computing
BioMISS: Language Diversity of Computing
 
RMSD: routine measure stirs doubts
RMSD: routine measure stirs doubtsRMSD: routine measure stirs doubts
RMSD: routine measure stirs doubts
 
Canonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformaticsCanonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformatics
 
Molecular scaffolds poster
Molecular scaffolds posterMolecular scaffolds poster
Molecular scaffolds poster
 
Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discovery
 
The BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDThe BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARD
 
Cheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case StudiesCheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case Studies
 
How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...
 
UNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applicationsUNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applications
 
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in BiocomputingCyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCN
 

Dernier

FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.takadzanijustinmaime
 
Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxCherry
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Cherry
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsbassianu17
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Cherry
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfCherry
 
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxCherry
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptxArvind Kumar
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 

Dernier (20)

FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.
 
Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptx
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 

The Language Diversity of Computing

  • 1. The Language Diversity of Computing Or, how to talk with a computer. Jeremy Yang (Mgr., Systems & Programming) Translational Informatics Div. Dept. of Internal Medicine University of New Mexico UNM D2K Day -- Feb. 20, 2015 1
  • 2. Language Diversity Examples Python Perl Natural Language Icons Menus C++ Java APIs SQL Sparql XML XSD XPath URLs bash HTML HTTP ASCII UTF-8 regex TCP/IP Controlled Vocabularies REST OWL RDF 2
  • 3. A Working Definition of “Language” ● Coherent symbology (symbolic system) 3
  • 4. Languages: Some major advances ● FORTRAN (1953) ● COBOL (1960) ● C (1969) ● SQL (1979) ● WWW (~1990) ● Semantic Web (~2000) ○ RDF ○ Sparql ○ OWL ● C++ (1979) ● Perl (1987) ● Python (1989) ● Java (1995) 4
  • 5. Language merit vs. elitism 5
  • 6. Why do we care about languages? ● Compatibility ● Efficiency ● Usability ● Knowledge representation ● Intelligence ● Evolution Of course we care. Duh! 6
  • 7. Q: So what is the problem? A: Language gaps CODE JARGON MEANING “Interpretation” MATH 7
  • 8. Q: So what is the problem? A: Standards (so many!) “Why can’t my iPhone talk to my ...” ● TV ● Audio system ● Car ● Medical records 8
  • 9. Domain study: Languages of Biomedical Knowledge Computers learning and talking medicine 9
  • 10. Our project: Illuminating the Druggable Genome (IDG) $4.9M 10
  • 11. Illuminating the Druggable Genome Knowledge Management Center (IDG-KMC) Translational Informatics Division Chief: Tudor Oprea, MD, PhD IDG-KMC Workflow 11
  • 12. Slide ℅ Tudor Oprea12
  • 15. IDG-KMC Language Challenge: Case #2: Disease Ontology 17k codes in ICD-9. 155k codes in ICD-10. Medical “coding” a huge $ item. 15
  • 16. TCRD - Target Central Research Db +------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+ | doid | Disease | zscore | conf | Protein | idgfam | tdl | +------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+ | DOID:13189 | Gout | 3.512 | 1.8 | alpha-kinase 1 | Kinase | Tmacro | | DOID:13189 | Gout | 3.214 | 1.6 | salt-inducible kinase 1 | Kinase | Tclin | | DOID:13189 | Gout | 2.922 | 1.5 | melanocortin 3 receptor | GPCR | Tclin | | DOID:13189 | Gout | 3.036 | 1.5 | olfactory receptor, family 2, subfamily M, member 3 | GPCR | Tdark | | DOID:13189 | Gout | 2.797 | 1.4 | taste receptor, type 2, member 30 | GPCR | Tdark | | DOID:13189 | Gout | 2.576 | 1.3 | taste receptor, type 2, member 16 | GPCR | Tmacro | | DOID:13189 | Gout | 2.379 | 1.2 | hepatocyte nuclear factor 4, gamma | NR | Tgray | | DOID:13189 | Gout | 2.441 | 1.2 | spleen tyrosine kinase | Kinase | Tclin | | DOID:13189 | Gout | 1.948 | 1.0 | protein kinase, cGMP-dependent, type II | Kinase | Tclin | | DOID:13189 | Gout | 1.798 | 0.9 | pannexin 1 | IC | Tmacro | | DOID:13189 | Gout | 1.531 | 0.8 | transient receptor potential cation channel, subfamily V, member 1 | IC | Tclin+ | | DOID:13189 | Gout | 1.517 | 0.8 | taste receptor, type 2, member 38 | GPCR | Tmacro | | DOID:13189 | Gout | 1.565 | 0.8 | transient receptor potential cation channel, subfamily A, member 1 | IC | Tclin+ | | DOID:13189 | Gout | 1.375 | 0.7 | transient receptor potential cation channel, subfamily M, member 3 | IC | Tmacro | | DOID:13189 | Gout | 1.427 | 0.7 | interleukin-1 receptor-associated kinase 1 | Kinase | Tclin | | DOID:13189 | Gout | 1.388 | 0.7 | adenosine kinase | Kinase | Tclin | | DOID:1156 | Pseudogout | 2.410 | 1.2 | chloride channel, voltage-sensitive Kb | IC | Tmacro | | DOID:1156 | Pseudogout | 2.423 | 1.2 | purinergic receptor P2Y, G-protein coupled, 11 | GPCR | Tchem | | DOID:1156 | Pseudogout | 1.742 | 0.9 | calcium-sensing receptor | GPCR | Tclin+ | | DOID:1156 | Pseudogout | 1.767 | 0.9 | activin A receptor, type I | Kinase | Tclin | | DOID:1156 | Pseudogout | 1.454 | 0.7 | purinergic receptor P2Y, G-protein coupled, 2 | GPCR | Tclin+ | +------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+ Sample query for disease substring “gout”: 16
  • 17. Language Diversity of Computers Final Thought: “Can we talk?”* ℅ Joan Rivers, 1933-2014 17