SlideShare a Scribd company logo
1 of 19
Download to read offline
Alex M. Clark & Leah R. McEwen
Mixtures InChI
A story of how standards drive upstream products
alex@collaborativedrug.com
It's always a mixture
✤ The pure molecule approximation has value... but in the real world:
2
State of mixture informatics
✤ Bespoke formats exist

✤ Generally in silos:

‣ low machine readability

‣ low interoperability
3
✤ Machine readable molecules ~ ½ century ago, but mixtures limited to text
Phenolphthalein, 1% (w/v
)

Indicator Solutio
n

in 95% Ethanol
Mixtures InChI
4
molmatinf.com/minchidemo
Mixtures InChI
4
molmatinf.com/minchidemo
CH2O/c1-2/h1H2 1 37wf-2
CH4O/c1-2/h2H,1H3
3
10:15pp0
H2O/h1H2
2
Mixtures InChI
4
molmatinf.com/minchidemo
MInChI=0.00.1S/ & &


/n{{ & }& }/g{{ &}& }
CH2O/c1-2/h1H2 1 37wf-2
CH4O/c1-2/h2H,1H3
3
10:15pp0
H2O/h1H2
2
Mixtures InChI
4
molmatinf.com/minchidemo
MInChI=0.00.1S/ & &


/n{{ & }& }/g{{ &}& }
CH2O/c1-2/h1H2
1 37wf-2
CH4O/c1-2/h2H,1H3
3 10:15pp0
H2O/h1H2 2
One brick at a time...
✤ Have:
5
✤ What to do? Chicken vs. egg problems...

‣ no community without data

‣ no data without tools

‣ no tools without community
☑︎
use cases

☑︎
standard

☒ tools

☒ data

☒ community
talking to people: demand is real
MInChI speci
fi
cation, IUPAC endorsed
... custom or limited purpose
... not machine readable
... not informatics oriented
have to build simultaneously
Tools
✤ 2018: NIH SBIR grant awarded to
Collaborative Drug Discovery

✤ First step: open source mixture
editor and software libraries

✤ Coded in TypeScript, cross-
compiled to JavaScript, for:

‣ web

‣ desktop (via Electron)

‣ server (via NodeJS)

✤ Operates on "Mix
fi
le" which is
ELN-like, JSON-based, mixture
analog of Mol
fi
le
6
github.com/cdd/mixtures
Upstream/Downstream
7
Mix
fi
le MInChI
commercial open source standards use cases
customers
searching

indexing

labelling

categorising
time
ELN integration
✤ Mixture creation is also part of a
commercial product

✤ Scientists use the ELN already...

✤ ... machine readable data is a side
e
ff
ect of normal use

✤ First class citizens:

‣molecules
‣reactions
‣mixtures
8
collaborativedrug.com
Data
✤ Bootstrap from
text sources

✤ Proprietary
deep learning
algorithm

✤ ~30K mixtures
marked up,
public release
9
✤ Substantial body of exemplars, and upstream test data for MInChI generation

✤ Can rapidly markup inventories and vendor catalogs

✤ Integrated into software-as-a-service products
Support resources
10
✤ Looking up known content
speeds up data creation...
✤ INCI and UNII collections
available to quick search
Demidigital
✤ Partially marked up data can be upgraded by document-wide options...
11
✤ Currently in design phase
MATERIAL MATERIAL
QUANTITY
B
A
C
Community
✤ Creating technology is easy, getting everyone to use it is hard...

✤ Requires concurrent strategies
12
✤ Endorsement by respected standards organisations is a good start

✤ InChI derivatives have enthusiastic champions (that's us!)
Engagement
✤ Code it up: using MInChI notation is easy enough

✤ Got use cases? Let us know

✤ Spread the word: data resources need to be digitised
13
Further viewing
✤ Peer reviewed literature:
14
✤ Webinars:
2019: www.youtube.com/watch?v=PcAJ4HoRnFU

Capturing mixtures — bringing informatics to the world of practical chemistry
2021: www.youtube.com/watch?v=0ILc0owuEzQ (starts at 1:05:00)

Mixtures as
fi
rst class citizens in the realm of informatics
2020: www.youtube.com/watch?v=aSQEVKKnrWw (starts at 4:13:00)

Mixtures: informatics for formulations and consumer products
github.com/cdd/mixtures
Further work
✤ Finalising MInChI 1.0 speci
fi
cation, reference implementation, validation

✤ MInChI needs to extend to less well de
fi
ned chemical entities

‣ variable structures

‣ polymers

‣ biologics

‣ nanomaterials

‣ reaction products

✤ Properties and metadata: ontology based / IUPAC Gold Book

✤ Implementation at scale: registration systems
15
MInChI Open Meeting: 20 April 11am-1pm (EDT)
Questions?
✤ Contact:

‣ Leah R. McEwen lrm1@cornell.edu (Cornell University, IUPAC/InChI Trust)

‣ Alex M. Clark alex@collaborativedrug.com (Collaborative Drug Discovery)

✤ Thanks to the MInChI team
16

More Related Content

Similar to Mixtures InChI: a story of how standards drive upstream products

Prepare for the Mobilacalypse
Prepare for the MobilacalypsePrepare for the Mobilacalypse
Prepare for the Mobilacalypse
Jeff Eaton
 
IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...
IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...
IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...
Dr. Haxel Consult
 

Similar to Mixtures InChI: a story of how standards drive upstream products (20)

Prepare for the Mobilacalypse
Prepare for the MobilacalypsePrepare for the Mobilacalypse
Prepare for the Mobilacalypse
 
LYON_BOCQUET_Justine_MP_2016
LYON_BOCQUET_Justine_MP_2016LYON_BOCQUET_Justine_MP_2016
LYON_BOCQUET_Justine_MP_2016
 
Usability is Important (Even for Flash Designers)
Usability is Important (Even for Flash Designers)Usability is Important (Even for Flash Designers)
Usability is Important (Even for Flash Designers)
 
EPCA ANNUAL REPORT 2012
EPCA ANNUAL REPORT 2012EPCA ANNUAL REPORT 2012
EPCA ANNUAL REPORT 2012
 
Consider Industrial Design Leaders involvement in specialized workshops for D...
Consider Industrial Design Leaders involvement in specialized workshops for D...Consider Industrial Design Leaders involvement in specialized workshops for D...
Consider Industrial Design Leaders involvement in specialized workshops for D...
 
Consider Industrial Design Leaders Involvement in Specialized Workshops
Consider Industrial Design Leaders Involvement in Specialized WorkshopsConsider Industrial Design Leaders Involvement in Specialized Workshops
Consider Industrial Design Leaders Involvement in Specialized Workshops
 
Blockchains and linked data for agrifood value chains
Blockchains and linked data for agrifood value chainsBlockchains and linked data for agrifood value chains
Blockchains and linked data for agrifood value chains
 
Explore the State of Open Source Performance Testing in Continuous Delivery P...
Explore the State of Open Source Performance Testing in Continuous Delivery P...Explore the State of Open Source Performance Testing in Continuous Delivery P...
Explore the State of Open Source Performance Testing in Continuous Delivery P...
 
Co-Developing and Implementing a Content Strategy Focued on User Experience R...
Co-Developing and Implementing a Content Strategy Focued on User Experience R...Co-Developing and Implementing a Content Strategy Focued on User Experience R...
Co-Developing and Implementing a Content Strategy Focued on User Experience R...
 
Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source
 
Great Open Source Compliance For Everyone - Version 11
Great Open Source Compliance For Everyone - Version 11Great Open Source Compliance For Everyone - Version 11
Great Open Source Compliance For Everyone - Version 11
 
Beyond instructional design: how performance support is revolutionizng e-lear...
Beyond instructional design: how performance support is revolutionizng e-lear...Beyond instructional design: how performance support is revolutionizng e-lear...
Beyond instructional design: how performance support is revolutionizng e-lear...
 
IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...
IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...
IC-SDV 2019: The IUPAC InChI Chemical Structure Standard – Today and the Futu...
 
Internet of Things Cologne 2015: To Make the World a Brighter Place - How Dat...
Internet of Things Cologne 2015: To Make the World a Brighter Place - How Dat...Internet of Things Cologne 2015: To Make the World a Brighter Place - How Dat...
Internet of Things Cologne 2015: To Make the World a Brighter Place - How Dat...
 
TRICS: Teaching Researchers and Innovators how to Create Startups
TRICS: Teaching Researchers and Innovators how to Create StartupsTRICS: Teaching Researchers and Innovators how to Create Startups
TRICS: Teaching Researchers and Innovators how to Create Startups
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
Knowledge Graph Implementation into Drupal Content Management System (CMS) fo...
Knowledge Graph Implementation into Drupal Content Management System (CMS) fo...Knowledge Graph Implementation into Drupal Content Management System (CMS) fo...
Knowledge Graph Implementation into Drupal Content Management System (CMS) fo...
 
Open PHACTS API Walkthrough
Open PHACTS API WalkthroughOpen PHACTS API Walkthrough
Open PHACTS API Walkthrough
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
 
CIAB Febraban - Michael Wagner
CIAB Febraban - Michael Wagner CIAB Febraban - Michael Wagner
CIAB Febraban - Michael Wagner
 

More from Alex Clark

Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informatics
Alex Clark
 

More from Alex Clark (20)

Mixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informaticsMixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informatics
 
Mixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer productsMixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer products
 
Coordination InChI (2019)
Coordination InChI (2019)Coordination InChI (2019)
Coordination InChI (2019)
 
Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
 
ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)
 
Autonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocolsAutonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocols
 
Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informatics
 
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
 
BioAssay Express
BioAssay ExpressBioAssay Express
BioAssay Express
 
SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?
 
The anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithmsThe anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithms
 
Compact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile appsCompact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile apps
 
Green chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by designGreen chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by design
 
ICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab NotebookICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab Notebook
 
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
 
Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)
 
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
 
Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013
 
Open Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health MontrealOpen Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health Montreal
 

Recently uploaded

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Silpa
 

Recently uploaded (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 

Mixtures InChI: a story of how standards drive upstream products

  • 1. Alex M. Clark & Leah R. McEwen Mixtures InChI A story of how standards drive upstream products alex@collaborativedrug.com
  • 2. It's always a mixture ✤ The pure molecule approximation has value... but in the real world: 2
  • 3. State of mixture informatics ✤ Bespoke formats exist ✤ Generally in silos: ‣ low machine readability ‣ low interoperability 3 ✤ Machine readable molecules ~ ½ century ago, but mixtures limited to text Phenolphthalein, 1% (w/v ) Indicator Solutio n in 95% Ethanol
  • 5. Mixtures InChI 4 molmatinf.com/minchidemo CH2O/c1-2/h1H2 1 37wf-2 CH4O/c1-2/h2H,1H3 3 10:15pp0 H2O/h1H2 2
  • 6. Mixtures InChI 4 molmatinf.com/minchidemo MInChI=0.00.1S/ & & /n{{ & }& }/g{{ &}& } CH2O/c1-2/h1H2 1 37wf-2 CH4O/c1-2/h2H,1H3 3 10:15pp0 H2O/h1H2 2
  • 7. Mixtures InChI 4 molmatinf.com/minchidemo MInChI=0.00.1S/ & & /n{{ & }& }/g{{ &}& } CH2O/c1-2/h1H2 1 37wf-2 CH4O/c1-2/h2H,1H3 3 10:15pp0 H2O/h1H2 2
  • 8. One brick at a time... ✤ Have: 5 ✤ What to do? Chicken vs. egg problems... ‣ no community without data ‣ no data without tools ‣ no tools without community ☑︎ use cases ☑︎ standard ☒ tools ☒ data ☒ community talking to people: demand is real MInChI speci fi cation, IUPAC endorsed ... custom or limited purpose ... not machine readable ... not informatics oriented have to build simultaneously
  • 9. Tools ✤ 2018: NIH SBIR grant awarded to Collaborative Drug Discovery ✤ First step: open source mixture editor and software libraries ✤ Coded in TypeScript, cross- compiled to JavaScript, for: ‣ web ‣ desktop (via Electron) ‣ server (via NodeJS) ✤ Operates on "Mix fi le" which is ELN-like, JSON-based, mixture analog of Mol fi le 6 github.com/cdd/mixtures
  • 10. Upstream/Downstream 7 Mix fi le MInChI commercial open source standards use cases customers searching indexing labelling categorising time
  • 11. ELN integration ✤ Mixture creation is also part of a commercial product ✤ Scientists use the ELN already... ✤ ... machine readable data is a side e ff ect of normal use ✤ First class citizens: ‣molecules ‣reactions ‣mixtures 8 collaborativedrug.com
  • 12. Data ✤ Bootstrap from text sources ✤ Proprietary deep learning algorithm ✤ ~30K mixtures marked up, public release 9 ✤ Substantial body of exemplars, and upstream test data for MInChI generation ✤ Can rapidly markup inventories and vendor catalogs ✤ Integrated into software-as-a-service products
  • 13. Support resources 10 ✤ Looking up known content speeds up data creation... ✤ INCI and UNII collections available to quick search
  • 14. Demidigital ✤ Partially marked up data can be upgraded by document-wide options... 11 ✤ Currently in design phase MATERIAL MATERIAL QUANTITY B A C
  • 15. Community ✤ Creating technology is easy, getting everyone to use it is hard... ✤ Requires concurrent strategies 12 ✤ Endorsement by respected standards organisations is a good start ✤ InChI derivatives have enthusiastic champions (that's us!)
  • 16. Engagement ✤ Code it up: using MInChI notation is easy enough ✤ Got use cases? Let us know ✤ Spread the word: data resources need to be digitised 13
  • 17. Further viewing ✤ Peer reviewed literature: 14 ✤ Webinars: 2019: www.youtube.com/watch?v=PcAJ4HoRnFU Capturing mixtures — bringing informatics to the world of practical chemistry 2021: www.youtube.com/watch?v=0ILc0owuEzQ (starts at 1:05:00) Mixtures as fi rst class citizens in the realm of informatics 2020: www.youtube.com/watch?v=aSQEVKKnrWw (starts at 4:13:00) Mixtures: informatics for formulations and consumer products github.com/cdd/mixtures
  • 18. Further work ✤ Finalising MInChI 1.0 speci fi cation, reference implementation, validation ✤ MInChI needs to extend to less well de fi ned chemical entities ‣ variable structures ‣ polymers ‣ biologics ‣ nanomaterials ‣ reaction products ✤ Properties and metadata: ontology based / IUPAC Gold Book ✤ Implementation at scale: registration systems 15 MInChI Open Meeting: 20 April 11am-1pm (EDT)
  • 19. Questions? ✤ Contact: ‣ Leah R. McEwen lrm1@cornell.edu (Cornell University, IUPAC/InChI Trust) ‣ Alex M. Clark alex@collaborativedrug.com (Collaborative Drug Discovery) ✤ Thanks to the MInChI team 16