Accelerating Scientific Research Through Machine Learning and Graph

•

2 likes•534 views

Miroculus is a molecular diagnostics company that leverages the potential of microRNAs as biomarkers and has created the most easy-to-use and automated platform for their detection. MicroRNAs are small non-coding RNA molecules, whose primary role is to regulate the expression of our genes. Their discovery in circulation of body fluids such as blood plasma/serum, urine and saliva has been followed up by a multitude of studies, providing evidence that detection of specific microRNA molecules can give clues about a person’s health status and may therefore be used as biomarkers for various conditions. Loom is an up-to-date snapshot of the scientific literature landscape focused on microRNAs that we built to expedite our own research. As of today, there is no compelling way to access much of the microRNA research. By using Loom's easy-to-use, interactive UI, the researcher is able to quickly locate the relevant sentences across many publications relating specific microRNAs with her disease or gene of interest. With this tool, our objective is to provide a visually compelling and complete overview of how microRNAs relate to specific diseases and genes. At the backend, Loom is comprised of 4 microservices. The first one is a listener that fetches new publications daily that are available in the NCBI databases: PubMed for abstracts and PMC for full-text, open-access publications. Then, a natural language processor scans the publication, breaking them down into their constituent sentences and detecting mentions of microRNAs, genes and diseases. Within each sentence, a machine learning scorer evaluates the strength and type of relationship on a scale from 0 to 1 and outputs the results in a graph database. The resulting graph database is then queried in real-time by the UI to retrieve the sentences and relationships the user is interested in.

Health & Medicine

Accelerating scientiﬁc
research through Machine
Learning & Graph
Jorge Soto
CTO, Miroculus
Antonio Molins
VP Data Science, Miroculus
SAN FRANCISCO
13-14 OCTOBER 2016

1993
lin-4 in c. elegans
2000
let-7 in h. sapiens
microRNA

microRNAs are tissue specific
1993
lin-4 in c. elegans
2000
let-7 in h. sapiens

microRNA expression across different cancer types
gastrointestinal tract samples
epithelial origin samples
Jun Lu et al. MicroRNA expression profiles classify human cancers. Nature 435, 834-838(9 June 2005)
1993
lin-4 in c. elegans
2000
let-7 in h. sapiens
2002
1st link to cancer

1993
lin-4 in c. elegans
2000
let-7 in h. sapiens
2008
plasma
2002
1st link to cancer
microRNAs found cell-free in biofluids

Highly Stable
Organ/Tissue
Specific
Detectable in blood
microRNA as an ideal biomarker

microRNAs reflect your physiology
Red blood cells
Liver
Muscle
Heart

Red blood cells
Liver
Muscle
Heart
microRNAs reflect your physiology

3 simple steps
Sample collection
20 mins

3 simple steps
Sample collection
Insert sample in a
cartridge device
20 mins 60 mins

3 simple steps
Sample collection
Automated workflow
and data analysis
real time
Insert sample in a
cartridge device
20 mins 60 mins

Adaptated from Nair et al, Am J Epid, 2014
Tissue VS circulating microRNA related publications
2000
let-7 in h. sapiens
2008
plasma

Search Choose
Retrieve Read Learn
Retrieve Read Learn
Retrieve Read Learn
Retrieve Read Learn
Retrieve Read Learn
Retrieve Read Learn

Retrieved
1,000,000+ 
articles
192,496,883 lines
199,639,090 sentences
111,382,775 concept mentions
What has the elephant learnt so far?

“As shown in Fig. 3, DADS inhibited
breast cancer growth by up-
regulating MiR-34A expression.”
What has the elephant learnt so far?

“As shown in Fig. 3, DADS inhibited
breast cancer growth by up-
regulating MiR-34A expression.”
What has the elephant learnt so far?
DADS

Breast 
Cancer
DADS
“As shown in Fig. 3, DADS inhibited
breast cancer growth by up-
regulating MiR-34A expression.”
What has the elephant learnt so far?

Breast 
Cancer
miR- 
34A
DADS
“As shown in Fig. 3, DADS inhibited
breast cancer growth by up-
regulating MiR-34A expression.”
What has the elephant learnt so far?

Distant supervision for relationship classification
Blog post in MSFT dev site

- connect to NCBI databases
(pubmed and pmc) and fetch
new publications
- identify when microRNAs are
mentioned in relationship to
genes or diseases
- split the results into
sentences
NLP
I can...Listener
I can...
Loom architecture
Scorer
I can...
- score between 0 to 1 the
accuracy of the relations
between the entities using
machine learning
Graph
I can...
- store the relationships and
their score in a graph
database
- be queried about each node
and their relationships
55

Weiland et al, RNA biology, 2012
When discovery > validation

“Most clinical research therefore fails to be
useful not because of its findings but because of
its design” - JPA Ioannidis, PLOS Medicine, 2016

Unmet clinical need for stomach cancer patients

In collaboration with:
Inclusion criteria Individuals suspected of stomach cancer eligible for endoscopies.
Collection All samples collected from 2010 to 2013.
Machine-learned
model
Samples split 50/50 in two groups doubly balanced per country, gender, diagnosis,
subtype and stage.
Cohort distribution 650 samples including the entire cascade of the disease.
Multi-center Samples collected in Chile, Lithuania and Latvia.
Clinical study design

Proprietary 7-microRNA diagnostic signature

Proprietary 7-microRNA diagnostic signature
Decision boundary set to maximize
accuracy for the observed prevalence

Robust regardless of stage
Good performance
across ethnicities
Decision boundary set to maximize
accuracy for the observed prevalence
Proprietary 7-microRNA diagnostic signature

-
+
Without Miroculus With Miroculus
NPV = 99.8%
Miroculus test compared to gold standard

Ideal biomarker
Cost effective, simple and accurate detection
Future of diagnostics
Enabling
technology
Advanced data
analysis

jorge@miroculus.com 
antonio@miroculus.com
http://loom.bio

What's hot

CRISPR: Discovery & Potential ApplicationsKumaraguru Veerasamy

CRISPRAkash Arora

Crispr ApplicationCreative Biogene

Crispr handbook 2015rochonf

CRISPR technology -A revolutionary discovery P Navaneeth Krishna Menon

Integrated genetic and transcriptional analysis at the single-cell levelJean Fan

Genome Editing with CRISPR-Cas9Lopamudra Nayak

CrisprMusharraf Ali

Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...Jean Fan

Crisper casSanjida Sultana

Seminario MolecularFelipe Ospina

Crispr applicationCreative Biogene

Crispr casJaiKishan76

Crisprjain7177

CRISPER-Cas9Apeksha Kesarwani

CRISPR TechnologyRomilMistry

CrisprFahim Ahmad

A New molecular biology techniques for gene therapyVanessa Chappell

Viral genome sequencingDynah Perry

Application of crispr in cancer therapykamran javidi

What's hot (20)

CRISPR: Discovery & Potential Applications

CRISPR

Crispr Application

Crispr handbook 2015

CRISPR technology -A revolutionary discovery

Integrated genetic and transcriptional analysis at the single-cell level

Genome Editing with CRISPR-Cas9

Crispr

Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...

Crisper cas

Seminario Molecular

Crispr application

Crispr cas

Crispr

CRISPER-Cas9

CRISPR Technology

Crispr

A New molecular biology techniques for gene therapy

Viral genome sequencing

Application of crispr in cancer therapy

Viewers also liked

Enabling the Cisco Decoder RingNeo4j

Neo4j for Cloud Management at ScaleNeo4j

Panama Papers and Beyond: Unveiling Secrecy with GraphsNeo4j

Knowledge Architecture: Graphing Your KnowledgeNeo4j

An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...Neo4j

Connecting the Dots in Early Drug DiscoveryNeo4j

Webinar: Intro to CypherNeo4j

Closing KeynoteNeo4j

The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyNeo4j

Knowledge Architecture: Graphing Your KnowledgeNeo4j

Data Modeling with Neo4jNeo4j

Viewers also liked (11)

Enabling the Cisco Decoder Ring

Neo4j for Cloud Management at Scale

Panama Papers and Beyond: Unveiling Secrecy with Graphs

Knowledge Architecture: Graphing Your Knowledge

An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...

Connecting the Dots in Early Drug Discovery

Webinar: Intro to Cypher

Closing Keynote

The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology

Knowledge Architecture: Graphing Your Knowledge

Data Modeling with Neo4j

Similar to Accelerating Scientific Research Through Machine Learning and Graph

DNA MicroarrayNabilaMahmoud3

Brief introduction to BioinformaticsCynthia Alexander Rascon

5th RNA-Seq San Francisco AgendaDiane McKenna

Jason C Poole Cv Linked Inrastare1a

Bioinformatics group presentationNaeem Ahmed

Biology foldingSydgold15

Vasantharajan janakiraman 1_resume_july_2016Vasant Janakiraman

Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryVaticle

Dna profiling presentation x2Eli Rosenthal

Dna profiling presentation x2teamchaotex

Festival Of Genomics 2016 - Brain talkJean Fan

Cell Authentication By STR ProfilingCreative-Bioarray

Myers CV_2015Amanda Myers

CRISPR ArrayStacey Wilson

5 & 6. carlos dna & bioinformatics edCarlos Santos Perez

short tandem repeats profileBennie George

Cell authentication by str profileBennie George

TLSC Biotech 101 Noc 2010 (Moore)jmoore89

Similar to Accelerating Scientific Research Through Machine Learning and Graph (20)

DNA Microarray

Brief introduction to Bioinformatics

5th RNA-Seq San Francisco Agenda

Jason C Poole Cv Linked In

Bioinformatics group presentation

Biology folding

Vasantharajan janakiraman 1_resume_july_2016

Building Biomedical Knowledge Graphs for In-Silico Drug Discovery

Dna profiling presentation x2

Festival Of Genomics 2016 - Brain talk

Cell Authentication By STR Profiling

Myers CV_2015

CRISPR Array

5 & 6. carlos dna & bioinformatics ed

short tandem repeats profile

Cell authentication by str profile

TLSC Biotech 101 Noc 2010 (Moore)

Recently uploaded

Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora

Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora

Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora

The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...chandars293

Russian Call Girls in Jaipur Riya WhatsApp ❤8445551418 VIP Call Girls Jaipurparulsinha

Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escortsaditipandeya

VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...jageshsingh5554

Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escortsvidya singh

(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...Taniya Sharma

Call Girls Dehradun Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora

Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...narwatsonia7

Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipurparulsinha

Lucknow Call girls - 8800925952 - 24x7 service with hotel roomdiscovermytutordmt

Call Girls Bareilly Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora

VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...Garima Khatri

Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Dipal Arora

Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora

Top Rated Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...chandars293

Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora

Recently uploaded (20)

Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available

Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available

Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available

The Most Attractive Hyderabad Call Girls Kothapet 𖠋 6297143586 𖠋 Will You Mis...

Russian Call Girls in Jaipur Riya WhatsApp ❤8445551418 VIP Call Girls Jaipur

Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...

VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts

VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...

Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts

(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...

Call Girls Dehradun Just Call 9907093804 Top Class Call Girl Service Available

Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...

Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur

Lucknow Call girls - 8800925952 - 24x7 service with hotel room

Call Girls Bareilly Just Call 9907093804 Top Class Call Girl Service Available

VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...

Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...

Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available

Top Rated Hyderabad Call Girls Erragadda ⟟ 6297143586 ⟟ Call Me For Genuine ...

Call Girls Faridabad Just Call 9907093804 Top Class Call Girl Service Available

Accelerating Scientific Research Through Machine Learning and Graph

1. Accelerating scientiﬁc research through Machine Learning & Graph Jorge Soto CTO, Miroculus Antonio Molins VP Data Science, Miroculus SAN FRANCISCO 13-14 OCTOBER 2016

3. microRNAs

4. DNA

5. mRNA DNA

6. mRNAPROTEIN DNA

7. mRNAPROTEIN DNA microRNA

8. 1993 lin-4 in c. elegans 2000 let-7 in h. sapiens microRNA

9. microRNAs are tissue specific 1993 lin-4 in c. elegans 2000 let-7 in h. sapiens

10. microRNA expression across different cancer types gastrointestinal tract samples epithelial origin samples Jun Lu et al. MicroRNA expression profiles classify human cancers. Nature 435, 834-838(9 June 2005) 1993 lin-4 in c. elegans 2000 let-7 in h. sapiens 2002 1st link to cancer

11. 1993 lin-4 in c. elegans 2000 let-7 in h. sapiens 2008 plasma 2002 1st link to cancer microRNAs found cell-free in biofluids

12. Highly Stable Organ/Tissue Specific Detectable in blood microRNA as an ideal biomarker

13. microRNAs reflect your physiology Red blood cells Liver Muscle Heart

14. Red blood cells Liver Muscle Heart microRNAs reflect your physiology

15. Red blood cells Liver Muscle Heart microRNAs reflect your physiology

16. Red blood cells Liver Muscle Heart microRNAs reflect your physiology

17.

18.

19.

20. 3 simple steps Sample collection 20 mins

21.

22. Assay sensitivity

23. 3 simple steps Sample collection Insert sample in a cartridge device 20 mins 60 mins

24. Digital microfluidics technology

25. 3 simple steps Sample collection Automated workflow and data analysis real time Insert sample in a cartridge device 20 mins 60 mins

26.

27.

28. Adaptated from Nair et al, Am J Epid, 2014 Tissue VS circulating microRNA related publications 2000 let-7 in h. sapiens 2008 plasma

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39. Breast  Cancer miR-34a DADS

40. Retrieve Read LearnSearch Choose

41. Search Choose Retrieve Read Learn Retrieve Read Learn Retrieve Read Learn Retrieve Read Learn Retrieve Read Learn Retrieve Read Learn

42. Retrieved 1,000,000+  articles 192,496,883 lines 199,639,090 sentences 111,382,775 concept mentions What has the elephant learnt so far?

43. “As shown in Fig. 3, DADS inhibited breast cancer growth by up- regulating MiR-34A expression.” What has the elephant learnt so far?

44. “As shown in Fig. 3, DADS inhibited breast cancer growth by up- regulating MiR-34A expression.” What has the elephant learnt so far? DADS

45. Breast  Cancer DADS “As shown in Fig. 3, DADS inhibited breast cancer growth by up- regulating MiR-34A expression.” What has the elephant learnt so far?

46. Breast  Cancer miR-  34A DADS “As shown in Fig. 3, DADS inhibited breast cancer growth by up- regulating MiR-34A expression.” What has the elephant learnt so far?

47. Distant supervision for relationship classification Blog post in MSFT dev site

48. Distant supervision for relationship classification Blog post in MSFT dev site

49. Distant supervision for relationship classification Blog post in MSFT dev site

50. Distant supervision for relationship classification Blog post in MSFT dev site

51. Distant supervision for relationship classification Blog post in MSFT dev site

52. 52 [cypher]

53. [cypher] [cypher]

54. www.loom.bio

55. - connect to NCBI databases (pubmed and pmc) and fetch new publications - identify when microRNAs are mentioned in relationship to genes or diseases - split the results into sentences NLP I can...Listener I can... Loom architecture Scorer I can... - score between 0 to 1 the accuracy of the relations between the entities using machine learning Graph I can... - store the relationships and their score in a graph database - be queried about each node and their relationships 55

56.

57. Weiland et al, RNA biology, 2012 When discovery > validation

58. “Most clinical research therefore fails to be useful not because of its findings but because of its design” - JPA Ioannidis, PLOS Medicine, 2016

59.

60. Unmet clinical need for stomach cancer patients

61.

62. In collaboration with: Inclusion criteria Individuals suspected of stomach cancer eligible for endoscopies. Collection All samples collected from 2010 to 2013. Machine-learned model Samples split 50/50 in two groups doubly balanced per country, gender, diagnosis, subtype and stage. Cohort distribution 650 samples including the entire cascade of the disease. Multi-center Samples collected in Chile, Lithuania and Latvia. Clinical study design

63. Proprietary 7-microRNA diagnostic signature

64. Proprietary 7-microRNA diagnostic signature Decision boundary set to maximize accuracy for the observed prevalence

65. Robust regardless of stage Good performance across ethnicities Decision boundary set to maximize accuracy for the observed prevalence Proprietary 7-microRNA diagnostic signature

66. - + Without Miroculus With Miroculus NPV = 99.8% Miroculus test compared to gold standard

67. Ideal biomarker Cost effective, simple and accurate detection Future of diagnostics Enabling technology Advanced data analysis

68. jorge@miroculus.com  antonio@miroculus.com http://loom.bio

Accelerating Scientific Research Through Machine Learning and Graph

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to Accelerating Scientific Research Through Machine Learning and Graph

Similar to Accelerating Scientific Research Through Machine Learning and Graph (20)

More from Neo4j

More from Neo4j (20)

Recently uploaded

Recently uploaded (20)

Accelerating Scientific Research Through Machine Learning and Graph