SlideShare a Scribd company logo
1 of 24
Download to read offline
Linked Data
for improved organization
of research data
Farmbio BioScience Seminar May 18, 2018
Samuel Lampa @smllmp
PhD Student in Pharm. Bioinformatics @ pharmb.io / farmbio.uu.se
● Large datasets
● Automation
● Scientific workflows
● Machine Learning
● Semantic data
● Reasoning
● Query systems
● Something user friendly
● And hopefully usable
● “Answer all the (computational)
research questions”
Research interests
What’s the problem?
What’s the problem?
● Data in different formats
● Different data schemas
● Losing track of what data means
(meaning available only in context)
A database to the rescue?
Database to the rescue?
● Same problems with losing data identity on export
● So, put all data in the same database?
● One database can’t fit all the world’s data!
● What to do?
What to do?
What if all data could be:
● Easy to share
● Self-described
● Use the same (underlying) format
● Be easy to integrate with other data
(In other words: FAIR – Findable, Accessible, Interoperable, Re-usable)
Linked Data!
Linked data – Basic ideas
● Use URI:s (“https://”) to identify things
● Make URI:s into dereferenceable links
(So one can visit them to find relevant data)
● Refer to other data using their links
What about the linking?
Triple model*:
– Subject (URI), Predicate (URI), Object (URI or literal value)
@ex: http://example.org/myontology/
ex:Sweden ex:hasPopulation 9000000
ex:Sweden ex:hasCapital ex:Stockholm
* For more info: Check “RDF: Resource Description Framework”
Web links vs. Data links
Example data
Willighagen EL,Alvarsson J,Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O,Wikberg JES.
Linking the Resource Description Framework to cheminformatics and proteochemometrics. J Biomed Semantics. 2011;2(Suppl 1):S6. Doi:10.1186/2041-1480-2-S1-S6.
Lampa S. SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking. 2010. bit.ly/mscreport
<http://[...]/nmrshiftdb/?moleculeId=234>
dc:title "warburganal";
chem:casnumber "62994-47-2";
nmr:moleculeId "234";
nmr:hasSpectrum <http://[...]/nmrshiftdb/?spectrumId=4735>;
<http://[...]/nmrshiftdb/?spectrumId=4735> nmr:field "50";
nmr:hasPeak <http://[...]/nmrshiftdb/?s4735p0>,
<http://[...]/nmrshiftdb/?s4735p1>,
<http://[...]/nmrshiftdb/?s4735p2>,
<http://[...]/nmrshiftdb/?s4735p3>;
nmr:solvent "Chloroform-D1 (CDCl3)";
nmr:spectrumId "4735";
nmr:spectrumType "13C";
nmr:temperature "298".
<http://[...]/nmrshiftdb/?s4735p1>
nmr:hasShift 18.3;
a nmr:peak.
<http://[...]/nmrshiftdb/?s4735p2>
nmr:hasShift 22.6;
a nmr:peak.
<http://[...]/nmrshiftdb/?s4735p3>
nmr:hasShift 26.5;
a nmr:peak.
Example data
Willighagen EL,Alvarsson J,Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O,Wikberg JES.
Linking the Resource Description Framework to cheminformatics and proteochemometrics. J Biomed Semantics. 2011;2(Suppl 1):S6. Doi:10.1186/2041-1480-2-S1-S6.
Lampa S. SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking. 2010. bit.ly/mscreport
Powerful querying with SPARQL
What to do? - Linked Data!
What if all data could be:
● Easy to share – Yep, RDF is a web based format
● Self-described – Yes, links in the data describe the data
● Use the same (underlying) format – Yes, RDF triples
● Be easy to integrate with other data - Yes, just create links
(In other words: FAIR – Findable, Accessible, Interoperable, Re-usable)
But how to actually use this in
practice?
What we did (1/3):
Willighagen EL,Alvarsson J,Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O,Wikberg JES.
Linking the Resource Description Framework to cheminformatics and proteochemometrics. J Biomed Semantics. 2011;2(Suppl 1):S6. Doi:10.1186/2041-1480-2-S1-S6.
Lampa S. SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking. 2010. bit.ly/mscreport
← SWI-Prolog for querying
… Integrated into Bioclipse
Pros / Cons:
+ Powerful querying
+ Easy to integrate into other software
=> Powerful interactive environment
+ Excellent performance
- No support for really large datasets
(exceednig RAM size)
What we did (2/3):
Lampa S,Willighagen E, Kohonen P,King A,Vrande i D,Grafström R, Spjuth O.č ć
RDFIO: Extending Semantic MediaWiki for interoperable biomedical data management. 2017;8(35):1-13. doi: 10.1186/s13326-017-0136-y.
Semantic MediaWiki as a collaborative and
interactive platform for playing around with
data, summarizing and visualizing using SMW’s
Ask query language →
Pros / Cons:
+ Collaboration supported
+ Versioned data storage
+ UI generation included in SMW
- Performance concerns
- Lack of expressiveness and power
in the SMW “Ask” query language
What we did (2/3):
Lampa S,Willighagen E, Kohonen P,King A,Vrande i D,Grafström R, Spjuth O.č ć
RDFIO: Extending Semantic MediaWiki for interoperable biomedical data management. 2017;8(35):1-13. doi: 10.1186/s13326-017-0136-y.
What we did (2/3):
Lampa S,Willighagen E, Kohonen P,King A,Vrande i D,Grafström R, Spjuth O.č ć
RDFIO: Extending Semantic MediaWiki for interoperable biomedical data management. 2017;8(35):1-13. doi: 10.1186/s13326-017-0136-y.
What we did (3/3): urisolve
● A simple web server to resolve, or “dereference” URIs
● Returns any data / triples for the URI in question
● Based on data in a triplestore (semantic database)
or an RDF-HDT file (compressed, indexed file format)
● Source code: github.com/pharmbio/urisolve
Lapins M,Arvidsson S, Lampa S, Berg A, Schaal W,Alvarsson J, Spjuth O.
A confidence predictor for logD using conformal regression and a support-vector machine. J Cheminform. 2018;10(1):17. doi: 10.1186/s13321-018-0271-1
● Linked Data makes data self-describing
● It is extremely flexible to work with
● Lowers the barriers to data entry
Conclusions
Vision:A central workbench for Linked Data
SWISH: SWI-Prolog Notebook: swish.swi-prolog.org
… to access all data sources, and
“answer all the (computational) research questions”
Thank you
Samuel Lampa @smllmp
PhD Student in Pharm. Bioinformatics @ pharmb.io / farmbio.uu.se

More Related Content

What's hot

2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search
2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search
2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search
Crossref
 
CrossRef Annual Meeting 2012 ORCID Laure Haak
CrossRef Annual Meeting 2012 ORCID Laure HaakCrossRef Annual Meeting 2012 ORCID Laure Haak
CrossRef Annual Meeting 2012 ORCID Laure Haak
Crossref
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
Juan Sequeda
 
RDF and Drupal - The Semantic web
RDF and Drupal - The Semantic webRDF and Drupal - The Semantic web
RDF and Drupal - The Semantic web
gauravkumar87
 

What's hot (20)

Yosemite part-4 webinar-final
Yosemite part-4 webinar-finalYosemite part-4 webinar-final
Yosemite part-4 webinar-final
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic Framework
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale
 
GraphDB
GraphDBGraphDB
GraphDB
 
The DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedThe DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMed
 
GraphConnect NYC
GraphConnect NYCGraphConnect NYC
GraphConnect NYC
 
Knowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPediaKnowledge Graph Construction and the Role of DBPedia
Knowledge Graph Construction and the Role of DBPedia
 
Integrating Government Data New
Integrating Government Data NewIntegrating Government Data New
Integrating Government Data New
 
Crossref DataCite joint data citation webinar
Crossref DataCite joint data citation webinarCrossref DataCite joint data citation webinar
Crossref DataCite joint data citation webinar
 
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-DataIntroduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
Introduction-and-RDF-Representation-of-FHIR-for-Clinical-Data
 
2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search
2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search
2014 CrossRef Annual Meeting Flash Update: CrossRef Metadata Search
 
Towards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataTowards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and Wikidata
 
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
Yosemite Project - Part 3 - Transformations for Integrating VA data with FHIR...
 
Jarrar: RDFa
Jarrar: RDFaJarrar: RDFa
Jarrar: RDFa
 
CrossRef Annual Meeting 2012 ORCID Laure Haak
CrossRef Annual Meeting 2012 ORCID Laure HaakCrossRef Annual Meeting 2012 ORCID Laure Haak
CrossRef Annual Meeting 2012 ORCID Laure Haak
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
 
Your Work is Distinctive, What about Your Name? Japan Library Fair 2014
Your Work is Distinctive, What about Your Name? Japan Library Fair 2014Your Work is Distinctive, What about Your Name? Japan Library Fair 2014
Your Work is Distinctive, What about Your Name? Japan Library Fair 2014
 
RDF and Drupal - The Semantic web
RDF and Drupal - The Semantic webRDF and Drupal - The Semantic web
RDF and Drupal - The Semantic web
 
Reto2.011 APEX API
Reto2.011 APEX APIReto2.011 APEX API
Reto2.011 APEX API
 

Similar to Linked Data for improved organization of research data

Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
dgarijo
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Sören Auer
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 

Similar to Linked Data for improved organization of research data (20)

Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
All thingspython@pivotal
All thingspython@pivotalAll thingspython@pivotal
All thingspython@pivotal
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
Dive deep into your Data Pools
Dive deep into your Data PoolsDive deep into your Data Pools
Dive deep into your Data Pools
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData Management
 
Semantic Web Adoption
Semantic Web AdoptionSemantic Web Adoption
Semantic Web Adoption
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Knowledge Graphs: Changing How We Think About Data
Knowledge Graphs: Changing How We Think About DataKnowledge Graphs: Changing How We Think About Data
Knowledge Graphs: Changing How We Think About Data
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational Research
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
Weaving a Web of Linked Data - September 26th, 2019
Weaving a Web of Linked Data - September 26th, 2019Weaving a Web of Linked Data - September 26th, 2019
Weaving a Web of Linked Data - September 26th, 2019
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategies
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Will Postgres Live Forever?
Will Postgres Live Forever?Will Postgres Live Forever?
Will Postgres Live Forever?
 

More from Samuel Lampa

Profiling go code a beginners tutorial
Profiling go code   a beginners tutorialProfiling go code   a beginners tutorial
Profiling go code a beginners tutorial
Samuel Lampa
 
Flow based programming an overview
Flow based programming   an overviewFlow based programming   an overview
Flow based programming an overview
Samuel Lampa
 
My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013
Samuel Lampa
 
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQL
Samuel Lampa
 

More from Samuel Lampa (18)

Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...
 
How to document computational research projects
How to document computational research projectsHow to document computational research projects
How to document computational research projects
 
Reproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience SeminarReproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience Seminar
 
Batch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWikiBatch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWiki
 
SciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programmingSciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programming
 
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
 
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat Sheet
 
AddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based ProgrammingAddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based Programming
 
First encounter with Elixir - Some random things
First encounter with Elixir - Some random thingsFirst encounter with Elixir - Some random things
First encounter with Elixir - Some random things
 
Profiling go code a beginners tutorial
Profiling go code   a beginners tutorialProfiling go code   a beginners tutorial
Profiling go code a beginners tutorial
 
Flow based programming an overview
Flow based programming   an overviewFlow based programming   an overview
Flow based programming an overview
 
Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15
 
The RDFIO Extension - A Status update
The RDFIO Extension - A Status updateThe RDFIO Extension - A Status update
The RDFIO Extension - A Status update
 
My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013
 
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQL
 
Thesis presentation Samuel Lampa
Thesis presentation Samuel LampaThesis presentation Samuel Lampa
Thesis presentation Samuel Lampa
 
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
 
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
 

Recently uploaded

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 

Recently uploaded (20)

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 

Linked Data for improved organization of research data

  • 1. Linked Data for improved organization of research data Farmbio BioScience Seminar May 18, 2018 Samuel Lampa @smllmp PhD Student in Pharm. Bioinformatics @ pharmb.io / farmbio.uu.se
  • 2. ● Large datasets ● Automation ● Scientific workflows ● Machine Learning ● Semantic data ● Reasoning ● Query systems ● Something user friendly ● And hopefully usable ● “Answer all the (computational) research questions” Research interests
  • 4. What’s the problem? ● Data in different formats ● Different data schemas ● Losing track of what data means (meaning available only in context)
  • 5. A database to the rescue?
  • 6. Database to the rescue? ● Same problems with losing data identity on export ● So, put all data in the same database? ● One database can’t fit all the world’s data! ● What to do?
  • 7. What to do? What if all data could be: ● Easy to share ● Self-described ● Use the same (underlying) format ● Be easy to integrate with other data (In other words: FAIR – Findable, Accessible, Interoperable, Re-usable)
  • 9. Linked data – Basic ideas ● Use URI:s (“https://”) to identify things ● Make URI:s into dereferenceable links (So one can visit them to find relevant data) ● Refer to other data using their links
  • 10. What about the linking? Triple model*: – Subject (URI), Predicate (URI), Object (URI or literal value) @ex: http://example.org/myontology/ ex:Sweden ex:hasPopulation 9000000 ex:Sweden ex:hasCapital ex:Stockholm * For more info: Check “RDF: Resource Description Framework”
  • 11. Web links vs. Data links
  • 12. Example data Willighagen EL,Alvarsson J,Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O,Wikberg JES. Linking the Resource Description Framework to cheminformatics and proteochemometrics. J Biomed Semantics. 2011;2(Suppl 1):S6. Doi:10.1186/2041-1480-2-S1-S6. Lampa S. SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking. 2010. bit.ly/mscreport
  • 13. <http://[...]/nmrshiftdb/?moleculeId=234> dc:title "warburganal"; chem:casnumber "62994-47-2"; nmr:moleculeId "234"; nmr:hasSpectrum <http://[...]/nmrshiftdb/?spectrumId=4735>; <http://[...]/nmrshiftdb/?spectrumId=4735> nmr:field "50"; nmr:hasPeak <http://[...]/nmrshiftdb/?s4735p0>, <http://[...]/nmrshiftdb/?s4735p1>, <http://[...]/nmrshiftdb/?s4735p2>, <http://[...]/nmrshiftdb/?s4735p3>; nmr:solvent "Chloroform-D1 (CDCl3)"; nmr:spectrumId "4735"; nmr:spectrumType "13C"; nmr:temperature "298". <http://[...]/nmrshiftdb/?s4735p1> nmr:hasShift 18.3; a nmr:peak. <http://[...]/nmrshiftdb/?s4735p2> nmr:hasShift 22.6; a nmr:peak. <http://[...]/nmrshiftdb/?s4735p3> nmr:hasShift 26.5; a nmr:peak. Example data Willighagen EL,Alvarsson J,Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O,Wikberg JES. Linking the Resource Description Framework to cheminformatics and proteochemometrics. J Biomed Semantics. 2011;2(Suppl 1):S6. Doi:10.1186/2041-1480-2-S1-S6. Lampa S. SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking. 2010. bit.ly/mscreport
  • 15. What to do? - Linked Data! What if all data could be: ● Easy to share – Yep, RDF is a web based format ● Self-described – Yes, links in the data describe the data ● Use the same (underlying) format – Yes, RDF triples ● Be easy to integrate with other data - Yes, just create links (In other words: FAIR – Findable, Accessible, Interoperable, Re-usable)
  • 16. But how to actually use this in practice?
  • 17. What we did (1/3): Willighagen EL,Alvarsson J,Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O,Wikberg JES. Linking the Resource Description Framework to cheminformatics and proteochemometrics. J Biomed Semantics. 2011;2(Suppl 1):S6. Doi:10.1186/2041-1480-2-S1-S6. Lampa S. SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking. 2010. bit.ly/mscreport ← SWI-Prolog for querying … Integrated into Bioclipse Pros / Cons: + Powerful querying + Easy to integrate into other software => Powerful interactive environment + Excellent performance - No support for really large datasets (exceednig RAM size)
  • 18. What we did (2/3): Lampa S,Willighagen E, Kohonen P,King A,Vrande i D,Grafström R, Spjuth O.č ć RDFIO: Extending Semantic MediaWiki for interoperable biomedical data management. 2017;8(35):1-13. doi: 10.1186/s13326-017-0136-y. Semantic MediaWiki as a collaborative and interactive platform for playing around with data, summarizing and visualizing using SMW’s Ask query language → Pros / Cons: + Collaboration supported + Versioned data storage + UI generation included in SMW - Performance concerns - Lack of expressiveness and power in the SMW “Ask” query language
  • 19. What we did (2/3): Lampa S,Willighagen E, Kohonen P,King A,Vrande i D,Grafström R, Spjuth O.č ć RDFIO: Extending Semantic MediaWiki for interoperable biomedical data management. 2017;8(35):1-13. doi: 10.1186/s13326-017-0136-y.
  • 20. What we did (2/3): Lampa S,Willighagen E, Kohonen P,King A,Vrande i D,Grafström R, Spjuth O.č ć RDFIO: Extending Semantic MediaWiki for interoperable biomedical data management. 2017;8(35):1-13. doi: 10.1186/s13326-017-0136-y.
  • 21. What we did (3/3): urisolve ● A simple web server to resolve, or “dereference” URIs ● Returns any data / triples for the URI in question ● Based on data in a triplestore (semantic database) or an RDF-HDT file (compressed, indexed file format) ● Source code: github.com/pharmbio/urisolve Lapins M,Arvidsson S, Lampa S, Berg A, Schaal W,Alvarsson J, Spjuth O. A confidence predictor for logD using conformal regression and a support-vector machine. J Cheminform. 2018;10(1):17. doi: 10.1186/s13321-018-0271-1
  • 22. ● Linked Data makes data self-describing ● It is extremely flexible to work with ● Lowers the barriers to data entry Conclusions
  • 23. Vision:A central workbench for Linked Data SWISH: SWI-Prolog Notebook: swish.swi-prolog.org … to access all data sources, and “answer all the (computational) research questions”
  • 24. Thank you Samuel Lampa @smllmp PhD Student in Pharm. Bioinformatics @ pharmb.io / farmbio.uu.se