SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
BRAINOMICS
A management system for exploring and merging
heterogeneous brain mapping data based on CubicWeb
Vincent Michel
Logilab
CrEDIBLE 2013 - 3/10/2013
1/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 1 / 3
Plan
1 Introduction
2 Cubicweb and Brainomics
3 Data Model
4 Querying data
5 Future and Conclusions
2/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 2 / 3
Logilab
computer science and knowledge management ;
Founded in 2000 ;
20 experts in IT technologies ;
Public and private clients (CEA, EDF R&D, EADS, Arcelor Mittal, etc.).
Knowledge management :
http://collections.musees-haute-normandie.fr/collections/
http://data.bnf.fr/
3/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 3 / 3
Brainomics
Goals
Software solution for integration of neuroimaging and genomic data ;
Conception/optimization (GPU) of algorithms for analysing these data ;
Collaborative R&D project
NeuroSpin laboratory of CEA
Supelec ;
UMR 894 of INSERM ;
UMR CNRS 8203 of IGR ;
Logilab ;
Alliance Services Plus (AS+) ;
Keosys.
This work was supported by grants from the French National Reseach Agency (ANR GENIM ;
ANR-10-BLAN-0128) and (ANR IA BRAINOMICS ; ANR-10-BINF-04).
4/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 4 / 3
Context
Brain mapping data
Large datasets for brain mapping :
http://openfmri.org/data-sets
http://fcon_1000.projects.nitrc.org/indi/abide/
Neuroimaging + clinical data + genetics data ;
Brain mapping databases
Neuroimaging and genomics databases are dedicated to their own field of
research ;
XNAT - Neuroimaging ;
BASE - Genetics ;
SHANOIR - Neuroimaging ;
5/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 5 / 3
Plan
1 Introduction
2 Cubicweb and Brainomics
3 Data Model
4 Querying data
5 Future and Conclusions
6/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 6 / 3
What is Cubicweb
A semantic open-source web framework written in Python
An efficient knowledge management system
Entity-relationship data-model ;
RQL (Relational Query Language) ;
Separate query and display (HTML UI, JSON, RDF, CSV,...) ;
Conform to the Semantic Web standards ;
Fine-grained security system coupled to the data model definition ;
Migration mechanisms control model version and ensure data integrity ;
Industrial use : large databases, many users, security, logging ;
Used in production environments since 2005 ; LGPL since 2008 ;
http://www.cubicweb.org/
7/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 7 / 3
What Cubicweb is not
Cubicweb is not
a pure Web application framework - Web/HTML interface is only one
possible output ;
a triple store - data is structured ;
a CMS - allows complex business data modeling ;
Cubicweb is a framework
Used to build applications, with reusable components called cubes :
data model : persons, addressbook, billing...
displays (a.k.a views) : d3js, workflow, maps, threejs...
full applications : forge, intranet, erp...
open databases : dbpedia, diseasome, pubmed...
8/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 8 / 3
Overview of the framework
Well established core technologies : SQL, Python, HTML5, Javascript ;
Application
Based on a relational database (e.g. PostgreSQL) for storing information ;
Web server for HTTP access to the data ;
Integration with existing LDAP for high-level access management ;
Code - written in Python
Schema - define the data model.
Business Logic - allow to increase the logic of the data beyond the scope
of the data model, using specific functions and adapters.
Views - specific display rules, from HTML to binary content.
9/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 9 / 3
CubicWeb and Semantics
Not using a triple store does not mean “not semantic web compliant”
CubicWeb conforms to the Semantic Web standards
One entity = an unique URI ;
One request = an unique URI :
http://localhost:8080/?rql=MYRQLREQUEST&vid=MYVIEWID
HTTP content negociation (HTML, RDF, JSON, etc.) ;
Import/export to/from RDF data, based on a specific mapping :
xy.add_equivalence(’Person given_name’,
’foaf:givenName’)
Use RDF as a standard I/O format for data, but stick to relational database
for storage.
10/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 10 /
Visualization and interactions
Cubicweb views
A view is applied on the result of a query ;
The same result may be visualized using different views ;
Views are selected based on the types of the resulting data ;
Exploring the data
Many different possible views ;
Auto-completion RQL query form ;
Filtering facets ;
Using data in scripts with the URL vid/rql parameters ;
And also : forms, widgets, ...
11/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 11 /
Brainomics
Brings together brain imaging and genetics data
A solution based on CubicWeb
Modeling of Scans, Questionnaires, Genomics results, Behavioural
results, Subjects and Studies information, ...
Can deal with large volumes (> 10000 subjects) ;
Tested with several datasets (openfmri, abide, imagen, localizer) ;
Specific views : ZIP, XCEDE XML, CSV ;
Open source solution
http://www.brainomics.net/demo
12/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 12 /
Plan
1 Introduction
2 Cubicweb and Brainomics
3 Data Model
4 Querying data
5 Future and Conclusions
13/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 13 /
Data model and Schema
CubicWeb schema
Based on a Python library : http://www.logilab.org/project/yams
Defined in a Python file (schema.py) ;
Allow to create entity types, relations, constraints ;
Security can be tightly included in the data model ;
class Subject(EntityType):
identifier = String(required=True,indexed=True,maxsize=64)
gender = String(vocabulary=(’male’, ’female’, ’unknown’))
date_of_birth = Date()
...
class related_studies(RelationType):
subject = ’Subject’
object = ’Study’
cardinality = ’1*’
14/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 14 /
Data model - Subject
15/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 15 /
Where are the reference models ?
Reference models
Reference models may be implemented as Cubicweb schema
additional modeling refinements for application features
(sortability, readability, ...)
already existing cubes for some ontologies/taxonomies (e.g. FRBR) ;
Be pragmatic : if you need a new attribute, add it in the model → deal with the
reference model in the I/O formats ;
... or you could use your own specific schema
Ontologies do not exist for all fields ;
Define your schema and only code the mappings (I/O) to reference
models when they go out.
16/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 16 /
Schema evolution
http://xkcd.com/
Standards are moving
Existing Migration mechanisms to stick to the reference models
evolution ;
Easy modification for modeling of other applicative fields ;
17/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 17 /
Data - Input
Keep the application data model as pivot model → convert the data to
this model during the insertion
Datafeed - Periodic input
an URL ;
some Python logic for integrating/updating information ;
an interval of synchronization ;
→ used for almost any possible type of data, e.g. RSS feeds, files, RDF data,
other CW instances...
Stores - Bulk loading
Tools similar to ETL
allow to import huge amount of structured data ;
principally used for bulk loading ;
different level of security and data check ;
18/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 18 /
Data - Output
Design a specific view for outputing in any reference model
If one reference model change, change the I/O, not the internal model ;
Avoid data redundancy, if different reference models are based on the
same information (e.g. dc :title, foaf :name) ;
Example of View - Export a Scan in XCEDE format
class ScanXcedeItemView(XMLItemView):
__select__ = XMLItemView.__select__ & is_instance(’Scan’)
__regid__ = ’xcede-item’
def entity_call(self, entity):
self.w(u’<acquisition ID="%(id)s" projectID="%(p)s"
’subjectID="%(s)s" visitID="%(a)s" ’
’studyID="%(a)s" episodeID="%(la)s"/>n’
% {’id’: entity.identifier,
’p’: entity.related_study[0].name,
...
19/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 19 /
Data Output - Example of XCEDE
20/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 20 /
Data Output - RDF specific case
Existing tools in CubicWeb for RDF mapping
RDF mapping
xy.register_prefix(’foaf’, ’http://xmlns.com/foaf/0.1/’)
xy.add_equivalence(’Subject’, ’foaf:Subject’)
xy.add_equivalence(’MedicalCenter’, ’foaf:Organization’)
xy.add_equivalence(’Subject given_name’, ’foaf:givenName’)
xy.add_equivalence(’Subject family_name’, ’foaf:familyName’)
xy.add_equivalence(’* same_as *’, ’owl:sameAs’)
xy.add_equivalence(’* see_also *’, ’foaf:page’)
Also possible to plug specific Python functions for non-trivial mapping.
Used for RDF import/export and SPARQL endpoint
21/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 21 /
Plan
1 Introduction
2 Cubicweb and Brainomics
3 Data Model
4 Querying data
5 Future and Conclusions
22/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 22 /
RQL - Relational Query Language
Features
Similar to W3C’s SPARQL, but less verbose ;
Supports the basic operations (select, insert, etc.), subquerying, ordering,
counting, ...
Tightly integrated with SQL, but abstracts the details of the tables and
the joins ;
Use the schema for data types inference, based on a syntactic analysis of
the request.
→ A query returns a result set (a list of results), that can be displayed using
specific views.
23/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 23 /
RQL - Example
Query all the Cmap scans of left-handed male subjects that have a score
greater than 4.0 for the "algebre" question of the Localizer questionnary
↓
Any SA WHERE S handedness "left", S gender "male",
X concerns S, A questionnaire_run X,
A question Q, Q text "algebre", A value > 4,
SA concerns S, SA is Scan, SA type "c map"
and the SQL translation ...
SELECT _SA.cw_eid FROM cw_Answer AS _A, cw_Question AS _Q,
cw_QuestionnaireRun AS _X, cw_Scan AS _SA, cw_Subject AS _S
WHERE _S.cw_handedness="left" AND _S.cw_gender"male"
AND _X.cw_concerns=_S.cw_eid
AND _A.cw_questionnaire_run=_X.cw_eid
AND _A.cw_question=_Q.cw_eid AND _Q.cw_text="algebre"
AND _A.cw_value>4 AND _SA.cw_concerns=_S.cw_eid
AND _SA.cw_type="c map"
24/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 24 /
RQL for Endusers
No perfect dashboard / no perfect search form
↓
Put an expressive query language in the hands of the endusers
Exploring the data model
Explore the schema
http://localhost:8080/schema
RQL completion form ;
Learning RQL by showing the RQL for each page/each filter ;
Deep exploration of the data by the endusers
25/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 25 /
RQL VS SPARQL
Developped in parallel with SPARQL years ago, with a focus on SQL database
and support of SET / UPDATE / DELETE.
Why we should support SPARQL ...
SPARQL is a W3C standard, and a reference in the Semantic Web
community.
To improve interoperability of the CubicWeb application with other
Semantic Web technologies.
... and why we don’t want to use only SPARQL
Huge company’s internal knowledge on RDBMS (VS Triplestores) ;
SPARQL is quite verbose, RQL is more intuitive and elegant (“Syntax
matters”) ;
There exists a basic translation from SPARQL to RQL (only selection queries).
26/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 26 /
Federated databases in CW
PostgreSQL federated databases
Let PostgreSQL do what it does best
Since PostgreSQL 9.3 ;
Based on Foreign Data Wrapper (FDW) ;
Could be used to federated queries in CW (WIP) ;
FROM clause (WIP)
Similar to SPARQL SERVICE
Any P, S WHERE GEN is GenomicMeasure, GEN concerns S,
GEN platform P, P related_snps SN, SN in_gene G, G name GN
WITH P BEING (Any X WHERE X is Paper, X keywords GN)
FROM http://pubmed:8080
Eventually, also allow SPARQL subqueries ;
Or use CubicWeb SPARQL endpoint with SERVICE.
27/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 27 /
Plan
1 Introduction
2 Cubicweb and Brainomics
3 Data Model
4 Querying data
5 Future and Conclusions
28/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 28 /
Conclusion
Brainomics
Open source solution to manage brain imaging datasets and associated
metadata ;
Powerful querying and reporting tool, customized for emerging multimodal
studies.
Feedback from Brainomics
Do not store raw data in database ;
Try to interact with existing reference databases ;
Using CubicWeb :
Easy modeling of other applicative fields in the schema (e.g. Histology) ;
Security, migrations are already included ;
Many different views, and existing API to define your own ;
29/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 29 /
Future work
Future of Brainomics
How to transfer large files (>10Go for genotype files) ?
Need Content Delivery Network (CDN) in CubicWeb ;
Integration to reference databases (pubmed, refseq, ...) ;
Future of Cubicweb
Extended support of SPARQL ;
Finish the work on federated queries ;
REST support, Python’s Web Server Gateway Interface ;
Better integration with Bootstrap ;
30/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 30 /
Questions ?
http://www.cubicweb.org/project/cubicweb-brainomics
http://www.brainomics.net/demo/
vincent.michel@logilab.fr
brainomics@logilab.fr
cubicweb@lists.cubicweb.org
31/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 31 /
RQL/SPARQL - Example
Cities of Île de France with more than 100 000 unhabitants ?
RQL
Any X WHERE X region Y, X population > 100000,
Y uri "http://fr.dbpedia.org/resource/Île-de-France"
SPARQL
select ?ville where {
?ville db-owl:region <http://fr.dbpedia.org/resource/Île-de-F
?ville rdf:type db-owl:Settlement .
?ville db-owl:populationTotal ?population .
FILTER (?population > 100000)
}
32/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 32 /
This is NOT big data !
Small databases ...
No need to store raw data in the database ;
Structured data : metadata (patient informations, ...), choosen scores
(some statistical values on genes of interest, ...) ;
10 Mo / subject * 10.000 subjects = 100 Go ;
Classical SQL databases (e.g. PostgreSQL) work greats up to few TB !
http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html
http://www.vitavonni.de/blog/201309/
2013092701-big-data-madness-and-reality.html
33/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 33 /
Global architecture of CubicWeb
34/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 34 /
What’s needed
Efficient data model to integrate all the measures ;
Easy access to the relevant information (query language + UI) ;
Import / Export in several formats, for merging heterogenous studies ;
Adaptable to the evolutions of various dynamic applicative fields.
CubicWeb, a semantic datamanagement framework
35/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 35 /
Data model - Assessment
36/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 36 /

Contenu connexe

En vedette

En vedette (12)

Ghana Medical Banking Institute
Ghana Medical Banking InstituteGhana Medical Banking Institute
Ghana Medical Banking Institute
 
Health Bit Webinar 612010
Health Bit Webinar 612010Health Bit Webinar 612010
Health Bit Webinar 612010
 
Presentación Ricardo Renteria - eRetail Day México 2016
Presentación Ricardo Renteria - eRetail Day México 2016Presentación Ricardo Renteria - eRetail Day México 2016
Presentación Ricardo Renteria - eRetail Day México 2016
 
openscan.ro
openscan.roopenscan.ro
openscan.ro
 
Interactive Quiz About Parts Of The Body
Interactive Quiz About Parts Of The BodyInteractive Quiz About Parts Of The Body
Interactive Quiz About Parts Of The Body
 
Ignite eCommerce growth with AWS
Ignite eCommerce growth with AWSIgnite eCommerce growth with AWS
Ignite eCommerce growth with AWS
 
Transição para a vida Pós Escolar
Transição para a vida Pós EscolarTransição para a vida Pós Escolar
Transição para a vida Pós Escolar
 
What is Pediatric Cardiac Surgery?
What is Pediatric Cardiac Surgery?What is Pediatric Cardiac Surgery?
What is Pediatric Cardiac Surgery?
 
The Mediastinum Including the Pericardium Dr. Muhammad Bin Zulfiqar
The Mediastinum Includingthe Pericardium Dr. Muhammad Bin ZulfiqarThe Mediastinum Includingthe Pericardium Dr. Muhammad Bin Zulfiqar
The Mediastinum Including the Pericardium Dr. Muhammad Bin Zulfiqar
 
Voorbeeld zakelijke bedrijfspresentatie, power point professional
Voorbeeld zakelijke bedrijfspresentatie, power point professionalVoorbeeld zakelijke bedrijfspresentatie, power point professional
Voorbeeld zakelijke bedrijfspresentatie, power point professional
 
Exhibitor Insights: Uniting In-Store and Digital Channels: 7 Success Stories
Exhibitor Insights: Uniting In-Store and Digital Channels: 7 Success StoriesExhibitor Insights: Uniting In-Store and Digital Channels: 7 Success Stories
Exhibitor Insights: Uniting In-Store and Digital Channels: 7 Success Stories
 
Social and the Art of the Influencer
Social and the Art of the InfluencerSocial and the Art of the Influencer
Social and the Art of the Influencer
 

Similaire à Brainomics - CrEDIBLE 2013

Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 
b.1-best-practices-in-threat-intelligence.pdf
b.1-best-practices-in-threat-intelligence.pdfb.1-best-practices-in-threat-intelligence.pdf
b.1-best-practices-in-threat-intelligence.pdf
farhan941
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Karen Thompson
 

Similaire à Brainomics - CrEDIBLE 2013 (20)

Discover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statementDiscover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statement
 
databricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineeringdatabricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineering
 
Multi datastores - CLOSER'14
Multi datastores - CLOSER'14Multi datastores - CLOSER'14
Multi datastores - CLOSER'14
 
On the relation between Model View Definitions (MVDs) and Linked Data technol...
On the relation between Model View Definitions (MVDs) and Linked Data technol...On the relation between Model View Definitions (MVDs) and Linked Data technol...
On the relation between Model View Definitions (MVDs) and Linked Data technol...
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
MLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionMLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to production
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
WSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product OverviewWSO2 Machine Learner - Product Overview
WSO2 Machine Learner - Product Overview
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
 
Data science and OSS
Data science and OSSData science and OSS
Data science and OSS
 
Wei ding(resume)
Wei ding(resume)Wei ding(resume)
Wei ding(resume)
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Be a database professional
Be a database professionalBe a database professional
Be a database professional
 
Be a database professional
Be a database professionalBe a database professional
Be a database professional
 
b.1-best-practices-in-threat-intelligence.pdf
b.1-best-practices-in-threat-intelligence.pdfb.1-best-practices-in-threat-intelligence.pdf
b.1-best-practices-in-threat-intelligence.pdf
 
Interoperability of Meta-Modeling Tools
Interoperability of Meta-Modeling ToolsInteroperability of Meta-Modeling Tools
Interoperability of Meta-Modeling Tools
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
 

Dernier

Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
 
Call Girls Kurnool Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kurnool Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kurnool Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kurnool Just Call 8250077686 Top Class Call Girl Service Available
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
 
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappMost Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
 
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 9630942363 Top Class Pondicherry Escort Servi...
 
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service AvailableTrichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
Trichy Call Girls Book Now 9630942363 Top Class Trichy Escort Service Available
 
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
 
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
 
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
 
Call Girls Kakinada Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kakinada Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Kakinada Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Kakinada Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur  Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Guntur  Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
 
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
 
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
 
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
Independent Call Girls In Jaipur { 8445551418 } ✔ ANIKA MEHTA ✔ Get High Prof...
 
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
 

Brainomics - CrEDIBLE 2013

  • 1. BRAINOMICS A management system for exploring and merging heterogeneous brain mapping data based on CubicWeb Vincent Michel Logilab CrEDIBLE 2013 - 3/10/2013 1/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 1 / 3
  • 2. Plan 1 Introduction 2 Cubicweb and Brainomics 3 Data Model 4 Querying data 5 Future and Conclusions 2/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 2 / 3
  • 3. Logilab computer science and knowledge management ; Founded in 2000 ; 20 experts in IT technologies ; Public and private clients (CEA, EDF R&D, EADS, Arcelor Mittal, etc.). Knowledge management : http://collections.musees-haute-normandie.fr/collections/ http://data.bnf.fr/ 3/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 3 / 3
  • 4. Brainomics Goals Software solution for integration of neuroimaging and genomic data ; Conception/optimization (GPU) of algorithms for analysing these data ; Collaborative R&D project NeuroSpin laboratory of CEA Supelec ; UMR 894 of INSERM ; UMR CNRS 8203 of IGR ; Logilab ; Alliance Services Plus (AS+) ; Keosys. This work was supported by grants from the French National Reseach Agency (ANR GENIM ; ANR-10-BLAN-0128) and (ANR IA BRAINOMICS ; ANR-10-BINF-04). 4/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 4 / 3
  • 5. Context Brain mapping data Large datasets for brain mapping : http://openfmri.org/data-sets http://fcon_1000.projects.nitrc.org/indi/abide/ Neuroimaging + clinical data + genetics data ; Brain mapping databases Neuroimaging and genomics databases are dedicated to their own field of research ; XNAT - Neuroimaging ; BASE - Genetics ; SHANOIR - Neuroimaging ; 5/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 5 / 3
  • 6. Plan 1 Introduction 2 Cubicweb and Brainomics 3 Data Model 4 Querying data 5 Future and Conclusions 6/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 6 / 3
  • 7. What is Cubicweb A semantic open-source web framework written in Python An efficient knowledge management system Entity-relationship data-model ; RQL (Relational Query Language) ; Separate query and display (HTML UI, JSON, RDF, CSV,...) ; Conform to the Semantic Web standards ; Fine-grained security system coupled to the data model definition ; Migration mechanisms control model version and ensure data integrity ; Industrial use : large databases, many users, security, logging ; Used in production environments since 2005 ; LGPL since 2008 ; http://www.cubicweb.org/ 7/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 7 / 3
  • 8. What Cubicweb is not Cubicweb is not a pure Web application framework - Web/HTML interface is only one possible output ; a triple store - data is structured ; a CMS - allows complex business data modeling ; Cubicweb is a framework Used to build applications, with reusable components called cubes : data model : persons, addressbook, billing... displays (a.k.a views) : d3js, workflow, maps, threejs... full applications : forge, intranet, erp... open databases : dbpedia, diseasome, pubmed... 8/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 8 / 3
  • 9. Overview of the framework Well established core technologies : SQL, Python, HTML5, Javascript ; Application Based on a relational database (e.g. PostgreSQL) for storing information ; Web server for HTTP access to the data ; Integration with existing LDAP for high-level access management ; Code - written in Python Schema - define the data model. Business Logic - allow to increase the logic of the data beyond the scope of the data model, using specific functions and adapters. Views - specific display rules, from HTML to binary content. 9/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 9 / 3
  • 10. CubicWeb and Semantics Not using a triple store does not mean “not semantic web compliant” CubicWeb conforms to the Semantic Web standards One entity = an unique URI ; One request = an unique URI : http://localhost:8080/?rql=MYRQLREQUEST&vid=MYVIEWID HTTP content negociation (HTML, RDF, JSON, etc.) ; Import/export to/from RDF data, based on a specific mapping : xy.add_equivalence(’Person given_name’, ’foaf:givenName’) Use RDF as a standard I/O format for data, but stick to relational database for storage. 10/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 10 /
  • 11. Visualization and interactions Cubicweb views A view is applied on the result of a query ; The same result may be visualized using different views ; Views are selected based on the types of the resulting data ; Exploring the data Many different possible views ; Auto-completion RQL query form ; Filtering facets ; Using data in scripts with the URL vid/rql parameters ; And also : forms, widgets, ... 11/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 11 /
  • 12. Brainomics Brings together brain imaging and genetics data A solution based on CubicWeb Modeling of Scans, Questionnaires, Genomics results, Behavioural results, Subjects and Studies information, ... Can deal with large volumes (> 10000 subjects) ; Tested with several datasets (openfmri, abide, imagen, localizer) ; Specific views : ZIP, XCEDE XML, CSV ; Open source solution http://www.brainomics.net/demo 12/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 12 /
  • 13. Plan 1 Introduction 2 Cubicweb and Brainomics 3 Data Model 4 Querying data 5 Future and Conclusions 13/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 13 /
  • 14. Data model and Schema CubicWeb schema Based on a Python library : http://www.logilab.org/project/yams Defined in a Python file (schema.py) ; Allow to create entity types, relations, constraints ; Security can be tightly included in the data model ; class Subject(EntityType): identifier = String(required=True,indexed=True,maxsize=64) gender = String(vocabulary=(’male’, ’female’, ’unknown’)) date_of_birth = Date() ... class related_studies(RelationType): subject = ’Subject’ object = ’Study’ cardinality = ’1*’ 14/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 14 /
  • 15. Data model - Subject 15/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 15 /
  • 16. Where are the reference models ? Reference models Reference models may be implemented as Cubicweb schema additional modeling refinements for application features (sortability, readability, ...) already existing cubes for some ontologies/taxonomies (e.g. FRBR) ; Be pragmatic : if you need a new attribute, add it in the model → deal with the reference model in the I/O formats ; ... or you could use your own specific schema Ontologies do not exist for all fields ; Define your schema and only code the mappings (I/O) to reference models when they go out. 16/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 16 /
  • 17. Schema evolution http://xkcd.com/ Standards are moving Existing Migration mechanisms to stick to the reference models evolution ; Easy modification for modeling of other applicative fields ; 17/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 17 /
  • 18. Data - Input Keep the application data model as pivot model → convert the data to this model during the insertion Datafeed - Periodic input an URL ; some Python logic for integrating/updating information ; an interval of synchronization ; → used for almost any possible type of data, e.g. RSS feeds, files, RDF data, other CW instances... Stores - Bulk loading Tools similar to ETL allow to import huge amount of structured data ; principally used for bulk loading ; different level of security and data check ; 18/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 18 /
  • 19. Data - Output Design a specific view for outputing in any reference model If one reference model change, change the I/O, not the internal model ; Avoid data redundancy, if different reference models are based on the same information (e.g. dc :title, foaf :name) ; Example of View - Export a Scan in XCEDE format class ScanXcedeItemView(XMLItemView): __select__ = XMLItemView.__select__ & is_instance(’Scan’) __regid__ = ’xcede-item’ def entity_call(self, entity): self.w(u’<acquisition ID="%(id)s" projectID="%(p)s" ’subjectID="%(s)s" visitID="%(a)s" ’ ’studyID="%(a)s" episodeID="%(la)s"/>n’ % {’id’: entity.identifier, ’p’: entity.related_study[0].name, ... 19/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 19 /
  • 20. Data Output - Example of XCEDE 20/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 20 /
  • 21. Data Output - RDF specific case Existing tools in CubicWeb for RDF mapping RDF mapping xy.register_prefix(’foaf’, ’http://xmlns.com/foaf/0.1/’) xy.add_equivalence(’Subject’, ’foaf:Subject’) xy.add_equivalence(’MedicalCenter’, ’foaf:Organization’) xy.add_equivalence(’Subject given_name’, ’foaf:givenName’) xy.add_equivalence(’Subject family_name’, ’foaf:familyName’) xy.add_equivalence(’* same_as *’, ’owl:sameAs’) xy.add_equivalence(’* see_also *’, ’foaf:page’) Also possible to plug specific Python functions for non-trivial mapping. Used for RDF import/export and SPARQL endpoint 21/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 21 /
  • 22. Plan 1 Introduction 2 Cubicweb and Brainomics 3 Data Model 4 Querying data 5 Future and Conclusions 22/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 22 /
  • 23. RQL - Relational Query Language Features Similar to W3C’s SPARQL, but less verbose ; Supports the basic operations (select, insert, etc.), subquerying, ordering, counting, ... Tightly integrated with SQL, but abstracts the details of the tables and the joins ; Use the schema for data types inference, based on a syntactic analysis of the request. → A query returns a result set (a list of results), that can be displayed using specific views. 23/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 23 /
  • 24. RQL - Example Query all the Cmap scans of left-handed male subjects that have a score greater than 4.0 for the "algebre" question of the Localizer questionnary ↓ Any SA WHERE S handedness "left", S gender "male", X concerns S, A questionnaire_run X, A question Q, Q text "algebre", A value > 4, SA concerns S, SA is Scan, SA type "c map" and the SQL translation ... SELECT _SA.cw_eid FROM cw_Answer AS _A, cw_Question AS _Q, cw_QuestionnaireRun AS _X, cw_Scan AS _SA, cw_Subject AS _S WHERE _S.cw_handedness="left" AND _S.cw_gender"male" AND _X.cw_concerns=_S.cw_eid AND _A.cw_questionnaire_run=_X.cw_eid AND _A.cw_question=_Q.cw_eid AND _Q.cw_text="algebre" AND _A.cw_value>4 AND _SA.cw_concerns=_S.cw_eid AND _SA.cw_type="c map" 24/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 24 /
  • 25. RQL for Endusers No perfect dashboard / no perfect search form ↓ Put an expressive query language in the hands of the endusers Exploring the data model Explore the schema http://localhost:8080/schema RQL completion form ; Learning RQL by showing the RQL for each page/each filter ; Deep exploration of the data by the endusers 25/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 25 /
  • 26. RQL VS SPARQL Developped in parallel with SPARQL years ago, with a focus on SQL database and support of SET / UPDATE / DELETE. Why we should support SPARQL ... SPARQL is a W3C standard, and a reference in the Semantic Web community. To improve interoperability of the CubicWeb application with other Semantic Web technologies. ... and why we don’t want to use only SPARQL Huge company’s internal knowledge on RDBMS (VS Triplestores) ; SPARQL is quite verbose, RQL is more intuitive and elegant (“Syntax matters”) ; There exists a basic translation from SPARQL to RQL (only selection queries). 26/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 26 /
  • 27. Federated databases in CW PostgreSQL federated databases Let PostgreSQL do what it does best Since PostgreSQL 9.3 ; Based on Foreign Data Wrapper (FDW) ; Could be used to federated queries in CW (WIP) ; FROM clause (WIP) Similar to SPARQL SERVICE Any P, S WHERE GEN is GenomicMeasure, GEN concerns S, GEN platform P, P related_snps SN, SN in_gene G, G name GN WITH P BEING (Any X WHERE X is Paper, X keywords GN) FROM http://pubmed:8080 Eventually, also allow SPARQL subqueries ; Or use CubicWeb SPARQL endpoint with SERVICE. 27/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 27 /
  • 28. Plan 1 Introduction 2 Cubicweb and Brainomics 3 Data Model 4 Querying data 5 Future and Conclusions 28/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 28 /
  • 29. Conclusion Brainomics Open source solution to manage brain imaging datasets and associated metadata ; Powerful querying and reporting tool, customized for emerging multimodal studies. Feedback from Brainomics Do not store raw data in database ; Try to interact with existing reference databases ; Using CubicWeb : Easy modeling of other applicative fields in the schema (e.g. Histology) ; Security, migrations are already included ; Many different views, and existing API to define your own ; 29/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 29 /
  • 30. Future work Future of Brainomics How to transfer large files (>10Go for genotype files) ? Need Content Delivery Network (CDN) in CubicWeb ; Integration to reference databases (pubmed, refseq, ...) ; Future of Cubicweb Extended support of SPARQL ; Finish the work on federated queries ; REST support, Python’s Web Server Gateway Interface ; Better integration with Bootstrap ; 30/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 30 /
  • 32. RQL/SPARQL - Example Cities of Île de France with more than 100 000 unhabitants ? RQL Any X WHERE X region Y, X population > 100000, Y uri "http://fr.dbpedia.org/resource/Île-de-France" SPARQL select ?ville where { ?ville db-owl:region <http://fr.dbpedia.org/resource/Île-de-F ?ville rdf:type db-owl:Settlement . ?ville db-owl:populationTotal ?population . FILTER (?population > 100000) } 32/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 32 /
  • 33. This is NOT big data ! Small databases ... No need to store raw data in the database ; Structured data : metadata (patient informations, ...), choosen scores (some statistical values on genes of interest, ...) ; 10 Mo / subject * 10.000 subjects = 100 Go ; Classical SQL databases (e.g. PostgreSQL) work greats up to few TB ! http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html http://www.vitavonni.de/blog/201309/ 2013092701-big-data-madness-and-reality.html 33/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 33 /
  • 34. Global architecture of CubicWeb 34/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 34 /
  • 35. What’s needed Efficient data model to integrate all the measures ; Easy access to the relevant information (query language + UI) ; Import / Export in several formats, for merging heterogenous studies ; Adaptable to the evolutions of various dynamic applicative fields. CubicWeb, a semantic datamanagement framework 35/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 35 /
  • 36. Data model - Assessment 36/36 Vincent Michel (Logilab) Brainomics CrEDIBLE 2013 - 3/10/2013 36 /