SlideShare une entreprise Scribd logo
1  sur  45
Télécharger pour lire hors ligne
1




       The open source ISA metadata tracking
  framework: from data curation and management at
        the source, to the linked data universe

     BOSC, Long Beach, July 13-14, 2012

     Philippe Rocca-Serra (Ph. D)
     ISA Team
     twitter: @isatools.org




                                          philippe.rocca-serra@oerc.ox.ac.uk
                                                     http://www.isa-tools.org

Friday, 13 July 2012
3




                                MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                         But let’s proceed gradually…




Friday, 13 July 2012
3




                                             MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                                    But let’s proceed gradually…




                       Notes in Lab Books
                       (information for humans)




Friday, 13 July 2012
3




                                              MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                                    But let’s proceed gradually…




                       Notes in Lab Books       Spreadsheets and Tables
                       (information for humans) ( the compromise)




Friday, 13 July 2012
3




                                              MAIN THEME:
         It is all about structuring experimental information to make it
          available to computer and software agents to enable mining.

                                    But let’s proceed gradually…




                       Notes in Lab Books       Spreadsheets and Tables   Facts as RDF statements
                       (information for humans) ( the compromise)         (information for machines)




Friday, 13 July 2012
9




                               Observations
         • Experiments are expensive, often publicly funded, still
           many fail to see the light.
         • Spreadsheets are the most common vehicle for so-called
           ‘omics’ (functional genomics) experimental metadata
           tracking
         • technology centric repositories form de facto silos
         • conversions are required to allow for deposition to public
           databases.
         • submitting to common information across a series of
           repositories is inefficient



Friday, 13 July 2012
10




                       Case Study




Friday, 13 July 2012
13


                       Many ontologies, Many Formats, Many
                                Requirements…


                                        Grr…Where are the
                                        tools!?!




                                Credits:	
  h/p://liverpoolsolfed.wordpress.com/resources/image-­‐bank/demonstraAon/




Friday, 13 July 2012
14


                       ISA framework overview




Friday, 13 July 2012
Why ISA format and Tools?

           – Supporting data provenance tracking
           – Node/Edge underlying concept
           – Tabular as a compromise: a presentation layer inspired by Object
             model (FuGE,MAGE-OM)
           – A Generic representation, applied to:
              • microarray based experiments (MAGE)
              • sequencing based experiments (SRA)
              • flow cytometry based experiments (FuGE-Flow Cyt)
              • mass spectrometry and NMR spectroscopy experiments




Friday, 13 July 2012
Why ISA format and Tools?


                                       investigation                       investigation
                                                                            high  level  concept  to  link          H1                 H. Sapiens       35       Years   H1.sample1    Labeling         H1.sample1.labeled        h1-s1.cel
                                                                            related  studies                        H1                 H. Sapiens       35       Years   H1.sample2                                               h1-s2.cel
                                                                                                                    H2                 H. Sapiens       33       Years   H2.sample1    Labeling         H2.sample1.labeled        h2-s1.cel
                                                                           study
                                                                            the  central  unit,  containing  
                                                                            information  on  the  subject  
                                                                            under  study,  its  characteristics                                     H1.sample1              Labeling              H1.sample1.labeled         h1-s1.cel
                                                                            and  any  treatments  applied.               H1
                                                                            a  study  has  associated  assays             H. Sapiens                H1.sample2                                                               h1-s2.cel
                                                                                                                          35 Years


                                                                           assay                                         H2                         H2.sample1              Labeling              H2.sample1.labeled         h2-s1.cel
                                                                             test  performed  either  on                  H. Sapiens
                                                                                                                          33 Years
                                                                             material  taken  from  the  sub-­
                                                                             ject  or  on  the  whole  initial  
                                                                             subject,  which  produce  quali-­
                                                                             tative  or  quantitative  meas-­            ISA metadata specifications:
                                                                             urements  (data)
                                                                                                                         •workflow and process orientated
                                                                                                                         •compatible with checklist enforcement
                                                                                                                         •compatible with external vocabulary resources
                       assay(s)                                 assay(s)                                                 •compatible by design with existing schemas
                                   pointers  to  data  file                                                                   MAGE-Tab
                                     names/location
                                                                                                                                                             Pride-xml
                                                                                                                                                                                   SRA-xml

                                    external  files  in                                                                                                                      Currently finalizing conversion to RDF to explore
                                  native  or  other  for-­
                                          mats
                                                                                                                                                                             the growing Linked Data universe, in collaboration
                                                                                                                                                                             with the W3C HCLSIG, Toxbank Consortium)
                         data                                      data




Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)




          Material Node                                        Material Node




    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)




          Material Node                                        Material Node




    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)



                                                                    Data File Node
          Material Node                                        Material Node




    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)



                                                                    Data File Node
          Material Node                                        Material Node




                                                                                    Comment[…]
    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
ISA syntax and Table definition

• Material Transformations:
     – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.)



                                                                    Data File Node
          Material Node                                        Material Node




                                                                                    Comment[…]
    Characteristics[…]
    Factor Value[…] (independent          Protocol REF                   Characteristics[…]
    variables)
                                                                         Factor Value[…] (independent
    Material Type
                                   Parameter Value                       variables)
    Comment[…]
                                   […]                                   Material Type
                                                                         Comment[…]


                                                         Performer (operator
                                                         effect)
                                                          Date (day effect)




                                                                                                                           9

Friday, 13 July 2012
19


                       ISAconfigurator Tables




Friday, 13 July 2012
20


                       ISAconfigurator Tables




Friday, 13 July 2012
22




             How do ISA tools access Ontology servers?




Friday, 13 July 2012
The ISAcreator...




                              isacreator
  Developed to be a user friendly way to
  enter standards-compliant metadata: it
  has lots of features...


    But these are just some of them...we
    also have a data entry wizard and an
    import utility...




Friday, 13 July 2012
24



                       Select and Annotate in ISAcreator




Friday, 13 July 2012
Extending ISAcreator
                           The Plugin Archictecture




Friday, 13 July 2012
Plugins in ISAcreator

     In ISAcreator, we use the Apache Felix implementation of the OSGi framework...it’s really good.

     •Plugins can be developed for 3 different purposes:




     Search (adds extra search space                 Custom cell editors                    Extra general functionality
     for ontology tool)                              (for spreadsheet)                      (which appears in a plugin
                                                                                            menu)
     •2 Examples of ISA plugins:
        • Access to local metadata stores: Novartis Plugin to Ontology Widget
          • Annotation of findings: Metabolite Identification Plugin (Metabolights Repository contribution to ISA project).




Friday, 13 July 2012
Plugins...example 1      Novartis Metastore Search




                           Search function on the Novartis
                           Metastore... integrates search results
                           on the metastore in the Ontology
                           search tool.

                           So, with the Novartis plugin in your
                           Plugin directory, you’ll be able to
                           search the Novartis metastore
                           directly within ISAcreator, and it will
                           handle all the tasks involved with
                           recording term source, etc.




Friday, 13 July 2012
Plugins Example 2 - Metabolite Identification plugin




 5
     Credits: Kenneth Haug: Metabolights


Friday, 13 July 2012
30




                       Potential Issues and known hurdles


         • The problem of conflicting versions
           – especially high when working with big consortia
           – distributed, decentralized groups of users
         • Lack of version control and history
         • Absence of collaborative features

               – Looking for new solutions while retaining the features !
                  • OntoMaton: Bringing Google Doc, NCBO Bioportal and
                    ISA-TAB together !


Friday, 13 July 2012
Friday, 13 July 2012
OntoMaton: Searching




Friday, 13 July 2012
OntoMaton: Tagging




Friday, 13 July 2012
OntoMaton
                       • Public release: http://goo.gl/2OKFV
                       • Can be used in any Google Spreadsheet
                         document

                       • Application:
                        • Annotating data records
                        • Supporting ontology development (see OBI
                           Quick Term Templates)



Friday, 13 July 2012
31



                             ISA2RDF work in progress
         • Use case on W3C HCLS scientific discourse list
               – deciding on the granularity of representation
               – building on previous experience
               – Evaluating alternative representations.
         • Participitation to the Biohackathon 2011
               – http://blogs.openaccesscentral.com/blogs/bmcblog/entry/
                 biohackathon_2011_number_1
               – Discussing best practices
                       • PURL uri and identifiers.org as identifiers
             • Openphacts guidelines                (http://www.nanopub.org/guidelines/
                       OpenPHACTS_Nanopublication_Guidlines_v1.8.1.pdf)

                       •
Friday, 13 July 2012
Preparing for Linked Open Data
                   ✴   ISA2RDF (Toxbank collaboration) contribution to an
                       ecosystem of software tools supporting the ISA syntax
                   ✴   reliance to internet resolvable identifiers
                   ✴   W3C bio/life science Note on Gene Expression RDF -
                       (PMID: 22449719)
                   ✴   TODO:
                       ✴   Specify comparator groups + analysis methods and
                           resulting measurements and statistical measures


Friday, 13 July 2012
Preparing for Linked Open Data
                   ✴   ISA2RDF (Toxbank collaboration) contribution to an
                       ecosystem of software tools supporting the ISA syntax
                   ✴   reliance to internet resolvable identifiers
                   ✴   W3C bio/life science Note on Gene Expression RDF -
                       (PMID: 22449719)
                   ✴   TODO:
                       ✴   Specify comparator groups + analysis methods and
                           resulting measurements and statistical measures


Friday, 13 July 2012
Preparing for Linked Open Data
                   ✴   ISA2RDF (Toxbank collaboration) contribution to an
                       ecosystem of software tools supporting the ISA syntax
                   ✴   reliance to internet resolvable identifiers
                   ✴   W3C bio/life science Note on Gene Expression RDF -
                       (PMID: 22449719)
                   ✴   TODO:
                       ✴   Specify comparator groups + analysis methods and
                           resulting measurements and statistical measures


Friday, 13 July 2012
32




                                    ISA2RDF: work in progress




                       jeliazkova.nina
                       [toxbank project]
Friday, 13 July 2012
32




                                    ISA2RDF: work in progress




                       jeliazkova.nina
                       [toxbank project]
Friday, 13 July 2012
ISA2OWL

                       • OWLAPI
                       • ISA Parser (in memory BII object store objects)

                       • Mapping ISA syntax into target Ontological Space
                       • Decoupling Mapping from Conversion Engine
                        • avoid to be tied to a semantic framework

Friday, 13 July 2012
ISA2OWL: mapping in the
                       BFO space as starting point




Friday, 13 July 2012
ISA2OWL: mapping in the
                       BFO space as starting point




Friday, 13 July 2012
ISA2OWL: mapping issues

                       • Stability over time
                       • Keeping track of resource versions
                       • Gaps in coverage
                           • Use of local extensions
                           • Direct requests/contributions

Friday, 13 July 2012
ISA2OWL: development

                       • include graph metadata (graph provenance to aid
                         indexing)

                       • extend semantic validation of ISA archive
                       • augment annotation by suggesting additions
                       • facilitate curation work
                       • create new mappings to other frameworks
                         (OPML model, SIO,)


Friday, 13 July 2012
33




        Publication...



                       ISA software suite: supporting standards-compliant
                       experimental annotation and enabling curation at the
                       community level
                       Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; Nataliya Sklyar; Chris Taylor; Kimberly Begley; Dawn Field; Stephen Harris;
                       Winston Hide; Oliver Hofmann; Steffen Neumann; Peter Sterk; Weida Tong; Susanna-Assunta Sansone
                       BioinformaAcs	
  2010	
  26:	
  2354-­‐2356




Friday, 13 July 2012
34




            Acknowledgements

         Groups and individuals participating in:
         MIBBI http://mibbi.org
         ISA-­‐Tab	
  format http://isatab.sf.net
         OBO	
  Foundry http://obofoundry.org
         OBI: http://obi-ontology.org/page/Main_Page
                                                               collaborators at:
         ISA Infrastructure Team:                                Cambridge University
         Alejandra Gonzalez-­‐Beltran	
  (Oxford)                               EuNuGO
                                                       Harvard School for Public Health
         Eamonn Maguire	
  (Oxford)                                        FDAs NCTR
         Philippe Rocca-­‐Serra	
  (Oxford)                      Leibniz Plant Institute
                                                                         NERCs NEBC
                                                                            SIDR,	
  INIST
                                                              Metabolights,	
  EMBL-­‐EBI
                                                                            Funders:
                                                           EU Carcinogenomics Project
                                                                          UK	
  BBSRC

Friday, 13 July 2012
35




                       Groups and individuals participating in:
                       Winston Hide: HSPH
                       Oliver Hoffman: HSPH
                       Shannan Ho Sui : HSPH
                       Brad Chapman: HSPH
                       Christoph Steinbeck: Metabolights
                       Kenneth Haug: Metabolights
                       Paula de Matos: Metabolights
                       Magali Roux: INIST
                       Florian Mazur: INIST
                       Alain Zasadzinki: INIST
                       Marie Christine Jacquemot: INIST
                       Nina Jeliazkova: ToxBank

                       And many more who have to forgive us!


Friday, 13 July 2012
36




                       Questions:




Friday, 13 July 2012

Contenu connexe

Similaire à P Rocca-Serra - The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe

Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...GigaScience, BGI Hong Kong
 
Pal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integrationPal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integrationMustafa Jarrar
 
White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction   White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction EMC
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research ObjectsDavid De Roure
 
Pal gov.tutorial2.session7
Pal gov.tutorial2.session7Pal gov.tutorial2.session7
Pal gov.tutorial2.session7Mustafa Jarrar
 
Pal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owlPal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owlMustafa Jarrar
 
Pal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationPal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationMustafa Jarrar
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1iotest
 
Pal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlPal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlMustafa Jarrar
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Alexandru Iosup
 
Pal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-storesPal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-storesMustafa Jarrar
 
Research Object Model in Sepublica
Research Object Model in SepublicaResearch Object Model in Sepublica
Research Object Model in SepublicaKhalid Belhajjame
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
Pal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataPal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataMustafa Jarrar
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013CS, NcState
 
Pal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oraclePal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oracleMustafa Jarrar
 

Similaire à P Rocca-Serra - The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe (20)

Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
Eamonn Maguire: The Open Source ISA Metadata Tracking Framework: From Data Cu...
 
TranSMART ISA-june2012
TranSMART ISA-june2012TranSMART ISA-june2012
TranSMART ISA-june2012
 
Pal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integrationPal gov.tutorial2.session13 1.data schema integration
Pal gov.tutorial2.session13 1.data schema integration
 
White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction   White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
 
Pal gov.tutorial2.session7
Pal gov.tutorial2.session7Pal gov.tutorial2.session7
Pal gov.tutorial2.session7
 
Pal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owlPal gov.tutorial2.session7.owl
Pal gov.tutorial2.session7.owl
 
Pal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integrationPal gov.tutorial2.session13 2.gav and lav integration
Pal gov.tutorial2.session13 2.gav and lav integration
 
Forschungsdaten & OpenAIREPlus
Forschungsdaten & OpenAIREPlusForschungsdaten & OpenAIREPlus
Forschungsdaten & OpenAIREPlus
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1
 
Pal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparqlPal gov.tutorial2.session10.sparql
Pal gov.tutorial2.session10.sparql
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
Pal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-storesPal gov.tutorial2.session9.lab rdf-stores
Pal gov.tutorial2.session9.lab rdf-stores
 
Research Object Model in Sepublica
Research Object Model in SepublicaResearch Object Model in Sepublica
Research Object Model in Sepublica
 
Why Workflows Break
Why Workflows BreakWhy Workflows Break
Why Workflows Break
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Pal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddataPal gov.tutorial2.session15 1.linkeddata
Pal gov.tutorial2.session15 1.linkeddata
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013
 
Pal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oraclePal gov.tutorial2.session11.oracle
Pal gov.tutorial2.session11.oracle
 

Plus de Jan Aerts

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Jan Aerts
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Jan Aerts
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Jan Aerts
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Jan Aerts
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data AnalysisJan Aerts
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsJan Aerts
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...Jan Aerts
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumJan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisJan Aerts
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...Jan Aerts
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...Jan Aerts
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsJan Aerts
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 
B Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoB Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoJan Aerts
 
D Baker - Galaxy Update
D Baker - Galaxy UpdateD Baker - Galaxy Update
D Baker - Galaxy UpdateJan Aerts
 

Plus de Jan Aerts (20)

VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 
Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bi...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
B Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoB Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUno
 
D Baker - Galaxy Update
D Baker - Galaxy UpdateD Baker - Galaxy Update
D Baker - Galaxy Update
 

Dernier

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Dernier (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

P Rocca-Serra - The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe

  • 1. 1 The open source ISA metadata tracking framework: from data curation and management at the source, to the linked data universe BOSC, Long Beach, July 13-14, 2012 Philippe Rocca-Serra (Ph. D) ISA Team twitter: @isatools.org philippe.rocca-serra@oerc.ox.ac.uk http://www.isa-tools.org Friday, 13 July 2012
  • 2. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Friday, 13 July 2012
  • 3. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Notes in Lab Books (information for humans) Friday, 13 July 2012
  • 4. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Notes in Lab Books Spreadsheets and Tables (information for humans) ( the compromise) Friday, 13 July 2012
  • 5. 3 MAIN THEME: It is all about structuring experimental information to make it available to computer and software agents to enable mining. But let’s proceed gradually… Notes in Lab Books Spreadsheets and Tables Facts as RDF statements (information for humans) ( the compromise) (information for machines) Friday, 13 July 2012
  • 6. 9 Observations • Experiments are expensive, often publicly funded, still many fail to see the light. • Spreadsheets are the most common vehicle for so-called ‘omics’ (functional genomics) experimental metadata tracking • technology centric repositories form de facto silos • conversions are required to allow for deposition to public databases. • submitting to common information across a series of repositories is inefficient Friday, 13 July 2012
  • 7. 10 Case Study Friday, 13 July 2012
  • 8. 13 Many ontologies, Many Formats, Many Requirements… Grr…Where are the tools!?! Credits:  h/p://liverpoolsolfed.wordpress.com/resources/image-­‐bank/demonstraAon/ Friday, 13 July 2012
  • 9. 14 ISA framework overview Friday, 13 July 2012
  • 10. Why ISA format and Tools? – Supporting data provenance tracking – Node/Edge underlying concept – Tabular as a compromise: a presentation layer inspired by Object model (FuGE,MAGE-OM) – A Generic representation, applied to: • microarray based experiments (MAGE) • sequencing based experiments (SRA) • flow cytometry based experiments (FuGE-Flow Cyt) • mass spectrometry and NMR spectroscopy experiments Friday, 13 July 2012
  • 11. Why ISA format and Tools? investigation investigation high  level  concept  to  link   H1 H. Sapiens 35 Years H1.sample1 Labeling H1.sample1.labeled h1-s1.cel related  studies H1 H. Sapiens 35 Years H1.sample2 h1-s2.cel H2 H. Sapiens 33 Years H2.sample1 Labeling H2.sample1.labeled h2-s1.cel study the  central  unit,  containing   information  on  the  subject   under  study,  its  characteristics   H1.sample1 Labeling H1.sample1.labeled h1-s1.cel and  any  treatments  applied. H1 a  study  has  associated  assays H. Sapiens H1.sample2 h1-s2.cel 35 Years assay H2 H2.sample1 Labeling H2.sample1.labeled h2-s1.cel test  performed  either  on   H. Sapiens 33 Years material  taken  from  the  sub-­ ject  or  on  the  whole  initial   subject,  which  produce  quali-­ tative  or  quantitative  meas-­ ISA metadata specifications: urements  (data) •workflow and process orientated •compatible with checklist enforcement •compatible with external vocabulary resources assay(s) assay(s) •compatible by design with existing schemas pointers  to  data  file   MAGE-Tab names/location Pride-xml SRA-xml external  files  in   Currently finalizing conversion to RDF to explore native  or  other  for-­ mats the growing Linked Data universe, in collaboration with the W3C HCLSIG, Toxbank Consortium) data data Friday, 13 July 2012
  • 12. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Material Node Material Node Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 13. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Material Node Material Node Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 14. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Data File Node Material Node Material Node Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 15. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Data File Node Material Node Material Node Comment[…] Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 16. ISA syntax and Table definition • Material Transformations: – Input and Outputs of Protocols are Material Nodes (Source Name, Sample Name, Extract Name, Labeled Extract Name.) Data File Node Material Node Material Node Comment[…] Characteristics[…] Factor Value[…] (independent Protocol REF Characteristics[…] variables) Factor Value[…] (independent Material Type Parameter Value variables) Comment[…] […] Material Type Comment[…] Performer (operator effect) Date (day effect) 9 Friday, 13 July 2012
  • 17. 19 ISAconfigurator Tables Friday, 13 July 2012
  • 18. 20 ISAconfigurator Tables Friday, 13 July 2012
  • 19. 22 How do ISA tools access Ontology servers? Friday, 13 July 2012
  • 20. The ISAcreator... isacreator Developed to be a user friendly way to enter standards-compliant metadata: it has lots of features... But these are just some of them...we also have a data entry wizard and an import utility... Friday, 13 July 2012
  • 21. 24 Select and Annotate in ISAcreator Friday, 13 July 2012
  • 22. Extending ISAcreator The Plugin Archictecture Friday, 13 July 2012
  • 23. Plugins in ISAcreator In ISAcreator, we use the Apache Felix implementation of the OSGi framework...it’s really good. •Plugins can be developed for 3 different purposes: Search (adds extra search space Custom cell editors Extra general functionality for ontology tool) (for spreadsheet) (which appears in a plugin menu) •2 Examples of ISA plugins: • Access to local metadata stores: Novartis Plugin to Ontology Widget • Annotation of findings: Metabolite Identification Plugin (Metabolights Repository contribution to ISA project). Friday, 13 July 2012
  • 24. Plugins...example 1 Novartis Metastore Search Search function on the Novartis Metastore... integrates search results on the metastore in the Ontology search tool. So, with the Novartis plugin in your Plugin directory, you’ll be able to search the Novartis metastore directly within ISAcreator, and it will handle all the tasks involved with recording term source, etc. Friday, 13 July 2012
  • 25. Plugins Example 2 - Metabolite Identification plugin 5 Credits: Kenneth Haug: Metabolights Friday, 13 July 2012
  • 26. 30 Potential Issues and known hurdles • The problem of conflicting versions – especially high when working with big consortia – distributed, decentralized groups of users • Lack of version control and history • Absence of collaborative features – Looking for new solutions while retaining the features ! • OntoMaton: Bringing Google Doc, NCBO Bioportal and ISA-TAB together ! Friday, 13 July 2012
  • 30. OntoMaton • Public release: http://goo.gl/2OKFV • Can be used in any Google Spreadsheet document • Application: • Annotating data records • Supporting ontology development (see OBI Quick Term Templates) Friday, 13 July 2012
  • 31. 31 ISA2RDF work in progress • Use case on W3C HCLS scientific discourse list – deciding on the granularity of representation – building on previous experience – Evaluating alternative representations. • Participitation to the Biohackathon 2011 – http://blogs.openaccesscentral.com/blogs/bmcblog/entry/ biohackathon_2011_number_1 – Discussing best practices • PURL uri and identifiers.org as identifiers • Openphacts guidelines (http://www.nanopub.org/guidelines/ OpenPHACTS_Nanopublication_Guidlines_v1.8.1.pdf) • Friday, 13 July 2012
  • 32. Preparing for Linked Open Data ✴ ISA2RDF (Toxbank collaboration) contribution to an ecosystem of software tools supporting the ISA syntax ✴ reliance to internet resolvable identifiers ✴ W3C bio/life science Note on Gene Expression RDF - (PMID: 22449719) ✴ TODO: ✴ Specify comparator groups + analysis methods and resulting measurements and statistical measures Friday, 13 July 2012
  • 33. Preparing for Linked Open Data ✴ ISA2RDF (Toxbank collaboration) contribution to an ecosystem of software tools supporting the ISA syntax ✴ reliance to internet resolvable identifiers ✴ W3C bio/life science Note on Gene Expression RDF - (PMID: 22449719) ✴ TODO: ✴ Specify comparator groups + analysis methods and resulting measurements and statistical measures Friday, 13 July 2012
  • 34. Preparing for Linked Open Data ✴ ISA2RDF (Toxbank collaboration) contribution to an ecosystem of software tools supporting the ISA syntax ✴ reliance to internet resolvable identifiers ✴ W3C bio/life science Note on Gene Expression RDF - (PMID: 22449719) ✴ TODO: ✴ Specify comparator groups + analysis methods and resulting measurements and statistical measures Friday, 13 July 2012
  • 35. 32 ISA2RDF: work in progress jeliazkova.nina [toxbank project] Friday, 13 July 2012
  • 36. 32 ISA2RDF: work in progress jeliazkova.nina [toxbank project] Friday, 13 July 2012
  • 37. ISA2OWL • OWLAPI • ISA Parser (in memory BII object store objects) • Mapping ISA syntax into target Ontological Space • Decoupling Mapping from Conversion Engine • avoid to be tied to a semantic framework Friday, 13 July 2012
  • 38. ISA2OWL: mapping in the BFO space as starting point Friday, 13 July 2012
  • 39. ISA2OWL: mapping in the BFO space as starting point Friday, 13 July 2012
  • 40. ISA2OWL: mapping issues • Stability over time • Keeping track of resource versions • Gaps in coverage • Use of local extensions • Direct requests/contributions Friday, 13 July 2012
  • 41. ISA2OWL: development • include graph metadata (graph provenance to aid indexing) • extend semantic validation of ISA archive • augment annotation by suggesting additions • facilitate curation work • create new mappings to other frameworks (OPML model, SIO,) Friday, 13 July 2012
  • 42. 33 Publication... ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; Nataliya Sklyar; Chris Taylor; Kimberly Begley; Dawn Field; Stephen Harris; Winston Hide; Oliver Hofmann; Steffen Neumann; Peter Sterk; Weida Tong; Susanna-Assunta Sansone BioinformaAcs  2010  26:  2354-­‐2356 Friday, 13 July 2012
  • 43. 34 Acknowledgements Groups and individuals participating in: MIBBI http://mibbi.org ISA-­‐Tab  format http://isatab.sf.net OBO  Foundry http://obofoundry.org OBI: http://obi-ontology.org/page/Main_Page collaborators at: ISA Infrastructure Team: Cambridge University Alejandra Gonzalez-­‐Beltran  (Oxford) EuNuGO Harvard School for Public Health Eamonn Maguire  (Oxford) FDAs NCTR Philippe Rocca-­‐Serra  (Oxford) Leibniz Plant Institute NERCs NEBC SIDR,  INIST Metabolights,  EMBL-­‐EBI Funders: EU Carcinogenomics Project UK  BBSRC Friday, 13 July 2012
  • 44. 35 Groups and individuals participating in: Winston Hide: HSPH Oliver Hoffman: HSPH Shannan Ho Sui : HSPH Brad Chapman: HSPH Christoph Steinbeck: Metabolights Kenneth Haug: Metabolights Paula de Matos: Metabolights Magali Roux: INIST Florian Mazur: INIST Alain Zasadzinki: INIST Marie Christine Jacquemot: INIST Nina Jeliazkova: ToxBank And many more who have to forgive us! Friday, 13 July 2012
  • 45. 36 Questions: Friday, 13 July 2012