SlideShare une entreprise Scribd logo
1  sur  41
x-omics Data
                            Integration Challenges
                                    Dr. Michael Lappe, Ph.D.
                                Senior Bioinformatics Scientist -
                            Functional Genomics and Systems Biology

                                       CLCbio, Denmark
Thursday, February 14, 13
Michael’s Social Network (partial)




Thursday, February 14, 13
No more
                      cargo-cult




  http://en.wikipedia.org/wiki/Cargo_cult_science
  http://en.wikipedia.org/wiki/Cargo_cult
Thursday, February 14, 13
Form follows function


             http://www.youtube.com/watch?v=pQHX-SjgQvQ




                  Do not follow empty ancient rituals that do not serve a useful purpose anymore!
              Do NOT confuse the container with its content. Database systems are NOT the DATA!
Thursday, February 14, 13
Data Integration
               • involves combining data
               • residing in different sources and
               • providing users with a unified view [...]

               (combining research
               results from different
               bioinformatics repositories,
               for example)

               http://en.wikipedia.org/wiki/Data_integration

Thursday, February 14, 13
•
      Different Levels of Resolution
                                           Ecosystem

                                       •   Population

                                       •   Organism

                                       •   Organ

                                       •   Tissue

                                       •   Cell

                                       •   Organelle

                                       •   Complexes

                                       •   Assemblies

                                       •   Molecule

                                       •   Atoms
                                       www.sciencephoto.com
Thursday, February 14, 13
Different experimental sources




  Kühner et al. “Proteome organization in a genome-reduced bacterium.”
  Science (2009) vol. 326 (5957) pp. 1235
Thursday, February 14, 13
Thursday, February 14, 13
www.abcam.com/cancer




 Henning Stehr*, Seon-Hi J. Jang*, Jose M. Duarte, Christoph Wierling, Hans Lehrach, Michael Lappe, Bodo M.H. Lange
(2011) "The structural impact of cancer-associated mutations in oncogenes and tumor suppressors" Molecular Cancer
Thursday, February 14, 13
www.abcam.com/cancer




               What are the typical mechanisms at the structural level
                  that cause the de/activation of cancer genes?




 Henning Stehr*, Seon-Hi J. Jang*, Jose M. Duarte, Christoph Wierling, Hans Lehrach, Michael Lappe, Bodo M.H. Lange
(2011) "The structural impact of cancer-associated mutations in oncogenes and tumor suppressors" Molecular Cancer
Thursday, February 14, 13
Mapping mutations to
                            (modelled) structures
                             ERBB2            MLH1




Thursday, February 14, 13
Structural Analysis
                     surface vs. core - binding site - stability - clustering ...




ERBB2                                                                               MLH1

Thursday, February 14, 13
A simple yet robust classification
                            IN-




Thursday, February 14, 13
• Oncogenes                •   Tumor-suppressor genes
 activating gain-of-function   de-activating loss-of-function
 mutations (surface, near      mutations (in the core,
 functional/binding sites)     destabilising the structure)




ERBB2                                                  MLH1

Thursday, February 14, 13
biological Networks -

         getting to grips
         with COMPLEXITY


                             Complex (biological) Systems as
                            Networks of Interacting Elements.


                   Graph
                                            Life is a graph! G=(V, E)
              records             records



     Nodes            organize   Relationships
     (Vertices)                     (Edges)

               have              have


                  Properties

Thursday, February 14, 13
The human disease network.
Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL.
Proc Natl Acad Sci U S A.
2007 May 22;104(21):8685-90.




Thursday, February 14, 13
Graph Databases
           Think of Graphs not as a visualization
           but as a DATA STRUCTURE




           http://en.wikipedia.org/wiki/Graph_database
           http://nosql-database.org/
           http://www.neo4j.org/learn#graphs


Thursday, February 14, 13
Proteins as                                    1a1m - (Ca 8 A)

                    ResidueInteractionGraphs -
                                                                        Anisotropic Network Model

                                                                              eigen-mode 3

                        capturing dynamics




1a1m (Xray)
 1jnj (NMR,
 20 models)




                                          oGNM: A protein dynamics online calculation engine using the Gaussian Network Model" Yang, L.-W.,
                                          Rader, A.J., Liu, X.,  Jursa, C.J., Chen S.C., Karimi, H, Bahar, I. Nucleic Acids Res, 34, W24-31, 2006
Thursday, February 14, 13
Geometry & Structure




PDB: 1KX5                                   http://vimeo.com/24047115
   S.Daujat, T. Weiss, F.Mohn, U.C.Lange, C.Ziegler-Birling, U.Zeissler, M.Lappe, D.Schubeler, M.E.Torres-Padilla, R.Schneider (2009). "H3K64 trimethylation
             marks heterochromatin and is dynamically remodeled during developmental reprogramming" Nature Structural and Molecular Biology
Thursday, February 14, 13
x-omics =
                Proteomics
                Metabolomics
                Regulation
                [...] +
                x-Seq Data




                                ChIP
                            =          RNA   BS   ...
Thursday, February 14, 13
x-omics =
                Proteomics
                Metabolomics
                Regulation
                [...] +
                x-Seq Data




                                ChIP
                            =          RNA   BS   ...
Thursday, February 14, 13
some challenges ...
                            different experiments, protocols, samples, coverage ...




                            isolated information silos
                            different data formats
                            mapping & identifier chaos
                            error propagation / annotation bottleneck
                            statistical criteria for (dis-)similarity
                            knowledge lock-up, literature access
                            redundancy / implicit co-ordination
                            TMI & essential info ?
Thursday, February 14, 13
                            ...
"Blind monks examining an elephant" by Itcho Hanabusa
    題「衆瞽探象之圖」。英一蝶(はなぶさ・いっちょう 1652 – 1724)の作。

Thursday, February 14, 13
Let’s move on ...
Thursday, February 14, 13
http://5stardata.info/

                                   5★ Open Data




                   Tim Berners-Lee, the inventor of the Web and Linked Data initiator,
                         suggested a 5 star deployment scheme for Open Data.
Thursday, February 14, 13
http://5stardata.info/

                                     5★ Open Data




                  ★ make your stuff available on the Web (whatever format) under an Open License

Thursday, February 14, 13
http://5stardata.info/

                                       5★ Open Data




                  ★ ★ make it available as structured data (machine REadable, e.g. Excel*)
                  * http://dontuseexcel.wordpress.com/2013/02/07/dont-use-excel-for-biological-data/
Thursday, February 14, 13
http://5stardata.info/

                                     5★ Open Data




                  ★ ★ ★ use non-proprietary Open Formats (e.g. CSV instead of Excel)

Thursday, February 14, 13
http://5stardata.info/

                                     5★ Open Data




                  ★ ★ ★ ★ use URIs to denote things, so that people can point at your stuff

Thursday, February 14, 13
http://5stardata.info/

                                    5★ Open Data




                  ★ ★ ★ ★ ★ Link your Data to other data to provide (networked)

Thursday, February 14, 13
http://5stardata.info/

                               5★ Open Data




Thursday, February 14, 13
Giant Global Graph
     important related concept that overlaps with GGG is that of the
     "Semantic Web" - relates to decentralized Information. (≄Web3.0)




Thursday, February 14, 13
Thursday, February 14, 13
The next Web of open, linked data:
                       Tim Berners-Lee on TED.com
                            http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html

         http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.html




Thursday, February 14, 13
Web of biological Data




                            linked open scientific data grass-roots movement

Thursday, February 14, 13
scale-free




                                                 Protein Interaction Networks




                                                                                    small-world
 Park, J., M. Lappe, et al. (2001). "Mapping protein family interactions: intramolecular and intermolecular
 protein family interaction repertoires in the PDB and yeast." Journal of Molecular Biology 307(3): 929-38
Thursday, February 14, 13
modelling information gain:
     Tandem-Affinity
  Purifications in-silico




Thursday, February 14, 13
modelling information gain:
     Tandem-Affinity
  Purifications in-silico




                              Michael Lappe and Liisa Holm
                              "Unraveling protein interaction networks
                              with near-optimal efficiency." (2004)
                              Nature Biotechnology 22(1): 98-103
Thursday, February 14, 13
Toward interoperable bioscience data
  Susanna-Assunta Sansone et al., Nature Genetics, Feb 2012

                                             “to make full use of research data, the
                                             bioscience community needs to adopt
                                             technologies and reward mechanisms that
                                             support interoperability and promote the
                                             growth of an open ‘data commoning’ culture.”

                                             The open source ISA metadata tracking tools
                                             facilitates standards compliant collection,
                                             curation, local management and reuse of
                                             datasets in an increasingly diverse set of life
                                             science domains.
                                             http://www.isa-tools.org/

                                             http://www.nature.com/ng/journal/v44/n2/pdf/ng.1054.pdf
Thursday, February 14, 13
Free your data ...
                            Biology and BioInformatics are data-driven sciences

                            think beyond your own harddrive and the current paper
                            evaluate and embrace new technologies (LOD, GraphDBs)
                            rethink current incentive systems : no more cargo-cult

                            make it useful, re-useable
                            and sustainable

                            Open Access, Open Source
                            Open Linked Data Mash-Ups

                            focus on your science
Thursday, February 14, 13
Thank you!




                                wood engraving by an unknown artist, in “L'atmosphère:
                                 météorologie populaire” (1888) Camille Flammarion




Hubble Space Telescope / NASA
Thursday, February 14, 13

Contenu connexe

Similaire à X-omics Data Integration Challenges

Penev, L et al. Publ Dissem Data Zookeys 06 01 09
Penev, L et al. Publ Dissem Data Zookeys 06 01 09Penev, L et al. Publ Dissem Data Zookeys 06 01 09
Penev, L et al. Publ Dissem Data Zookeys 06 01 09
Tom Moritz
 
The infrastructure crisis of science
The infrastructure crisis of scienceThe infrastructure crisis of science
The infrastructure crisis of science
Björn Brembs
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
Monica Munoz-Torres
 

Similaire à X-omics Data Integration Challenges (20)

Accomplishments And Challenges In Bioinformatics
Accomplishments And Challenges In BioinformaticsAccomplishments And Challenges In Bioinformatics
Accomplishments And Challenges In Bioinformatics
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
Penev, L et al. Publ Dissem Data Zookeys 06 01 09
Penev, L et al. Publ Dissem Data Zookeys 06 01 09Penev, L et al. Publ Dissem Data Zookeys 06 01 09
Penev, L et al. Publ Dissem Data Zookeys 06 01 09
 
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebRetrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
 
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
 
Using SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and MetadataUsing SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and Metadata
 
Thesis def
Thesis defThesis def
Thesis def
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
The infrastructure crisis of science
The infrastructure crisis of scienceThe infrastructure crisis of science
The infrastructure crisis of science
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
 
Molecular and data visualization in drug discovery
Molecular and data visualization in drug discoveryMolecular and data visualization in drug discovery
Molecular and data visualization in drug discovery
 
Michelangelo Ceci – Tecniche di data-mining per la caratterizzazione di entit...
Michelangelo Ceci – Tecniche di data-mining per la caratterizzazione di entit...Michelangelo Ceci – Tecniche di data-mining per la caratterizzazione di entit...
Michelangelo Ceci – Tecniche di data-mining per la caratterizzazione di entit...
 
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the WebBeyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
 
Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013
 
Reproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience SeminarReproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience Seminar
 
The State of Open Research Data - OpenCon 2014
The State of Open Research Data - OpenCon 2014The State of Open Research Data - OpenCon 2014
The State of Open Research Data - OpenCon 2014
 
The State of Open Research Data
The State of Open Research DataThe State of Open Research Data
The State of Open Research Data
 
Zookeyeditorial
ZookeyeditorialZookeyeditorial
Zookeyeditorial
 
eScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-BrazileScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-Brazil
 
Dynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical CommunicationsDynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical Communications
 

Plus de COST action BM1006

An Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network ApproachAn Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network Approach
COST action BM1006
 
Reverse-engineering techniques in Data Integration
Reverse-engineering techniques in Data IntegrationReverse-engineering techniques in Data Integration
Reverse-engineering techniques in Data Integration
COST action BM1006
 
from B-cell Biology to Data Integration
from B-cell Biology to Data Integrationfrom B-cell Biology to Data Integration
from B-cell Biology to Data Integration
COST action BM1006
 
Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...
Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...
Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...
COST action BM1006
 
Integrative Analysis of Epigenomics and miRNA data in Immune System Models
Integrative Analysis of Epigenomics and miRNA data in Immune System ModelsIntegrative Analysis of Epigenomics and miRNA data in Immune System Models
Integrative Analysis of Epigenomics and miRNA data in Immune System Models
COST action BM1006
 
Proteomics analysis: Basics and Applications
Proteomics analysis: Basics and ApplicationsProteomics analysis: Basics and Applications
Proteomics analysis: Basics and Applications
COST action BM1006
 
Metabolomics: data acquisition, pre-processing and quality control
Metabolomics: data acquisition, pre-processing and quality controlMetabolomics: data acquisition, pre-processing and quality control
Metabolomics: data acquisition, pre-processing and quality control
COST action BM1006
 

Plus de COST action BM1006 (10)

An Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network ApproachAn Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network Approach
 
Reverse-engineering techniques in Data Integration
Reverse-engineering techniques in Data IntegrationReverse-engineering techniques in Data Integration
Reverse-engineering techniques in Data Integration
 
from B-cell Biology to Data Integration
from B-cell Biology to Data Integrationfrom B-cell Biology to Data Integration
from B-cell Biology to Data Integration
 
Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...
Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...
Mechanisms of Asthma and Allergy (MeDALL): from population based birth cohort...
 
Integrative Analysis of Epigenomics and miRNA data in Immune System Models
Integrative Analysis of Epigenomics and miRNA data in Immune System ModelsIntegrative Analysis of Epigenomics and miRNA data in Immune System Models
Integrative Analysis of Epigenomics and miRNA data in Immune System Models
 
Proteomics analysis: Basics and Applications
Proteomics analysis: Basics and ApplicationsProteomics analysis: Basics and Applications
Proteomics analysis: Basics and Applications
 
Metabolomics Data Analysis
Metabolomics Data AnalysisMetabolomics Data Analysis
Metabolomics Data Analysis
 
Metabolomics: data acquisition, pre-processing and quality control
Metabolomics: data acquisition, pre-processing and quality controlMetabolomics: data acquisition, pre-processing and quality control
Metabolomics: data acquisition, pre-processing and quality control
 
RNA-seq Analysis
RNA-seq AnalysisRNA-seq Analysis
RNA-seq Analysis
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 

X-omics Data Integration Challenges

  • 1. x-omics Data Integration Challenges Dr. Michael Lappe, Ph.D. Senior Bioinformatics Scientist - Functional Genomics and Systems Biology CLCbio, Denmark Thursday, February 14, 13
  • 2. Michael’s Social Network (partial) Thursday, February 14, 13
  • 3. No more cargo-cult http://en.wikipedia.org/wiki/Cargo_cult_science http://en.wikipedia.org/wiki/Cargo_cult Thursday, February 14, 13
  • 4. Form follows function http://www.youtube.com/watch?v=pQHX-SjgQvQ Do not follow empty ancient rituals that do not serve a useful purpose anymore! Do NOT confuse the container with its content. Database systems are NOT the DATA! Thursday, February 14, 13
  • 5. Data Integration • involves combining data • residing in different sources and • providing users with a unified view [...] (combining research results from different bioinformatics repositories, for example) http://en.wikipedia.org/wiki/Data_integration Thursday, February 14, 13
  • 6. Different Levels of Resolution Ecosystem • Population • Organism • Organ • Tissue • Cell • Organelle • Complexes • Assemblies • Molecule • Atoms www.sciencephoto.com Thursday, February 14, 13
  • 7. Different experimental sources Kühner et al. “Proteome organization in a genome-reduced bacterium.” Science (2009) vol. 326 (5957) pp. 1235 Thursday, February 14, 13
  • 9. www.abcam.com/cancer Henning Stehr*, Seon-Hi J. Jang*, Jose M. Duarte, Christoph Wierling, Hans Lehrach, Michael Lappe, Bodo M.H. Lange (2011) "The structural impact of cancer-associated mutations in oncogenes and tumor suppressors" Molecular Cancer Thursday, February 14, 13
  • 10. www.abcam.com/cancer What are the typical mechanisms at the structural level that cause the de/activation of cancer genes? Henning Stehr*, Seon-Hi J. Jang*, Jose M. Duarte, Christoph Wierling, Hans Lehrach, Michael Lappe, Bodo M.H. Lange (2011) "The structural impact of cancer-associated mutations in oncogenes and tumor suppressors" Molecular Cancer Thursday, February 14, 13
  • 11. Mapping mutations to (modelled) structures ERBB2 MLH1 Thursday, February 14, 13
  • 12. Structural Analysis surface vs. core - binding site - stability - clustering ... ERBB2 MLH1 Thursday, February 14, 13
  • 13. A simple yet robust classification IN- Thursday, February 14, 13
  • 14. • Oncogenes • Tumor-suppressor genes activating gain-of-function de-activating loss-of-function mutations (surface, near mutations (in the core, functional/binding sites) destabilising the structure) ERBB2 MLH1 Thursday, February 14, 13
  • 15. biological Networks - getting to grips with COMPLEXITY Complex (biological) Systems as Networks of Interacting Elements. Graph Life is a graph! G=(V, E) records records Nodes organize Relationships (Vertices) (Edges) have have Properties Thursday, February 14, 13
  • 16. The human disease network. Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL. Proc Natl Acad Sci U S A. 2007 May 22;104(21):8685-90. Thursday, February 14, 13
  • 17. Graph Databases Think of Graphs not as a visualization but as a DATA STRUCTURE http://en.wikipedia.org/wiki/Graph_database http://nosql-database.org/ http://www.neo4j.org/learn#graphs Thursday, February 14, 13
  • 18. Proteins as 1a1m - (Ca 8 A) ResidueInteractionGraphs - Anisotropic Network Model eigen-mode 3 capturing dynamics 1a1m (Xray) 1jnj (NMR, 20 models) oGNM: A protein dynamics online calculation engine using the Gaussian Network Model" Yang, L.-W., Rader, A.J., Liu, X.,  Jursa, C.J., Chen S.C., Karimi, H, Bahar, I. Nucleic Acids Res, 34, W24-31, 2006 Thursday, February 14, 13
  • 19. Geometry & Structure PDB: 1KX5 http://vimeo.com/24047115 S.Daujat, T. Weiss, F.Mohn, U.C.Lange, C.Ziegler-Birling, U.Zeissler, M.Lappe, D.Schubeler, M.E.Torres-Padilla, R.Schneider (2009). "H3K64 trimethylation marks heterochromatin and is dynamically remodeled during developmental reprogramming" Nature Structural and Molecular Biology Thursday, February 14, 13
  • 20. x-omics = Proteomics Metabolomics Regulation [...] + x-Seq Data ChIP = RNA BS ... Thursday, February 14, 13
  • 21. x-omics = Proteomics Metabolomics Regulation [...] + x-Seq Data ChIP = RNA BS ... Thursday, February 14, 13
  • 22. some challenges ... different experiments, protocols, samples, coverage ... isolated information silos different data formats mapping & identifier chaos error propagation / annotation bottleneck statistical criteria for (dis-)similarity knowledge lock-up, literature access redundancy / implicit co-ordination TMI & essential info ? Thursday, February 14, 13 ...
  • 23. "Blind monks examining an elephant" by Itcho Hanabusa 題「衆瞽探象之圖」。英一蝶(はなぶさ・いっちょう 1652 – 1724)の作。 Thursday, February 14, 13
  • 24. Let’s move on ... Thursday, February 14, 13
  • 25. http://5stardata.info/ 5★ Open Data Tim Berners-Lee, the inventor of the Web and Linked Data initiator, suggested a 5 star deployment scheme for Open Data. Thursday, February 14, 13
  • 26. http://5stardata.info/ 5★ Open Data ★ make your stuff available on the Web (whatever format) under an Open License Thursday, February 14, 13
  • 27. http://5stardata.info/ 5★ Open Data ★ ★ make it available as structured data (machine REadable, e.g. Excel*) * http://dontuseexcel.wordpress.com/2013/02/07/dont-use-excel-for-biological-data/ Thursday, February 14, 13
  • 28. http://5stardata.info/ 5★ Open Data ★ ★ ★ use non-proprietary Open Formats (e.g. CSV instead of Excel) Thursday, February 14, 13
  • 29. http://5stardata.info/ 5★ Open Data ★ ★ ★ ★ use URIs to denote things, so that people can point at your stuff Thursday, February 14, 13
  • 30. http://5stardata.info/ 5★ Open Data ★ ★ ★ ★ ★ Link your Data to other data to provide (networked) Thursday, February 14, 13
  • 31. http://5stardata.info/ 5★ Open Data Thursday, February 14, 13
  • 32. Giant Global Graph important related concept that overlaps with GGG is that of the "Semantic Web" - relates to decentralized Information. (≄Web3.0) Thursday, February 14, 13
  • 34. The next Web of open, linked data: Tim Berners-Lee on TED.com http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.html Thursday, February 14, 13
  • 35. Web of biological Data linked open scientific data grass-roots movement Thursday, February 14, 13
  • 36. scale-free Protein Interaction Networks small-world Park, J., M. Lappe, et al. (2001). "Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast." Journal of Molecular Biology 307(3): 929-38 Thursday, February 14, 13
  • 37. modelling information gain: Tandem-Affinity Purifications in-silico Thursday, February 14, 13
  • 38. modelling information gain: Tandem-Affinity Purifications in-silico Michael Lappe and Liisa Holm "Unraveling protein interaction networks with near-optimal efficiency." (2004) Nature Biotechnology 22(1): 98-103 Thursday, February 14, 13
  • 39. Toward interoperable bioscience data Susanna-Assunta Sansone et al., Nature Genetics, Feb 2012 “to make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open ‘data commoning’ culture.” The open source ISA metadata tracking tools facilitates standards compliant collection, curation, local management and reuse of datasets in an increasingly diverse set of life science domains. http://www.isa-tools.org/ http://www.nature.com/ng/journal/v44/n2/pdf/ng.1054.pdf Thursday, February 14, 13
  • 40. Free your data ... Biology and BioInformatics are data-driven sciences think beyond your own harddrive and the current paper evaluate and embrace new technologies (LOD, GraphDBs) rethink current incentive systems : no more cargo-cult make it useful, re-useable and sustainable Open Access, Open Source Open Linked Data Mash-Ups focus on your science Thursday, February 14, 13
  • 41. Thank you! wood engraving by an unknown artist, in “L'atmosphère: météorologie populaire” (1888) Camille Flammarion Hubble Space Telescope / NASA Thursday, February 14, 13