2. Carole Goble
Stuart Owen
Jacky Snoep
Wolfgang
Mueller
Olga Krebs Quyen Nguyen
Natalie
Stanford
Katy WolstencroftPeter Kunszt Bernd Rinn
also contributing:
VLN SEEK team
also contributing:
UK SEEK team
9. This all leads to issues with conveying
what a project has achieved to
funders. • Papers?
• Data produced?
• Discoveries?
• Presentations?
• Workshops?
• Tutorials?
Defining success
and impact of
project.
10. We need better ways of
formatting, storing, and sharing
data and models.
11. SEEK is a commons originally designed for
centralizing information and assets for large
consortia projects.
13. …and their data and models are uploaded
to projects within the SEEK database.
14. SEEK has varied functionality.
Yellow pages,
manage SOPs and
link to investigations,
studies, assays,
specimens and
samples.
Find my
peers.
Creating and
sharing SOPs
across
projects.
Track my
specimens.
Track
different
versions of
my model.
Data viewing
functionality; ISA
framework for linking
studies to data,
models, SOPs,
samples,
publications.
Browse
experimental data
without
downloading
them.
How data, models
and SOPs fit
together.
Which data
belong with
which
publication.
15. It works as aggregated asset manager,
allowing storage on SEEK, or linking assets
from disparate databases.
16. Investigation:
Glucose metabolism in P.
falciparum trophozoites
Study:
Model construction
Study:
Model validation
Assay: LDH
Assay: PK
Assay: ENO
Assay: PGM
Assay: PGK
Assay: GAPDH
Assay: TPI
Assay: ALD
Assay: PFK
Assay: PGI
Assay: HK
Assay: GLCtr
Assay: PYRtr
Assay: LACtr
Assay: G3PDH
Assay: GLYtr
Assay: ATPase
Data: GLCtr
Model: GLCtr
Data: HK
Model: HK
Steady state
Incubation
penkler1
Validation data
penkler2
Validation data
...
...
SOP: GLCtr
SOP: HK
...
SOP: Validation
Assay: Culturing
Assay: Lysate prep.
SOP: Culturing
SOP: Lysate prep.
It allows published work and all associated
data and files to be organised in an ISA
(Investigation, Study, Assay) format.
22. There are many Systems Biology
standards available.
Minimal
Information
Models
Standard
Formats
Ontologies
Data Models Simulation Results
[Nicolas Le Novere]
MAGE-TABStandard
Formats
RDF annotations
23. ..But, the barrier to standard formats
and annotation usage by researchers
can seem great.
26. We use it to generate templates for different
types of assay data.
Excel workbook loaded into
RightField with multiple
worksheets
27. Suitable ontologies are selected and used
to annotate cells for associated data input.
Selected parent term
from the ontology
Methods for specifying
ontology terms
Term lists for
selected cells
Value Type
and Property
28. Scientists are able to use the templates in
Excel, where the annotations take the form
of drop down menus or data entry cells.
29. The usage of tools like RightField are
reducing the uptake barriers for generating
formatted and annotated data and models.
30.
31. “Ruin is the destination toward which
all men rush, each pursuing his own
best interest in a society that believes
in the the freedom of the commons.”
- Garrett Hardin, The Tragedy of the Commons.
32. To find out more about FAIRdom
please visit our website.
www.fair-dom.org
Notes de l'éditeur
What doesn’t this data tell us? Whether it is experimental data or model data. What the reactions/species mean. If it is an experiment what type of experiment was used. Was there an SOP associated with it? Etc.
What doesn’t this data tell us? Whether it is experimental data or model data. What the reactions/species mean. If it is an experiment what type of experiment was used. Was there an SOP associated with it? Etc.
Linking methods with data and linking models with data
We adopted the ISA framework – Investigation, Study and Assay – which provides a scaffold and experimental context for linking data and models.
We also include and link Standard Operating Procedures (SOPs). We currently don’t do any RDF generation from the contents of these, but hope to in the future.
Lots of use of spreadsheets
“Schema.org for Systems Biology”
What doesn’t this data tell us? Whether it is experimental data or model data. What the reactions/species mean. If it is an experiment what type of experiment was used. Was there an SOP associated with it? Etc.
RightField is an Adminstrator’s tool to be used by an informatician. The user of the spreadsheets need never see this and the scary ontology stuff.
Uses OWL API
there is an upper memory limit on the size of an ontology RightField can open, due to limitations in the OWL API we use. However, once the ontology is open in RightField it doesn't have an impact on the saved spreadsheet. RF doesn't store the entire ontology inside the spreadsheet, but just the sets of terms used for the annotations and reference to the originating ontology and version. When the list of terms for a given cell becomes very long, we do have a problem with the dropdown box becoming unmanageable. To solve this we are looking at adding a feature that converts it into an auto-completion type cell rather than a dropdown box. Since this would involve a macro or plugin (which we like to avoid) we would make this an optional and explicit option by the user.
Multiple ontologies
Scientist never sees RightField, but just your normal basic Excel spreadsheet – with dropdown boxes for controlled terms (or text boxes for literals). By default we highlight the marked up cells in yellow, but this can be changed, and even the cells moved about without affecting the tracking of the ontologies or terms used.
Value proposition to users