[2024]Digital Global Overview Report 2024 Meltwater.pdf
Federated microarray gene expression repository using MOLGENIS and MAGE-TAB
1. Towards a federated microarray gene expression repository using MOLGENIS and MAGE-TAB AlexandrosKanterakis, Tomasz Adamusiak, JuhaMuilu, Helen Parkinson, DespoinaAntonakaki, Morris A. Swertz
2. About BBMRI-NL Biobank research infrastructure Exploit the wealth of information in microarray and GWAS Data currently fragmented between individual biobanks (>6500) samples
3. Objectives (1/2) Establish: web-based national repository for microarray gene expression data Populate: with well-annotated microarray experiments Share: the software as ‘microarray database in-a-box’ such that all BBMRI biobanks can reuse it locally Requirements Interfaces Programmatic Interfaces Extendable Data federation User Interface Analysis Protocols Diverging local needs
4. Objectives (2/2) Combine gene expression data from multi-platform microarray experiments with GWAS studies in order to create novel eQTL datasets for complex diseases +
5. MAGE-TAB (1/2) MAGE-TAB: simple, human readable, tab-delimited. Comprised by 4 parts: Investigation Description Format (IDF). General information, contact details, bibliographic references,... Array Design Format (ADF). What sequence is located at each position on an array and what the annotation of this sequence is. Raw and processed data files. ASCII or binary files. 2006
6. MAGE-TAB (2/2) Sample and Data Relationship Format (SDRF). Relationships between samples, arrays, extracts, hybridizations and other objects used in the investigation.
7. MAGE-TAB Object Model From MAGE-TAB specifications we created a data model* in XML format.. .. and parsers for MAGE-TAB files. http://www.mged.org/mage-tab/MAGE-TABv1.0.pdf http://magetab-om.sourceforge.net/magetab_idf.xml *data model is the set of definitions of classes, elements and properties of the data
9. MOLGENIS MAGE-TAB From MAGE-TAB Object Model we created a web environment for managing Microarray Experiments: 850 lines of maintainable code 60K lines of automatic generated code
The envisioned system should include suitable user interfaces for researchers, programmatic interfaces for analysis protocols and data federation, and should be easily extended to accommodate diverging local needs. None of the available (open source) systems seemed to provide this and meanwhile GEN2PHEN [2] started ‘database-in-a-box’ projects including a microarray system based on the MAGE-TAB file format [3] and the MOLGENIS [4,5] biosoftware platform. BBMRI-NL chose to sponsor this project with the following results: