SADI for GMOD is a collection of ready-made SADI services for accessing sequence feature data in RDF form. The services were developed as an add-on for the GMOD (Generic Model Organism Database) project, which is a popular toolkit for building model organism databases and their associated websites (e.g. FlyBase).
Powerpoint exploring the locations used in television show Time Clash
SADI for GMOD: Bringing Model Organism Data onto the Semantic Web
1. SADI for GMOD:
Bringing Model Organism
Databases onto the
Semantic Web
Ben Vandervalk, Luke McCarthy, Edward
Kawas, Mark Wilkinson
James Hogg Research Centre, Heart + Lung Institute
University of British Columbia
http://code.google.com/p/sadi/wiki/SADIforGMOD
2. SADI for GMOD: Background
SADI (Semantic Automated Discovery and
Integration)
• Standard for Web services that consume/generate
RDF
• Motivation: automated integration of bioinformatics
data and software
GMOD (Generic Model Organism Database)
• Toolkit for building a model organism database and
website
• Collection of related open source projects: e.g.
Chado, Gbrowse, Pathway Tools
• Many sites use GMOD components: FlyBase,
BeetleBase, DictyBase, etc.
3. SADI in a Nutshell
• to invoke a SADI service:
o HTTP POST an RDF document to the service URI
o e.g. $ curl --data-binary @input.rdf
http://sadiframework.org/examples/hello
• to get service metadata:
o HTTP GET on service URL
o returns an RDF document with service name, description, etc.
o e.g. $ curl http://sadiframework.org/examples/hello
• structure of input/output data is described in OWL
o service provider specifies one input OWL class and one output OWL class
• strengths of SADI
o no framework-specific messaging formats or ontologies
o supports batch processing of inputs
o supports long-running services (asynchronous services)
more info: http://sadiframework.org/
4. SADI for GMOD
• SADI services for accessing sequence feature data
• implemented as Perl CGI scripts
Service Name Input Relationship Output
get_feature_info database identifier is about feature description
get_features_ genomic coordinates overlaps collection of feature
overlapping_region descriptions
get_sequence_ genomic coordinates is represented by DNA, RNA, or amino
for_region acid sequence
get_child_features feature description has part / derives into collection of feature
descriptions
get_parent_feature feature description is part of / derives collection of feature
from descriptions
5. SADI for GMOD: Structure of Service
Input/Output RDF
Input RDF (N3) Output RDF (N3)
@prefix lsrn: <http://purl.oclc.org/SADI/LSRN/> . @perefix lsrn: <http://purl.oclc.org/SADI/LSRN/> .
@prefix GeneID: <http://lsrn.org/GeneID:> . @prefix GeneID: <http://lsrn.org/GeneID:> .
@prefix FlyBase: <http://flybase.org/cgi-bin/sadi.gmod/feature?
GeneID:49962 id=> .
a lsrn:GeneID_Record; @prefix GenBank: <http://lsrn.org/GB:> .
sio:SIO_000008 [ # p = 'has attribute'
a lsrn:GeneID_Identifier; # p = 'is about'
sio:SIO_000300 "49962" # p = 'has value' GeneID:49962 sio:SIO_000332 FlyBase:FBgn0040037 .
] .
# feature
FlyBase:FBgn0040037
a SO:SO_0000704 . # o = 'gene'
range:position [
HTTP a range:RangedSequencePosition;
sio:SIO_000053 . # p = 'has proper part'
POST [ a range:StartPosition; sio:SIO_000300 26994];
sio:SIO_000053 . # p = 'has proper part'
[ a range:EndPosition; sio:SIO_000300 32391];
range:in_relation_to _:minus_strand_seq
] .
_:minus_strand_seq
sio:SIO_000011 [ # p = 'represents'
a strand:MinusStrand;
sio:SIO_000093 GenBank:AE014135 # p = 'is proper part of'
] .
# reference feature (chromosome)
FlyBase:4 # chromosome 4
get_feature_info a SO:SO_0000105 . # o = 'chromosome arm'
8. Acknowledgements
Team
Mark Wilkinson: Principal Investigator
Luke McCarthy: Lead Programmer, SADI & SHARE
Edward Kawas: Perl Programmer, SADI
Funding
Microsoft
Research
http://sadiframework.org/