Swoogle is a search engine and crawler for ontologies, documents, terms and data published on the Semantic Web. It crawls and indexes documents written in RDF and OWL and provides search services through a web interface and web services. Its objectives are to organize the physically distributed Semantic Web documents in a systematic way so that humans and agents can easily search and query the repository. It allows users to search for existing ontologies matching their needs and domains before creating new ones.
2. What is Swoogle?
• Started as a research project of the Ebiquity research group in
University of Maryland
• Swoogle is a search engine for Semantic Web ontologies, documents,
terms and data published on the Web
• A distribute online repository of SWDs
• a crawler based indexing and retrieval system for Semantic Web
• crawls and discovers documents written in RDF,OWL
• provides services to human users through a browser interface and to
software agents via RESTful web services
3. Objective of Swoogle
• More and more SWDs, both ontologies and instances physically
distributed all over the web
• A retrieval system that organizes these documents in a systematic
way
• Both humans and agents can easily conduct searches and queries
against this repository
4. Why we use Swoogle?
• Avoid creating new ontologies
• Need for reuse
5. Services
• Search Semantic Web ontologies
• Search Semantic Web instance data
• Search Semantic Web terms, i.e., URIs that have been defined as
classes and properties
• Provide metadata of Semantic Web documents and support browsing
the Semantic Web
• Archive different versions of Semantic Web documents
6. What Swoogle search?
• Find if suitable ontologies matching the user’s need already exist
within underlying domain
• User inputs specific term
• Swoogle replies with existing ontologies that also use the term
entered
• Follow the link and see whether the provided ontology satisfies the
need
• Query SWDs with constraints on classes and properties used by them
8. Swoogle Architecture
• SWD discovery component
• Metadata creation component
• Data analysis component
• Indexation and retrieval component
• User interface
9. Swoogle Crawler
• Crawler visits the web to collect SWDs, ignoring all other documents
(html, pdf, image files)
• For each SWD discovered, Swoogle extracts metadata from the
document and indexes it into an information retrieval system for later
searches and queries
10. How does Swoogle crawl the semantic web?
• Manual submission
• Google-based meta-crawling
• Bounded HTML crawling
• RDF crawling
11. Check URL
• Semantic Web archive service to help users
• check if a URL has been indexed
• track the previous versions of the Semantic Web document retrieved from the
URL
12.
13. Submit URL
• Submit a new URL or a web page containing hyperlinks to user’s
Semantic Web documents
• Swoogle will run regular crawling starting from the provided URL
14.
15. • Index is being updated regularly
• Already defined or outdated URL submissions are ignored
• Documents that are not accessible will eventually be removed from
the database
Because if everyone come up with new one, no shared understanding about anything, no interoperability between two agents
entire Swoogle website is based on the web services as well
Swoogle uses google to find SWDs, google provides API to add constraints to a search . Files with .rdf .owl
Jena confirms that docs are SWDs
Software agents and services can use swoogle APIs
two crawlers; focused crawler and swooglebot. keep updated information about SWDs
creates metadata for each SWD for necessary computations and navigation
calculate relationships and ranks of SWDs(OntoRank and TermRank)
Currently, Swoogle only indexes some metadata (namespace and localname)about Semantic Web documents. It neither stores nor searches all triples in an Semantic Web documents as a triple store.