Libraries play a critical role in the dissemination of Knowledge and serve as repositories of knowledge. Internet has been instrumental in delivering digital information worldwide. With the advent of semantic web, there has been a paradigm shift from digital libraries (DL) to semantic digital libraries (SemDL) to address the issues and challenges of DL. Semantic Web aims at transforming the current web, dominated by unstructured and semi-structured documents into a "web of data". SemDL is an initiative that allows the system to assist end users in retrieval of the most relevant content with respect to a description of their information needs. Semantic techniques have been considered from the perspective of DL and how it enhances the functioning of DLs. The key players of the semantic techniques in the context of DL are XML, XML Schema, RDF, RDF Schema, Web Ontology Language (OWL). Jerome DL is one such success story that shows the potentials of the semantic techniques to sort the problems/challenges of DL and how it improves browsing and searching of resources. The future tends to focus more keenly on the sharing of user knowledge and not merely Information Retrieval. One of the obvious outcomes is the advent of Social Semantic Digital Library (SSDL) to improve user benefits by empowering user interfaces and social networking.
3. “My two favourite things in life are libraries and
bicycles. They both move people forward
without wasting anything”
~Peter Golkin~
4.
5. What is a library?
A building or room containing
collections of
books, periodicals, and sometimes
films and recorded music for use
or borrowing by the public or the
members of an institution.
6. An electronic library (colloquially referred
to as a digital library) is a focused
collection of digital objects that can
include text, visual, audio, video
materials, stored as electronic
media formats along with means for
organizing, storing, and retrieving them.
10. The Semantic Web is an extension of the current
web in which information is given well-defined
meaning, better enabling computers and people to
work in co-operation. [Tim Berners-Lee , 2001]
11.
12. How is semantics related to Web?
The Semantic Web aims at converting the current
web, dominated by unstructured and semi-
structured documents into a "web of data“.
18. The content of these
site is fine but there
are no linkages
between the data
So visitor find it
difficult
to get all the
information
they need quickly
and easily
20. Using code we can create
relationships between
websites, people and
events…
These can then be
understood by the browser
and interpreted in a
helpful way
21. The semantic web
With all this data being
able to be displayed
simply it provides a much
richer user experience and
offers information that
previously might not have
been exposed.
22. Well that’s exciting stuff. But how do
we go about getting onboard with
semantic web?
There are a few different ways we’ll
have a quick and brief look at it
26. Tim
World Wide Web or WWW or Web as defined in
Wikipedia
It is a system of interlinked hypertext documents
accessed via the Internet.
Who proposed it ? Tim Berners-
Lee
INTERNET WEB
Network of Computers Service which runs on the
network
• Tim Berners-Lee proposed
WWW
1989
27. Web 1.0
ScreenShot
of msn.com
from the
year
1995
Most of the pages were static
There were only images(mostly animate GIFs..) and hyperlinks
Readers or Users were unable contribute to the site
28. Web 2.0
Pages are made dynamic
Users or readers are allowed to participate with the website
and contribute their views to web.
Technologies widely used in Web 2.0
30. Towards a Semantic Web
-Tasks often require to combine data on the Web
-Humans combine these information easily
-Sort catalogues on the Web environment
-Making the web more meaningful
32. How HTML5 helps to make a semantic web
<dl>
<dt>Name<dt>
<dd>Mark Zuckerberg</dd>
<dt>Position</dt>
<dd><span >Developer advocate</span> for <span>Google, Inc.<span></dd>
</dl>
Part of a normal webpage
Make it more meaning full…
<dl>
<dt>Name<dt>
<dd itempprop=“name”>Mark Zuckerberg</dd>
<dt>Position</dt>
<dd><span itempprop=“title”>CEO</span> for <span
itempprop=“company”>facebook</dd>
</dl>
33.
34. Towards Semantic Digital Library-
Examples Library Services Semantic Web
Response to
information
abundance
Library to digital library is
developed since the
abundance of information
increased
Semantic Web was initiated as a means
to more effectively manage and take
advantage of the increased amount of
digital data
Missions
grounded in
service,
information
access, and
knowledge
discovery
Objectives, goals serve
the purpose to facilitate
information;
Semantic Web strives to allow data to be
shared and reused across applications,
enterprises, and community boundaries.
It is a collaborative effort led by W3C and
partners, based on the Resource
Description Framework (RDF)
Part of
society’s fabric
Part of life, for all walks, in
all types, physically and
virtually
Current Web is any indication of
Semantic Web‟s reach, which seems
quite logical, the Semantic Web will
surely impact millions of people‟s lives
daily.
35. Examples Library Services Semantic Web
Advancement
via
international
and national
standards
Libraries consolidated development of
cataloging codes; formalized classificatory and
verbal systems; and encoding/communication
standards (International Bibliographic
Description (ISBD) and Machine Readable
Cataloging (MARC), many metadata schemes,
Functional Requirements for Bibliographic
Records (1998), and Resource Description and
Access (RDA)
The Semantic Web has
followed a similar path as
evidenced by a collection of
information standards:
eXtensible Markup Language
(XML), RDF, OWL, Friend Of A
Friend (FOAF), and Simple
Knowledge Organizations
System (SKOS).
Collaborative
spirit
American
Library Association, Association of Library
Collections and Technical Services, Cataloging
and Classification Section (ALA/ALCTS/CCS),
committees review cataloging polices and
standards, and interact with international
organizations (e.g.., IFLA and the Dublin Core
Metadata Initiative).
All of the enabling
technologies/standards listed
above (RDF, OWL, FOAF, and
SKOS) have been developed
through working groups and
public calls for comment.. The
World Wide Web Consortium
(W3C), the home of the
Semantic Web, involves
academic, research, and
industry members
36. Semantic Web Development
Traditional Services Semantic Web Services
Collection development Semantic Web selection
Cataloging „Semantic metadata‟ representation
Reference Semantic Web reference service
Classification Knowledge representation
38. Two characteristics for the construction of the semantic web-
Downward compatibility
Agents fully aware of a layer should also be able to interpret and use
information written at lower levels. For example, agents aware of the
semantics of OWL can take full advantage of information written in RDF
and RDF Schema.
Upward partial understanding
On the other hand, agents fully aware of a layer should take at
least partial advantage of information at higher levels. For example, an agent
aware only of the RDF and RDF Schema semantics can interpret knowledge
written in OWL partly, by disregarding those elements that go beyond RDF and
RDF Schema.
Semantic web architecture
39. Semantic Web Stack
Semantic Web Stack
Illustrates the architecture of the semantic web
Show hierarchy of languages used to create semantic web
Shows how the technologies are organised to make
semantic web possible
41. UNICODE
Unicode provides a unique number for every character
no matter what the platform.
no matter what the program.
no matter what the language.
42.
43. URI
Uniform Resource Identifier (URI) is an Internet Standard. It's a string of
characters used to identify a name or a resource on the Internet. Such
identification enables interaction with representations of the resource over a
network (typically the World Wide Web) using specific protocols.
Schemes specifying a concrete syntax and associated protocols
define each URI.
It has several component parts:
A scheme name (http)
A domain name (www.xxx.com)
A path (/sa/edu/yuc/index.html)
An URI identifies a resource either by location, or a name, or both.
A URI has two specializations known as URL, URN .
44. XML: Extensible Markup Language
It is a general purpose markup Language for creating
specific purpose mark-up languages
Follows the SGML-standards (Standard Generlised
Markup Language)
With XML the single users can create their own tags
(which is not possible with HTML)
45.
46. RDF: Resource Description Framework
RDF is a general-purpose language for representing information in the web
Useful to represent metadata about Web resources
RDF describes resources (Both abstract or concrete subjects) identifiable
via an URI
The syntax of RDF is based on XML
RDF-documents are written as XML-documents with the tag rdf:RDF
47. RDF Statements
A RDF-statement is described by a triple (S, P, O)
S= Subject of the statement (It‟s a URIref)
P= Property (Predicate) of the statement (URIref)
O= Object
48. Graphical Representation of a RDF statement
(subject, predicate, object)
3/21/2014 48
https://www.facebook.
com/SudhirShivaramP
hotography
Sudhir Shivaram
creator
Resource
Property Type
Property
Value
49. RDF-Schema
RDF schema provides a way of building a object model from which the
actual data is referenced and which tells us what things really mean.
RDFS allows users to define resources with classes, properties and
values
This allows resources to be defined as instances of classes, and
subclasses of classes.
52. Ontology?
● Study of existence in philosophy.
● Ontology is the philosophical study of the
nature of being, becoming, existence, or
reality, as well as the basic categories of
being and their relations.
● A data model that represents knowledge as a
set of concepts within a domain and the
relationship between these concepts.
● It is concerned with the fundamental
questions of “what is being?” and “what kinds
of things are there?”
53. Example
● Ontology for a people of DRTC
PEOPLE
FACULTY
RES. SCH.
MASTER STU.
Prof A.Prof
JRF SRF 1st Year 2nd
Year
54. Ontology Vocabulary
● Vocabulary ==> context-less list of terms,
with no defined interrelationships.
● Ontology Vocabulary used to describe (a
particular view of) some domain.
-how concepts should be classified.
● Examples:-
1.Man
2.Vegetarian
3. Non-veg
55. SPARQL
● Simple Protocol and RDF Query Language
● SPARQL is SQL-like language, but uses
RDF triples and resources for both matching
part of the query and for returning results of
the query.
● Since both RDFS and OWL are built on
RDF, SPARQL can be used for querying
ontologies and knowledge bases directly as
well. Note that SPARQL is not only query
language, it is also a protocol for accessing
RDF data.
56. The Proof/Rule layer
● rule: informal notion
● rules are used to perform inference
over ontologies
● rules as a tool for capturing further
knowledge
(not expressible in OWL ontologies)
57. Rule Layer
● Natural form of expressing knowledge in
some domain of interest are rules that reflect
the notion of consequence.
● Rules come in the form of IF-THEN
constructs and allow to express various kinds
of complex statements.
● The IF part is also called the body of a
rule, while the THEN part is also called its
head.
58. Logic layer
● rules, have been formalised using logic to give
them a precise semantics.
● Without such a precise formalisation they are
vague and ambiguous, and thus problematic
for computational purposes.
● The most prominent and fundamental logical
formalism classically used for knowledge
representation is the “first-order predicate
calculus”, or first-order logic.
59. Cryptography layer
● Cryptography (Greek word: Kryptos, which
means hidden secrets)
● It is the practice and study of techniques for
secure communication in the presence of third
parties.
● For reliable inputs, cryptography means are to
be used, such as digital signatures for
verification of the origin of the sources.
60. The Trust layer
● SW top layer: support for provenance/trust
● Provenance: where does the information
come from?
● how this information has been obtained?
● can I trust this information?
64. BRICKSBRICKS
● Building Resources for Integrated Cultural
Knowledge Services (BRICKS) is an open-source
● software framework for the managementof
distributed digital assets.
65. The Fedora repository system is an open source, digital
object repository system using public APIs exposed as
web services.
Fedora
66. SIMILE = Semantic Interoperability of
Metadata In unLike Environments
motivated by DSpace, repository for
storing, indexing, preserving, and
redistributing digital assets,
jointly developed by Hewlett-Packard
Research Labs and the MIT Libraries.
SIMILE
67. Joint effort of DERI, National University of
Ireland, Galway and Gdansk University of
Technology (GUT).
Distributed under BSD Open Source license.
Digital library build on semantic web
technologies to answer requirements from:
librarians, scientists and communities.
JeromeDL
68. † JeromeDL has been installed in a
number of locations ; the two most
user are-
DERI Galway library
WBSS at Gdansk University of Technology
‡ serve their community of users in
everyday activities.
USERS
69.
70. Services in JeromeDL
JeromeDL allows librarians to maintain and use the
following controlled vocabularies:
authority files - with a list of authors, editors
and publishers;
Classification taxonomies- such as DMoz or DDC,
for annotating resources with topics;
WordNet dictionary, for specifying keywords
71. Structure Ontology
Modern digital library systems not only store
bibliographic metadata.
They also manage an electronic
representation of the content itself.
The structure of the content
might, however, depend on the type of the
resource.
72. The key feature of every digital library
system is-
making bibliographic resources
accessible which involves-
• domain (topic) categorization of a
resource from the WordNet dictionary.
Support for Legacy Information
73. Resource management
Each resource is described by the semantic
descriptions according to the JeromeDL core
ontology.
Additionally a fulltext index of the resource‟s
content and MARC21, and BibTEX
bibliographic descriptions are provided
75. MultiBeeBrowse allows to browse
unstructured metadata represented as an
RDF graph.
It consist of access to resource, search
services, filter service, similar service, related
service, combination service (
conjunction, sum, difference,binding, on two
given sets of results).
MultiBeeBrowse (MBB)
76.
77. Communication link
The content of the JeromeDL database can
be searched not only through the web pages
of the digital library
But also from the other digital libraries
another web applications
78. Social bookmarking
Users can allow others to see their
bookmarks and annotations and share their
knowledge within a social network.
JeromeDL can also treat a single library
resource as a blog post.
Users can comment the content of the
resource and reply to others‟ comments and
this way create new knowledge.
79. Delicious is the leading bookmarking service to save,
organize, and discover interesting links on the web.
80. Bottom layer provides a service for a flexible
and extendable electronic representation of
objects;
The middle layer offers information retrieval
and identity management services.
Top layer in the semantic digital library stack
utilizes benefits from engaging community
of users into annotating and filtering
resources
Architecture
83. Bottom layer
● The bottom layer handle the physical representation of
resources, their structure and provenance.
● It provides a flexible and extendable electronic representation
of objects with its structure ontology.
85. Bottom layer (contd.)
● For knowledge organisation Jerome
DL provides
- authority files, with a list of
authors, editors and publishers;
- classification taxonomies, such as DMoz or
DDC, for annotating resources with topics;
- WordNet thesaurus, for specifying keywords
(domain categorisation).
96. MIDDLE LAYER
• Lifts up legacy bibliogrphic
description to semantic level;
• A mediation standard like MarcOnt
Ontology is used to dissolve the
problem of heterogeneity of
different standard (MARC
21, BibTex, Dublin Core).
97. Example of BibTex format
@article{ahu61,
author={Arrow, Kenneth J. and Leonid Hurwicz
and Hirofumi Uzawa},
title={Constraint qualifications in maximization
problems},
journal={Naval Research Logistics Quarterly},
volume={8},
year = 1961,
pages = {175-191}
}
101. Middle layer (contd.)
● JeromeDL delivers RDF query service to
be able to act as a mash-up sevice.
● Protocols like Z39.50, OAI-PMH are used
for the communication purpose.
● Natural language querry
“show me ... written by ...”
Regular Expression
● Tagstree map
112. Social Services in JeromeDL
• Involve users into sharing knowledge
– Blogs – comments and discussions about
documents and resources
– Tagging – collaborative classification
– Wikis – collaboratively edited additional
descriptions, such as summaries and
interesting facts.
113. FOAF - Describing Social Networks
• FOAF - Stands for Friend-of-a-Friend
• Defines properties for a person (but it does not have
to be a person, can be an “agent”)
• FOAFRealm
115. Identity management with
FOAFRealm
Distance between owner and requester
Friendship level between owner and requester, calculated
across digraph of social network
• Support for single registration and sign on
• Distributed identity management with HyperCuP (“D-FOAF”)
• FOAFRealm is currently implemented as a plugin for Tomcat
(Realm/Valve implementation), with PHP and .NET versions
coming soon
116. Social Networks in Digital Libraries
Resourc
e
xfoaf:Annotation
user_Ccreator_B
foaf:knows
marcont:hasCreator
creator_A
foaf:knows foaf:knows
xfoaf:Director
y
user_D
xfoaf:owns
xfoaf:linksTo
xfoaf:isIn
117. JeromeDL – Delivering Semantic Content
• Providing semantic annotations during uploading process:
– open module for handling any taxonomies
– keywords based on WordNet and free tagging.
– defining structure of resources in the JeromeDL ontology
• Social Semantic Collaborative Filtering:
• . Catalogs can include (transclusion) friend's catalogues
• Access to catalogues can be restricted with social networking-
based polices
• SSCF delivers:
– Community-oriented, semantically-rich taxonomies
– Information about a user's interest
– Flows of expertise from the domain expert
– Recommendations based on users previous actions
118. JeromeDL – Semantic Information In Use
• Searching:
– Keyword-based search with semantic query expansion
– Semantic search:
• Direct RDF quering
• Natural language templates
• Browsing
– Exibit
• Sharing:
– Social Semantic Collaborative Filtering
– Semantically Interlinked Online Communities
• Heterogeneous communication
119. Information Retrieval in JeromeDL
Fulltext Index
Structure
Repository
MarcOnt
Repository
Resources’
Content
FOAFRealm
Repository
(typed)
keywords
RDF & NL
Query
OpenSearch
RSS
collaborative
filtering
local
interface
distributed
interface
types translation
semantic query
expansion
RDF Repositories Secure Snapshot
120. Networks of Digital Libraries
• ELP (Extensible Library Protocol) implementation
– communication within JeromeDL network
– adapters for communication with other networks
• D-FOAF integration (distributed user profile
management)
– single sign on and single registration within D-FOAF
network
• HyperCuP integration (scalable P2P network)
0 0
11
0
0
11
0
2 2
22
122. •Digital libraries identifier.
•DOIs are a specific type of URI and similar to the
(ISBN)
• DOIs can be used to retrieve metadata
Source : http://www.tomw.net.au/2003/domains.html
URI
123. UNICODE
The Unicode Standard was designed to be :
Universal
Efficient
Uniform
Unambiguous
The present version is Unicode 3.0 covers 49,194
characters of all the scripts in the world and many
other symbols
Problems with Unicode :
Operating systems
Source : http://michaelseiler.net/2013/08/05/unicode-characters-in-
html/
124. XML
• visualize the information on the web
• it doesn't provide described information
• we can use our own tags
• To define a resource “book” titled “Prolegomena to
Library Classification” authored by “S. R. Ranganathan”,
can be represented in a XML document as
<book>
<title> Prolegomena to Library Classification</title>
<author>S. R. Ranganathan</author>
</book>
Source :
https://www.google.co.in/search?q=RDF+Storage+:+Jena&rlz=1C2SNJF_enIN574IN574&source=lnms&tbm=isch
&sa=X&ei=2nwcU-
125. RDF
• To represent the knowledge in a web page
• To provide better search engine capabilities
• In cataloging for describing the content
• For describing IPR
• For expressing the privacy preferences
126. RDF schema
• Describe RDF
• Provides a data-modelling vocabulary for
RDF data
• Describing groups of related resources.
127. RDF Storage : Sesame
• For querying and analyzing RDF data
• Features :
Highly scalable RDF storage
High query performance
Support for several RDF query
languages
128. RDF Storage : Jena
• Jena provides persistent storage of RDF.
• The Jena layout enables faster insertion
and retrieval for fine-grained API.
• Reduce storage
129. Web Ontology Language (OWL)
• Designed to meet the needs of WWW
• It‟s syntax is nearly identical to RDF‟s.
• Three variations of OWL
OWL Lite
OWL DL
OWL Full
• An ontology written in OWL DL could be extensively
used in Digital Libraries.
Source : http://www.mycutegraphics.com/graphics/owl/polka-dot-owl-
books.html
130. Tools for Building ontology
• The are many Ontology tools are available in
the present times such as
Protégé, OntoEdit, Ontolingua, OilEd, pOWL
etc.
Source : http://www.riversmead.org.uk/your-home/shared-owners/repairs-
and-maintenance/
131. Ontology editors : Protégé
• Protégé is a free open-source ontology
editor.
• Created by Stanford Center for Biomedical
Informatics Research.
• Support creation, visualization
• Export Ontology into different languages.
Source :
http://protege.stanford.edu/download/protege/4.1/installanywhere/Web_Installers/
135. Benefits of SW technology
open modeinteroperability
data
integration
share and re-
use data
Multilinguality service reuse
rapid
response to
change
136. LIST OF CASE STUDIES
• Online resource for information on
aquatic sciences/Spain July 2009.
• Enriching and sharing cultural heritage
data in Europeana, June 2012.
• Use of SWT in Natural language interface
to Business Applications, April2007.
• Publishing STW thesaurus for
Economics as linked open data, Germany
June 2009.
137. Case Study
Publishing STW Thesaurus for
Economics as Linked Open Data
in German National Library of Economics
(ZBW), Germany.
by Timo Borst and Joachim Neubert
June 2009
138.
139. Facets
•Activity area:
• library, public institution and publishing
•Application area of SW technologies:
• semantic annotation, improved search, content
management, domain modeling, and data integration
•SW technologies used:
• RDF, SPARQL, RDFa, and SKOS
•SW technology benefits:
• open mode, rapid response to change, service reuse, and share
and re-use data
140. Features of ZBW
• Provides a high-level taxonomy of subject
categories.
• Thousands of keywords (“descriptors”)
and tens of thousands of both synonyms
and links between the thesaurus
concepts.
• The media items are indexed with
descriptors from this thesaurus. They can
be retrieved by these descriptors through
the library catalog ECONIS.
141. Challenges
• First, to improve web-based presentation of
STW.
• Second, to foster precision of search results
by actively suggesting preferred terms from
STW.
• Third, to support the integration of STW into
other indexing or retrieval environments.
• Fourth, to induce third-party reuse of the STW
data, e.g. for customizing the vocabulary.
• Finally, to establish anchor points for linking
to other vocabularies and datasets.
142. Solutions
• The “SKOS - Simple Knowledge Organization System”
however, built within the Semantic Web community by
vocabulary experts and targeting
thesauri, classifications, folksonomies.
• Since SKOS is inherently multi-lingual, preferred and
alternate labels (synonyms) in English could be attached
to concepts as easily as their German equivalents.
“Related”, “narrower” and “broader” relations.
• Mapped nicely to the according SKOS properties such as
publisher, version and licensing information were added
seamlessly through the use of other RDF vocabularies
(e.g., Dublin Core).
143. Solutions (contd…)
• From the RDF file, they generated an XHTML page for each
concept in the thesaurus and embedded all of the data into this
page using RDFa.
• They assigned a persistent, language- and version-independent
URI to each page.
• Thus, the set of pages forms a highly interlinked network of
semantic relations, usable for both humans and machines.
• Web server content negotiation is used to deliver the format
• (RDF/XML or XHTML, English or German) most appropriate to the
request.
149. PROBLEMS DUE TO LIMITATION
Digital libraries should not be for librarians
only, but for average people
Concentration on delivering
content/information, not on knowledge
sharing within a community of users
Digital libraries have lost human-part of
their predecessors
150. SOLUTION
Making users/readers involved in the content
annotation process
Allowing users/readers to share their
knowledge within a community
Providing better communication between users
in and across communities
Achieved through SSDL (SOCIAL SEMANTIC
DIGITAL LIBRARY)
151. SSDL
The social semantic digital library is an attempt to
restore the collaborative approach to sharing
knowledge.
The semantic services help
# to enhance search and browsing features
# to interconnect different systems and exchange data
The social services help
# to gather relevant information from expertise of others
# to improve high rank knowledge sharing in a digital
library
153. HELPFULNESS
The social semantic digital library will help digital library
to build heterogeneous networks of Semantic Web.
It may deliver more robust, user-friendly, adaptable
search and browsing interfaces empowered by
semantics.
156. Social Semantic Digital Library
Services in e-Learning
Introducing Personal Learning Environments (PLEs)
Addressing some of the open research challenges
in Technology Enhanced Learning (TEL)
Enabling effective and reliable
mechanisms for managing various types of knowledge
relevant for providing personalized learning
experiences in online learning environments
Ability to preserve the semantics of this knowledge
while sharing
Interaction with during the learning process
158. QUESTIONS ARISE…
Do the social and semantic services
increase the quality of the answers
provided by the users in response to
given problems?
Do the social and semantic services
increase the accuracy of the
references provided by the users to
answer given questions?
Do the social and semantic services
increase overall satisfaction of using
the digital library?
Which
services, i.e., semantic, social, or
recommendations,are found to be
most useful by the end users?
159. FUTURE WORK
Future research on semantic features should concentrate more on
improving accuracy of automated recommendations services
and usability of existing solutions.
Our future work in the domain of semantic digital libraries, and
JeromeDL in particular, will focus on adapting research on the
semantic web, web 2.0 and adaptive hypermedia to our
system;we work on delivering wiki-like and faceted navigation
features for JeromeDL.
160. PERORATION...
Semantic Web digital library would contain features like
semantic blogs
semantic wikis
semantic search
social semantic digital libraries
semantic social networks
semantic social information spaces etc.
These will have
open access - open information - open source
(OPEN MANTRA)
161. PERORATION...
There are some other factors constraint
the librarian to initiate and adopt the Social
Semantic Web like:
Communication barriers
Absence of metadata representations
Absence of user-friendly applications
Limited available literature
But, librarians need to participate in
ontologies and social semantic-based conferences
to explore the technology more and
also to give wider coverage to their skills and talent.
162. REFERENCES
Finding the Concept, Not just the Word: A librarian’s guide to ontologies and
semantics. By- Brandy E. king and kathy Reinold
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1
000204
http://drtc.isibang.ac.in/xmlui/bitstream/handle/1849/72/I_unicode.pdf?se
quence=2
http://books.google.co.in
http://www.ieee-tcdl.org/Bulletin/v6n1/Yang/yang.html
http://drtc.isibang.ac.in/xmlui/bitstream/handle/1849/26/D_Semanticweb_s
neha.pdf?sequence=2
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.9150&rep=rep1
&type=pdf
http://www.ukoln.ac.uk/nkos/nkos2006/presentations/kruk.pdf
PROBLEM OF PRESENT WEB?High recall, low precisionResults are highly sensitive to vocabularyResults are single web pagesWeb pages are written in HTMLHTML describes the structure of informationHTML describes the syntax not the semantics
If computers can understand the meaning behind information...they can learn what we are interested inthey can help us better find what we wantThis is really what the Semantic Web is all about
Semantic Web provides a common framework that allows data to be shared and reused.The term was coined by Tim Berners-Lee for a web of data that can be processed by machines.
Unstructured-Type of information and its location is unknownWEB OF DATA- Semantic web is a web of data
folksonomy collaboratively creating and translating tags to annotate and categorize contentbroad and narrow. A broad folksonomy is the one in which multiple users tag particular content with a variety of terms from a variety of vocabularies, thus creating a greater amount of metadata for that content. A narrow folksonomy, on the other hand, occurs when a few users, primarily the content creator, tag an object with a limited number of terms. both broad and narrow folksonomies enable the searchability of content by adding textual description - or access points - to an object, a narrow folksonomy does not have the same benefits as a broad folksonomy, which allows for the tracking of emerging trends in tag usage and developing vocabularies. [4] Folksonomies became popular on the Web applications such as social bookmarking and photograph annotation.7]
FOAF based on RDF. Evaluation based on reification.Include <foaf.knows> tag to evaluate the friendship. Dijkstra algorithm is used to understand the distance between two friends.
digital libraries on the Web identified by URIsallowing persistent and unique identification of a publication independently of its locationDOIs can be used to retrieve metadata for a given publication using a DOI resolver such as CrossRef
Universal: The repertoire must be large enough to encompass all characters that are likely to be used in general text interchange, including those in major international, national, and industry character sets. Efficient: Plain text is simple to parse: software does not have to maintain state or look for special escape sequences, and character synchronization from any point in a character stream is quick and unambiguous.Uniform: A fixed character code allows for efficient sorting, searching, display, and editing of text. Unambiguous: Any given 16-bit value always represents the same character. ++Although the present operating systems support a partial multi-lingual approach, they do not support all the scripts supported by Unicode++
HTML allows us to visualize the information on the web,it doesn't provide much capability to describe the information in ways that facilitate the use of software programs to find or interpret the information.
To represent the knowledge in a web page there is a document description framework developed known as RDF (Resource Description Framework).In resource discovery to provide better search engine capabilitiesIn cataloging for describing the content and content relationships available at a particular Web site, page, or digital library, by intelligent software agents to facilitate knowledge sharing and exchange, For describing intellectual property rights of Web pages For expressing the privacy preferences of a user as well as the privacy policies of a Web site. RDF with digital signatures will be key to building the "Web of Trust" for electronic commerce, collaboration, and other applications
RDFS is a language for describing RDF documents. It is complemented by several companion documents which describe the basic concepts and abstract syntax of RDF It provides mechanisms for describing groups of related resources and the relationships between these resources
Sesame is an open-source framework for querying and analyzing RDF data. It was created by the Dutch software company Aduna as part of the "On-To-Knowledge", a semantic web project that ran from 1999 to 2002. It contains implementations of an in-memory triplestore and an on-disk triplestore, along with two separate Servlet packages that can be used to manage and provide access to these triplestores, on a permanent server. The Sesame Rio (RDF Input/Output) package contains a simple API for Java based RDF parsers and writers. Parsers and writers for popular RDF serialisations are distributed along with Sesame, and users can easily extend the list by putting their parsers and writers on the Java classpath when running their application.Sesame is an open-source framework for querying and analyzing RDF dataSesame is a fast and scalable RDF database. It serves as one of the building blocks of the Semantic Web Sesame is based on open standards developed by W3C and is available under a liberal Open Source license .Support for several RDF query languages including SPARQL and SeRQL.
Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or a combination of these. A Model can also be queried through SPARQL and updated through SPARUL.Jena provides persistent storage of RDF data in relational databases.fine-grained API operations at the cost of storage over a normalized triples and nodes schema.It reduce storage consumption.
It is designed specifically to meet the needs of the World Wide web.It’s syntax is nearly identical to RDF’s but OWL is intended to better describe the semantics (like relationships, definitions, and rules) of concept and allow inference of new data based on these semantics. OWL lite- OWL Lite was originally intended to support those users primarily needing a classification hierarchy and simple constraints. For example, while it supports cardinality constraints, it only permits cardinality values of 0 or 1. It was hoped that it would be simpler to provide tool support for OWL Lite than its more expressive relatives, allowing quick migration path for systems using thesauriand other taxonomies.Owl DL- support inference and additional relationship. OWL DL was designed to provide the maximum expressiveness possible while retaining computational completeness OWL full- its provide the ability for the user to extend the language, OWL Full is based on a different semantics from OWL Lite or OWL DL, and was designed to preserve some compatibility with RDF Schema
An ontology can be expressed in structure languages such as XML or RDF, but as the ontology develops and grows larger , maintaining the structure of these knowledge sources quickly becomes overwhelming and difficult to manage. Tolls for building ontologies attempts to simplify the task of creating an using an ontology. Most tools provide some ability to visualize the relationship among concepts and nearly all can generate the ontology into two or more ontology language.
Protégé is one of the most well-known tools, created by Stanford Center for Biomedical Informatics Research and supported by grants from the National Library of Medicine , Defense advance research project agency, ebay Inc etc.It supports the creation, visualization and management of ontologies in a number of formats.Allows the user to export ontologies into OWL, RDF Schema, XML and other language format.
Content managementis the set of processes and technologies that support the collection, managing, and publishing of information in any form ormedium.Data integration involves combining data residing in different sources and providing users with a unified view of these data.Customization: to modify, make, or build according to individual specifications or preference.A domain model in problem solving and software engineering is a conceptual model of all the topics related to a specific problem. It describes the various entities, their attributes, roles, and relationships, plus the constraints that govern the problem domain.Semantic Annotation helps to bridge the ambiguity of the natural language when expressing notions and their computational representation in a formal language. By telling a computer how data items are related and how these relations can be evaluated automatically, it becomes possible to process complex filter and search operations. Imagine your search engine understands that "Barcelona" is a city in "Europe", it can answer a search query on "IT Companies in Europe" with a link to a document about Yahoo Office in Barcelona, although the exact words "Barcelona" or "Yahoo" never occur in your search query.A web portal is most often one specially-designed Web page at a website which brings information together from diverse sources in a uniform way.
Interoperability: ability of a system (as a weapons system) to work with or use the parts or equipment of another system
Aquaring is a portal on European resources of the Aquatic world, offering a multilingual semantic search engine, a semantic tag cloud adapted to user context, map, exhibitions, etc. The initial aim was to offer a multilingual and unique access point to a heterogeneous and distributed collection of digital resources. Application area of SW technologies: portal, semantic annotation, domain modeling, improved search, customization, and data integrationSW technologies used: public vocabularies, public datasets, RDF, SPARQL, OWL, and OWL DLSW technology benefits: rapid response to change, service reuse, multilinguality, and share and re-use dataEuropeana provides access to millions of objects gathered from hundreds of libraries, archives, museums and other cultural institutions throughout Europe. To do so, it gathers descriptive metadata and links to Web resources from all of these institutions. The result is a set of highly heterogeneous metadata.Application area of SW technologies: content management, improved search, data integration, domain modeling, and portalSW technologies used: RDFa, RDF, SPARQL, RDFS, SKOS, public datasets, public vocabularies, and in-house vocabulariesSW technology benefits: open mode, data integration, rapid response to change, share and re-use data, and interoperabilityTata Consultancy Services Limited (TCS) has research labs across the globe that focus on a wide range of domains of interest. One of the areas of research in TCS is to investigate natural language interfaces to business applications. A framework called NATAS has been developed, that enables users to interact with a business application in natural language by posing questions and invoking tasks. NATAS analyses, interprets and evaluates user input and responds back appropriately. The architecture of NATAS uses a Semantic Web based ontology of the domain to aid in the retrieval of relevant data and concepts from the application. The main advantage of such a system is that the user is free to enter any information in a raw form. It is then the job of the system to process the raw information and get whatever else is required.
In order to provide a standard format for publishing the thesaurus as a whole, and also to decouple the publication process from the highly proprietary thesaurus maintenance application, we looked for a standardized, highly expressive intermediate format. It turned out that no common serialization format for thesauri yet exists. The mapping of the thesaurus concepts and relations to SKOS proved to be quite straightforward. Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.
Publishing the data on the Web was one of the main goals of the project. To achieve the goal they used these steps:RDFa (or Resource Description Framework in Attributes[1]) is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTMLand various XML-based document types for embedding rich metadata within Web documents.