SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
STRAYER UNIVERSTY
NETWORK ARCHITECTURE AND ANALYSIS
SUBMITTED TO
DR. BRYANT PAYDEN
BY
ADELMAR ESPLANA
GRADUATE STUDENT IN
MASTER OF SCIENCE IN INFORMATION SYSTEMS
MARCH 2009
2
I. Overview
“Knowledge is power (But only if you know how to acquire it)." (The Economist, 2003) Knowledge
is awareness and understanding of facts, truth or information in the form of experiences and learning. Today,
one of the quickest ways to acquire knowledge is through the Web or WWW (World Wide Web). Using
search engine facilities, we can search almost any kind of information. The internet has become a world-
wide network of information resource and a powerful communication tool. And the more specific word you
type in, in the search engine, the more accurate results you would get. The internet is a powerful tool which
may be used in a number of ways such as having online classes, bills payment, booking vacation, scheduling
medical appointments, finding the latest news and even taking a virtual tour on vacation destinations. In
spite of all these tremendous capabilities that the Web offers, to date, the principle that is being used by
search engines in the analysis of material is based on textual indexing. Although search engines have proven
remarkably useful in finding information rapidly, they have also been proven remarkably useless in
providing quality information. “The biggest problem facing users of web search engines today is the quality
of the results they get back.”(Brin S. & Page L.) Currently, search engines can only account for the
vocabulary of the documents and has a little concept of document quality thereby producing a lot of junk.
Since its inception in 1989, the internet has grown initially as a medium for the broadcast of read-
only material, from heavily loaded corporate servers, to the mass of internet-connected consumers. Recently,
the rise of digital video, blogging, podcasting, and social media has revolutionized the way people socially
interact through the web. Other promising developments include the increasing interactive nature of the
interface to the user, and the increasing use of machine-readable information with defined semantics
allowing more advanced machine processing of global information, including machine-readable signed
assertions. (Berners T., 1996a)
A lot of the data that we use everyday are not part of the web. These data which are in various silos,
reside in different software applications and different places that cannot be connected easily like bank
statements, photographs and calendar appointments. In order for us to see bank statements, we have to use
online banking; to manage our calendar appointments and organize our photo collection, we have to use
3
social-oriented websites such as MySpace and Facebook. It is interesting to see on the web photos displayed
on a calendar, showing exactly what we were doing when we took them; and bank statement, showing the
list of transactions that we incurred. Due to the silo of information where both personal and business
information are managed by different software, we are forced to learn how to use different programs all the
time. In addition, since data are controlled by applications, each application must keep the data to itself.
According to W3C Group, what we need is a “web of data” (Herman I., 2009). Supported by Fielding’s
dissertation, he mentioned that what we need is a way to store and structure our own information, whether
permanent or temporal in nature, both for our own necessity and for others, be able to reference and structure
the information stored by others so that it would not be necessary for everyone to keep and maintain local
copies. (Fielding R., 2000)
II. Purpose of the Paper
Understanding the underlying concepts and principles behind the web is essential to current and
future implementation initiatives. For this reason, it is the objective of this paper to uncover the root of its
existence, and to examine the fundamental design notion of the following design principles: Independent
specification design, Hypertext Transfer Protocol (HTTP), Uniform Resource Identifier (URI) and Hypertext
Markup Language (HTML). This study also aims to develop a better understanding of the emerging web
standards, such as REST, SOA, and Semantic Web. The paper discusses some of the misconceptions about
URI, HTTP and XML and the following issues: a) In REST and Semantic point of view, there is no
difference between slash based and parameter based URI reference; b) HTTP is not a data transfer protocol;
it is an application protocol (or a coordination language, if you swing that way). REST does not "run on top
of HTTP" but rather HTTP is a protocol that displays many of the traits of the REST architectural style; c)
What is Extensible Markup Language (XML) function in Representational state transfer (REST) and
Semantic Web? Is it true that most REST services in deployment do not return XML but rather HTML? Is
it true that REST has no preference for XML?
4
III. Foundation of the World Wide Web
According to Tim Berners, “The goal of the Web was to be a shared information space through
which people (and machines) could communicate.” (Berners T., 1996a) It was also the original intent that
this so called “space” ought to span all sorts of information, from different sources to a wide array of
formats, and from highly valued designed material to a spontaneous idea. In the original design of the web,
he stated the following fundamental design criteria: (Berners T., 1996a)
a) An information system must be able to record random associations between any arbitrary objects,
unlike most database systems
The concept of database systems has been purposely utilized to facilitate storage, retrieval
and information generation of structured data. Unlike, the web concept of “one universal space of
information” which is based on the principle that almost anything on the web could be possibly linked
to any arbitrary objects. The power of the Web is that linkage can be established to any document (or,
more generally, resource) of any kind in the universe of information, whereas in the database systems,
one has to understand the data structure to establish the relationship.
b) If two sets of users started to use the system independently, to make a link from one system to another
should be an incremental effort, not requiring unscalable operations such as the merging of link
databases
In the business environment, to integrate two different types of systems, it is necessary to
perform some degrees of integration efforts such as merging, importing or linking of databases. On
the contrary, the idea of web was to be able integrate systems easily. Most of the systems done in
the past involve a great deal of integration effort due to the information silo. For this reason, the idea
where machine can talk to each other set forth the promise of seamless integration.
c) Any attempt to constrain users as a whole to the use of particular languages or operating systems was
always doomed to fail. Information must be available on all platforms, including future ones.
Platform and language interoperability support the principles of universality of access
5
irrespective of hardware or software platform, network infrastructure, language, culture, geographical
location, or physical or mental impairment.
d) Any attempt to constrain the mental model users of data into a given pattern was always doomed to
fail. If information within an organization is to be accurately represented in the system, entering or
correcting it must be trivial for the person directly knowledgeable
If the interaction between person and hypertext could be so intuitive that the machine-
readable information space gave an accurate representation of the state of people's thoughts,
interactions, and work patterns, then machine analysis could become a very powerful management
tool, seeing patterns in our work and facilitating our working together through the typical problems
which beset the management of large organizations.
Independent specification design
The basic principles of the Web proposed in 1989 to meet the design criteria were adopted based on
the well-known software design principles called “independent specification design”. This design was based
on the principle of modularity. Meaning when it is modular in nature, the interfaces between the modules
hinge on simplicity and abstraction. This allows seamless compatibility of the existing content, to work with
the new implementation. As technology evolves and disappears, specifications for the Web’s languages and
protocols should be able to adapt to the new hardware and software changes. Along with this basic principle
are the three main components such as URI, HTTP and HTML.
URI or Universal Resource Identifier
URI is a compact string of characters for identifying abstract or physical resource. It is a simple and
extensible means of identifying a resource. A URI can be further classified as a locator, a name, or both. The
term "Uniform Resource Locator" (URL) refers to the subset of URI that identifies resources via a
representation of their primary access mechanism, rather than identifying the resource by name or by some
other attribute(s) of that resource (Berners T., Fielding R. & L. Masinter, 2005). URNs (Uniform Resource
Names) are used for identification; URCs (Uniform Resource Characteristics), for including meta-
information; and URLs, for locating or finding resources.
6
REST defines URI as a resource based on a simple premise that identifiers should change as
infrequently as possible (Fielding R., 2000). While Semantic Web identifies URI’s not just Web documents,
but rather real-world objects like people, cars, and abstract ideas. They call all these as real-world objects or
things (W3C, 2008b). Deriving URI definitions from the meaning of each letters U -Uniform, R-Resource
and I-Identifier as listed below:
Uniform allows consistency of its usage, even when the internal mechanism of accessing the
resources has changed. It allows common semantic interpretation of syntactic conventions, across different
type of resources to work with the existing identifiers.
Resources are, in general, any real world “thing” such as electronic documents, images and services,
recognized by URI to represent something, for example, electronic document, an image, or a source of
information with consistent purpose. Other resources that are not accessible via internet are representation of
the abstract concepts, mathematical equations, correlation (e.g., “parent” or “employee”) and values (e.g.,
zero, one, and infinity).
Identifier pertains to information required to distinguish what is being identified from all other things
within its scope of identification. The terms “identify” and “identifying” means distinguishing one resource
from the other regardless how that purpose is accomplished. One of the capabilities web popularized is the
ability of documents to link to any kind, in the universe of information. With this in mind, the concept of
“identity” is concerned with the conceptual scheme of identifying objects generically. For example, one URI
can represent a book which is available in several languages and several data format.
HTTP and URIs are the basis of the World Wide Web, yet they are often misunderstood, and their
implementations and uses are sometimes incomplete or incorrect (W3C, 2003).
a) A common mistake, responsible for many implementation problems, is to think that a URI is
equivalent to a filename within a computer system. This is wrong as URIs have, conceptually,
nothing to do with a file system.
7
b) A URI should not show the underlying technology (server-side content generation engine,
script written in such or such language) used to serve the resource. Using URIs to show the
specific underlying technology means one is dependent on the technology used, which, in turn,
means that the technology cannot be changed without either breaking URIs or going through
the hassle of "fixing" them.
HTTP
According to the HTTP 1.0 specification, The Hypertext Transfer Protocol (HTTP) is an application-
level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information
systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name
servers and distributed object management systems, through extension of its request methods (Berners T.,
Fielding R. & Frystyk H., 1996b).
HTTP messages are generic and communication takes place operationally based on the client/server
paradigm of request/response. Messages are all created to comply with the generic message format. Clients
usually send requests and receive responses, while servers receive requests and send responses. It is stateless
and connectionless in nature because after the server has responded to the client's request, the connection
between client and server is dropped and forgotten. There is no "memory" between client connections.
Basically, when you type in the URL in the browser, the client and server Connection takes place over
TCP/IP. This URL internally gets converted into a Request for server to process, after the server finished
processing, and then the server sends the message Response back to the client and Closes the connection of
both parties. The downside of it is that, it may decrease the network-performance due to the increasing
amount of overhead data per request, the fact that the state of request is not stored in a shared context.
The design is patterned and implemented with the idea of object-orientation. In general, objects used
internally for each request are as follows: HTTP messages, Request/Response, Entity, Method Definitions,
Status Code Definitions, Status Code Definitions, and Header Field Definitions (based HTTP 1.0
specification).
8
The Method field indicates the method to be performed on the object identified by the URL.
Methods supported by HTTP 1.1 specification are OPTIONS, GET, HEAD, POST, PUT, DELETE,
TRACE, and CONNECT (Fielding R., 1999). The GET method means to retrieve whatever is identified by
the URI. The HEAD is the same as GET but it returns only HTTP headers and no document body. The
POST method is used to request that the destination server accept the entity enclosed in the request as a new
subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a
uniform method to cover the following functions: 1) annotation of existing resources; 2) posting a message
to a bulletin board, newsgroup, mailing list, or similar group of articles; 3) providing a block of data, such as
the result of submitting a form, to a data-handling process; and 4) extending a database through an append
operation. (Berners T., Fielding R & H. Frystyk, 1996b).
HTML
The Hypertext Markup Language (HTML) is a markup language used to create hypertext documents
that are platform independent. (Yergeau F. et.al, 1997) The difference between XML and HTML is that,
HTML is defined by W3C that must be followed by every possible browser. XML is an extension, its
markup is customizable. It is typically used as storage to hold and describe data.
IV. Future of the World Wide Web
Say you had some lingering back pain: a program might determine a specialist's availability, check
an insurance site's database for in-plan status, consult your calendar, and schedule an appointment. Another
program might look up restaurant reviews, check a map database, cross-reference open table times with your
calendar, and make a dinner reservation. Tim Berners and others describe this as “web of data”. This will be
the new Web capable of supporting software agents that are able not only to locate data, but also to
“understand” in ways that will allow computers to perform meaningful tasks with data automatically on the
fly (Updegrove, 2001).
The Semantic Web is a web of data. It is about common formats for integration and combination of
data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of
9
documents. It is also about language for recording how the data relates to real world objects that allows a
person, or a machine, to start off in one database, and then move through an unending set of databases which
are connected not by wires but by being about the same thing (Herman I., 2009).
Representational State Transfer (REST) is an architectural style for distributed hypermedia systems,
describing the software engineering principles guiding REST and the interaction constraints chosen to retain
those principles, while contrasting them to the constraints of other architectural styles (Fielding R., 2000).
The fundamental differences between the two are: Semantic Web is an integration solution (a
solution to information silo), while REST is a set of state transfer operations universal to any data storage
and retrieval system (Battle R. & Benson E., 2007). Semantic Web provides ways to semantically describe
and align data from desperate sources while REST offers resource data access operations commonly known
as CRUD (Create, Read, Update and Delete).
From the traditional “web of pages” to a “web of data”, the Semantic Web goal is to provide a cost-
efficient way of sharing machine-readable data. The business of sharing machine-readable data in general
has been around for quite some time. Information silo has always been a challenge that researchers and IT
practitioners are keen about.
Service-oriented architecture (SOA) solutions have been created to satisfy business goals that
include easy and flexible integration with legacy systems, streamlined business processes, reduced costs,
innovative service to customers, and agile adaptation and reaction to opportunities and competitive threats.
SOA is a popular architecture paradigm for designing and developing distributed systems (Bianco P. et al.,
2007). In spite of the popularity of SOA and Web Services, confusion among software developers is
prevalent. To shed a light, SOA is an architectural style, whereas Web Services is a technology used to
implement SOA’s.
Web services provide a standard means of interoperating between different software applications,
running on a variety of platforms and/or frameworks (W3C, 2004). The Web services technology consists of
several published standards, the most important ones being SOAP, XML (Extensible Markup Language) and
WSDL (Web Services Description Language). Although there are some other technologies like CORBA and
10
Jini but to limit our discussion, we are only concerned with the Web Services as other do not apply to Web
domain.
At the heart of the Service Oriented Architecture is the service contract. It answers the question,
"what service is delivered to the customer?" In the current web-services stack, WSDL is used to define this
contract. However, WSDL defines only the operational signature of the service interface and is too brittle to
support discovery in a scalable way. "SOAP” is no longer an acronym, A SOAP message represents the
information needed to invoke a service or reflect the results of a service invocation, and contains the
information specified in the service interface definition (W3C, 2004). Extensible Markup Language (XML)
documents are made up of storage units called entities, which contain either parsed or unparsed data. (Cowan
J., 2008). SOAP and WSDL are good examples of XML documents.
As mentioned earlier, Web services way is just another roadmap of Service Oriented Architecture.
The concept of “web of data” was also introduced as a solution for information silo and, was able to establish
the rationale for Web-accessible API (Application Programming Interface). Technically speaking, a Web
service is a Web-accessible API. So, why is there a need for REST and Web Semantics?
There is a great amount of data available through REST and SOAP Web Services, published by
private and public sectors however these data carry no markup that conforms to semantic standards. It is
important to provide markup in a manner where Semantic Web application suite understands to make the
services compatible and to allow semantic query operations feasible (Battle R. & Benson E, 2007).
In the traditional Database Systems, we have SQL (Structured Query Language). It is the language
used to interact with the database. In the Semantic world, there is this technology called SPARQL
(SPARQL Protocol and RDF Query Language). SPARQL can be used to express queries across diverse data
sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains
capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions.
SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results
of SPARQL queries can be results sets or RDF graphs (W3C, 2008a).
11
It is now getting more interesting, that developers have to learn not only SQL but also SPARQL.
Thanks to Edgar Frank Codd, the inventor of Relational Database; without his invention we are still using
the filing cabinet and physically sorting out and searching records manually.
It is important to provide markup in a manner where Semantic Web could understand to make the
services compatible and to allow semantic query operations feasible. With the existing SOAP Web services,
what needed to be done is to add semantic information to web services, such as OWL-S and SAWDL. They
provide details for each Web service parameter that describes how the value is derived from ontology (Battle
R. Benson E, 2007). The OWL-S document maps each operation and message defined in the WSDL
definition ontology. But the problem with SPARQL is that, it is an RDF Query Language designed for
RDF. Since most Web services return plain old XML, a conversion process from XML data to RDF is
needed.
So where does REST come into play? REST is another roadmap of SOA and a principle that is being
applied to a quite few of Web services implementation. While SOAP based services have a WSDL
document that defines their operation, there is no standard equivalent for REST services. This is the area
where companies who adopted REST earlier must be aware of. Who knows, if companies are really
convinced that this is the right way of doing Web services, then maybe in the future there will be a standard
way of implementation.
V. Discussion
a. In REST and Semantic point of view, there is no difference between slash based and parameter based
URI reference.
There are two major requirements on the Semantic Web where naming of Resource must be
followed. First, a description of the identified resource should be retrievable with standard Web
technologies. Second, a naming scheme should not confuse things and the documents representing them
(Battle R. Benson E., 2007). Both REST and Semantic Web support the idea that “Cool URIs don't
change”. Tim Berner explained that the best resource identifiers don't just provide descriptions for
people and machines, but are designed with simplicity, stability and manageability in mind. Based on
12
W3C standard, a generic URI syntax consists of hierarchical sequence of components such as scheme,
authority, path, query, and fragment.
URI = scheme ":" hier-part [“?" query] [“#" fragment]
The following are two examples URIs and their component parts:
foo://example.com:8042/over/there?name=ferret#nose
_/ ______________/_________/ _________/ __/
| | | | |
scheme authority path query fragment
| _____________________|__
/  / 
urn: example: animal: ferret: nose
WC3 recommends the use of standard session mechanisms instead of session-based
URIs (W3C, 2003). What does it mean? HTTP/1.1 provides a number of mechanisms for
identification, authentication and session management. Using these mechanisms instead of user-
based or session-based URIs guarantees than the URIs used to serve resources is truly universal
(allowing, for example, people to share, send, or copy them).
For example: Bob tries to visit http://www.example.com/resource, but since it's a rainy Monday
morning, he gets redirected to http://www.example.com/rainymondaymorning/resource. The day
after, when Bob tries to access the resource, he had bookmarked earlier, the server answers that
Bob has made a bad request and serves http://www.example.com/error/thisisnotmondayanymore.
Had the server served back http://www.example.com/resource because the Monday session had
expired, it would have been, if not acceptable, at least harmless. The problem with this is that, it
does not really guarantee that URI’s used are truly universal. The acceptable practice in this
situation is to use some modifiers, like "?" used to pass arguments for cgi, or ";" and to pass other
kind of arguments or context information.
13
Roy Fielding did not mention in his dissertation that URI do not allow parameterized reference.
Similarly, Semantic Web requirements mentioned that as long as the identifier conforms to the
two major requirements above and W3C standard specifications then the use of it is acceptable.
Both REST and Semantic Web consistently raised the implementation need of having abstraction
to URI. The key abstraction of information in REST is a resource. Any information that can be
named can be a resource, a document or image, a temporal service (e.g. “today’s weather in Los
Angeles”), a collection of other resources, a non-virtual object (e.g. a person), and so on (Fielding
R. 2000).
It was mentioned previously that the biggest challenge of the search engines today is the
quality of results. Search engine spiders do not presently crawl many types of “dynamic” web
pages. Typical examples of dynamic pages are those internal and external web applications that
companies are using to do their business as well as those Web 2.0 emerging sites. In accordance
to this, it is important that we identify the types of resource and mapped the underlying entity.
Conceptual representation means, resource is an abstraction of some type of arbitrary concept.
Once mapping of the concept “resource” to a physical resource is done, it should remain this way
as long as possible. Think about pages that have .asp extension. Companies who are still linking to
those pages are possibly not working anymore since a lot of companies are now moving to .aspx.
b) HTTP is not a data transfer protocol; it is an application protocol (or a coordination language, if you
swing that way). REST does not "run on top of HTTP" but rather HTTP is a protocol that displays
many of the traits of the REST architectural style.
HTTP is not designed to be a transport protocol. It is a transfer protocol in which the
messages reflect the semantics of the Web architecture by performing actions on resources through
the transfer and manipulation of representations of those resources. It is possible to achieve a wide
range of functionality using this very simple interface, but following the interface is required in order
for HTTP semantics to remain visible to intermediaries (Fielding R., 2000).
14
Conceivably, it is easy to get the wrong idea that REST sits in between Application Protocol and
Transport Protocol when it was cited that REST is a “transfer protocol”. Leveraging the HTTP
Headers to provide request context around CRUD operations (Create for POST, Read for GET, Update
for PUT and Delete for DELETE) will allow developers to overlay the programmatic API for a website
directly on top of the site exposed to web user and reduce the cost and complexity of providing multi-
format access to a site’s underlying data.
c) What is Extensible Markup Language (XML) function in REST and Semantic Web? Is it true that
most REST services in deployment do not return XML but rather HTML? Is it true that REST has no
preference for XML?
RESTS’s data elements are summarized in Table 1
Table 1 REST Data Elements
Data Element Modern Web Examples
resource the intended conceptual target of a hypertext reference
resource identifier URL, URN
representation HTML, document, JPEG image
representation metadata media type, last-modified time
resource metadata source link, alternates, vary
control data if-modified-since, cache-control
It is true that Roy did not specify XML as an example of resource and resource metadata; on
the other hand, he did mention the representation media type which is the data format of the
representation. He described that representation consists of data, meta data describing the data, and,
on occasion metadata to describe metadata. As mentioned previously XML is an open standard for
describing data. Therefore, the question about “REST has no preference for XML”. It is not true if
you are doing Web services but false if you are just creating pages that are not designed for machine
15
interpretation, why do you have to care about returning XML if you don’t need it in the first place?
Likewise if you want to share your data, how do you want your data represented? Using file
delimited text?
The idea of REST and Semantic Web is to coexist with the existing web standards and not to
disqualify any of them; in fact it is the idea of Platform and language interoperability.
V. CONCLUSION
Acquiring data has becoming easier and easier than ever before and with latest technology
breakthroughs, there is no doubt that in time the internet will be “all in one”. Along the way, there will
be some adjustments and corrections to be done and misconceptions to be addressed (intentional or not
- whatever) to reach this so called “web of data”. Support from the industry players is crucial. Data
security issues have always been our primary concerned. There are a plethora of questions that must
be addressed, such as the following; a) who will annotate the data? b) What is the advantage of giving
the data, as we all know that data is a valuable commodity? c) Without any centralized control, how
will all this data be connected to one another? d) Will the existing AI techniques be sufficient to
process this huge amount of data? In addition, is it even practical to pursue this route?
On the other side, web has exploded so rapidly that in the beginning, what we are only concerned
about is the sharing of documents. Giving credit to the early contributors who brought us this far, I
firmly believe that the simple approach of the existing implementation of web particularly URL (which
was originally designed as URI) opened up the door to everybody (computer savvy and none).
Explaining URI alone to common people and doing it right the first time is not simple. Philosophically
speaking, isn’t it also what the concept of “universality of access” is? In the same way, when you write
software, it is not always right the first time. Writing software is an evolving process.
Meeting the line of both ends, I believe that there is not much we can do concerning why things are
done the URL way, but I do recognize that there is always room for improvement. With a better
understanding of what went before and what it is about to come, moving towards the future of Semantic
Web is up for us to consider.
16
VI. REFERENCES
Berners T., Fielding R. & Masinter L. (2005) Uniform Resource Identifier (URI): Generic Syntax.
Retrieved Feb 13, 2009 from http://labs.apache.org/webarch/uri/rfc/rfc3986.html#URLvsURN
Berners T. (2002). What do HTTP URIs Identify?
Retrieved Feb 20, 2009 from http://www.w3.org/DesignIssues/Overview.html
Berners T. (1996a). The World Wide Web: Past, Present and Future. Retrieved on Feb 20, 2009 from
http://www.w3.org/People/Berners-Lee/1996/ppf.html
Berners T., Fielding R & H. Frystyk (1996b) Hypertext Transfer Protocol -- HTTP/1.0. Retrieved Feb
13, 2009 from http://www.ietf.org/rfc/rfc1945.txt
Battle R. Benson E, (2007). Bridging the semantic Web and Web 2.0 with Representational.
State Transfer (REST). Retrieved Feb 13,2009 from
http://omescigil.etu.edu.tr/semanticweb/papers/sw_4.pdf
Bianco P., Kotermanski R. Merson P. (2007 ) Evaluating a Service-Oriented Architecture. Retrieved
Feb 13,2009 from http://www.sei.cmu.edu/pub/documents/07.reports/07tr015.pdf
Brin S. & Page L. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Retrieved Feb 20,
2009 from http://infolab.stanford.edu/~backrub/google.html
Cowan J., Fang A., Grosso P., Lanz K., Marcy G., Thompson H., Tobin R., Veillard D., Walsh N.,
Yergeau F. (2008) Extensible Markup Language (XML) 1.0 (Fifth Edition) Retrieved Feb
13,2009 from http://www.w3.org/TR/REC-xml/#sec-intro
Fielding R. (2000). Architectural Styles and the Design of Network-based Software Architectures.
Retrieved Feb 20, 2009 from http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
Fielding R., Mogul J., Gettys J., Frystyk H., Masinter L., Leach P. & Berners T. (1999) Hypertext
Transfer Protocol HTTP/1.1. Retrieved Feb 13,2009 from http://www.ietf.org/rfc/rfc2616.txt
Herman I. (2009). W3C Semantic Web Activity. Retrieved Feb 20, 2009
from http://www.w3.org/2001/sw/
17
The Economist (2003). Knowledge is power. Retrieved on Feb 20, 2009 from
http://www.economist.com/business/globalexecutive/education/displayStory.cfm?story_id=17626
Updegrove A. (2001) THE SEMANTIC WEB: AN INTERVIEW WITH TIM BERNERS-LEE.
Retrieved Feb 13,2009 from http://www.consortiuminfo.org/bulletins/semanticweb.php
W3C (2008a) SPARQL Query Language for RDF Retrieved Feb 13, 2009 from
http://www.w3.org/TR/rdf-sparql-query/
W3C (2008b) Cool URIs for the Semantic Web. Retrieved Feb 13, 2009 from
http://www.w3.org/TR/2008/WD-cooluris-20080321/
W3C (2004) Web Services Architecture Retrieved Feb 13, 2009 from
http://www.w3.org/TR/ws-arch/#id2260892
W3C (2003) Common HTTP Implementation Problems.
http://www.w3.org/TR/2003/NOTE-chips-20030128/
Yergeau Y., Nicol G. Adams G. & Duerst M .(1997) Internationalization of the Hypertext Markup
Language http://www.rfc-editor.org/rfc/rfc2070.txt

Contenu connexe

Tendances

Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...
Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...
Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...Black Duck by Synopsys
 
How Is IoT Changing Web Development?
How Is IoT Changing Web Development?How Is IoT Changing Web Development?
How Is IoT Changing Web Development?Cloud Analogy
 
Odoo iot box integration
Odoo iot box integrationOdoo iot box integration
Odoo iot box integrationCeline George
 
Encryption by Default BoF by Gihan Dias [APRICOT 2015]
Encryption by Default BoF by Gihan Dias [APRICOT 2015]Encryption by Default BoF by Gihan Dias [APRICOT 2015]
Encryption by Default BoF by Gihan Dias [APRICOT 2015]APNIC
 
[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...
[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...
[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...CODE BLUE
 
DYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITY
DYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITYDYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITY
DYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITY anurama
 
Keynote Session : Internet Of Things (IOT) Security Taskforce
Keynote Session : Internet Of Things (IOT) Security TaskforceKeynote Session : Internet Of Things (IOT) Security Taskforce
Keynote Session : Internet Of Things (IOT) Security TaskforcePriyanka Aash
 
Extending CyberSecurity Beyond The Office Perimeter
Extending CyberSecurity Beyond The Office PerimeterExtending CyberSecurity Beyond The Office Perimeter
Extending CyberSecurity Beyond The Office PerimeterVeriato
 
Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...
Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...
Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...Netpluz Asia Pte Ltd
 
venkata krishna IoT ppt
venkata krishna IoT pptvenkata krishna IoT ppt
venkata krishna IoT pptRaja Krishna
 
Data Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 KeynoteData Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 KeynoteHPCC Systems
 
Security Aspects in IoT - A Review
Security Aspects in IoT - A Review Security Aspects in IoT - A Review
Security Aspects in IoT - A Review Asiri Hewage
 
Security and Privacy considerations in Internet of Things
Security and Privacy considerations in Internet of ThingsSecurity and Privacy considerations in Internet of Things
Security and Privacy considerations in Internet of ThingsSomasundaram Jambunathan
 
Iot tunisia forum 2017 security, confidentiality and privacy in iot
Iot tunisia forum 2017 security, confidentiality and privacy in iot Iot tunisia forum 2017 security, confidentiality and privacy in iot
Iot tunisia forum 2017 security, confidentiality and privacy in iot IoT Tunisia
 
IoT World - creating a secure robust IoT reference architecture
IoT World - creating a secure robust IoT reference architectureIoT World - creating a secure robust IoT reference architecture
IoT World - creating a secure robust IoT reference architecturePaul Fremantle
 
Principals of IoT security
Principals of IoT securityPrincipals of IoT security
Principals of IoT securityIoT613
 

Tendances (20)

Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...
Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...
Open Source Insight: AI for Open Source Management, IoT Time Bombs, Ready for...
 
IE_ERS_CyberAnalysisReport
IE_ERS_CyberAnalysisReportIE_ERS_CyberAnalysisReport
IE_ERS_CyberAnalysisReport
 
How Is IoT Changing Web Development?
How Is IoT Changing Web Development?How Is IoT Changing Web Development?
How Is IoT Changing Web Development?
 
Odoo iot box integration
Odoo iot box integrationOdoo iot box integration
Odoo iot box integration
 
Encryption by Default BoF by Gihan Dias [APRICOT 2015]
Encryption by Default BoF by Gihan Dias [APRICOT 2015]Encryption by Default BoF by Gihan Dias [APRICOT 2015]
Encryption by Default BoF by Gihan Dias [APRICOT 2015]
 
[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...
[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...
[CB21] Keynote1:Shaking the Cybersecurity Kaleidoscope – An Immersive Look in...
 
DYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITY
DYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITYDYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITY
DYNAMIC KEY REFRESHMENT FOR SMART GRID MESH NETWORK SECURITY
 
Keynote Session : Internet Of Things (IOT) Security Taskforce
Keynote Session : Internet Of Things (IOT) Security TaskforceKeynote Session : Internet Of Things (IOT) Security Taskforce
Keynote Session : Internet Of Things (IOT) Security Taskforce
 
Extending CyberSecurity Beyond The Office Perimeter
Extending CyberSecurity Beyond The Office PerimeterExtending CyberSecurity Beyond The Office Perimeter
Extending CyberSecurity Beyond The Office Perimeter
 
Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...
Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...
Are you Cyber ready? Introducing Netpluz managed cyber security - cyber intel...
 
venkata krishna IoT ppt
venkata krishna IoT pptvenkata krishna IoT ppt
venkata krishna IoT ppt
 
Data Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 KeynoteData Analytics in Cyber Security - Intellisys 2015 Keynote
Data Analytics in Cyber Security - Intellisys 2015 Keynote
 
Security Aspects in IoT - A Review
Security Aspects in IoT - A Review Security Aspects in IoT - A Review
Security Aspects in IoT - A Review
 
Security and Privacy considerations in Internet of Things
Security and Privacy considerations in Internet of ThingsSecurity and Privacy considerations in Internet of Things
Security and Privacy considerations in Internet of Things
 
Iot tunisia forum 2017 security, confidentiality and privacy in iot
Iot tunisia forum 2017 security, confidentiality and privacy in iot Iot tunisia forum 2017 security, confidentiality and privacy in iot
Iot tunisia forum 2017 security, confidentiality and privacy in iot
 
Iot Overview
Iot Overview Iot Overview
Iot Overview
 
Cyber security trends 2018
Cyber security trends 2018Cyber security trends 2018
Cyber security trends 2018
 
IoT World - creating a secure robust IoT reference architecture
IoT World - creating a secure robust IoT reference architectureIoT World - creating a secure robust IoT reference architecture
IoT World - creating a secure robust IoT reference architecture
 
Principals of IoT security
Principals of IoT securityPrincipals of IoT security
Principals of IoT security
 
Global Cybersecurity Market (2017 - 2022)
Global Cybersecurity Market (2017 -  2022) Global Cybersecurity Market (2017 -  2022)
Global Cybersecurity Market (2017 - 2022)
 

En vedette

Redefining Integration
Redefining IntegrationRedefining Integration
Redefining IntegrationLeigh Reyes
 
Semantic Wiki
Semantic Wiki Semantic Wiki
Semantic Wiki aesplana
 
Fishing for Affection- Pitch
Fishing for Affection- PitchFishing for Affection- Pitch
Fishing for Affection- Pitchxanthezyoung
 
Words are more powerful in a visual world
Words are more powerful in a visual worldWords are more powerful in a visual world
Words are more powerful in a visual worldLeigh Reyes
 
Crowdder - realtime consumer insights
Crowdder - realtime consumer insightsCrowdder - realtime consumer insights
Crowdder - realtime consumer insightsLeigh Reyes
 
Don't Just Think There, Make Something
Don't Just Think There, Make SomethingDon't Just Think There, Make Something
Don't Just Think There, Make SomethingLeigh Reyes
 

En vedette (9)

Redefining Integration
Redefining IntegrationRedefining Integration
Redefining Integration
 
Semantic Wiki
Semantic Wiki Semantic Wiki
Semantic Wiki
 
Eaveshopping
EaveshoppingEaveshopping
Eaveshopping
 
Fishing for Affection- Pitch
Fishing for Affection- PitchFishing for Affection- Pitch
Fishing for Affection- Pitch
 
PoKePoint
PoKePointPoKePoint
PoKePoint
 
Words are more powerful in a visual world
Words are more powerful in a visual worldWords are more powerful in a visual world
Words are more powerful in a visual world
 
Crowdder - realtime consumer insights
Crowdder - realtime consumer insightsCrowdder - realtime consumer insights
Crowdder - realtime consumer insights
 
Backtothefuture
BacktothefutureBacktothefuture
Backtothefuture
 
Don't Just Think There, Make Something
Don't Just Think There, Make SomethingDon't Just Think There, Make Something
Don't Just Think There, Make Something
 

Semantic Web and REST

  • 1. STRAYER UNIVERSTY NETWORK ARCHITECTURE AND ANALYSIS SUBMITTED TO DR. BRYANT PAYDEN BY ADELMAR ESPLANA GRADUATE STUDENT IN MASTER OF SCIENCE IN INFORMATION SYSTEMS MARCH 2009
  • 2. 2 I. Overview “Knowledge is power (But only if you know how to acquire it)." (The Economist, 2003) Knowledge is awareness and understanding of facts, truth or information in the form of experiences and learning. Today, one of the quickest ways to acquire knowledge is through the Web or WWW (World Wide Web). Using search engine facilities, we can search almost any kind of information. The internet has become a world- wide network of information resource and a powerful communication tool. And the more specific word you type in, in the search engine, the more accurate results you would get. The internet is a powerful tool which may be used in a number of ways such as having online classes, bills payment, booking vacation, scheduling medical appointments, finding the latest news and even taking a virtual tour on vacation destinations. In spite of all these tremendous capabilities that the Web offers, to date, the principle that is being used by search engines in the analysis of material is based on textual indexing. Although search engines have proven remarkably useful in finding information rapidly, they have also been proven remarkably useless in providing quality information. “The biggest problem facing users of web search engines today is the quality of the results they get back.”(Brin S. & Page L.) Currently, search engines can only account for the vocabulary of the documents and has a little concept of document quality thereby producing a lot of junk. Since its inception in 1989, the internet has grown initially as a medium for the broadcast of read- only material, from heavily loaded corporate servers, to the mass of internet-connected consumers. Recently, the rise of digital video, blogging, podcasting, and social media has revolutionized the way people socially interact through the web. Other promising developments include the increasing interactive nature of the interface to the user, and the increasing use of machine-readable information with defined semantics allowing more advanced machine processing of global information, including machine-readable signed assertions. (Berners T., 1996a) A lot of the data that we use everyday are not part of the web. These data which are in various silos, reside in different software applications and different places that cannot be connected easily like bank statements, photographs and calendar appointments. In order for us to see bank statements, we have to use online banking; to manage our calendar appointments and organize our photo collection, we have to use
  • 3. 3 social-oriented websites such as MySpace and Facebook. It is interesting to see on the web photos displayed on a calendar, showing exactly what we were doing when we took them; and bank statement, showing the list of transactions that we incurred. Due to the silo of information where both personal and business information are managed by different software, we are forced to learn how to use different programs all the time. In addition, since data are controlled by applications, each application must keep the data to itself. According to W3C Group, what we need is a “web of data” (Herman I., 2009). Supported by Fielding’s dissertation, he mentioned that what we need is a way to store and structure our own information, whether permanent or temporal in nature, both for our own necessity and for others, be able to reference and structure the information stored by others so that it would not be necessary for everyone to keep and maintain local copies. (Fielding R., 2000) II. Purpose of the Paper Understanding the underlying concepts and principles behind the web is essential to current and future implementation initiatives. For this reason, it is the objective of this paper to uncover the root of its existence, and to examine the fundamental design notion of the following design principles: Independent specification design, Hypertext Transfer Protocol (HTTP), Uniform Resource Identifier (URI) and Hypertext Markup Language (HTML). This study also aims to develop a better understanding of the emerging web standards, such as REST, SOA, and Semantic Web. The paper discusses some of the misconceptions about URI, HTTP and XML and the following issues: a) In REST and Semantic point of view, there is no difference between slash based and parameter based URI reference; b) HTTP is not a data transfer protocol; it is an application protocol (or a coordination language, if you swing that way). REST does not "run on top of HTTP" but rather HTTP is a protocol that displays many of the traits of the REST architectural style; c) What is Extensible Markup Language (XML) function in Representational state transfer (REST) and Semantic Web? Is it true that most REST services in deployment do not return XML but rather HTML? Is it true that REST has no preference for XML?
  • 4. 4 III. Foundation of the World Wide Web According to Tim Berners, “The goal of the Web was to be a shared information space through which people (and machines) could communicate.” (Berners T., 1996a) It was also the original intent that this so called “space” ought to span all sorts of information, from different sources to a wide array of formats, and from highly valued designed material to a spontaneous idea. In the original design of the web, he stated the following fundamental design criteria: (Berners T., 1996a) a) An information system must be able to record random associations between any arbitrary objects, unlike most database systems The concept of database systems has been purposely utilized to facilitate storage, retrieval and information generation of structured data. Unlike, the web concept of “one universal space of information” which is based on the principle that almost anything on the web could be possibly linked to any arbitrary objects. The power of the Web is that linkage can be established to any document (or, more generally, resource) of any kind in the universe of information, whereas in the database systems, one has to understand the data structure to establish the relationship. b) If two sets of users started to use the system independently, to make a link from one system to another should be an incremental effort, not requiring unscalable operations such as the merging of link databases In the business environment, to integrate two different types of systems, it is necessary to perform some degrees of integration efforts such as merging, importing or linking of databases. On the contrary, the idea of web was to be able integrate systems easily. Most of the systems done in the past involve a great deal of integration effort due to the information silo. For this reason, the idea where machine can talk to each other set forth the promise of seamless integration. c) Any attempt to constrain users as a whole to the use of particular languages or operating systems was always doomed to fail. Information must be available on all platforms, including future ones. Platform and language interoperability support the principles of universality of access
  • 5. 5 irrespective of hardware or software platform, network infrastructure, language, culture, geographical location, or physical or mental impairment. d) Any attempt to constrain the mental model users of data into a given pattern was always doomed to fail. If information within an organization is to be accurately represented in the system, entering or correcting it must be trivial for the person directly knowledgeable If the interaction between person and hypertext could be so intuitive that the machine- readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations. Independent specification design The basic principles of the Web proposed in 1989 to meet the design criteria were adopted based on the well-known software design principles called “independent specification design”. This design was based on the principle of modularity. Meaning when it is modular in nature, the interfaces between the modules hinge on simplicity and abstraction. This allows seamless compatibility of the existing content, to work with the new implementation. As technology evolves and disappears, specifications for the Web’s languages and protocols should be able to adapt to the new hardware and software changes. Along with this basic principle are the three main components such as URI, HTTP and HTML. URI or Universal Resource Identifier URI is a compact string of characters for identifying abstract or physical resource. It is a simple and extensible means of identifying a resource. A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identifies resources via a representation of their primary access mechanism, rather than identifying the resource by name or by some other attribute(s) of that resource (Berners T., Fielding R. & L. Masinter, 2005). URNs (Uniform Resource Names) are used for identification; URCs (Uniform Resource Characteristics), for including meta- information; and URLs, for locating or finding resources.
  • 6. 6 REST defines URI as a resource based on a simple premise that identifiers should change as infrequently as possible (Fielding R., 2000). While Semantic Web identifies URI’s not just Web documents, but rather real-world objects like people, cars, and abstract ideas. They call all these as real-world objects or things (W3C, 2008b). Deriving URI definitions from the meaning of each letters U -Uniform, R-Resource and I-Identifier as listed below: Uniform allows consistency of its usage, even when the internal mechanism of accessing the resources has changed. It allows common semantic interpretation of syntactic conventions, across different type of resources to work with the existing identifiers. Resources are, in general, any real world “thing” such as electronic documents, images and services, recognized by URI to represent something, for example, electronic document, an image, or a source of information with consistent purpose. Other resources that are not accessible via internet are representation of the abstract concepts, mathematical equations, correlation (e.g., “parent” or “employee”) and values (e.g., zero, one, and infinity). Identifier pertains to information required to distinguish what is being identified from all other things within its scope of identification. The terms “identify” and “identifying” means distinguishing one resource from the other regardless how that purpose is accomplished. One of the capabilities web popularized is the ability of documents to link to any kind, in the universe of information. With this in mind, the concept of “identity” is concerned with the conceptual scheme of identifying objects generically. For example, one URI can represent a book which is available in several languages and several data format. HTTP and URIs are the basis of the World Wide Web, yet they are often misunderstood, and their implementations and uses are sometimes incomplete or incorrect (W3C, 2003). a) A common mistake, responsible for many implementation problems, is to think that a URI is equivalent to a filename within a computer system. This is wrong as URIs have, conceptually, nothing to do with a file system.
  • 7. 7 b) A URI should not show the underlying technology (server-side content generation engine, script written in such or such language) used to serve the resource. Using URIs to show the specific underlying technology means one is dependent on the technology used, which, in turn, means that the technology cannot be changed without either breaking URIs or going through the hassle of "fixing" them. HTTP According to the HTTP 1.0 specification, The Hypertext Transfer Protocol (HTTP) is an application- level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (Berners T., Fielding R. & Frystyk H., 1996b). HTTP messages are generic and communication takes place operationally based on the client/server paradigm of request/response. Messages are all created to comply with the generic message format. Clients usually send requests and receive responses, while servers receive requests and send responses. It is stateless and connectionless in nature because after the server has responded to the client's request, the connection between client and server is dropped and forgotten. There is no "memory" between client connections. Basically, when you type in the URL in the browser, the client and server Connection takes place over TCP/IP. This URL internally gets converted into a Request for server to process, after the server finished processing, and then the server sends the message Response back to the client and Closes the connection of both parties. The downside of it is that, it may decrease the network-performance due to the increasing amount of overhead data per request, the fact that the state of request is not stored in a shared context. The design is patterned and implemented with the idea of object-orientation. In general, objects used internally for each request are as follows: HTTP messages, Request/Response, Entity, Method Definitions, Status Code Definitions, Status Code Definitions, and Header Field Definitions (based HTTP 1.0 specification).
  • 8. 8 The Method field indicates the method to be performed on the object identified by the URL. Methods supported by HTTP 1.1 specification are OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, and CONNECT (Fielding R., 1999). The GET method means to retrieve whatever is identified by the URI. The HEAD is the same as GET but it returns only HTTP headers and no document body. The POST method is used to request that the destination server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions: 1) annotation of existing resources; 2) posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles; 3) providing a block of data, such as the result of submitting a form, to a data-handling process; and 4) extending a database through an append operation. (Berners T., Fielding R & H. Frystyk, 1996b). HTML The Hypertext Markup Language (HTML) is a markup language used to create hypertext documents that are platform independent. (Yergeau F. et.al, 1997) The difference between XML and HTML is that, HTML is defined by W3C that must be followed by every possible browser. XML is an extension, its markup is customizable. It is typically used as storage to hold and describe data. IV. Future of the World Wide Web Say you had some lingering back pain: a program might determine a specialist's availability, check an insurance site's database for in-plan status, consult your calendar, and schedule an appointment. Another program might look up restaurant reviews, check a map database, cross-reference open table times with your calendar, and make a dinner reservation. Tim Berners and others describe this as “web of data”. This will be the new Web capable of supporting software agents that are able not only to locate data, but also to “understand” in ways that will allow computers to perform meaningful tasks with data automatically on the fly (Updegrove, 2001). The Semantic Web is a web of data. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of
  • 9. 9 documents. It is also about language for recording how the data relates to real world objects that allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing (Herman I., 2009). Representational State Transfer (REST) is an architectural style for distributed hypermedia systems, describing the software engineering principles guiding REST and the interaction constraints chosen to retain those principles, while contrasting them to the constraints of other architectural styles (Fielding R., 2000). The fundamental differences between the two are: Semantic Web is an integration solution (a solution to information silo), while REST is a set of state transfer operations universal to any data storage and retrieval system (Battle R. & Benson E., 2007). Semantic Web provides ways to semantically describe and align data from desperate sources while REST offers resource data access operations commonly known as CRUD (Create, Read, Update and Delete). From the traditional “web of pages” to a “web of data”, the Semantic Web goal is to provide a cost- efficient way of sharing machine-readable data. The business of sharing machine-readable data in general has been around for quite some time. Information silo has always been a challenge that researchers and IT practitioners are keen about. Service-oriented architecture (SOA) solutions have been created to satisfy business goals that include easy and flexible integration with legacy systems, streamlined business processes, reduced costs, innovative service to customers, and agile adaptation and reaction to opportunities and competitive threats. SOA is a popular architecture paradigm for designing and developing distributed systems (Bianco P. et al., 2007). In spite of the popularity of SOA and Web Services, confusion among software developers is prevalent. To shed a light, SOA is an architectural style, whereas Web Services is a technology used to implement SOA’s. Web services provide a standard means of interoperating between different software applications, running on a variety of platforms and/or frameworks (W3C, 2004). The Web services technology consists of several published standards, the most important ones being SOAP, XML (Extensible Markup Language) and WSDL (Web Services Description Language). Although there are some other technologies like CORBA and
  • 10. 10 Jini but to limit our discussion, we are only concerned with the Web Services as other do not apply to Web domain. At the heart of the Service Oriented Architecture is the service contract. It answers the question, "what service is delivered to the customer?" In the current web-services stack, WSDL is used to define this contract. However, WSDL defines only the operational signature of the service interface and is too brittle to support discovery in a scalable way. "SOAP” is no longer an acronym, A SOAP message represents the information needed to invoke a service or reflect the results of a service invocation, and contains the information specified in the service interface definition (W3C, 2004). Extensible Markup Language (XML) documents are made up of storage units called entities, which contain either parsed or unparsed data. (Cowan J., 2008). SOAP and WSDL are good examples of XML documents. As mentioned earlier, Web services way is just another roadmap of Service Oriented Architecture. The concept of “web of data” was also introduced as a solution for information silo and, was able to establish the rationale for Web-accessible API (Application Programming Interface). Technically speaking, a Web service is a Web-accessible API. So, why is there a need for REST and Web Semantics? There is a great amount of data available through REST and SOAP Web Services, published by private and public sectors however these data carry no markup that conforms to semantic standards. It is important to provide markup in a manner where Semantic Web application suite understands to make the services compatible and to allow semantic query operations feasible (Battle R. & Benson E, 2007). In the traditional Database Systems, we have SQL (Structured Query Language). It is the language used to interact with the database. In the Semantic world, there is this technology called SPARQL (SPARQL Protocol and RDF Query Language). SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions. SPARQL also supports extensible value testing and constraining queries by source RDF graph. The results of SPARQL queries can be results sets or RDF graphs (W3C, 2008a).
  • 11. 11 It is now getting more interesting, that developers have to learn not only SQL but also SPARQL. Thanks to Edgar Frank Codd, the inventor of Relational Database; without his invention we are still using the filing cabinet and physically sorting out and searching records manually. It is important to provide markup in a manner where Semantic Web could understand to make the services compatible and to allow semantic query operations feasible. With the existing SOAP Web services, what needed to be done is to add semantic information to web services, such as OWL-S and SAWDL. They provide details for each Web service parameter that describes how the value is derived from ontology (Battle R. Benson E, 2007). The OWL-S document maps each operation and message defined in the WSDL definition ontology. But the problem with SPARQL is that, it is an RDF Query Language designed for RDF. Since most Web services return plain old XML, a conversion process from XML data to RDF is needed. So where does REST come into play? REST is another roadmap of SOA and a principle that is being applied to a quite few of Web services implementation. While SOAP based services have a WSDL document that defines their operation, there is no standard equivalent for REST services. This is the area where companies who adopted REST earlier must be aware of. Who knows, if companies are really convinced that this is the right way of doing Web services, then maybe in the future there will be a standard way of implementation. V. Discussion a. In REST and Semantic point of view, there is no difference between slash based and parameter based URI reference. There are two major requirements on the Semantic Web where naming of Resource must be followed. First, a description of the identified resource should be retrievable with standard Web technologies. Second, a naming scheme should not confuse things and the documents representing them (Battle R. Benson E., 2007). Both REST and Semantic Web support the idea that “Cool URIs don't change”. Tim Berner explained that the best resource identifiers don't just provide descriptions for people and machines, but are designed with simplicity, stability and manageability in mind. Based on
  • 12. 12 W3C standard, a generic URI syntax consists of hierarchical sequence of components such as scheme, authority, path, query, and fragment. URI = scheme ":" hier-part [“?" query] [“#" fragment] The following are two examples URIs and their component parts: foo://example.com:8042/over/there?name=ferret#nose _/ ______________/_________/ _________/ __/ | | | | | scheme authority path query fragment | _____________________|__ / / urn: example: animal: ferret: nose WC3 recommends the use of standard session mechanisms instead of session-based URIs (W3C, 2003). What does it mean? HTTP/1.1 provides a number of mechanisms for identification, authentication and session management. Using these mechanisms instead of user- based or session-based URIs guarantees than the URIs used to serve resources is truly universal (allowing, for example, people to share, send, or copy them). For example: Bob tries to visit http://www.example.com/resource, but since it's a rainy Monday morning, he gets redirected to http://www.example.com/rainymondaymorning/resource. The day after, when Bob tries to access the resource, he had bookmarked earlier, the server answers that Bob has made a bad request and serves http://www.example.com/error/thisisnotmondayanymore. Had the server served back http://www.example.com/resource because the Monday session had expired, it would have been, if not acceptable, at least harmless. The problem with this is that, it does not really guarantee that URI’s used are truly universal. The acceptable practice in this situation is to use some modifiers, like "?" used to pass arguments for cgi, or ";" and to pass other kind of arguments or context information.
  • 13. 13 Roy Fielding did not mention in his dissertation that URI do not allow parameterized reference. Similarly, Semantic Web requirements mentioned that as long as the identifier conforms to the two major requirements above and W3C standard specifications then the use of it is acceptable. Both REST and Semantic Web consistently raised the implementation need of having abstraction to URI. The key abstraction of information in REST is a resource. Any information that can be named can be a resource, a document or image, a temporal service (e.g. “today’s weather in Los Angeles”), a collection of other resources, a non-virtual object (e.g. a person), and so on (Fielding R. 2000). It was mentioned previously that the biggest challenge of the search engines today is the quality of results. Search engine spiders do not presently crawl many types of “dynamic” web pages. Typical examples of dynamic pages are those internal and external web applications that companies are using to do their business as well as those Web 2.0 emerging sites. In accordance to this, it is important that we identify the types of resource and mapped the underlying entity. Conceptual representation means, resource is an abstraction of some type of arbitrary concept. Once mapping of the concept “resource” to a physical resource is done, it should remain this way as long as possible. Think about pages that have .asp extension. Companies who are still linking to those pages are possibly not working anymore since a lot of companies are now moving to .aspx. b) HTTP is not a data transfer protocol; it is an application protocol (or a coordination language, if you swing that way). REST does not "run on top of HTTP" but rather HTTP is a protocol that displays many of the traits of the REST architectural style. HTTP is not designed to be a transport protocol. It is a transfer protocol in which the messages reflect the semantics of the Web architecture by performing actions on resources through the transfer and manipulation of representations of those resources. It is possible to achieve a wide range of functionality using this very simple interface, but following the interface is required in order for HTTP semantics to remain visible to intermediaries (Fielding R., 2000).
  • 14. 14 Conceivably, it is easy to get the wrong idea that REST sits in between Application Protocol and Transport Protocol when it was cited that REST is a “transfer protocol”. Leveraging the HTTP Headers to provide request context around CRUD operations (Create for POST, Read for GET, Update for PUT and Delete for DELETE) will allow developers to overlay the programmatic API for a website directly on top of the site exposed to web user and reduce the cost and complexity of providing multi- format access to a site’s underlying data. c) What is Extensible Markup Language (XML) function in REST and Semantic Web? Is it true that most REST services in deployment do not return XML but rather HTML? Is it true that REST has no preference for XML? RESTS’s data elements are summarized in Table 1 Table 1 REST Data Elements Data Element Modern Web Examples resource the intended conceptual target of a hypertext reference resource identifier URL, URN representation HTML, document, JPEG image representation metadata media type, last-modified time resource metadata source link, alternates, vary control data if-modified-since, cache-control It is true that Roy did not specify XML as an example of resource and resource metadata; on the other hand, he did mention the representation media type which is the data format of the representation. He described that representation consists of data, meta data describing the data, and, on occasion metadata to describe metadata. As mentioned previously XML is an open standard for describing data. Therefore, the question about “REST has no preference for XML”. It is not true if you are doing Web services but false if you are just creating pages that are not designed for machine
  • 15. 15 interpretation, why do you have to care about returning XML if you don’t need it in the first place? Likewise if you want to share your data, how do you want your data represented? Using file delimited text? The idea of REST and Semantic Web is to coexist with the existing web standards and not to disqualify any of them; in fact it is the idea of Platform and language interoperability. V. CONCLUSION Acquiring data has becoming easier and easier than ever before and with latest technology breakthroughs, there is no doubt that in time the internet will be “all in one”. Along the way, there will be some adjustments and corrections to be done and misconceptions to be addressed (intentional or not - whatever) to reach this so called “web of data”. Support from the industry players is crucial. Data security issues have always been our primary concerned. There are a plethora of questions that must be addressed, such as the following; a) who will annotate the data? b) What is the advantage of giving the data, as we all know that data is a valuable commodity? c) Without any centralized control, how will all this data be connected to one another? d) Will the existing AI techniques be sufficient to process this huge amount of data? In addition, is it even practical to pursue this route? On the other side, web has exploded so rapidly that in the beginning, what we are only concerned about is the sharing of documents. Giving credit to the early contributors who brought us this far, I firmly believe that the simple approach of the existing implementation of web particularly URL (which was originally designed as URI) opened up the door to everybody (computer savvy and none). Explaining URI alone to common people and doing it right the first time is not simple. Philosophically speaking, isn’t it also what the concept of “universality of access” is? In the same way, when you write software, it is not always right the first time. Writing software is an evolving process. Meeting the line of both ends, I believe that there is not much we can do concerning why things are done the URL way, but I do recognize that there is always room for improvement. With a better understanding of what went before and what it is about to come, moving towards the future of Semantic Web is up for us to consider.
  • 16. 16 VI. REFERENCES Berners T., Fielding R. & Masinter L. (2005) Uniform Resource Identifier (URI): Generic Syntax. Retrieved Feb 13, 2009 from http://labs.apache.org/webarch/uri/rfc/rfc3986.html#URLvsURN Berners T. (2002). What do HTTP URIs Identify? Retrieved Feb 20, 2009 from http://www.w3.org/DesignIssues/Overview.html Berners T. (1996a). The World Wide Web: Past, Present and Future. Retrieved on Feb 20, 2009 from http://www.w3.org/People/Berners-Lee/1996/ppf.html Berners T., Fielding R & H. Frystyk (1996b) Hypertext Transfer Protocol -- HTTP/1.0. Retrieved Feb 13, 2009 from http://www.ietf.org/rfc/rfc1945.txt Battle R. Benson E, (2007). Bridging the semantic Web and Web 2.0 with Representational. State Transfer (REST). Retrieved Feb 13,2009 from http://omescigil.etu.edu.tr/semanticweb/papers/sw_4.pdf Bianco P., Kotermanski R. Merson P. (2007 ) Evaluating a Service-Oriented Architecture. Retrieved Feb 13,2009 from http://www.sei.cmu.edu/pub/documents/07.reports/07tr015.pdf Brin S. & Page L. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Retrieved Feb 20, 2009 from http://infolab.stanford.edu/~backrub/google.html Cowan J., Fang A., Grosso P., Lanz K., Marcy G., Thompson H., Tobin R., Veillard D., Walsh N., Yergeau F. (2008) Extensible Markup Language (XML) 1.0 (Fifth Edition) Retrieved Feb 13,2009 from http://www.w3.org/TR/REC-xml/#sec-intro Fielding R. (2000). Architectural Styles and the Design of Network-based Software Architectures. Retrieved Feb 20, 2009 from http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm Fielding R., Mogul J., Gettys J., Frystyk H., Masinter L., Leach P. & Berners T. (1999) Hypertext Transfer Protocol HTTP/1.1. Retrieved Feb 13,2009 from http://www.ietf.org/rfc/rfc2616.txt Herman I. (2009). W3C Semantic Web Activity. Retrieved Feb 20, 2009 from http://www.w3.org/2001/sw/
  • 17. 17 The Economist (2003). Knowledge is power. Retrieved on Feb 20, 2009 from http://www.economist.com/business/globalexecutive/education/displayStory.cfm?story_id=17626 Updegrove A. (2001) THE SEMANTIC WEB: AN INTERVIEW WITH TIM BERNERS-LEE. Retrieved Feb 13,2009 from http://www.consortiuminfo.org/bulletins/semanticweb.php W3C (2008a) SPARQL Query Language for RDF Retrieved Feb 13, 2009 from http://www.w3.org/TR/rdf-sparql-query/ W3C (2008b) Cool URIs for the Semantic Web. Retrieved Feb 13, 2009 from http://www.w3.org/TR/2008/WD-cooluris-20080321/ W3C (2004) Web Services Architecture Retrieved Feb 13, 2009 from http://www.w3.org/TR/ws-arch/#id2260892 W3C (2003) Common HTTP Implementation Problems. http://www.w3.org/TR/2003/NOTE-chips-20030128/ Yergeau Y., Nicol G. Adams G. & Duerst M .(1997) Internationalization of the Hypertext Markup Language http://www.rfc-editor.org/rfc/rfc2070.txt