SlideShare a Scribd company logo
1 of 8
Download to read offline
Using OpenCalais API in the context of Linked Data

                               Eldorina Andreea Alergus

                     Faculty of Computer Science, Distributed Systems
                        eldorina.alergus@info.uaic.ro



       Abstract. In this paper we discuss about OpenCalais Api in the context of
       Linked Data. With the growth of Linked Datasets, automating certain tasks,
       such as discovery or interlinking data becomes more and more important. We
       will survey in this work what OpenCalais is offering us for linking the data.

       Keywords: OpenCalais, linked data, Web of Data



1      Introduction
The OpenCalais Web Service automatically generates rich semantic metadata for the
submitted content. OpenCalais analyses the content using method as: natural language
processing (NLP) or machine learning and finds the entities (Company, Country,
City, Product, Movie etc) within it, and more, it finds events (person P was hired at
company C) and facts (person P works for company C) within your text. The
metadata returned as response is an RDF construct that is also centrally stored.
    The metadata gives us the possibility of building maps, networks or graphs by
linking documents to people, geographies, places, companies, etc. Those maps can be
used in order to verify if our content contains what we expect, to tag and organize it
and also to create structured folksonomies or to improve site navigation. We can share
our maps with anyone else in the content ecosystem.
    The Calais ecosystem is exposed via Linked Data endpoints. We use the term
Linked Data to describe a method of exposing, sharing and connecting data on the
Web via dereferenceable URIs.[15] Having linked data, we can find other related
data. This is the Semantic Web, it’s about interlinking data, so that a person or a
machine to be able to explore the web of data. The main idea behind linked data is
that we may increase the value and the usability of data by connecting it with other
related data.
2          Eldorina Andreea Alergus


    Calais is part of the Linked Open Data (LOD) Cloud, and it links to the following
assets: Dbpedia, Wikipedia, Freebase, Reuters.com, GeoNames, Shopping.com,
IMDB, LinkedMDB. In order to understand what Calais is offering, we must first
understand the concept of Linked Data.



2        Linked Data
As we said above, Linked Data is the technique of publishing data on the Web and
interlinking data between different sources. It is machine-readable, its meaning is
explicitly defined, it is linked to other external data sets and can also be linked from
external data sets. Linked Data is based on RDF (Resource Description Framework)
documents, which is used to make typed statements that link arbitrary things in the
world. In order to access the web of data, we use Linked Data browsers (Tabulator,
Disco, RDFViz, BrowseRDF, etc) which enable navigation between different sources
using RDF. For instance, while looking at data about a product, a user may be
interested in information about the company that produces the thing. Following the
RDF link, he can navigate to information about that company contained in another
dataset.
    Berners-Lee outlined a set of rules in order to publish data on the Web in a way
that all published data becomes part of a single global data space:
    1.   Name things using URIs (Uniform Resource Identifiers).
    2.   Use HTTP URIs so that people can look up those names.
    3.   When someone looks up a URI, provide useful information using the
         standards (RDF, SPARQL).
    4.   Data should be interlinked with other data.
    These principles provide a basic recipe for publishing and connecting data using
the infrastructure of the Web while adhering to its architecture and standards.
    Linked Data relies on two fundamental technologies: URI and HTTP. URIs
provide generic methods of identifying any existing entity. Entities identified by URIs
that use http:// can be looked up by dereferencing the URI. We say dereferencing a
URI is the act of retrieving the representation of a resource identified by that URI.[16]




                                             2
Using OpenCalais API in the context of Linked Data       3


   To URI and HTTP we add a necessarily technology to the Web of Data – the RDF.
Similarly to HTML which provides the means to structure and link documents on the
Web, RDF provides a graph-based data model to structure and link data that describes
things.
   In RDFs data has the form of a triple: subject, predicate, object. The subject and
the object are URIs that identify a resource, or a URI and a string. The predicate
describes how the subject and the object are related, and is also represented by a URI.
   A linked dataset is a collection of data, published and maintained by a single
provider, available as RDF on the Web, where at least some of the resources in the
dataset are identified by dereferenceable URIs (http://rdfs.org/ns/void/html). In the
image below, we have an image of the Linked Open Data Cloud, on which we can see
the available datasets, and the links between them.




   By publishing data on the Web according to the Linked Data principles, we add
our data to a global data space, which allows data to be discovered and used by
various applications. To publish data set a Linked Data on the web, we must follow
three basic steps:
4       Eldorina Andreea Alergus


    -   Assign Uris to the entities described by the dataset and provide for
        dereferencing these URIs into RDF representations.
    -   Set RDF links to other data sources on the Web.
    -   Provide metadata about the published data so that clients to evaluate the
        quality of the published data.
    We will talk forward about how we can create rich semantic metadata for some
content.



3       OpenCalais Web Service

    As we already said in Introduction, The OpenCalais Web Service automatically
generates rich semantic metadata for the submitted content. It uses natural language
processing (NLP), machine learning and other methods to analyze content and return
the entities it finds, such as the cities, countries and people with dereferenceable
Linked Data style URIs. The events, facts and entity types, are defined in the
OpenCalais RDF Schemas (http://s.opencalais.com/1/pred/asf/1/pred/.html).
    In order to get started with OpenCalais, you first need to get an API key. Do get
the key, you must register at http://www.opencalais.com/user/register. The Calais WS
can be called from .NET, java, php etc using SOAP or REST. We can also use Calais
Viewer to see how it works, and what the output of a Calais call is.
    When we want to make a call to Calais API, we must provide some input
parameters, whom must be HTTP encoded. The service we invoke is at
http://api.opencalais.com/enlighten/?wsdl. We will explain what do we need to call
the service via SOAP.
    The method enlighten which allows to call the Open Calais web service via soap
has three parameters:
    -   licenseId. This is your API key that you can get from Calais site.
    -   paramsXML. Those are the input parameters of the service in XML format.
        More    information    about     the   input   parameters   we   can   find   at
        http://opencalais.com/documentation/calais-web-service-api/forming-api-
        calls/input-parameters.




                                               4
Using OpenCalais API in the context of Linked Data       5


   -     content. This is the content on which the extraction will be performed.
   For start we use a simple text as content: The Palace of Versailles, or simply
Versailles, is a royal château in Versailles, the Île-de-France region of France.
When the château was built, Versailles was a country village; today, however, it is a
suburb of Paris, some twenty kilometers southwest of the French capital. The court of
Versailles was the center of political power in France from 1682, when Louis XIV
moved from Paris, until the royal family was forced to return to the capital in October
1789 after the beginning of French Revolution. Versailles is therefore famous not only
as a building, but as a symbol of the system of absolute monarchy of the Ancien
Régime.
   We call the service using C# as follows: add in our project a service reference to
the Calais wsdl, then call the service as it follows:
   CalaisReference.calaisSoapClient client = new
CalaisReference.calaisSoapClient();
   string response = client.Enlighten(m_Licence,m_Content,
m_Params());
   The m_Content and m_Params is better to be read fron a file, and the response (a
RDF) should also be kept in a file.
   The entities found are: City (Paris, France), Country (France) and Facility (Palace
of Versailles). If we look at the URI http://d.opencalais.com/er/geo/city/ralg-
geo1/797c999a-d455-520d-e5cf-04ca7fb255c1.html, we can say thet the entity (City)
has been disambiguated, because it contains /er/. The entities which contain /em/ are
not disambiguated by OpenCalais. If we open the link in a browser, we see that is was
linked to other data sets (OpenCalais is linked to Freebase, Dbpedia, Geonames,
Linked              IMDB)               as:             http://dbpedia.org/resource/Paris,
http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000002db30                         ,
http://sws.geonames.org/2988507/        and is also has assigned a Web link -
http://en.wikipedia.org/wiki/Paris.
   For the detected entities OpenCalais provides an entity relevance score (shown for
each respectively in the screen shots below ) The relevance capability detects the
importance of each unique entity and assigns a relevance score in the range 0-1 (1
6        Eldorina Andreea Alergus


being the most relevant and important). We see that France is the most relevant
(69%).




    For a better understanding of how Calais can be used, we take a look at
http://gvlt.appspot.com/opencalais-geo/. In this project, the Calais API is used to
identify geographic references in a text and display them on an Open Layers map. The
Calais is used with JSON output, and all the processing is done on client side in the
browser.
    OpenCalais can also be useful to content managers to create smart indexes. Instead
of indexing by keywords, you can index by referenced subject. If you have a
collection of unstructured documents, in a website for example, you can use
OpenCalais to help manage and reference them together. By using the OpenCalais
API, a website's side navigation bar can suggest other related documents based on the
conceptual subject, instead of word matching as is used by most indexes. By taking
the RDF/XML document returned by the OpenCalais HTTP interface and storing it in
a RDF store, you can enable an application to find documents related to anything in
the RDF store. (http://www.devx.com/semantic/Article/38517/1763/page/2).



4        Conclusions
    Nowadays, the Web means more than just putting data on the web, it means
interlinking and sharing data as we share documents. The web is seen as an increasing
global graph. It started with the assumption that the values and usefulness of the data




                                            6
Using OpenCalais API in the context of Linked Data      7


increases by creating links between the data. This is what Linked Data means: uses
the Web to create typed links between data from different sources.


      Calais is a rapidly growing toolkit of capabilities that allows you to readily
incorporate state-of-the-art semantic functionality within your blog, content
management system, website or application. We have described in this paper how the
Calais WS can be invoked and what the RDF output is offering us. OpenCalais
represents an important move forward Semantic Web. With OpenCalais computers
could do the research for you, combing through and comparing company names,
locations and rumored or real transactions real time to give you answers in a way that
keyword search simply cannot do.



5        References

[1] C. Bizer, R. Cyganiak, T. Heath, How to Publish Linked Data on the Web

[2] T. Heath, An Introduction to Linked Data, 2009
[3] C. Bizer, T. Heath, T. Berners-Lee, Linked Data - The Story So Far
[4] M. Watson, Practical Semantic Web Programming With AllegroGraph, 2009
[5] K. Alexander, R. Cyganiaky, M. Hausenblasz, J. Zhaox, Describing Linked
Datasets
[6] http://opencalais.com/
[7] http://www.w3.org/DesignIssues/LinkedData.html
[8] http://thomsonreuters.com/content/corporate/articles/398062
[9]http://philippeadjiman.com/blog/2009/09/16/open-calais-from-java-with-eclipse-
extract-entities-facts-and-events-in-4-minutes/
[10] http://www.devx.com/semantic/Article/38517/1763/page/2
[11] http://blog.3kbo.com/2009/09/26/opencalais-response/
[12] http://wiki.dbpedia.org/Interlinking
[13]http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataS
ets
[14] http://gvlt.wordpress.com/2008/10/17/tutorial-text-geotagging-with-opencalais/
8      Eldorina Andreea Alergus


[15] http://en.wikipedia.org/wiki/Linked_Data
[16] http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14




                                          8

More Related Content

What's hot

Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQLOpen Data Support
 
A Term Based Ranking Methodology for Resources on the Semantic Web
A Term Based Ranking Methodology for Resources on the Semantic WebA Term Based Ranking Methodology for Resources on the Semantic Web
A Term Based Ranking Methodology for Resources on the Semantic WebAaron Huang
 
Scoda openrefine-directordata
Scoda openrefine-directordataScoda openrefine-directordata
Scoda openrefine-directordataTony Hirst
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
Scoda company networks2
Scoda company networks2Scoda company networks2
Scoda company networks2Tony Hirst
 
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia IndustryFrom Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia IndustryJoel Amoussou
 
Quick Linked Data Introduction
Quick Linked Data IntroductionQuick Linked Data Introduction
Quick Linked Data IntroductionMichael Hausenblas
 
ODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureMichele Pasin
 
reegle - a new key portal for open energy data
reegle - a new key portal for open energy datareegle - a new key portal for open energy data
reegle - a new key portal for open energy datareeep
 
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...Michele Pasin
 
Building a semantic website
Building a semantic websiteBuilding a semantic website
Building a semantic websiteCJ Jenkins
 
An introduction to Linked (Open) Data
An introduction to Linked (Open) DataAn introduction to Linked (Open) Data
An introduction to Linked (Open) DataAli Khalili
 
School of Data - mapping company networks
School of Data - mapping company networksSchool of Data - mapping company networks
School of Data - mapping company networksTony Hirst
 
ORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and ComplexityORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and ComplexityEduserv Foundation
 
Data.dcs: Converting Legacy Data into Linked Data
Data.dcs: Converting Legacy Data into Linked DataData.dcs: Converting Legacy Data into Linked Data
Data.dcs: Converting Legacy Data into Linked DataMatthew Rowe
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
 

What's hot (20)

Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQL
 
Linked Data
Linked DataLinked Data
Linked Data
 
Semantic web
Semantic webSemantic web
Semantic web
 
Web of Data Usage Mining
Web of Data Usage MiningWeb of Data Usage Mining
Web of Data Usage Mining
 
A Term Based Ranking Methodology for Resources on the Semantic Web
A Term Based Ranking Methodology for Resources on the Semantic WebA Term Based Ranking Methodology for Resources on the Semantic Web
A Term Based Ranking Methodology for Resources on the Semantic Web
 
Scoda openrefine-directordata
Scoda openrefine-directordataScoda openrefine-directordata
Scoda openrefine-directordata
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Scoda company networks2
Scoda company networks2Scoda company networks2
Scoda company networks2
 
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia IndustryFrom Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
 
Quick Linked Data Introduction
Quick Linked Data IntroductionQuick Linked Data Introduction
Quick Linked Data Introduction
 
ODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer Nature
 
reegle - a new key portal for open energy data
reegle - a new key portal for open energy datareegle - a new key portal for open energy data
reegle - a new key portal for open energy data
 
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...
DH11: Browsing Highly Interconnected Humanities Databases Through Multi-Resul...
 
Building a semantic website
Building a semantic websiteBuilding a semantic website
Building a semantic website
 
An introduction to Linked (Open) Data
An introduction to Linked (Open) DataAn introduction to Linked (Open) Data
An introduction to Linked (Open) Data
 
School of Data - mapping company networks
School of Data - mapping company networksSchool of Data - mapping company networks
School of Data - mapping company networks
 
ORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and ComplexityORE and SWAP: Composition and Complexity
ORE and SWAP: Composition and Complexity
 
Data.dcs: Converting Legacy Data into Linked Data
Data.dcs: Converting Legacy Data into Linked DataData.dcs: Converting Legacy Data into Linked Data
Data.dcs: Converting Legacy Data into Linked Data
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
Danbri Drupalcon Export
Danbri Drupalcon ExportDanbri Drupalcon Export
Danbri Drupalcon Export
 

Similar to OpenCalais in Linked Data context

Linked dataresearch
Linked dataresearchLinked dataresearch
Linked dataresearchTope Omitola
 
Linked Data Tutorial
Linked Data TutorialLinked Data Tutorial
Linked Data TutorialSören Auer
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutionsOpen Data Support
 
Linked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and ExamplesLinked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and ExamplesOpen Data Support
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
Discovering Resume Information using linked data  
Discovering Resume Information using linked data  Discovering Resume Information using linked data  
Discovering Resume Information using linked data  dannyijwest
 
Semantically enriching content using OpenCalais
Semantically enriching content using OpenCalaisSemantically enriching content using OpenCalais
Semantically enriching content using OpenCalaisMarius Butuc
 
Open Calais
Open CalaisOpen Calais
Open Calaisymark
 
Web of Data as a Solution for Interoperability. Case Studies
Web of Data as a Solution for Interoperability. Case StudiesWeb of Data as a Solution for Interoperability. Case Studies
Web of Data as a Solution for Interoperability. Case StudiesSabin Buraga
 
RDFa Semantic Web
RDFa Semantic WebRDFa Semantic Web
RDFa Semantic WebRob Paok
 
Lodlam saa 2011_jenelfarrell_2
Lodlam saa 2011_jenelfarrell_2Lodlam saa 2011_jenelfarrell_2
Lodlam saa 2011_jenelfarrell_2Jenel Farrell
 
Anatomy of a semantic virus
Anatomy of a semantic virusAnatomy of a semantic virus
Anatomy of a semantic virusUltraUploader
 

Similar to OpenCalais in Linked Data context (20)

Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
 
Linked Data In Action
Linked Data In ActionLinked Data In Action
Linked Data In Action
 
Linked dataresearch
Linked dataresearchLinked dataresearch
Linked dataresearch
 
Linked Data Tutorial
Linked Data TutorialLinked Data Tutorial
Linked Data Tutorial
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutions
 
Linked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and ExamplesLinked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and Examples
 
Linked Data
Linked DataLinked Data
Linked Data
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Discovering Resume Information using linked data  
Discovering Resume Information using linked data  Discovering Resume Information using linked data  
Discovering Resume Information using linked data  
 
Linked sensor data
Linked sensor dataLinked sensor data
Linked sensor data
 
Semantically enriching content using OpenCalais
Semantically enriching content using OpenCalaisSemantically enriching content using OpenCalais
Semantically enriching content using OpenCalais
 
Open Calais
Open CalaisOpen Calais
Open Calais
 
Linked data 20171106
Linked data 20171106Linked data 20171106
Linked data 20171106
 
Web of Data as a Solution for Interoperability. Case Studies
Web of Data as a Solution for Interoperability. Case StudiesWeb of Data as a Solution for Interoperability. Case Studies
Web of Data as a Solution for Interoperability. Case Studies
 
RDFa Semantic Web
RDFa Semantic WebRDFa Semantic Web
RDFa Semantic Web
 
When RDFa?
When RDFa?When RDFa?
When RDFa?
 
Lodlam saa 2011_jenelfarrell_2
Lodlam saa 2011_jenelfarrell_2Lodlam saa 2011_jenelfarrell_2
Lodlam saa 2011_jenelfarrell_2
 
Anatomy of a semantic virus
Anatomy of a semantic virusAnatomy of a semantic virus
Anatomy of a semantic virus
 
Semantic web browser
Semantic web browser Semantic web browser
Semantic web browser
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

OpenCalais in Linked Data context

  • 1. Using OpenCalais API in the context of Linked Data Eldorina Andreea Alergus Faculty of Computer Science, Distributed Systems eldorina.alergus@info.uaic.ro Abstract. In this paper we discuss about OpenCalais Api in the context of Linked Data. With the growth of Linked Datasets, automating certain tasks, such as discovery or interlinking data becomes more and more important. We will survey in this work what OpenCalais is offering us for linking the data. Keywords: OpenCalais, linked data, Web of Data 1 Introduction The OpenCalais Web Service automatically generates rich semantic metadata for the submitted content. OpenCalais analyses the content using method as: natural language processing (NLP) or machine learning and finds the entities (Company, Country, City, Product, Movie etc) within it, and more, it finds events (person P was hired at company C) and facts (person P works for company C) within your text. The metadata returned as response is an RDF construct that is also centrally stored. The metadata gives us the possibility of building maps, networks or graphs by linking documents to people, geographies, places, companies, etc. Those maps can be used in order to verify if our content contains what we expect, to tag and organize it and also to create structured folksonomies or to improve site navigation. We can share our maps with anyone else in the content ecosystem. The Calais ecosystem is exposed via Linked Data endpoints. We use the term Linked Data to describe a method of exposing, sharing and connecting data on the Web via dereferenceable URIs.[15] Having linked data, we can find other related data. This is the Semantic Web, it’s about interlinking data, so that a person or a machine to be able to explore the web of data. The main idea behind linked data is that we may increase the value and the usability of data by connecting it with other related data.
  • 2. 2 Eldorina Andreea Alergus Calais is part of the Linked Open Data (LOD) Cloud, and it links to the following assets: Dbpedia, Wikipedia, Freebase, Reuters.com, GeoNames, Shopping.com, IMDB, LinkedMDB. In order to understand what Calais is offering, we must first understand the concept of Linked Data. 2 Linked Data As we said above, Linked Data is the technique of publishing data on the Web and interlinking data between different sources. It is machine-readable, its meaning is explicitly defined, it is linked to other external data sets and can also be linked from external data sets. Linked Data is based on RDF (Resource Description Framework) documents, which is used to make typed statements that link arbitrary things in the world. In order to access the web of data, we use Linked Data browsers (Tabulator, Disco, RDFViz, BrowseRDF, etc) which enable navigation between different sources using RDF. For instance, while looking at data about a product, a user may be interested in information about the company that produces the thing. Following the RDF link, he can navigate to information about that company contained in another dataset. Berners-Lee outlined a set of rules in order to publish data on the Web in a way that all published data becomes part of a single global data space: 1. Name things using URIs (Uniform Resource Identifiers). 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information using the standards (RDF, SPARQL). 4. Data should be interlinked with other data. These principles provide a basic recipe for publishing and connecting data using the infrastructure of the Web while adhering to its architecture and standards. Linked Data relies on two fundamental technologies: URI and HTTP. URIs provide generic methods of identifying any existing entity. Entities identified by URIs that use http:// can be looked up by dereferencing the URI. We say dereferencing a URI is the act of retrieving the representation of a resource identified by that URI.[16] 2
  • 3. Using OpenCalais API in the context of Linked Data 3 To URI and HTTP we add a necessarily technology to the Web of Data – the RDF. Similarly to HTML which provides the means to structure and link documents on the Web, RDF provides a graph-based data model to structure and link data that describes things. In RDFs data has the form of a triple: subject, predicate, object. The subject and the object are URIs that identify a resource, or a URI and a string. The predicate describes how the subject and the object are related, and is also represented by a URI. A linked dataset is a collection of data, published and maintained by a single provider, available as RDF on the Web, where at least some of the resources in the dataset are identified by dereferenceable URIs (http://rdfs.org/ns/void/html). In the image below, we have an image of the Linked Open Data Cloud, on which we can see the available datasets, and the links between them. By publishing data on the Web according to the Linked Data principles, we add our data to a global data space, which allows data to be discovered and used by various applications. To publish data set a Linked Data on the web, we must follow three basic steps:
  • 4. 4 Eldorina Andreea Alergus - Assign Uris to the entities described by the dataset and provide for dereferencing these URIs into RDF representations. - Set RDF links to other data sources on the Web. - Provide metadata about the published data so that clients to evaluate the quality of the published data. We will talk forward about how we can create rich semantic metadata for some content. 3 OpenCalais Web Service As we already said in Introduction, The OpenCalais Web Service automatically generates rich semantic metadata for the submitted content. It uses natural language processing (NLP), machine learning and other methods to analyze content and return the entities it finds, such as the cities, countries and people with dereferenceable Linked Data style URIs. The events, facts and entity types, are defined in the OpenCalais RDF Schemas (http://s.opencalais.com/1/pred/asf/1/pred/.html). In order to get started with OpenCalais, you first need to get an API key. Do get the key, you must register at http://www.opencalais.com/user/register. The Calais WS can be called from .NET, java, php etc using SOAP or REST. We can also use Calais Viewer to see how it works, and what the output of a Calais call is. When we want to make a call to Calais API, we must provide some input parameters, whom must be HTTP encoded. The service we invoke is at http://api.opencalais.com/enlighten/?wsdl. We will explain what do we need to call the service via SOAP. The method enlighten which allows to call the Open Calais web service via soap has three parameters: - licenseId. This is your API key that you can get from Calais site. - paramsXML. Those are the input parameters of the service in XML format. More information about the input parameters we can find at http://opencalais.com/documentation/calais-web-service-api/forming-api- calls/input-parameters. 4
  • 5. Using OpenCalais API in the context of Linked Data 5 - content. This is the content on which the extraction will be performed. For start we use a simple text as content: The Palace of Versailles, or simply Versailles, is a royal château in Versailles, the Île-de-France region of France. When the château was built, Versailles was a country village; today, however, it is a suburb of Paris, some twenty kilometers southwest of the French capital. The court of Versailles was the center of political power in France from 1682, when Louis XIV moved from Paris, until the royal family was forced to return to the capital in October 1789 after the beginning of French Revolution. Versailles is therefore famous not only as a building, but as a symbol of the system of absolute monarchy of the Ancien Régime. We call the service using C# as follows: add in our project a service reference to the Calais wsdl, then call the service as it follows: CalaisReference.calaisSoapClient client = new CalaisReference.calaisSoapClient(); string response = client.Enlighten(m_Licence,m_Content, m_Params()); The m_Content and m_Params is better to be read fron a file, and the response (a RDF) should also be kept in a file. The entities found are: City (Paris, France), Country (France) and Facility (Palace of Versailles). If we look at the URI http://d.opencalais.com/er/geo/city/ralg- geo1/797c999a-d455-520d-e5cf-04ca7fb255c1.html, we can say thet the entity (City) has been disambiguated, because it contains /er/. The entities which contain /em/ are not disambiguated by OpenCalais. If we open the link in a browser, we see that is was linked to other data sets (OpenCalais is linked to Freebase, Dbpedia, Geonames, Linked IMDB) as: http://dbpedia.org/resource/Paris, http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000002db30 , http://sws.geonames.org/2988507/ and is also has assigned a Web link - http://en.wikipedia.org/wiki/Paris. For the detected entities OpenCalais provides an entity relevance score (shown for each respectively in the screen shots below ) The relevance capability detects the importance of each unique entity and assigns a relevance score in the range 0-1 (1
  • 6. 6 Eldorina Andreea Alergus being the most relevant and important). We see that France is the most relevant (69%). For a better understanding of how Calais can be used, we take a look at http://gvlt.appspot.com/opencalais-geo/. In this project, the Calais API is used to identify geographic references in a text and display them on an Open Layers map. The Calais is used with JSON output, and all the processing is done on client side in the browser. OpenCalais can also be useful to content managers to create smart indexes. Instead of indexing by keywords, you can index by referenced subject. If you have a collection of unstructured documents, in a website for example, you can use OpenCalais to help manage and reference them together. By using the OpenCalais API, a website's side navigation bar can suggest other related documents based on the conceptual subject, instead of word matching as is used by most indexes. By taking the RDF/XML document returned by the OpenCalais HTTP interface and storing it in a RDF store, you can enable an application to find documents related to anything in the RDF store. (http://www.devx.com/semantic/Article/38517/1763/page/2). 4 Conclusions Nowadays, the Web means more than just putting data on the web, it means interlinking and sharing data as we share documents. The web is seen as an increasing global graph. It started with the assumption that the values and usefulness of the data 6
  • 7. Using OpenCalais API in the context of Linked Data 7 increases by creating links between the data. This is what Linked Data means: uses the Web to create typed links between data from different sources. Calais is a rapidly growing toolkit of capabilities that allows you to readily incorporate state-of-the-art semantic functionality within your blog, content management system, website or application. We have described in this paper how the Calais WS can be invoked and what the RDF output is offering us. OpenCalais represents an important move forward Semantic Web. With OpenCalais computers could do the research for you, combing through and comparing company names, locations and rumored or real transactions real time to give you answers in a way that keyword search simply cannot do. 5 References [1] C. Bizer, R. Cyganiak, T. Heath, How to Publish Linked Data on the Web [2] T. Heath, An Introduction to Linked Data, 2009 [3] C. Bizer, T. Heath, T. Berners-Lee, Linked Data - The Story So Far [4] M. Watson, Practical Semantic Web Programming With AllegroGraph, 2009 [5] K. Alexander, R. Cyganiaky, M. Hausenblasz, J. Zhaox, Describing Linked Datasets [6] http://opencalais.com/ [7] http://www.w3.org/DesignIssues/LinkedData.html [8] http://thomsonreuters.com/content/corporate/articles/398062 [9]http://philippeadjiman.com/blog/2009/09/16/open-calais-from-java-with-eclipse- extract-entities-facts-and-events-in-4-minutes/ [10] http://www.devx.com/semantic/Article/38517/1763/page/2 [11] http://blog.3kbo.com/2009/09/26/opencalais-response/ [12] http://wiki.dbpedia.org/Interlinking [13]http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataS ets [14] http://gvlt.wordpress.com/2008/10/17/tutorial-text-geotagging-with-opencalais/
  • 8. 8 Eldorina Andreea Alergus [15] http://en.wikipedia.org/wiki/Linked_Data [16] http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 8