SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
A Framework for Dynamic
 Data Source Identification and
                
  Orchestration on the Web



     Alexander Berezovskiy and Dr Leslie Carr

4th International Workshop on Web APIs and Service Mashups
     European Conference on Web Services (ECOWS 2010)

           1 December 2010, Ayia Napa, Cyprus
Problem
• Modern Web can be seen as a “universal data machine”
      • Multitude of web applications, services and data provides

      • Almost every user has vast amount of data stored online

• Data is often duplicated
      • Users need to register, provide their details, upload photos...

• Ideally, we would like a way to universally manipulate the data
      • APIs are different, functionality depends on data source

      • We need to know what data source to use

      • Make adjustments for every data source



                                                                          2
Current solutions
• OpenSocial
      • Focused on social applications

      • Requires adjustments in target applications

• Integrate every possible resource
      • Very time consuming

      • No guarantee the user will be satisfied with the choice

      • User interaction required to choose the right source




                                                                  3
Proposed solution
• Identify the most appropriate data resource
      • Task is given, nature of required data is known

      • Without any user intervention attempt to identify most suitable data
        resource to perform the given task

• Execute the task (process data request)
      • When data resource is identified, hand the request over to the resource and
        report on the results (or, return results)
      • Allow full CRUD (create, read, update, delete) operations on data




                                                                                  4
Proposed solution (cont'd)




                             5
Data resource identification
• One visit to a web page can tell a lot about the user
      • Country, language, browser, operating system, …

      • We assume all these parameters affect the user preferences and their choice
        of applications

• Use two-dimensional model:
      • Information about the user
          – Country, Language, Age, Occupation, Marital Status, …

      • Usage information
          – Browser, Operating System, Web and Local Applications, …
      • Some information can be obtained from a single HTTP request with no user
        intervention required



                                                                                   6
Data resource identification (cont'd)
• User information is grouped in a single entity called Environment
• Data can be structured as a tree:
                                             


• Identification algorithm:
    Total score for data source a is defined as:

        TSA a =TRS a ERS a TRAA a

     TRS  a    - Total Relevance Score for a

     ERS a , u  - Environment Relevance Score for a and user u
     TRAA a    - Total Relevance Application-to-Application score for a

                                                                            7
Data operations
• Each data resource serves data differently
• Data operations are performed by “bindings”
     • These are small chunks of code executed independently
     • We need at least one for each data resource
     • They can be written by anyone




                                                               8
Data operations (cont'd)
• Bindings return data in “raw” format
• The data can then be converted to almost any format
      • Currently available XML, JSON and Plain Text
      • Can be adjusted to serve RDF
• Example binding to return a user's name from Facebook:
   import urllib
   import simplejson

   interface =   {
                     'fields': {'username': {'required': 'yes', 'type': 'text'}},
                     'formats': ['html', 'xml', 'txt']
                 }

   def run_binding():
       url = 'http://graph.facebook.com/' + str(job.input_args['username'][0])
       response = urllib.urlopen(url)
       user = simplejson.loads(response.read())
       return user['name']


                                                                                    9
Demo




       10
Demo




       11
Demo




       12
Demo




       13
Demo




       14
Demo




       15
Demo




       16
Demo




       17
Demo




       18
Demo




       19
Discussion
• Dynamic discovery of data sources
     • Data mining can help us read data
     • How can we do full CRUD on the data?

• Universal data addressing
     • How can we universally address data based on its nature?

• Semantic Web application
     • Can we derive ontologies from the available data?




                                                                  20
Thank you




            21

Contenu connexe

Tendances

Linked Data Technology and Status
Linked Data Technology and StatusLinked Data Technology and Status
Linked Data Technology and Status
Myungjin Lee
 
EDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five StarsEDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five Stars
Richard Cyganiak
 

Tendances (20)

Linked Data: A short(-ish) introduction
Linked Data: A short(-ish) introductionLinked Data: A short(-ish) introduction
Linked Data: A short(-ish) introduction
 
McDanold-1-jun15
McDanold-1-jun15McDanold-1-jun15
McDanold-1-jun15
 
Introduction to MongoDB Basics from SQL to NoSQL
Introduction to MongoDB Basics from SQL to NoSQLIntroduction to MongoDB Basics from SQL to NoSQL
Introduction to MongoDB Basics from SQL to NoSQL
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in Libraries
 
A review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic WebA review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic Web
 
Linked Data Technology and Status
Linked Data Technology and StatusLinked Data Technology and Status
Linked Data Technology and Status
 
EDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five StarsEDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five Stars
 
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joinedKeystone summer school_2015_miguel_antonio_ldcompression_4-joined
Keystone summer school_2015_miguel_antonio_ldcompression_4-joined
 
Library Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic ControlLibrary Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic Control
 
Web Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingWeb Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen Scraping
 
Scraping data from the web and documents
Scraping data from the web and documentsScraping data from the web and documents
Scraping data from the web and documents
 
OpenKM
OpenKMOpenKM
OpenKM
 
Web Archives and the dream of the Personal Search Engine
Web Archives and the dream of the Personal Search EngineWeb Archives and the dream of the Personal Search Engine
Web Archives and the dream of the Personal Search Engine
 
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
What is Web-scraping?
What is Web-scraping?What is Web-scraping?
What is Web-scraping?
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010
 
Washington Linked Data Authority Service at University of Houston
Washington Linked Data Authority Service at University of HoustonWashington Linked Data Authority Service at University of Houston
Washington Linked Data Authority Service at University of Houston
 

En vedette

En vedette (19)

Web API Directory: Statistics, Trends and Good Practices
Web API Directory: Statistics, Trends and Good PracticesWeb API Directory: Statistics, Trends and Good Practices
Web API Directory: Statistics, Trends and Good Practices
 
A Step-By-Step Debugging Technique To Facilitate Mashup Development and Maint...
A Step-By-Step Debugging Technique To Facilitate Mashup Development and Maint...A Step-By-Step Debugging Technique To Facilitate Mashup Development and Maint...
A Step-By-Step Debugging Technique To Facilitate Mashup Development and Maint...
 
Keynote #Tech - Cloud & DevOps, par Hervé LECLERC
Keynote #Tech - Cloud & DevOps, par Hervé LECLERCKeynote #Tech - Cloud & DevOps, par Hervé LECLERC
Keynote #Tech - Cloud & DevOps, par Hervé LECLERC
 
Keynote #Tech - API Management, comment orchestrer vos API ? par Philippe DE ...
Keynote #Tech - API Management, comment orchestrer vos API ? par Philippe DE ...Keynote #Tech - API Management, comment orchestrer vos API ? par Philippe DE ...
Keynote #Tech - API Management, comment orchestrer vos API ? par Philippe DE ...
 
API economie, un nouveau vecteur de croissance
API economie, un nouveau vecteur de croissanceAPI economie, un nouveau vecteur de croissance
API economie, un nouveau vecteur de croissance
 
Open Source And the Internet Of Things
Open Source And the Internet Of ThingsOpen Source And the Internet Of Things
Open Source And the Internet Of Things
 
The Future of API Monetization
The Future of API MonetizationThe Future of API Monetization
The Future of API Monetization
 
Your API Deserves More Respect: Make It A Product
Your API Deserves More Respect: Make It A ProductYour API Deserves More Respect: Make It A Product
Your API Deserves More Respect: Make It A Product
 
Building A Business-Facing Mobile Developer Community
Building A Business-Facing Mobile Developer CommunityBuilding A Business-Facing Mobile Developer Community
Building A Business-Facing Mobile Developer Community
 
Introduction a IBM API Management
Introduction a IBM API ManagementIntroduction a IBM API Management
Introduction a IBM API Management
 
API pour les nuls
API pour les nulsAPI pour les nuls
API pour les nuls
 
Real World API Business Models That Worked
Real World API Business Models That WorkedReal World API Business Models That Worked
Real World API Business Models That Worked
 
API Frenzy: API Strategy 101
API Frenzy: API Strategy 101API Frenzy: API Strategy 101
API Frenzy: API Strategy 101
 
Deep dive: Monetize your API Programs
Deep dive: Monetize your API ProgramsDeep dive: Monetize your API Programs
Deep dive: Monetize your API Programs
 
What's Better than Microservices? Serverless Microservices.
What's Better than Microservices? Serverless Microservices.What's Better than Microservices? Serverless Microservices.
What's Better than Microservices? Serverless Microservices.
 
6 Reasons Why APIs Are Reshaping Your Business
6 Reasons Why APIs Are Reshaping Your Business6 Reasons Why APIs Are Reshaping Your Business
6 Reasons Why APIs Are Reshaping Your Business
 
ProgrammableWeb's eSignature API Research Report
ProgrammableWeb's eSignature API Research ReportProgrammableWeb's eSignature API Research Report
ProgrammableWeb's eSignature API Research Report
 
Welcome to the API Economy: Developing Your API Strategy
Welcome to the API Economy: Developing Your API StrategyWelcome to the API Economy: Developing Your API Strategy
Welcome to the API Economy: Developing Your API Strategy
 
API Business Models
API Business ModelsAPI Business Models
API Business Models
 

Similaire à A Framework for Dynamic Data Source Identification and Orchestration on the Web

The Real-time Web in the Age of Agents
The Real-time Web in the Age of AgentsThe Real-time Web in the Age of Agents
The Real-time Web in the Age of Agents
Joshua Shinavier
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
eswcsummerschool
 

Similaire à A Framework for Dynamic Data Source Identification and Orchestration on the Web (20)

Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
 
Unit 1
Unit 1Unit 1
Unit 1
 
Hadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFSHadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFS
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Processing Life Science Data at Scale - using Semantic Web Technologies
Processing Life Science Data at Scale - using Semantic Web TechnologiesProcessing Life Science Data at Scale - using Semantic Web Technologies
Processing Life Science Data at Scale - using Semantic Web Technologies
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
 
The Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web InitiativeThe Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web Initiative
 
The Real-time Web in the Age of Agents
The Real-time Web in the Age of AgentsThe Real-time Web in the Age of Agents
The Real-time Web in the Age of Agents
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data Relationships
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
Bertenthal
BertenthalBertenthal
Bertenthal
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
 
Big data meet_up_08042016
Big data meet_up_08042016Big data meet_up_08042016
Big data meet_up_08042016
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
 
10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About 10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About
 
Why Your MongoDB Needs Redis
Why Your MongoDB Needs RedisWhy Your MongoDB Needs Redis
Why Your MongoDB Needs Redis
 

A Framework for Dynamic Data Source Identification and Orchestration on the Web

  • 1. A Framework for Dynamic Data Source Identification and   Orchestration on the Web Alexander Berezovskiy and Dr Leslie Carr 4th International Workshop on Web APIs and Service Mashups European Conference on Web Services (ECOWS 2010) 1 December 2010, Ayia Napa, Cyprus
  • 2. Problem • Modern Web can be seen as a “universal data machine” • Multitude of web applications, services and data provides • Almost every user has vast amount of data stored online • Data is often duplicated • Users need to register, provide their details, upload photos... • Ideally, we would like a way to universally manipulate the data • APIs are different, functionality depends on data source • We need to know what data source to use • Make adjustments for every data source 2
  • 3. Current solutions • OpenSocial • Focused on social applications • Requires adjustments in target applications • Integrate every possible resource • Very time consuming • No guarantee the user will be satisfied with the choice • User interaction required to choose the right source 3
  • 4. Proposed solution • Identify the most appropriate data resource • Task is given, nature of required data is known • Without any user intervention attempt to identify most suitable data resource to perform the given task • Execute the task (process data request) • When data resource is identified, hand the request over to the resource and report on the results (or, return results) • Allow full CRUD (create, read, update, delete) operations on data 4
  • 6. Data resource identification • One visit to a web page can tell a lot about the user • Country, language, browser, operating system, … • We assume all these parameters affect the user preferences and their choice of applications • Use two-dimensional model: • Information about the user – Country, Language, Age, Occupation, Marital Status, … • Usage information – Browser, Operating System, Web and Local Applications, … • Some information can be obtained from a single HTTP request with no user intervention required 6
  • 7. Data resource identification (cont'd) • User information is grouped in a single entity called Environment • Data can be structured as a tree:   • Identification algorithm: Total score for data source a is defined as: TSA a =TRS a ERS a TRAA a TRS  a - Total Relevance Score for a ERS a , u  - Environment Relevance Score for a and user u TRAA a - Total Relevance Application-to-Application score for a 7
  • 8. Data operations • Each data resource serves data differently • Data operations are performed by “bindings” • These are small chunks of code executed independently • We need at least one for each data resource • They can be written by anyone 8
  • 9. Data operations (cont'd) • Bindings return data in “raw” format • The data can then be converted to almost any format • Currently available XML, JSON and Plain Text • Can be adjusted to serve RDF • Example binding to return a user's name from Facebook: import urllib import simplejson interface = { 'fields': {'username': {'required': 'yes', 'type': 'text'}}, 'formats': ['html', 'xml', 'txt'] } def run_binding(): url = 'http://graph.facebook.com/' + str(job.input_args['username'][0]) response = urllib.urlopen(url) user = simplejson.loads(response.read()) return user['name'] 9
  • 10. Demo 10
  • 11. Demo 11
  • 12. Demo 12
  • 13. Demo 13
  • 14. Demo 14
  • 15. Demo 15
  • 16. Demo 16
  • 17. Demo 17
  • 18. Demo 18
  • 19. Demo 19
  • 20. Discussion • Dynamic discovery of data sources • Data mining can help us read data • How can we do full CRUD on the data? • Universal data addressing • How can we universally address data based on its nature? • Semantic Web application • Can we derive ontologies from the available data? 20
  • 21. Thank you 21