On Evaluating and Publishing Data Concerns for Data as a Service
1. On Evaluating and Publishing Data
Concerns for Data as a Service
Hong-Linh Truong and Schahram Dustdar
Distributed Systems Group, Vienna University of Technology
truong@infosys.tuwien.ac.at
http://www.infosys.tuwien.ac.at/Staff/truong
APSCC 2010, Hangzhou 9 Dec 2010 1
2. Overview
Motivation and background
Data concern-aware service engineering
process
A framework for evaluating and publishing QoD
of DaaS
Experiments
Conclusions and future work
APSCC 2010, Hangzhou 9 Dec 2010 2
3. The rise of DaaS
Web services technologies and the cloud computing
model foster the concept of data/information as a service
(DaaS)
Provide data capabilities rather than provide
computation or software
Providing DaaS is an increasing trend
In both business and e-science environments
Bio data, weather data, company balance sheets,
etc., via Web services
But data is associated with many data concerns
Quality of data, privacy, licensing, etc.
APSCC 2010, Hangzhou 9 Dec 2010 3
5. Motivation: the role of data
concerns Should we perform data
composition?
Data consumers/data integrators need “data
concerns”
to use data in a right way: Is the data good? Or free?
to filter irrelevant results: avoid information
overloading
to save processing time/energy and storage
Both DaaS service and data providers need to
evaluate and provide data concerns
APSCC 2010, Hangzhou 9 Dec 2010 5
6. Motivation: service provider versus
data provider
The DaaS service provider is separated from the
data provider
Consumer Service provider Data provider
quality1
DaaS quality2
Consumer
DaaS privacy1
DaaS
privacy2
Sensor
the lack of techniques and tools to deal with the
evaluation and publishing of data concerns for DaaS
APSCC 2010, Hangzhou 9 Dec 2010 6
7. Example: DaaS provider =! data
provider
Source: http://www.infochimps.org
APSCC 2010, Hangzhou 9 Dec 2010 7
8. Background: data resources
Data items → data resources →
DaaS APIs → consumers
DaaS and data providers have the
Data resource
right to publish the data
Data
items
Consumer
Service APIs
Data Data
DaaS items items
Consumer
Data resource Data resource
Data resource Data resource
SOAP/REST
APSCC 2010, Hangzhou 9 Dec 2010 8
9. Backgroud: diverse concerns
associated with service and data
Hong-Linh Truong, Schahram Dustdar "On Analyzing and Specifying Concerns for Data as a Service" , The 2009 Asia-Pacific Services Computing
Conference (IEEE APSCC 2009), (c) IEEE Computer Society, December 7-11, 2009, Biopolis, Singapore.
9
10. Data concern-aware service
engineering process Typical activities
for data wrapping
and publishing
Typical activities
for data updating &
retrieval
APSCC 2010, Hangzhou 9 Dec 2010 10
11. Wrapping, selecting, and updating
data in DaaS
Typically different strategies for structured data and
unstructured data – not our main work
We just reuse existing techniques in order to plug our data
concern evaluation and publishing techniques
APSCC 2010, Hangzhou 9 Dec 2010 11
12. Evaluating data concerns (1)
Based on three concepts:
evaluation scope, evaluation modes and integration model
Evaluation scopes – enable fine-grained evaluation
Three scopes: data resource, service operation, and service as
a whole
Evaluation modes – suitable for different types of data
Off-line (before the access to data) and on-the-fly (when the data
is requested)
Integration models – suitable for different tool integration strategies
Push and pull data concerns
Pass-by-value versus pass-by-reference to data concerns
evaluation tools
APSCC 2010, Hangzhou 9 Dec 2010 12
13. Evaluating data concerns (2)
Pull, pass-by-references Pull, pass-by-values
Push, pass-by-values
APSCC 2010, Hangzhou 9 Dec 2010 13
14. Publishing data concern
information
Off-line publishing of data concerns
suitable for static data concerns
the publishing of data concerns of a data resource is separated from
the service operation which provides the access to the data resource
On-the-fly publishing of data concerns by associating concerns
with retrieved data resources
the resulting data resources (e.g., via queries) are annotated with data
concerns evaluated by data concerns evaluation tools.
suitable for providing dynamic data concerns
On-the-fly publishing of data concerns through queries
the use of different service operation parameters to query data
concerns of data resources
suitable for validating data concerns before accessing data resources
APSCC 2010, Hangzhou 9 Dec 2010 14
15. How do we utilize the data concern-
aware service engineering process?
Using this model we can determine and publish
several concerns
Our “a proof-of-concept”
A framework for evaluating and publishing QoD of
DaaS
A proof-of-concept implementation of data concern-
aware service engineering process
Another example: model and publish privacy
concerns for DaaS [ECOWS 2010]
Michael Mrissa, Salah-Eddine Tbahriti, Hong-Linh Truong, "Privacy model and annotation for DaaS", The 8th European
Conference on Web Services (ECOWS 2010), (c)IEEE Computer Society, 1-3 December, 2010, Ayia Napa, Cyprus
APSCC 2010, Hangzhou 9 Dec 2010 15
16. QoD framework: pull QoD
evaluation models for DaaS
Pull QoD Evaluation Models for DaaS
Pass-by-references and pass-by-value
References of data resources: URI
Values: any object
Third-party data evaluation tools
APSCC 2010, Hangzhou 9 Dec 2010 16
17. QoD framework: publishing
concerns (1)
Off-line data concern
publishing
a common data concern
publication specification
a tool for providing data concerns
according to the specification
supported by external service
information systems
APSCC 2010, Hangzhou 9 Dec 2010 17
18. QoD framework: publishing
concerns (2)
On-the-fly querying data concerns associated with data
resources
Using our proposed REST parameter convention in
[Composable Web 2010]
Based on metric names in the data concern
specification
Specifying requests by using utilizing query parameters
the form of metricName=value
GET/resource?accuracy="0.5"&location=’’Europe”
Hong Linh Truong, Schahram Dustdar, Andrea Maurino, Marco Comerio: Context, Quality and Relevance: Dependencies and
Impacts on RESTful Web Services Design. ICWE Workshops 2010: 347-359
APSCC 2010, Hangzhou 9 Dec 2010 18
19. QoD framework: QoD monitoring
and composition
QoD concerns monitoring and composition are
useful for the evaluation of aggregated data
resources
Our approach
Utilizing monitoring rules
QoD metrics of data resources are passed to an rule
engine
Rules are user-defined for monitoring and composing
QoD metrics
APSCC 2010, Hangzhou 9 Dec 2010 19
20. Experiments
Implementation
Java, JAX-RS/Jersey
Drools
Utilizing UNDataAPI - www.undata-api.org
XML data sets without QoD
Illustrating examples: check data from 1990-2009
datasetcompleteness: the completeness of the list of
countries
dataelementcompleteness: the completeness of data
elements in the list metrics
RESTful services wrapping to UNDataAPI
APSCC 2010, Hangzhou 9 Dec 2010 20
21. Experiment: evaluating and
annotating QoD metrics
http://www.infosys.tuwien.ac.at/prototyp/SOD1/dataconcerns/
APSCC 2010, Hangzhou 9 Dec 2010 21
24. Conclusions and future work
A novel, generic data concern-aware service engineering
process for DaaS
A proof-of-concept implementation for evaluating of
quality of data in REST-based DaaS
but in principle other concerns can be supported
more evaluation are needed
Open research questions:
how to deal with other concerns ?
what are the trade-offs between on-line and off-line
evaluation ?
how to utilize evaluated data concerns for optimizing
data compositions ?
APSCC 2010, Hangzhou 9 Dec 2010 24
25. Thanks for your attention!
Hong-Linh Truong
Distributed Systems Group
Vienna University of Technology
Austria
truong@infosys.tuwien.ac.at
http://www.infosys.tuwien.ac.at
APSCC 2010, Hangzhou 9 Dec 2010 25