SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
The evolution of semantic
technology evaluation
in my own flesh
(The 15 tips for technology evaluation)
Raúl García-Castro
Ontology Engineering Group.
Universidad Politécnica de Madrid, Spain
rgarcia@fi.upm.es

Speaker: Raúl García-Castro

Talk at IMATI-CNR,
October 15th,
Genova, Italy
Index

• 
• 
• 
• 
• 

Self-awareness
Crawling (Graduation Project)
Walking (Ph.D. Thesis)
Cruising (Postdoctoral Research)
Insight

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

2
Who am I?

•  Assistant Professor
-  Ontology Engineering Group
-  Computer Science School at Universidad Politécnica de Madrid (UPM)

•  Research lines
-  Evaluation and benchmarking of semantic technologies
•  Conformance and interoperability of ontology engineering tools
•  Evaluation infrastructures
-  Ontological engineering
•  Sensors, ALM, energy efficiency, context, software evaluation
-  Application integration
http://www.garcia-castro.com/
© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

3
Semantic Web technologies
The Semantic Web is:
•  “An extension of the current web in which information is given welldefined meaning, better enabling computers and people to work in
cooperation” [Berners-Lee et al., 2001]
•  A common framework for data sharing and reusing across applications
• 

Distinctive characteristics:
- 
- 
- 
- 

Use of W3C standards
Use ontologies as data models
Inference of new information
Open world assumption

Information
Directory
manager
Service
directory
manager
Ontology
editor
Ontology
visualizer

• 

High heterogeneity:

-  Different functionalities
•  In general
•  In particular

-  Different KR formalisms
•  Different expressivity
•  Different reasoning capabilities

Ontology
browser
Ontology
selector

Service
discoverer

Ontology
aligner

Ontology
localizer

Ontology
evaluator

Ontology
searcher

Ontology
learner

Ontology
ranker

Ontology
modularizer

Ontology
profiler

ONTOLOGY
ONTOLOGY
DEVELOPMENT
& MANAGEMENT CUSTOMIZATION

Ontology
evolution
manager
Ontology
evolution
visualizer

Service
non-functional
selector

Ontology
matcher
Ontology
merger

Service
process
mediator

Instance
editor
Query
answering

Ontology
integrator

Manual
annotation

Ontology
transformer

Automatic
annotation

Semantic
query
processor

Ontology
configuration
manager

Ontology
reconciler

Ontology
populator

ONTOLOGY
EVOLUTION

ONTOLOGY
ALIGNMENT

ONTOLOGY
INSTANCE
GENERATION

Ontology
versioner

Service
choreography
engine

Distributed
ontology
repository
Distributed
instance
repository
Distributed
data
repository
Distributed
annotated
data
repository

Service
orchestration

Distributed
alignment
repository

Semantic
query
editor

Service
composer

Distributed
registry

QUERYING
AND
REASONING

SEMANTIC
WEB
SERVICES

DATA
MANAGEMENT

García-Castro, R.; Muñoz-García, O.; Gómez-Pérez, A.; Nixon L. "Towards a component-based framework for developing Semantic
Web applications". 3rd Asian Semantic Web Conference (ASWC 2008). 2-5 February, 2009. Bangkok, Thailand.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

4
Ontology engineering tools
Allow the creation and management of ontologies:
•  Ontology editors
-  User oriented

•  Ontology language APIs
-  Programming oriented

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

5
Index

• 
• 
• 
• 
• 

Self-awareness
Crawling (Graduation Project)
Walking (Ph.D. Thesis)
Cruising (Postdoctoral Research)
Insight

http://www.phdcomics.com/comics/archive.php?comicid=1012

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

6
Evaluation goal
GQM paradigm: Any software measurement activity should be preceded by:
1.- The identification of a software engineering goal ...

Latency

2.- ... which leads to questions ...

Scalability

3.- ... which in turn lead to actual metrics.

Goal: To improve the performance and the scalability of the methods provided
by the ontology management APIs of ontology development tools
Which is the
actual
performance of
the API methods?

Is the
performance
of the methods
stable?

Are there any
anomalies in the
performance of
the methods?

Do changes in
a method’s
parameters
affect its
performance?

Does tool load
affect the
performance of
the methods?

Execution time
of each method

Variance of
execution
times of each
method

Percentage of
execution times
out of range in
each method’s
sample

Execution time
with parameter
A = Execution
time with
parameter B

Tool load versus
execution time
relationship

Metric: Execution times of the methods of the API with different load factors

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

7
Evaluation data
•  Atomic operations of the ontology management API
•  Multiple benchmarks defined for each method according to
changes in its parameters
•  Benchmarks parameterised according to the number of
consecutive executions of the method
insertConcept(String ontology, String concept)
insertConcept
insertRelation
insertClassAttribute
insertInstanceAttribute
insertConstant
insertReasoningElement
insertInstance
updateConcept
updateRelation
updateClassAttribute
updateInstanceAttribute
updateConstant
updateReasoningElement
updateInstance
.......

(72 methods)

© Raúl García Castro

benchmark1_1_08(N)
“Inserts N concepts in 1 ontology”

benchmark1_1_09(N)
“Inserts 1 concept in N ontologies”
Ontology_1

Concept_1
.
.
.

Ontology_1
Concept_1

Concept_N

.
.
.

Ontology_N

(128 benchmarks)

Talk at IMATI-CNR. 15th October 2013

8
Workload generator
•  Generates and inserts into the tool synthetic
ontologies accordant with:
-  Load factor (X). Defines the size of ontology data
-  Ontology structure dependent on the benchmarks

Benchmark

Operation

Execution needs

benchmark1_1_08

Inserts N concepts in an ontology

1 ontology

benchmark1_1_09

Inserts a concept in N ontologies

N ontologies

benchmark1_3_20

Removes N concepts from an ontology

1 ontology with N concepts

benchmark1_3_21

Removes a concept from N ontologies

N ontologies with 1 concept

For executing all the benchmarks, the ontology structure includes the execution needs of all the benchmarks

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

9
Evaluation infrastructure
Benchmark
Suite
Executor

Performance
Benchmark
Suite

Workload
Generator

Ontology
Development
Tool

Measurement
Data Library

Statistical
Analyser

To be instantiated for each tool

…
http://knowledgeweb.semanticweb.org/wpbs/

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

10
Statistical analyser
Benchmark
Suite
Executor

Performance
Benchmark
Suite

Workload
Generator

Ontology
Development
Tool

BenchStats

Measurement
Data Library

Statistical
Analyser

Measurement
Data Library
benchmark1_1_08
400 measurements
2134 ms.
2300 ms.
2242 ms.
2809 ms.
...

Statistical software

benchmark1_1_09
400 measurements
1399 ms.
2180 ms.
...
benchmark1_3_20
400 measurements
2032 ms.
1459 ms.
...

…

© Raúl García Castro

Load

N

UQ

LQ

IQR

Median

% Outliers

Function

benchmark1_1_08

5000

400

60

60

0

60

1.25

y=62.0-0.009x

benchmark1_1_09

5000

400

912

901

11

911

1.75

y=910.25-0.003x

benchmark1_3_20

5000

400

160

150

10

150

1.25

y=155.25-0.003x

benchmark1_3_21

5000

400

160

150

10

151

0.25

y=154.96-0.001x

Talk at IMATI-CNR. 15th October 2013

11
Result analysis - Latency
Metric for the execution time:

Metric for anomalies in the execution times:

The median of the execution times of a method

Percentage of outliers in the execution times of a method

No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la
imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si
sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo.

8 Methods with
execution times>800 ms.

No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la
imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si
sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo.

N=400, X=5000

2 methods with
% outliers>5%

N=400, X=5000

Metric for the variability of the execution time:

Effect of changes in method parameters:

The interquartile range of the execution times of a method

Comparison of the medians of the execution times of
the benchmarks that use the same method

No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la
imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si
sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo.

3 methods with
IQR>11 ms.

N=400, X=5000

© Raúl García Castro

No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la
imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si
sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo.

5 methods with
differences in
execution times >
60 ms.

N=400, X=5000

Talk at IMATI-CNR. 15th October 2013

12
Result analysis - Scalability
Effect of changes in WebODE’s load:
Slope of the function estimated by simple linear regression of the medians of the
execution times from a minimum load (X=500) to a maximum one (X=5000).

8 methods with
slope>0.1 ms.

N=400, X=500..5000

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

13
Limitations
•  Evaluating other tools is expensive
Benchmark
Suite
Executor

Workload
Generator

Performance
Benchmark
Suite

Measurement
Data Library

Statistical
Analyser

Ontology
Ontology
Ontology
Development
Ontology
Development
Development
Tool
Development
Tool
Tool
Tool

•  Analysis of results was difficult
-  The evaluation was executed 10 times with different load factors:
500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, and 5000
-  128 benchmarks x 10 executions = 1280 files with results!!!!!

García-Castro R., Gómez-Pérez A "Guidelines for Benchmarking the Performance of Ontology Management APIs" 4th International
Semantic Web Conference (ISWC2005), LNCS 3729. November 2005. Galway, Ireland.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

14
The 15 tips for technology evaluation

•  Know the technology

•  Support different types
of technology

© Raúl García Castro

•  Automate the
evaluation framework
•  Expect reproducibility
•  Beware of result
analysis
•  Learn statistics
•  Plan for evaluation
requirements

Talk at IMATI-CNR. 15th October 2013

15
Index

• 
• 
• 
• 
• 

Self-awareness
Crawling (Graduation Project)
Walking (Ph.D. Thesis)
Cruising (Postdoctoral Research)
Insight
KHAAAAN!

http://www.phdcomics.com/comics/archive.php?comicid=500

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

16
Interoperability in the Semantic Web
•  Interoperability is the ability that Semantic Web
technologies have to interchange ontologies and use
them
-  At the information level; not at the system level
-  In terms of knowledge reuse; not information integration

•  In the real world it is not feasible to use a single
system or a single formalism
•  Different behaviours in interchanges between
different formalisms:

C
disjoint
subclass

A
© Raúl García Castro

B

LOSSLESS

Different formalism

A

LOSS

Same formalism

disjoint

B

subclass

A

C

subclass

disjoint

subclass

A

A

C

B

disjoint

myDisjoint

A

myDisjoint

B
C

subclass

B

C

B

A

B

Talk at IMATI-CNR. 15th October 2013

17
Evaluation goal
To evaluate and improve the interoperability of
Semantic Web technologies using RDF(S) and OWL as
interchange languages

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

18
Evaluation workflow - Manual
Import

Export

Ontologies
Ontologies

Tool X
Oi

RDF(S)/OWL

Tool X

Oi’

Oi

Oi’

RDF(S)/OWL

Oi = Oi’ + β - β’

Oi = Oi’ + α – α’

Interoperability
Ontologies

Tool Y

Tool X
Oi

Oi’

RDF(S)/OWL

Oi’’

Oi = Oi’’ + α - α’ + β - β’
© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

19
Evaluation workflow - Automatic

Interchange
language

Existing
ontologies
O1..On

Tool X
O1RDF(S)/OWL

Tool Y

O1’

O1’’
RDF(S)/OWL

Step 1: Import + Export

O1’’’

O1’’’’
RDF(S)/OWL

Step 2: Import + Export

O1’’=O1’’’’ + β - β’

O1 = O1’’ + α - α’
Interchange

O1 = O1’’’’ + α - α’ + β - β’

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

20
Evaluation data - OWL Lite Import Test Suite
Component combinations

RDF/XML Syntax variants
<rdf:Description rdf:about="#class1">
<rdf:type rdf:resource="&rdfs;Class"/>
</rdf:Description>

=

<rdfs:Class rdf:about="#class1">
</rdfs:Class>

Group
Class hierarchies
Class equivalences
Classes defined with set operators
Property hierarchies
Properties with domain and range
Relations between properties
Global cardinality constraints and
logical property characteristics
Single individuals
Named individuals and properties
Anonymous individuals and properties
Individual identity
Syntax and abbreviation
TOTAL

No.
17
12
2
4
10
3
5

Subclass of class

Subclass of restriction
Value constraints Set operators

Cardinality +
object property

Cardinality +
datatype property

3
5
3
3
15
82

David S., García-Castro, R.; Gómez-Pérez, A. "Defining a Benchmark Suite for Evaluating the Import of OWL Lite Ontologies". Second International
Workshop OWL: Experiences and Directions 2006 (OWL2006). November, 2006. Athens, Georgia, USA.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

21
Evaluation criteria
•  Execution informs about the correct execution:
– 
– 
– 
– 

OK. No execution problem
FAIL. Some execution problem
Comparer Error (C.E.) Comparer exception
Not Executed. (N.E.) Second step not executed

•  Information added or lost in terms of triples
Oi = Oi’ + α - α’
•  Interchange informs whether the ontology has been
interchanged correctly with no addition or loss of
information:
–  SAME if Execution is OK and Information added and
Information lost are void
–  DIFFERENT if Execution is OK but Information added
or Information lost are not void
–  NO if Execution is FAIL, N.E. or C.E.

© Raúl García Castro

Oi = Oi’ ?

Talk at IMATI-CNR. 15th October 2013

22
Evaluation campaigns
RDF(S) Interoperability Benchmarking
3 ontology
repositories

3 ontology
development tools

6
Tools

(Frames)

OWL Interoperability Benchmarking
1 ontology-based
annotation tool

5 ontology
development tools

3 ontology
repositories

9
Tools

© Raúl García Castro

SemTalk

(Frames)

(OWL)

(Frames)

Talk at IMATI-CNR. 15th October 2013

23
Evaluation infrastructure - IRIBA

!

!
!

!
!

!

http://knowledgeweb.semanticweb.org/iriba/

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

24
Evaluation infrastructure - IBSE
Benchmark descriptions
OWL Lite
Import
Benchmark
Suite

benchmarkOntology

Reports
(HTML, SVG)

rdf:type

1 Describe

benchmarks

<rdf:RDF
xmlns:rdf="http://www.w3.org/
<rdf:RDF
1999/02/22-rdf-syntax-ns#"
xmlns:rdf="http://www.w3.org/
<rdf:RDF
xmlns:rdfs="http://www.w3.org/
1999/02/22-rdf-syntax-ns#"
xmlns:rdf="http://www.w3.org/
2000/01/rdf-schema#"
xmlns:rdfs="http://www.w3.org/
1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/
2000/01/rdf-schema#"
xmlns:rdfs="http://www.w3.org/
2002/07/owl#" xmlns:xsd="http://
2000/01/rdf-schema#"
www.w3.org/2001/XMLSchema#"
xmlns:owl="http://www.w3.org/
arkOntology#" arkOntology#">
2002/07/owl#"
<owl:Ontology rescription of the
benchmark suite inputs.</
rdfs:comment>
<owl:versionInfo>24 October
2006</owl:versionInfo>
</owl:Ontology>
<!-- classes -->

Execution results
resultOntology

rdf:type

Tools

…
• 
• 
• 
• 

2 Execute

benchmarks

<rdf:RDF
xmlns:rdf="http://www.w3.org/
<rdf:RDF
1999/02/22-rdf-syntax-ns#"
xmlns:rdf="http://www.w3.org/
<rdf:RDF
xmlns:rdfs="http://www.w3.org/
1999/02/22-rdf-syntax-ns#"
xmlns:rdf="http://www.w3.org/
2000/01/rdf-schema#"
xmlns:rdfs="http://www.w3.org/
1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/
2000/01/rdf-schema#"
xmlns:rdfs="http://www.w3.org/
2002/07/owl#" xmlns:xsd="http://
2000/01/rdf-schema#"
www.w3.org/2001/XMLSchema#"
xmlns:owl="http://www.w3.org/
arkOntology#" arkOntology#">
2002/07/owl#"
<owl:Ontology rescription of the
benchmark suite inputs.</
rdfs:comment>
<owl:versionInfo>24 October
2006</owl:versionInfo>
</owl:Ontology>
<!-- classes -->

3 Generate

reports

Automatically executes experiments between all the tools
Allows configuring different execution parameters
Uses ontologies to represent benchmarks and results
Depends on external ontology comparers (KAON2 OWL Tools and RDFutils)

http://knowledgeweb.semanticweb.org/benchmarking_interoperability/ibse/
García-Castro, R.; Gómez-Pérez, A., Prieto-González J. "IBSE: An OWL Interoperability Evaluation Infrastructure". Third International Workshop OWL:
Experiences and Directions 2007 (OWL2007). June, 2007. Innsbruck, Austria.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

25
Evaluation results - Variability
•  High variability in evaluation results
Tool import/export:

Ontology comparison:
Same information
More information
Less information
Tool fails
Comparer fails
Not valid ontology

Models and executes
Does not model and executes
Models and fails
Does not model and fails
Not executed

•  Different perspectives for analysis
- 
- 
- 
- 

50
45

Results per tool / pair of tools
Results per component
Result evolution over time
…
W-W

K-P

K-W

P-W

K-P-W

Y
Y

Y
Y

Y
-

Y
N

Y
-

Y
-

Y
-

Classes instance of multiple metaclasses (1)

Y

N

-

N

-

-

-

Class hierarchies without cycles (3)

Y

Y

Y

Y

Y

Y

Y
Y

Y
-

-

N
-

-

-

-

Datatype properties whose range is String (5)
Datatype properties whose range is a XML Schema datatype (2)
Object properties without domain or range (8)
Object properties with a domain and a range (2)

Y
Y
Y
Y

Y
Y
Y

Y
Y
Y

N
Y
Y

N
Y
Y

Y
Y

N
Y

Object properties with multiple domains or ranges (5)
Instances of undefined resources (1)
Instances of a single class (2)
Instances of a multiple classes (1)
Instances related via object properties (7)
Instances related via datatype properties (2)
Instances related via datatype properties with range a XML schema datatype (2)
Instances related via undefined object or datatype properties (3)

Y
Y
Y
Y
Y
-

Y
N
Y
Y
-

Y
Y
Y
Y
-

Y
N
Y
N
-

Y
Y
Y
-

Y
Y
Y
-

Y
Y
N
-

Models and executes

30

Not models and exec

25

Models and fails

20

Not models and fails

15
10
5
0
04-2005

Y

Class hierarchies with cycles (2)
Classes related through object or datatype properties (6)
Datatype properties without domain or range (7)
Datatype properties with multiple domains (3)

© Raúl García Castro

K-K

35

05-2005

10-2005

DESTINATION

01-2006

DESTINATION
JE

PO

SW

K2

GA

ST

WE

PF

Jena

100

100

100

78

85

16

17

5

5

Protégé-OWL

100

100

95

78

89

16

17

5

17

5

SWIProlog

100

100

100

78

55

45

17

5

39

6

0

KAON2

78

78

78

78

40

39

6

0

46

13

15

13

GATE

96

52

79

74

46

13

15

13

JE

PO

SW

K2

GA

ST

WE

PF

Jena

100

100

100

78

85

16

17

5

Protégé-OWL

100

100

95

78

89

16

17

SWIProlog

100

100

100

78

55

45

KAON2

78

78

78

78

40

GATE

96

52

79

74

ORIGIN

P-P

Classes (2)
Classes instance of a single metaclass (4)

ORIGIN

Combinations

40

SemTalk

45

46

46

27

24

46

17

0

SemTalk

45

46

46

27

24

46

17

0

WebODE

17

18

0

6

16

17

17

12

WebODE

17

18

0

6

16

17

17

12

Protégé-Frames

5

5

0

0

4

5

0

13

Protégé-Frames

5

5

0

0

4

5

0

13

Talk at IMATI-CNR. 15th October 2013

26
Evaluation results - Interoperability
Clear picture of the interoperability between different tools
•  Low interoperability and few clusters of interoperable tools
•  Interoperability depends on:
- 
- 
- 
- 

Ontology translation (tool knowledge model)
Specification (development decisions)
Robustness (tool defects)
Tools participating in the interchange (each behaves differently)

•  Tools have improved
•  Involvement of tool developers is needed
-  Tool developers have been informed
-  Tool improvement is out of our scope

•  Results are expected to change
-  Continuous evaluation is needed
García-Castro, R.; Gómez-Pérez, A. "Interoperability results for Semantic Web technologies using OWL as the interchange language". Web
Semantics: Science, Services and Agents in the World Wide Web. ISSN: 1570-8268. Elsevier. Volume 8, number 4. pp. 278-291. November 2010.
García-Castro, R.; Gómez-Pérez, A. "RDF(S) Interoperability Results for Semantic Web Technologies". International Journal of Software Engineering
and Knowledge Engineering. ISSN: 0218-1940. Editor: Shi-Kuo Chang. Volume 19, number 8. pp. 1083-1108. December 2009.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

27
Benchmarking interoperability
Method for benchmarking interoperability
•  Common for different Semantic Web technologies
•  Problem-focused instead of tool-focused
•  Manual vs automatic experiments:
-  It depends on the specific needs of the benchmarking
-  Automatic: cheaper, more flexible and extensible
-  Manual: higher quality of results

Resources for benchmarking interoperability
•  All the benchmark suites, software and results are publicly
available
•  Independent of:
RDF(S) Interoperability B.

OWL Interoperability B.

RDF(S) Import B. Suite

RDF(S) Export B. Suite

OWL Lite Import B. Suite

RDF(S) Interoperability B. Suite

-  The interchange language
-  The input ontologies

Manual
Tool X

Automatic
Tool Y

rdfsbs
IRIBA

Tool X

Tool Y

IBSE

García-Castro, R. "Benchmarking Semantic Web technology". Studies on the Semantic Web vol. 3. AKA Verlag – IOS Press. ISBN:
978-3-89838-622-7. January 2010.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

28
Limitations
•  Number of results to analyse increased exponentially
-  2168 executions in the RDF(S) benchmarking activity and
-  6642 executions in the OWL one

•  Hard to support and maintain different test data and
tools
•  Every tool to be evaluated had to be deployed in the
same computer

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

29
The 15 tips for technology evaluation

•  Know the technology
•  Support different test
data
•  Support different
types of technology

•  Use machineprocessable
descriptions of
evaluation resources
© Raúl García Castro

•  Automate the
evaluation framework
•  Expect reproducibility
•  Beware of result
analysis
•  Learn statistics
•  Plan for evaluation
requirements
•  Organize (or join)
evaluation
campaigns
Talk at IMATI-CNR. 15th October 2013

30
Index

• 
• 
• 
• 
• 

Self-awareness
Crawling (Graduation Project)
Walking (Ph.D. Thesis)
Cruising (Postdoctoral Research)
Insight

http://www.phdcomics.com/comics/archive.php?comicid=570

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

31
The SEALS Project (RI-238975)

hVp://www.seals-­‐project.eu/	
  
Project	
  Coordinator:	
  	
  
Asunción	
  Gómez	
  Pérez	
  
<asun@fi.upm.es>	
  

3

2
1 1 2

1
© Raúl García Castro

EC	
  	
  contribu2on:	
  
3.500.000	
  €	
  

Dura2on:	
  	
  
June	
  2009-­‐June	
  2012	
  

Universidad	
  Politécnica	
  de	
  Madrid,	
  Spain	
  (Coordinator)	
  
University	
  of	
  Sheffield,	
  UK	
  
University	
  of	
  Mannheim,	
  Germany	
  
Forschungszentrum	
  InformaCk,	
  Germany	
  

University	
  of	
  Zurich,	
  Switzerland	
  

University	
  of	
  Innsbruck,	
  Austria	
  

STI	
  InternaConal,	
  Austria	
  

InsCtut	
  NaConal	
  de	
  Recherche	
  en	
  	
  

Open	
  University,	
  UK	
  

InformaCque	
  et	
  en	
  AutomaCque,	
  France	
  

Oxford	
  University,	
  UK	
  
Talk at IMATI-CNR. 15th October 2013

32
Semantic technology evaluation @ SEALS

SEALS	
  Pla8orm	
  

SEALS	
  Evalua2on	
  Campaigns	
  

SEALS	
  Evalua2on	
  
Services	
  

SEALS	
  Community	
  

Wrigley S.; García-Castro R.; Nixon L. "Semantic Evaluation At Large Scale (SEALS)". 21st International World Wide Web Conference (WWW 2012).
European projects track. pp. 299-302. Lyon, France. 16-20 April 2012.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

33
The SEALS entities

Evaluations

Tools

Ontology engineering
Storage and reasoning
Ontology matching
Semantic search
Semantic web service

Results

Test Data

Raw Results
Interpretations

15/10/13
34
© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

34
Structure of the SEALS entities
• Java	
  Binaries	
  
• Shell	
  scripts	
  
• Bundles	
  

• BPEL	
  	
  
• Java	
  Binaries	
  

• Ontologies	
  

EnCty	
  
Discovery,	
  
Valida2on	
  

SEALS	
  Ontologies	
  

Metadata	
  

Data	
  

Exploita2on	
  

http://www.seals-project.eu/ontologies/

García-Castro R.; Esteban-Gutiérrez M.; Kerrigan M.; Grimm S. "An Ontology Model to Support the Automatic Evaluation of Software". 22nd
International Conference on Software Engineering and Knowledge Engineering (SEKE 2010). pp. 129-134. Redwood City, USA. 1-3 July 2010.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

35
SEALS logical architecture

O

S	
  
Evaluation Organisers

A	
  

Technology
Providers

Technology
Adopters

SEALS	
  Portal	
  

Run2me	
  
Evalua2on	
  
Service	
  

SEALS	
  	
  
Service	
  Manager	
  
Software agents

SEALS Repositories
Test	
  Data	
  	
  
Repository	
  
Service	
  

Tools	
  	
  
Repository	
  
Service	
  

Results	
  	
  
Repository	
  
Service	
  

Evalua2on	
  
Descrip2ons	
  
Repository	
  Service	
  

García-Castro R.; Esteban-Gutiérrez M.; Gómez-Pérez A. "Towards an Infrastructure for the Evaluation of Semantic Technologies". eChallenges
e-2010 Conference (e-2010). pp. 1-8. Warsaw, Poland. 27-29 October 2010.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

36
Challenges
•  Tool heterogeneity

Virtualization
as a
technology
enabler

-  Hardware requirements
-  Software requirements

•  Reproducibility
-  Ensure execution environment offers
the same initial status

Processing	
  Node	
  

Execu)on	
  
Node	
  

Virtual	
  Machine	
  
Tool	
  
Virtual	
  Machine	
  

Virtualiza)on	
  Solu)on	
  
•  VMWare	
  Server	
  2.0.2	
  
•  VMWare	
  vSphere	
  4	
  
•  Amazon	
  EC2	
  (In	
  progress)	
  

Tool	
  

…	
  
© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

37
Evaluation campaign methodology

SEALS
Methodology for
Evaluation
Campaigns

•  SEALS-independent
•  Includes:
- 
- 
- 
- 
- 
- 

Actors
Process
Recommendations
Alternatives
Terms of participation
Use rights

Raúl García-Castro and Stuart N. Wrigley

September 2011

INITIATION	
  

INVOLVEMENT	
  

DISSEMINATION	
  

PREPARATION	
  &	
  EXECUTION	
  

FINALIZATION	
  

García Castro R.; Martin-Recuerda F.; Wrigley S. "SEALS. Deliverable 3.8 SEALS Methodology for Evaluation Campaigns v2". Technical Report.
SEALS project. July 2011.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

38
Current SEALS evaluation services
Ontology
engineering

Ontology
reasoning

Ontology
matching

Semantic
search

Semantic web
service

•  Conformance
•  Interoperability
•  Scalability

DL reasoning:
• Classification
• Class
satisfiability
• Ontology
satisfiability
• Entailment
• Non-entailment
• Instance retrieval
RDF reasoning:
• Conformance

• Matching
accuracy
• Matching
accuracy
multilingual
• Scalability
(ontology size, #
CPU)

• Search accuracy,
efficiency
(automated)
• Usability,
satisfaction (userin-the-loop)

• SWS Discovery

Conformance &
interoperability:
•  RDF(S)
•  OWL Lite, DL and
Full
•  OWL 2
Expressive x3
•  OWL 2 Full
Scalability:
•  Real-world
•  LUBM
•  Real-world +
•  LUBM +

DL reasoning:
•  Gardiner test
suite
•  Wang et al.
repository
•  Versions of
GALEN
•  Ontologies from
EU projects
•  Instance
retrieval test
data
RDF reasoning:
• OWL 2 Full

•  Benchmark
•  Anatomy
Conference
•  MultiFarm
•  Large Biomed
(supported by
SEALS)

Automated:
• EvoOnt
• MusicBrainz
(from QALD-1)
User-in-the-loop:
• Mooney
• Mooney +

• OWLS-TC 4.0
• SAWSDL-TC 3.0
• WSMO-LITE-TC

Evaluations

Test Data

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

39
New evaluation data – Conformance and interoperability
•  OWL DL test suite à
keyword-driven approach
-  Manual definition of test in CSV/
spreadsheet using a keyword
library

Test Suite Generator
Test Suite
Definition
Script

Expanded
Test Suite
Definition
Script

Preprocessor

Test Suite
Metadata
ontology01.owl
ontology02.owl

Keyword
Library

Interpreter

ontology03.owl
…

OWL2EG (http://knowledgeweb.semanticweb.org/benchmarking_interoperability/OWL2EG/)

•  OWL 2 test suite à
automatically generate
ontologies of increasing
expressiveness:

Online Ontologies

Ontology Search

Ontology Module
Extraction
Initial
ontologies

Ontology generation process

OWL API
Original test suite

Metad
ata

-  Using ontologies in the Web
-  Maximizing expressiveness

Expressive test
suite

Increase
expressivity

Metad
ata
Maximize
expressivity

OWLDLGenerator (http://knowledgeweb.semanticweb.org/benchmarking_interoperability/OWLDLGenerator/)

Full-expressive
test suite

Metad
ata

García-Castro R.; Gómez-Pérez A. "A Keyword-driven Approach for Generating Ontology Language Conformance Test Data". Engineering
Applications of Artificial Intelligence. ISSN: 0952-1976. Elsevier. Editor: B. Grabot.
Grangel-González I.; García-Castro R. "Automatic Conformance Test Data Generation Using Existing Ontologies in the Web". Second International
Workshop on Evaluation of Semantic Technologies (IWEST 2012). 28 May 2012. Heraklion, Greece.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

40
1st Evaluation Campaign
Campaign
Ontology engineering

Tool

Jena
Sesame
Protégé 4
Protégé OWL
NEON toolkit
OWL API
Reasoning
HermiT
jcel
FaCT++
Matching
AROMA
ASMOV
Aroma
Falcon-AO
Lily
RiMOM
Mapso
CODI
AgreeMaker
Gerome*
Ef2Match
Semantic search
K-Search
Ginseng
NLP-Reduce
PowerAqua
Jena Arq
Semantic web service 4 OWLS-MX variants

Provider

Country

HP Labs
Aduna
University of Stanford
University of Stanford
NEON Foundation
University of Manchester
University of Oxford
Tec. Universitat Dresden
University of Manchester
INRIA
INFOTECH Soft
Nantes University
Southeast University
Southeast University
Tsinghua University
FZI
University of Mannheim
Advances in Computing Lab
RWTH Aachen
Nanyang Tec. University
K-Now Ltd
University of Zurich
University of Zurich
KMi, Open University
HP Labs, Talis
DFKI

UK
Netherlands
USA
USA
Europe
UK
UK
Germany
UK
France
USA
France
China
China
China
Germany
Germany
USA
Germany
China
UK
Switzerland
Switzerland
UK
UK
Germany

29 tools from
8 countries

Nixon L.; García-Castro R.; Wrigley S.; Yatskevich M.; Trojahn-dos-Santos C.; Cabral L. "The state of semantic technology today – overview of the
first SEALS evaluation campaigns". 7th International Conference on Semantic Systems (I-SEMANTICS2011). Graz, Austria. 7-9 September 2011.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

41
2nd Evaluation Campaign
W
P

Tool

10

Jena
HP Labs
UK
W Tool
Provider
Country
Sesame
Aduna
Netherlands
P
Protégé 4
University of Stanford
USA
12
AgrMaker
University of Illinois at Chicago
Protégé OWL
Stanford
USA USA Country
W Tool University of Provider
Aroma
INRIA Grenoble Rhone-Alpes
France
NeOn toolkit
NeOn Foundation
Europe
P
AUTOMSv2 University of Manchester
VTT Technical Research Centre Finland
OWL API
UK
13
K-Search
K-Now Ltd
CIDER
Universidad Politecnica de UK Spain UK
HermiT
University of Oxford
Ginseng
University of Zurich
Switzerland
Madrid
jcel
Technischen Universitat Dresden Germany
University of Zurich
Switzerland
CODI NLP-Reduce Universitat Mannheim
FaCT++
University of Manchester
UK Germany
KMi, Open University
UK
CSA PowerAqua University of Ho Chi Minh City
Vietnam
WSReasoner
University of New Brunswick
Canada
Jena Arq v2.8.2
HP Labs, Talis
UK
GOMMA
Universitat Leipzig
Germany
Jena Arq v2.9.0
HP Labs, Talis
UK
Hertuda
Technische Universitat
Germany
rdfQuery v0.5.1University of Southampton
UK
Darmstadt
LDOA beta
Tunis-El Manar University
Tunisia
University of Zurich
Lily Semantic Crystal
Southeast University
China Switzerland
Affective Graphs
University of Sheffield
LogMap
University of Oxford
UK UK
14
WSMO-LITE-OU
KMi, Open University
LogMapLt
University of Oxford
UK UK
SAWSDL-OU Maastricht University
KMi, Open University
UK
MaasMtch
Netherlands
OWLS-URJC FZI Forschungszentrum Juan Carlos
University of Rey
Spain
MapEVO
Germany
OWLS-M0
DFKI
Germany
Informatik
MapPSO
FZI Forschungszentrum
Germany
Informatik
MapSSS
Wright State University
USA
Optima
University of Georgia
USA
WeSeEMtch
Technische Universitat
Germany
Darmstadt
YAM++
LIRMM
France

11

© Raúl García Castro

Provider

41 tools from 13
countries

Country

Talk at IMATI-CNR. 15th October 2013

42
Evaluation services
Tools
Evaluations

Test data
My tool

Or	
  define	
  
your	
  own	
  

© Raúl García Castro

Tools

My test data

Tools

Update	
  them	
  

Evaluations

My results

Test data

My evaluation

Exploit	
  results	
  
	
  

Execute	
  
evaluaCons	
  

Results

My results

My results
My tool

My test data

Talk at IMATI-CNR. 15th October 2013

43
Quality model for semantic technologies

Tool/Measures

Raw
Results

Interpretations

Quality
Measures

Quality subcharacteristics

Ontology engineering tools

7

20

8

6

Ontology matching tools

1

4

4

2

Reasoning systems

11

0

16

5

Semantic search tools

12

8

18

7

Semantic web service tools

5

9

10

2

Total

34

41

55

17

Radulovic, F., Garcia-Castro, R., Extending Software Quality Models - A Sample In The Domain of Semantic Technologies. 23rd International
Conference on Software Engineering and Knowledge Engineering (SEKE2011). Miami, USA. July, 2011

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

44
Semantic technology recommendation
I	
  need	
  a	
  robust	
  
ontology	
  engineering	
  
tool	
  and	
  a	
  seman.c	
  
search	
  tool	
  with	
  the	
  
highest	
  precision	
  

User	
  Quality	
  
Requirements	
  

SEALS	
  Pla<orm	
  

You	
  should	
  use	
  Sesame	
  
v2.6.5	
  and	
  Arq	
  v2.9.0	
  
	
  
The	
  reason	
  for	
  this	
  is...	
  
	
  
Alterna.vely,	
  you	
  can	
  use	
  ...	
  

SemanCc	
  
Technology	
  
Quality	
  model	
  

Seman2c	
  
Technology	
  
Recommenda2on	
  
Tools	
  	
  
Repository	
  
Service	
  

RecommendaCon	
  

Results	
  	
  
Repository	
  
Service	
  

Radulovic F.; García-Castro R. "Semantic Technology Recommendation Based on the Analytic Network Process". 24th Int. Conference on Software
Engineering and Knowledge Engineering (SEKE 2012). Redwood City, CA, USA. 1-3 July 2012. 3rd Best Paper Award!

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

45
You can use the SEALS Platform

•  The SEALS Platform facilitates:
- 
- 
- 
- 
- 
- 

Comparing tools under common settings
Reproducibility of evaluations
Reusing evaluation resources, completely or partially
Or defining new ones
Managing evaluation resources using platform services
Computational resources for demanding evaluations

•  Don’t start your evaluation from scratch!

15/10/13
46
© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

46
The 15 tips for technology evaluation

•  Know the technology
•  Support different test
data
•  Facilitate test data
definition
•  Support different
types of technology
•  Define declarative
evaluation workflows
•  Use machineprocessable
descriptions of
evaluation resources

© Raúl García Castro

•  Automate the
evaluation framework
•  Expect reproducibility
•  Beware of result
analysis
•  Learn statistics
•  Plan for evaluation
requirements
•  Use a quality model
•  Organize (or join)
evaluation campaigns
•  Share evaluation
resources
•  Exploit evaluation
results
Talk at IMATI-CNR. 15th October 2013

47
Index

• 
• 
• 
• 
• 

Self-awareness
Crawling (Graduation Project)
Walking (Ph.D. Thesis)
Cruising (Postdoctoral Research)
Insight
insight
noun
[…]
[mass noun] Psychiatry awareness by a mentally ill
person that their mental experiences are not based in
external reality.

© Raúl García Castro

Talk at IMATI-CNR. 15th October 2013

48
Evolution towards maturity
Software Evaluation Technology Maturity Model
TABLE I
L EVELS AND THEMES OF SOFTWARE EVALUATION TECHNOLOGY MATURITY

Level
Initial
Repeatable

Reusable

Integrated

Optimized

Formalization of the
evaluation workflow
Ad-hoc workflow informally
defined.
Ad-hoc workflow defined.

Software support to
the evaluation
Manual evaluation.
No software support.
Ad-hoc evaluation software.

Technology-specific
workflow defined.

Reusable
evaluation
software:
- multiple software
products.
- multiple test data.
Evaluation
infrastructure:
- multiple types of
software products.
- multiple test data.
Federation of evaluation
infrastructures:
- autonomous infrastructures.
- interchange of evaluation resources.
- data access and use
policies.

Generic workflow defined.
Machine-processable
and
built reusing common parts.
Evaluation resources built
upon shared principles.
Generic workflow defined.
Machine-processable
and
built reusing common parts.
Evaluation resources built
upon shared principles.
Measured and optimized.

Applicability to multiple
software types
Small number of software
products of the same type.
Small number of software
products of the same type.
Ad-hoc access to software
products.
Multiple software products of the same type.
Generic access to software products.

Usability of test data
Informally defined.
Defined.

Machine-processable.

Exploitability of
results
Informally defined.
Not verifiable.
Machine-processable.
Combined for some
software products of
the same type.
Machine-processable.
Combined for many
software products of
the same type.

Representativeness
of participants
One team.
One or few teams.

Several teams.

UPM-FBI
RDF(S) Import B.
Suite
RDF(S) Interoperability B.
RDF(S) Export B.
Suite

OWL Interoperability B.
OWL Lite Import B. Suite

RDF(S)
Interoperability B.
Manual
Suite

Tool X

rdfsbs
IRIBA

Multiple software products of different types.
Generic access to software products.

Machine-processable.
Reused across evaluations.

Machine-processable.
Combined for many
software products of
different types.

Machine-processable.
Reused
across
evaluations.
Customizable,
optimized
and
curated.

Machine-processable.
Combined for many
software products of
different types.
High availability and
quality.

Tool X

Tool Y

IBSE

Several teams.
Stakeholders.

Multiple
software
products of different
types.
Generic
access
to
software products.
Support any software
product requirement.

Automatic

Tool Y

Community.

characteristics of such software products. This workflow is access, interchange, and use. This federation of infrastructures
supported by evaluation software that can be used to assess permits satisfying any software or hardware requirements of
any software product of the type covered by the evaluation; the different software products; customizing, optimizing, and
García-Castro R. "SET-MM – A Software Evaluation Technology Maturity Model". 23rd International Conference on Software Engineering and
the software product must have previously 660-665. Miami Beach, curating July 2011. and improving the availability and quality
implemented the USA. 7-9 test data;
Knowledge Engineering (SEKE2011). pp.
required mechanisms to be integrated with the evaluation soft- of the evaluation results.
ware. Test García Castro
© Raúl data and evaluation results are machine-processable;
Talk at IMATI-CNR. 15th October 2013

49
The 15 tips for technology evaluation

•  Know the technology
•  Support different test
data
•  Facilitate test data
definition
•  Support different types
of technology
•  Define declarative
evaluation workflows
•  Use machineprocessable
descriptions of
evaluation resources

© Raúl García Castro

•  Automate the
evaluation framework
•  Expect reproducibility
•  Beware of result
analysis
•  Learn statistics
•  Plan for evaluation
requirements
•  Use a quality model
•  Organize (or join)
evaluation campaigns
•  Share evaluation
resources
•  Exploit evaluation
results
Talk at IMATI-CNR. 15th October 2013

50
Thank you for your
attention!

Speaker: Raúl García-Castro

Talk at IMATI-CNR,
October 15th,
Genova, Italy

Contenu connexe

Similaire à The evolution of semantic technology evaluation in my own flesh (The 15 tips for technology evaluation)

Mohan C R CV
Mohan C R CVMohan C R CV
Mohan C R CVMOHAN C R
 
Algorithm Visualizer
Algorithm VisualizerAlgorithm Visualizer
Algorithm VisualizerAnwar Jameel
 
Runtime Behavior of JavaScript Programs
Runtime Behavior of JavaScript ProgramsRuntime Behavior of JavaScript Programs
Runtime Behavior of JavaScript ProgramsIRJET Journal
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdfOpenACC
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated ProcessIRJET Journal
 
Comprehending Ajax Web Applications by the DynaRIA Tool
Comprehending Ajax Web Applications by the DynaRIA ToolComprehending Ajax Web Applications by the DynaRIA Tool
Comprehending Ajax Web Applications by the DynaRIA ToolPorfirio Tramontana
 
Federico Toledo - Extra-functional testing.pdf
Federico Toledo - Extra-functional testing.pdfFederico Toledo - Extra-functional testing.pdf
Federico Toledo - Extra-functional testing.pdfQA or the Highway
 
Towards Configuration Technologies for IoT Gateways
Towards Configuration Technologies  for IoT GatewaysTowards Configuration Technologies  for IoT Gateways
Towards Configuration Technologies for IoT GatewaysAGILE IoT
 
Simulagora (Euroscipy2014 - Logilab)
Simulagora (Euroscipy2014 - Logilab)Simulagora (Euroscipy2014 - Logilab)
Simulagora (Euroscipy2014 - Logilab)Logilab
 
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptxGEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptxGeetha982072
 
Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10Kuwait10
 
online movie ticket booking system
online movie ticket booking systemonline movie ticket booking system
online movie ticket booking systemSikandar Pandit
 
IT Confidence 2013 - Spago4Q presents a 3D model for Productivity Intelligence
IT Confidence 2013 - Spago4Q presents a 3D model for Productivity IntelligenceIT Confidence 2013 - Spago4Q presents a 3D model for Productivity Intelligence
IT Confidence 2013 - Spago4Q presents a 3D model for Productivity IntelligenceSpagoWorld
 
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...Rohan Karunaratne
 
Performance Evaluation of Open Source Data Mining Tools
Performance Evaluation of Open Source Data Mining ToolsPerformance Evaluation of Open Source Data Mining Tools
Performance Evaluation of Open Source Data Mining Toolsijsrd.com
 
Go Observability (in practice)
Go Observability (in practice)Go Observability (in practice)
Go Observability (in practice)Eran Levy
 

Similaire à The evolution of semantic technology evaluation in my own flesh (The 15 tips for technology evaluation) (20)

Mohan C R CV
Mohan C R CVMohan C R CV
Mohan C R CV
 
Algorithm Visualizer
Algorithm VisualizerAlgorithm Visualizer
Algorithm Visualizer
 
Runtime Behavior of JavaScript Programs
Runtime Behavior of JavaScript ProgramsRuntime Behavior of JavaScript Programs
Runtime Behavior of JavaScript Programs
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated Process
 
Comprehending Ajax Web Applications by the DynaRIA Tool
Comprehending Ajax Web Applications by the DynaRIA ToolComprehending Ajax Web Applications by the DynaRIA Tool
Comprehending Ajax Web Applications by the DynaRIA Tool
 
software engineering
software engineering software engineering
software engineering
 
Federico Toledo - Extra-functional testing.pdf
Federico Toledo - Extra-functional testing.pdfFederico Toledo - Extra-functional testing.pdf
Federico Toledo - Extra-functional testing.pdf
 
Towards Configuration Technologies for IoT Gateways
Towards Configuration Technologies  for IoT GatewaysTowards Configuration Technologies  for IoT Gateways
Towards Configuration Technologies for IoT Gateways
 
Simulagora (Euroscipy2014 - Logilab)
Simulagora (Euroscipy2014 - Logilab)Simulagora (Euroscipy2014 - Logilab)
Simulagora (Euroscipy2014 - Logilab)
 
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptxGEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx
 
Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10
 
online movie ticket booking system
online movie ticket booking systemonline movie ticket booking system
online movie ticket booking system
 
CV_CEB_2017_en
CV_CEB_2017_enCV_CEB_2017_en
CV_CEB_2017_en
 
Bertazo et al - Application Lifecycle Management and process monitoring throu...
Bertazo et al - Application Lifecycle Management and process monitoring throu...Bertazo et al - Application Lifecycle Management and process monitoring throu...
Bertazo et al - Application Lifecycle Management and process monitoring throu...
 
IT Confidence 2013 - Spago4Q presents a 3D model for Productivity Intelligence
IT Confidence 2013 - Spago4Q presents a 3D model for Productivity IntelligenceIT Confidence 2013 - Spago4Q presents a 3D model for Productivity Intelligence
IT Confidence 2013 - Spago4Q presents a 3D model for Productivity Intelligence
 
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
 
Performance Evaluation of Open Source Data Mining Tools
Performance Evaluation of Open Source Data Mining ToolsPerformance Evaluation of Open Source Data Mining Tools
Performance Evaluation of Open Source Data Mining Tools
 
Go Observability (in practice)
Go Observability (in practice)Go Observability (in practice)
Go Observability (in practice)
 

Dernier

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 

Dernier (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

The evolution of semantic technology evaluation in my own flesh (The 15 tips for technology evaluation)

  • 1. The evolution of semantic technology evaluation in my own flesh (The 15 tips for technology evaluation) Raúl García-Castro Ontology Engineering Group. Universidad Politécnica de Madrid, Spain rgarcia@fi.upm.es Speaker: Raúl García-Castro Talk at IMATI-CNR, October 15th, Genova, Italy
  • 2. Index •  •  •  •  •  Self-awareness Crawling (Graduation Project) Walking (Ph.D. Thesis) Cruising (Postdoctoral Research) Insight © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 2
  • 3. Who am I? •  Assistant Professor -  Ontology Engineering Group -  Computer Science School at Universidad Politécnica de Madrid (UPM) •  Research lines -  Evaluation and benchmarking of semantic technologies •  Conformance and interoperability of ontology engineering tools •  Evaluation infrastructures -  Ontological engineering •  Sensors, ALM, energy efficiency, context, software evaluation -  Application integration http://www.garcia-castro.com/ © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 3
  • 4. Semantic Web technologies The Semantic Web is: •  “An extension of the current web in which information is given welldefined meaning, better enabling computers and people to work in cooperation” [Berners-Lee et al., 2001] •  A common framework for data sharing and reusing across applications •  Distinctive characteristics: -  -  -  -  Use of W3C standards Use ontologies as data models Inference of new information Open world assumption Information Directory manager Service directory manager Ontology editor Ontology visualizer •  High heterogeneity: -  Different functionalities •  In general •  In particular -  Different KR formalisms •  Different expressivity •  Different reasoning capabilities Ontology browser Ontology selector Service discoverer Ontology aligner Ontology localizer Ontology evaluator Ontology searcher Ontology learner Ontology ranker Ontology modularizer Ontology profiler ONTOLOGY ONTOLOGY DEVELOPMENT & MANAGEMENT CUSTOMIZATION Ontology evolution manager Ontology evolution visualizer Service non-functional selector Ontology matcher Ontology merger Service process mediator Instance editor Query answering Ontology integrator Manual annotation Ontology transformer Automatic annotation Semantic query processor Ontology configuration manager Ontology reconciler Ontology populator ONTOLOGY EVOLUTION ONTOLOGY ALIGNMENT ONTOLOGY INSTANCE GENERATION Ontology versioner Service choreography engine Distributed ontology repository Distributed instance repository Distributed data repository Distributed annotated data repository Service orchestration Distributed alignment repository Semantic query editor Service composer Distributed registry QUERYING AND REASONING SEMANTIC WEB SERVICES DATA MANAGEMENT García-Castro, R.; Muñoz-García, O.; Gómez-Pérez, A.; Nixon L. "Towards a component-based framework for developing Semantic Web applications". 3rd Asian Semantic Web Conference (ASWC 2008). 2-5 February, 2009. Bangkok, Thailand. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 4
  • 5. Ontology engineering tools Allow the creation and management of ontologies: •  Ontology editors -  User oriented •  Ontology language APIs -  Programming oriented © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 5
  • 6. Index •  •  •  •  •  Self-awareness Crawling (Graduation Project) Walking (Ph.D. Thesis) Cruising (Postdoctoral Research) Insight http://www.phdcomics.com/comics/archive.php?comicid=1012 © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 6
  • 7. Evaluation goal GQM paradigm: Any software measurement activity should be preceded by: 1.- The identification of a software engineering goal ... Latency 2.- ... which leads to questions ... Scalability 3.- ... which in turn lead to actual metrics. Goal: To improve the performance and the scalability of the methods provided by the ontology management APIs of ontology development tools Which is the actual performance of the API methods? Is the performance of the methods stable? Are there any anomalies in the performance of the methods? Do changes in a method’s parameters affect its performance? Does tool load affect the performance of the methods? Execution time of each method Variance of execution times of each method Percentage of execution times out of range in each method’s sample Execution time with parameter A = Execution time with parameter B Tool load versus execution time relationship Metric: Execution times of the methods of the API with different load factors © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 7
  • 8. Evaluation data •  Atomic operations of the ontology management API •  Multiple benchmarks defined for each method according to changes in its parameters •  Benchmarks parameterised according to the number of consecutive executions of the method insertConcept(String ontology, String concept) insertConcept insertRelation insertClassAttribute insertInstanceAttribute insertConstant insertReasoningElement insertInstance updateConcept updateRelation updateClassAttribute updateInstanceAttribute updateConstant updateReasoningElement updateInstance ....... (72 methods) © Raúl García Castro benchmark1_1_08(N) “Inserts N concepts in 1 ontology” benchmark1_1_09(N) “Inserts 1 concept in N ontologies” Ontology_1 Concept_1 . . . Ontology_1 Concept_1 Concept_N . . . Ontology_N (128 benchmarks) Talk at IMATI-CNR. 15th October 2013 8
  • 9. Workload generator •  Generates and inserts into the tool synthetic ontologies accordant with: -  Load factor (X). Defines the size of ontology data -  Ontology structure dependent on the benchmarks Benchmark Operation Execution needs benchmark1_1_08 Inserts N concepts in an ontology 1 ontology benchmark1_1_09 Inserts a concept in N ontologies N ontologies benchmark1_3_20 Removes N concepts from an ontology 1 ontology with N concepts benchmark1_3_21 Removes a concept from N ontologies N ontologies with 1 concept For executing all the benchmarks, the ontology structure includes the execution needs of all the benchmarks © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 9
  • 10. Evaluation infrastructure Benchmark Suite Executor Performance Benchmark Suite Workload Generator Ontology Development Tool Measurement Data Library Statistical Analyser To be instantiated for each tool … http://knowledgeweb.semanticweb.org/wpbs/ © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 10
  • 11. Statistical analyser Benchmark Suite Executor Performance Benchmark Suite Workload Generator Ontology Development Tool BenchStats Measurement Data Library Statistical Analyser Measurement Data Library benchmark1_1_08 400 measurements 2134 ms. 2300 ms. 2242 ms. 2809 ms. ... Statistical software benchmark1_1_09 400 measurements 1399 ms. 2180 ms. ... benchmark1_3_20 400 measurements 2032 ms. 1459 ms. ... … © Raúl García Castro Load N UQ LQ IQR Median % Outliers Function benchmark1_1_08 5000 400 60 60 0 60 1.25 y=62.0-0.009x benchmark1_1_09 5000 400 912 901 11 911 1.75 y=910.25-0.003x benchmark1_3_20 5000 400 160 150 10 150 1.25 y=155.25-0.003x benchmark1_3_21 5000 400 160 150 10 151 0.25 y=154.96-0.001x Talk at IMATI-CNR. 15th October 2013 11
  • 12. Result analysis - Latency Metric for the execution time: Metric for anomalies in the execution times: The median of the execution times of a method Percentage of outliers in the execution times of a method No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo. 8 Methods with execution times>800 ms. No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo. N=400, X=5000 2 methods with % outliers>5% N=400, X=5000 Metric for the variability of the execution time: Effect of changes in method parameters: The interquartile range of the execution times of a method Comparison of the medians of the execution times of the benchmarks that use the same method No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo. 3 methods with IQR>11 ms. N=400, X=5000 © Raúl García Castro No se puede mostrar la imagen. Puede que su equipo no tenga suficiente memoria para abrir la imagen o que ésta esté dañada. Reinicie el equipo y, a continuación, abra el archivo de nuevo. Si sigue apareciendo la x roja, puede que tenga que borrar la imagen e insertarla de nuevo. 5 methods with differences in execution times > 60 ms. N=400, X=5000 Talk at IMATI-CNR. 15th October 2013 12
  • 13. Result analysis - Scalability Effect of changes in WebODE’s load: Slope of the function estimated by simple linear regression of the medians of the execution times from a minimum load (X=500) to a maximum one (X=5000). 8 methods with slope>0.1 ms. N=400, X=500..5000 © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 13
  • 14. Limitations •  Evaluating other tools is expensive Benchmark Suite Executor Workload Generator Performance Benchmark Suite Measurement Data Library Statistical Analyser Ontology Ontology Ontology Development Ontology Development Development Tool Development Tool Tool Tool •  Analysis of results was difficult -  The evaluation was executed 10 times with different load factors: 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, and 5000 -  128 benchmarks x 10 executions = 1280 files with results!!!!! García-Castro R., Gómez-Pérez A "Guidelines for Benchmarking the Performance of Ontology Management APIs" 4th International Semantic Web Conference (ISWC2005), LNCS 3729. November 2005. Galway, Ireland. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 14
  • 15. The 15 tips for technology evaluation •  Know the technology •  Support different types of technology © Raúl García Castro •  Automate the evaluation framework •  Expect reproducibility •  Beware of result analysis •  Learn statistics •  Plan for evaluation requirements Talk at IMATI-CNR. 15th October 2013 15
  • 16. Index •  •  •  •  •  Self-awareness Crawling (Graduation Project) Walking (Ph.D. Thesis) Cruising (Postdoctoral Research) Insight KHAAAAN! http://www.phdcomics.com/comics/archive.php?comicid=500 © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 16
  • 17. Interoperability in the Semantic Web •  Interoperability is the ability that Semantic Web technologies have to interchange ontologies and use them -  At the information level; not at the system level -  In terms of knowledge reuse; not information integration •  In the real world it is not feasible to use a single system or a single formalism •  Different behaviours in interchanges between different formalisms: C disjoint subclass A © Raúl García Castro B LOSSLESS Different formalism A LOSS Same formalism disjoint B subclass A C subclass disjoint subclass A A C B disjoint myDisjoint A myDisjoint B C subclass B C B A B Talk at IMATI-CNR. 15th October 2013 17
  • 18. Evaluation goal To evaluate and improve the interoperability of Semantic Web technologies using RDF(S) and OWL as interchange languages © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 18
  • 19. Evaluation workflow - Manual Import Export Ontologies Ontologies Tool X Oi RDF(S)/OWL Tool X Oi’ Oi Oi’ RDF(S)/OWL Oi = Oi’ + β - β’ Oi = Oi’ + α – α’ Interoperability Ontologies Tool Y Tool X Oi Oi’ RDF(S)/OWL Oi’’ Oi = Oi’’ + α - α’ + β - β’ © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 19
  • 20. Evaluation workflow - Automatic Interchange language Existing ontologies O1..On Tool X O1RDF(S)/OWL Tool Y O1’ O1’’ RDF(S)/OWL Step 1: Import + Export O1’’’ O1’’’’ RDF(S)/OWL Step 2: Import + Export O1’’=O1’’’’ + β - β’ O1 = O1’’ + α - α’ Interchange O1 = O1’’’’ + α - α’ + β - β’ © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 20
  • 21. Evaluation data - OWL Lite Import Test Suite Component combinations RDF/XML Syntax variants <rdf:Description rdf:about="#class1"> <rdf:type rdf:resource="&rdfs;Class"/> </rdf:Description> = <rdfs:Class rdf:about="#class1"> </rdfs:Class> Group Class hierarchies Class equivalences Classes defined with set operators Property hierarchies Properties with domain and range Relations between properties Global cardinality constraints and logical property characteristics Single individuals Named individuals and properties Anonymous individuals and properties Individual identity Syntax and abbreviation TOTAL No. 17 12 2 4 10 3 5 Subclass of class Subclass of restriction Value constraints Set operators Cardinality + object property Cardinality + datatype property 3 5 3 3 15 82 David S., García-Castro, R.; Gómez-Pérez, A. "Defining a Benchmark Suite for Evaluating the Import of OWL Lite Ontologies". Second International Workshop OWL: Experiences and Directions 2006 (OWL2006). November, 2006. Athens, Georgia, USA. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 21
  • 22. Evaluation criteria •  Execution informs about the correct execution: –  –  –  –  OK. No execution problem FAIL. Some execution problem Comparer Error (C.E.) Comparer exception Not Executed. (N.E.) Second step not executed •  Information added or lost in terms of triples Oi = Oi’ + α - α’ •  Interchange informs whether the ontology has been interchanged correctly with no addition or loss of information: –  SAME if Execution is OK and Information added and Information lost are void –  DIFFERENT if Execution is OK but Information added or Information lost are not void –  NO if Execution is FAIL, N.E. or C.E. © Raúl García Castro Oi = Oi’ ? Talk at IMATI-CNR. 15th October 2013 22
  • 23. Evaluation campaigns RDF(S) Interoperability Benchmarking 3 ontology repositories 3 ontology development tools 6 Tools (Frames) OWL Interoperability Benchmarking 1 ontology-based annotation tool 5 ontology development tools 3 ontology repositories 9 Tools © Raúl García Castro SemTalk (Frames) (OWL) (Frames) Talk at IMATI-CNR. 15th October 2013 23
  • 24. Evaluation infrastructure - IRIBA ! ! ! ! ! ! http://knowledgeweb.semanticweb.org/iriba/ © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 24
  • 25. Evaluation infrastructure - IBSE Benchmark descriptions OWL Lite Import Benchmark Suite benchmarkOntology Reports (HTML, SVG) rdf:type 1 Describe benchmarks <rdf:RDF xmlns:rdf="http://www.w3.org/ <rdf:RDF 1999/02/22-rdf-syntax-ns#" xmlns:rdf="http://www.w3.org/ <rdf:RDF xmlns:rdfs="http://www.w3.org/ 1999/02/22-rdf-syntax-ns#" xmlns:rdf="http://www.w3.org/ 2000/01/rdf-schema#" xmlns:rdfs="http://www.w3.org/ 1999/02/22-rdf-syntax-ns#" xmlns:owl="http://www.w3.org/ 2000/01/rdf-schema#" xmlns:rdfs="http://www.w3.org/ 2002/07/owl#" xmlns:xsd="http:// 2000/01/rdf-schema#" www.w3.org/2001/XMLSchema#" xmlns:owl="http://www.w3.org/ arkOntology#" arkOntology#"> 2002/07/owl#" <owl:Ontology rescription of the benchmark suite inputs.</ rdfs:comment> <owl:versionInfo>24 October 2006</owl:versionInfo> </owl:Ontology> <!-- classes --> Execution results resultOntology rdf:type Tools … •  •  •  •  2 Execute benchmarks <rdf:RDF xmlns:rdf="http://www.w3.org/ <rdf:RDF 1999/02/22-rdf-syntax-ns#" xmlns:rdf="http://www.w3.org/ <rdf:RDF xmlns:rdfs="http://www.w3.org/ 1999/02/22-rdf-syntax-ns#" xmlns:rdf="http://www.w3.org/ 2000/01/rdf-schema#" xmlns:rdfs="http://www.w3.org/ 1999/02/22-rdf-syntax-ns#" xmlns:owl="http://www.w3.org/ 2000/01/rdf-schema#" xmlns:rdfs="http://www.w3.org/ 2002/07/owl#" xmlns:xsd="http:// 2000/01/rdf-schema#" www.w3.org/2001/XMLSchema#" xmlns:owl="http://www.w3.org/ arkOntology#" arkOntology#"> 2002/07/owl#" <owl:Ontology rescription of the benchmark suite inputs.</ rdfs:comment> <owl:versionInfo>24 October 2006</owl:versionInfo> </owl:Ontology> <!-- classes --> 3 Generate reports Automatically executes experiments between all the tools Allows configuring different execution parameters Uses ontologies to represent benchmarks and results Depends on external ontology comparers (KAON2 OWL Tools and RDFutils) http://knowledgeweb.semanticweb.org/benchmarking_interoperability/ibse/ García-Castro, R.; Gómez-Pérez, A., Prieto-González J. "IBSE: An OWL Interoperability Evaluation Infrastructure". Third International Workshop OWL: Experiences and Directions 2007 (OWL2007). June, 2007. Innsbruck, Austria. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 25
  • 26. Evaluation results - Variability •  High variability in evaluation results Tool import/export: Ontology comparison: Same information More information Less information Tool fails Comparer fails Not valid ontology Models and executes Does not model and executes Models and fails Does not model and fails Not executed •  Different perspectives for analysis -  -  -  -  50 45 Results per tool / pair of tools Results per component Result evolution over time … W-W K-P K-W P-W K-P-W Y Y Y Y Y - Y N Y - Y - Y - Classes instance of multiple metaclasses (1) Y N - N - - - Class hierarchies without cycles (3) Y Y Y Y Y Y Y Y Y - - N - - - - Datatype properties whose range is String (5) Datatype properties whose range is a XML Schema datatype (2) Object properties without domain or range (8) Object properties with a domain and a range (2) Y Y Y Y Y Y Y Y Y Y N Y Y N Y Y Y Y N Y Object properties with multiple domains or ranges (5) Instances of undefined resources (1) Instances of a single class (2) Instances of a multiple classes (1) Instances related via object properties (7) Instances related via datatype properties (2) Instances related via datatype properties with range a XML schema datatype (2) Instances related via undefined object or datatype properties (3) Y Y Y Y Y - Y N Y Y - Y Y Y Y - Y N Y N - Y Y Y - Y Y Y - Y Y N - Models and executes 30 Not models and exec 25 Models and fails 20 Not models and fails 15 10 5 0 04-2005 Y Class hierarchies with cycles (2) Classes related through object or datatype properties (6) Datatype properties without domain or range (7) Datatype properties with multiple domains (3) © Raúl García Castro K-K 35 05-2005 10-2005 DESTINATION 01-2006 DESTINATION JE PO SW K2 GA ST WE PF Jena 100 100 100 78 85 16 17 5 5 Protégé-OWL 100 100 95 78 89 16 17 5 17 5 SWIProlog 100 100 100 78 55 45 17 5 39 6 0 KAON2 78 78 78 78 40 39 6 0 46 13 15 13 GATE 96 52 79 74 46 13 15 13 JE PO SW K2 GA ST WE PF Jena 100 100 100 78 85 16 17 5 Protégé-OWL 100 100 95 78 89 16 17 SWIProlog 100 100 100 78 55 45 KAON2 78 78 78 78 40 GATE 96 52 79 74 ORIGIN P-P Classes (2) Classes instance of a single metaclass (4) ORIGIN Combinations 40 SemTalk 45 46 46 27 24 46 17 0 SemTalk 45 46 46 27 24 46 17 0 WebODE 17 18 0 6 16 17 17 12 WebODE 17 18 0 6 16 17 17 12 Protégé-Frames 5 5 0 0 4 5 0 13 Protégé-Frames 5 5 0 0 4 5 0 13 Talk at IMATI-CNR. 15th October 2013 26
  • 27. Evaluation results - Interoperability Clear picture of the interoperability between different tools •  Low interoperability and few clusters of interoperable tools •  Interoperability depends on: -  -  -  -  Ontology translation (tool knowledge model) Specification (development decisions) Robustness (tool defects) Tools participating in the interchange (each behaves differently) •  Tools have improved •  Involvement of tool developers is needed -  Tool developers have been informed -  Tool improvement is out of our scope •  Results are expected to change -  Continuous evaluation is needed García-Castro, R.; Gómez-Pérez, A. "Interoperability results for Semantic Web technologies using OWL as the interchange language". Web Semantics: Science, Services and Agents in the World Wide Web. ISSN: 1570-8268. Elsevier. Volume 8, number 4. pp. 278-291. November 2010. García-Castro, R.; Gómez-Pérez, A. "RDF(S) Interoperability Results for Semantic Web Technologies". International Journal of Software Engineering and Knowledge Engineering. ISSN: 0218-1940. Editor: Shi-Kuo Chang. Volume 19, number 8. pp. 1083-1108. December 2009. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 27
  • 28. Benchmarking interoperability Method for benchmarking interoperability •  Common for different Semantic Web technologies •  Problem-focused instead of tool-focused •  Manual vs automatic experiments: -  It depends on the specific needs of the benchmarking -  Automatic: cheaper, more flexible and extensible -  Manual: higher quality of results Resources for benchmarking interoperability •  All the benchmark suites, software and results are publicly available •  Independent of: RDF(S) Interoperability B. OWL Interoperability B. RDF(S) Import B. Suite RDF(S) Export B. Suite OWL Lite Import B. Suite RDF(S) Interoperability B. Suite -  The interchange language -  The input ontologies Manual Tool X Automatic Tool Y rdfsbs IRIBA Tool X Tool Y IBSE García-Castro, R. "Benchmarking Semantic Web technology". Studies on the Semantic Web vol. 3. AKA Verlag – IOS Press. ISBN: 978-3-89838-622-7. January 2010. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 28
  • 29. Limitations •  Number of results to analyse increased exponentially -  2168 executions in the RDF(S) benchmarking activity and -  6642 executions in the OWL one •  Hard to support and maintain different test data and tools •  Every tool to be evaluated had to be deployed in the same computer © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 29
  • 30. The 15 tips for technology evaluation •  Know the technology •  Support different test data •  Support different types of technology •  Use machineprocessable descriptions of evaluation resources © Raúl García Castro •  Automate the evaluation framework •  Expect reproducibility •  Beware of result analysis •  Learn statistics •  Plan for evaluation requirements •  Organize (or join) evaluation campaigns Talk at IMATI-CNR. 15th October 2013 30
  • 31. Index •  •  •  •  •  Self-awareness Crawling (Graduation Project) Walking (Ph.D. Thesis) Cruising (Postdoctoral Research) Insight http://www.phdcomics.com/comics/archive.php?comicid=570 © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 31
  • 32. The SEALS Project (RI-238975) hVp://www.seals-­‐project.eu/   Project  Coordinator:     Asunción  Gómez  Pérez   <asun@fi.upm.es>   3 2 1 1 2 1 © Raúl García Castro EC    contribu2on:   3.500.000  €   Dura2on:     June  2009-­‐June  2012   Universidad  Politécnica  de  Madrid,  Spain  (Coordinator)   University  of  Sheffield,  UK   University  of  Mannheim,  Germany   Forschungszentrum  InformaCk,  Germany   University  of  Zurich,  Switzerland   University  of  Innsbruck,  Austria   STI  InternaConal,  Austria   InsCtut  NaConal  de  Recherche  en     Open  University,  UK   InformaCque  et  en  AutomaCque,  France   Oxford  University,  UK   Talk at IMATI-CNR. 15th October 2013 32
  • 33. Semantic technology evaluation @ SEALS SEALS  Pla8orm   SEALS  Evalua2on  Campaigns   SEALS  Evalua2on   Services   SEALS  Community   Wrigley S.; García-Castro R.; Nixon L. "Semantic Evaluation At Large Scale (SEALS)". 21st International World Wide Web Conference (WWW 2012). European projects track. pp. 299-302. Lyon, France. 16-20 April 2012. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 33
  • 34. The SEALS entities Evaluations Tools Ontology engineering Storage and reasoning Ontology matching Semantic search Semantic web service Results Test Data Raw Results Interpretations 15/10/13 34 © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 34
  • 35. Structure of the SEALS entities • Java  Binaries   • Shell  scripts   • Bundles   • BPEL     • Java  Binaries   • Ontologies   EnCty   Discovery,   Valida2on   SEALS  Ontologies   Metadata   Data   Exploita2on   http://www.seals-project.eu/ontologies/ García-Castro R.; Esteban-Gutiérrez M.; Kerrigan M.; Grimm S. "An Ontology Model to Support the Automatic Evaluation of Software". 22nd International Conference on Software Engineering and Knowledge Engineering (SEKE 2010). pp. 129-134. Redwood City, USA. 1-3 July 2010. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 35
  • 36. SEALS logical architecture O S   Evaluation Organisers A   Technology Providers Technology Adopters SEALS  Portal   Run2me   Evalua2on   Service   SEALS     Service  Manager   Software agents SEALS Repositories Test  Data     Repository   Service   Tools     Repository   Service   Results     Repository   Service   Evalua2on   Descrip2ons   Repository  Service   García-Castro R.; Esteban-Gutiérrez M.; Gómez-Pérez A. "Towards an Infrastructure for the Evaluation of Semantic Technologies". eChallenges e-2010 Conference (e-2010). pp. 1-8. Warsaw, Poland. 27-29 October 2010. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 36
  • 37. Challenges •  Tool heterogeneity Virtualization as a technology enabler -  Hardware requirements -  Software requirements •  Reproducibility -  Ensure execution environment offers the same initial status Processing  Node   Execu)on   Node   Virtual  Machine   Tool   Virtual  Machine   Virtualiza)on  Solu)on   •  VMWare  Server  2.0.2   •  VMWare  vSphere  4   •  Amazon  EC2  (In  progress)   Tool   …   © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 37
  • 38. Evaluation campaign methodology SEALS Methodology for Evaluation Campaigns •  SEALS-independent •  Includes: -  -  -  -  -  -  Actors Process Recommendations Alternatives Terms of participation Use rights Raúl García-Castro and Stuart N. Wrigley September 2011 INITIATION   INVOLVEMENT   DISSEMINATION   PREPARATION  &  EXECUTION   FINALIZATION   García Castro R.; Martin-Recuerda F.; Wrigley S. "SEALS. Deliverable 3.8 SEALS Methodology for Evaluation Campaigns v2". Technical Report. SEALS project. July 2011. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 38
  • 39. Current SEALS evaluation services Ontology engineering Ontology reasoning Ontology matching Semantic search Semantic web service •  Conformance •  Interoperability •  Scalability DL reasoning: • Classification • Class satisfiability • Ontology satisfiability • Entailment • Non-entailment • Instance retrieval RDF reasoning: • Conformance • Matching accuracy • Matching accuracy multilingual • Scalability (ontology size, # CPU) • Search accuracy, efficiency (automated) • Usability, satisfaction (userin-the-loop) • SWS Discovery Conformance & interoperability: •  RDF(S) •  OWL Lite, DL and Full •  OWL 2 Expressive x3 •  OWL 2 Full Scalability: •  Real-world •  LUBM •  Real-world + •  LUBM + DL reasoning: •  Gardiner test suite •  Wang et al. repository •  Versions of GALEN •  Ontologies from EU projects •  Instance retrieval test data RDF reasoning: • OWL 2 Full •  Benchmark •  Anatomy Conference •  MultiFarm •  Large Biomed (supported by SEALS) Automated: • EvoOnt • MusicBrainz (from QALD-1) User-in-the-loop: • Mooney • Mooney + • OWLS-TC 4.0 • SAWSDL-TC 3.0 • WSMO-LITE-TC Evaluations Test Data © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 39
  • 40. New evaluation data – Conformance and interoperability •  OWL DL test suite à keyword-driven approach -  Manual definition of test in CSV/ spreadsheet using a keyword library Test Suite Generator Test Suite Definition Script Expanded Test Suite Definition Script Preprocessor Test Suite Metadata ontology01.owl ontology02.owl Keyword Library Interpreter ontology03.owl … OWL2EG (http://knowledgeweb.semanticweb.org/benchmarking_interoperability/OWL2EG/) •  OWL 2 test suite à automatically generate ontologies of increasing expressiveness: Online Ontologies Ontology Search Ontology Module Extraction Initial ontologies Ontology generation process OWL API Original test suite Metad ata -  Using ontologies in the Web -  Maximizing expressiveness Expressive test suite Increase expressivity Metad ata Maximize expressivity OWLDLGenerator (http://knowledgeweb.semanticweb.org/benchmarking_interoperability/OWLDLGenerator/) Full-expressive test suite Metad ata García-Castro R.; Gómez-Pérez A. "A Keyword-driven Approach for Generating Ontology Language Conformance Test Data". Engineering Applications of Artificial Intelligence. ISSN: 0952-1976. Elsevier. Editor: B. Grabot. Grangel-González I.; García-Castro R. "Automatic Conformance Test Data Generation Using Existing Ontologies in the Web". Second International Workshop on Evaluation of Semantic Technologies (IWEST 2012). 28 May 2012. Heraklion, Greece. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 40
  • 41. 1st Evaluation Campaign Campaign Ontology engineering Tool Jena Sesame Protégé 4 Protégé OWL NEON toolkit OWL API Reasoning HermiT jcel FaCT++ Matching AROMA ASMOV Aroma Falcon-AO Lily RiMOM Mapso CODI AgreeMaker Gerome* Ef2Match Semantic search K-Search Ginseng NLP-Reduce PowerAqua Jena Arq Semantic web service 4 OWLS-MX variants Provider Country HP Labs Aduna University of Stanford University of Stanford NEON Foundation University of Manchester University of Oxford Tec. Universitat Dresden University of Manchester INRIA INFOTECH Soft Nantes University Southeast University Southeast University Tsinghua University FZI University of Mannheim Advances in Computing Lab RWTH Aachen Nanyang Tec. University K-Now Ltd University of Zurich University of Zurich KMi, Open University HP Labs, Talis DFKI UK Netherlands USA USA Europe UK UK Germany UK France USA France China China China Germany Germany USA Germany China UK Switzerland Switzerland UK UK Germany 29 tools from 8 countries Nixon L.; García-Castro R.; Wrigley S.; Yatskevich M.; Trojahn-dos-Santos C.; Cabral L. "The state of semantic technology today – overview of the first SEALS evaluation campaigns". 7th International Conference on Semantic Systems (I-SEMANTICS2011). Graz, Austria. 7-9 September 2011. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 41
  • 42. 2nd Evaluation Campaign W P Tool 10 Jena HP Labs UK W Tool Provider Country Sesame Aduna Netherlands P Protégé 4 University of Stanford USA 12 AgrMaker University of Illinois at Chicago Protégé OWL Stanford USA USA Country W Tool University of Provider Aroma INRIA Grenoble Rhone-Alpes France NeOn toolkit NeOn Foundation Europe P AUTOMSv2 University of Manchester VTT Technical Research Centre Finland OWL API UK 13 K-Search K-Now Ltd CIDER Universidad Politecnica de UK Spain UK HermiT University of Oxford Ginseng University of Zurich Switzerland Madrid jcel Technischen Universitat Dresden Germany University of Zurich Switzerland CODI NLP-Reduce Universitat Mannheim FaCT++ University of Manchester UK Germany KMi, Open University UK CSA PowerAqua University of Ho Chi Minh City Vietnam WSReasoner University of New Brunswick Canada Jena Arq v2.8.2 HP Labs, Talis UK GOMMA Universitat Leipzig Germany Jena Arq v2.9.0 HP Labs, Talis UK Hertuda Technische Universitat Germany rdfQuery v0.5.1University of Southampton UK Darmstadt LDOA beta Tunis-El Manar University Tunisia University of Zurich Lily Semantic Crystal Southeast University China Switzerland Affective Graphs University of Sheffield LogMap University of Oxford UK UK 14 WSMO-LITE-OU KMi, Open University LogMapLt University of Oxford UK UK SAWSDL-OU Maastricht University KMi, Open University UK MaasMtch Netherlands OWLS-URJC FZI Forschungszentrum Juan Carlos University of Rey Spain MapEVO Germany OWLS-M0 DFKI Germany Informatik MapPSO FZI Forschungszentrum Germany Informatik MapSSS Wright State University USA Optima University of Georgia USA WeSeEMtch Technische Universitat Germany Darmstadt YAM++ LIRMM France 11 © Raúl García Castro Provider 41 tools from 13 countries Country Talk at IMATI-CNR. 15th October 2013 42
  • 43. Evaluation services Tools Evaluations Test data My tool Or  define   your  own   © Raúl García Castro Tools My test data Tools Update  them   Evaluations My results Test data My evaluation Exploit  results     Execute   evaluaCons   Results My results My results My tool My test data Talk at IMATI-CNR. 15th October 2013 43
  • 44. Quality model for semantic technologies Tool/Measures Raw Results Interpretations Quality Measures Quality subcharacteristics Ontology engineering tools 7 20 8 6 Ontology matching tools 1 4 4 2 Reasoning systems 11 0 16 5 Semantic search tools 12 8 18 7 Semantic web service tools 5 9 10 2 Total 34 41 55 17 Radulovic, F., Garcia-Castro, R., Extending Software Quality Models - A Sample In The Domain of Semantic Technologies. 23rd International Conference on Software Engineering and Knowledge Engineering (SEKE2011). Miami, USA. July, 2011 © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 44
  • 45. Semantic technology recommendation I  need  a  robust   ontology  engineering   tool  and  a  seman.c   search  tool  with  the   highest  precision   User  Quality   Requirements   SEALS  Pla<orm   You  should  use  Sesame   v2.6.5  and  Arq  v2.9.0     The  reason  for  this  is...     Alterna.vely,  you  can  use  ...   SemanCc   Technology   Quality  model   Seman2c   Technology   Recommenda2on   Tools     Repository   Service   RecommendaCon   Results     Repository   Service   Radulovic F.; García-Castro R. "Semantic Technology Recommendation Based on the Analytic Network Process". 24th Int. Conference on Software Engineering and Knowledge Engineering (SEKE 2012). Redwood City, CA, USA. 1-3 July 2012. 3rd Best Paper Award! © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 45
  • 46. You can use the SEALS Platform •  The SEALS Platform facilitates: -  -  -  -  -  -  Comparing tools under common settings Reproducibility of evaluations Reusing evaluation resources, completely or partially Or defining new ones Managing evaluation resources using platform services Computational resources for demanding evaluations •  Don’t start your evaluation from scratch! 15/10/13 46 © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 46
  • 47. The 15 tips for technology evaluation •  Know the technology •  Support different test data •  Facilitate test data definition •  Support different types of technology •  Define declarative evaluation workflows •  Use machineprocessable descriptions of evaluation resources © Raúl García Castro •  Automate the evaluation framework •  Expect reproducibility •  Beware of result analysis •  Learn statistics •  Plan for evaluation requirements •  Use a quality model •  Organize (or join) evaluation campaigns •  Share evaluation resources •  Exploit evaluation results Talk at IMATI-CNR. 15th October 2013 47
  • 48. Index •  •  •  •  •  Self-awareness Crawling (Graduation Project) Walking (Ph.D. Thesis) Cruising (Postdoctoral Research) Insight insight noun […] [mass noun] Psychiatry awareness by a mentally ill person that their mental experiences are not based in external reality. © Raúl García Castro Talk at IMATI-CNR. 15th October 2013 48
  • 49. Evolution towards maturity Software Evaluation Technology Maturity Model TABLE I L EVELS AND THEMES OF SOFTWARE EVALUATION TECHNOLOGY MATURITY Level Initial Repeatable Reusable Integrated Optimized Formalization of the evaluation workflow Ad-hoc workflow informally defined. Ad-hoc workflow defined. Software support to the evaluation Manual evaluation. No software support. Ad-hoc evaluation software. Technology-specific workflow defined. Reusable evaluation software: - multiple software products. - multiple test data. Evaluation infrastructure: - multiple types of software products. - multiple test data. Federation of evaluation infrastructures: - autonomous infrastructures. - interchange of evaluation resources. - data access and use policies. Generic workflow defined. Machine-processable and built reusing common parts. Evaluation resources built upon shared principles. Generic workflow defined. Machine-processable and built reusing common parts. Evaluation resources built upon shared principles. Measured and optimized. Applicability to multiple software types Small number of software products of the same type. Small number of software products of the same type. Ad-hoc access to software products. Multiple software products of the same type. Generic access to software products. Usability of test data Informally defined. Defined. Machine-processable. Exploitability of results Informally defined. Not verifiable. Machine-processable. Combined for some software products of the same type. Machine-processable. Combined for many software products of the same type. Representativeness of participants One team. One or few teams. Several teams. UPM-FBI RDF(S) Import B. Suite RDF(S) Interoperability B. RDF(S) Export B. Suite OWL Interoperability B. OWL Lite Import B. Suite RDF(S) Interoperability B. Manual Suite Tool X rdfsbs IRIBA Multiple software products of different types. Generic access to software products. Machine-processable. Reused across evaluations. Machine-processable. Combined for many software products of different types. Machine-processable. Reused across evaluations. Customizable, optimized and curated. Machine-processable. Combined for many software products of different types. High availability and quality. Tool X Tool Y IBSE Several teams. Stakeholders. Multiple software products of different types. Generic access to software products. Support any software product requirement. Automatic Tool Y Community. characteristics of such software products. This workflow is access, interchange, and use. This federation of infrastructures supported by evaluation software that can be used to assess permits satisfying any software or hardware requirements of any software product of the type covered by the evaluation; the different software products; customizing, optimizing, and García-Castro R. "SET-MM – A Software Evaluation Technology Maturity Model". 23rd International Conference on Software Engineering and the software product must have previously 660-665. Miami Beach, curating July 2011. and improving the availability and quality implemented the USA. 7-9 test data; Knowledge Engineering (SEKE2011). pp. required mechanisms to be integrated with the evaluation soft- of the evaluation results. ware. Test García Castro © Raúl data and evaluation results are machine-processable; Talk at IMATI-CNR. 15th October 2013 49
  • 50. The 15 tips for technology evaluation •  Know the technology •  Support different test data •  Facilitate test data definition •  Support different types of technology •  Define declarative evaluation workflows •  Use machineprocessable descriptions of evaluation resources © Raúl García Castro •  Automate the evaluation framework •  Expect reproducibility •  Beware of result analysis •  Learn statistics •  Plan for evaluation requirements •  Use a quality model •  Organize (or join) evaluation campaigns •  Share evaluation resources •  Exploit evaluation results Talk at IMATI-CNR. 15th October 2013 50
  • 51. Thank you for your attention! Speaker: Raúl García-Castro Talk at IMATI-CNR, October 15th, Genova, Italy