In this paper we present a series of nature inspired models used as alternative solutions for Semantic Web concerns. Some of the methods presented in this article perform better than classic algorithms by enhancing response time and computational costs. Others are just proof of concept, first steps towards new techniques that will improve their respective field. The intricate nature of the Semantic Web urges the need for faster, more intelligent algorithms and nature inspired models have been proven to be more than suitable for such complex tasks.
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Nature Inspired Models And The Semantic Web
1. Nature Inspired Models and the Semantic Web
Stefan Ceriu, Stefan Prutianu,
Faculty of Computer Science, „Al. I. Cuza“ University, Iasi, Romania
{ stefan.ceriu, stefan.prutianu }@info.uaic.ro
Abstract. In this paper we present a series of nature inspired models used as
alternative solutions for Semantic Web concerns. Some of the methods
presented in this article perform better than classic algorithms by enhancing
response time and computational costs. Others are just proof of concept, first
steps towards new techniques that will improve their respective field. The
intricate nature of the Semantic Web urges the need for faster, more intelligent
algorithms and nature inspired models have been proven to be more than
suitable for such complex tasks.
Keywords: nature inspired models, semantic web, ontology, alignment, rdf,
query, soft computing, genetic algorithms, artificial neural networks, kohonen,
swarm intelligence
1 Introduction
The Semantic Web is a new paradigm for the Web in which semantic
information is associated with current existing data in order to make it accessible to
machines. The goal is to allow autonomous agents to rapidly access content by
searches based on meaning instead of classic syntactic methods.
Nature inspired methods and natural computing are software models that
follow the steps of natural phenomenon and are often used in solving complex
problems. Models like Artificial Neural Networks and Genetic Algorithms are fast,
reliable and scalable and have been successfully adopted in dealing with a vast
amount of software predicaments.
The large quantity of data available and discrepancies between
interpretations make the Semantic Web a very complex domain for which new and
more ingenious methods have to be devised in order to have it progress and improve.
We will further investigate how Nature Inspired Models are being used in the
current context of the Semantic Web and what advantages and changes they bring to
this domain.
This paper is organized in two main chapters based on the classic
classification of nature inspired models: Evolutionary Computing and Artificial
Neural Networks. In each chapter we will present characteristic algorithms and their
application in different aspects of the Semantic Web.
2. 2 Evolutionary Computing
Evolutionary computing represents a collection of methods inspired from the
Darwinian evolutionary system and natural models. Their main characteristic is that
they auto-adapt to different problem constraints thus being able to discover and take
advantage of instance specific properties. By directly working with binary
representations of the solutions and not requiring a mathematical model, evolutionary
algorithms requires fewer approximations and return better results.
Because they have been successfully used in optimization, machine learning
and complex system design, it is found that evolutionary algorithms can be also
applied to other domains more or less successfully. We will further look upon how
Evolutionary Computing can aid or solve issues in the context of the Semantic Web.
2.1 Optimizing Ontology Alignments by Using Genetic Algorithms [1]
Ontologies are systematic representation and specifications about some
domain or parts of it. It provides a common vocabulary on top of which we can define
a world by the objects that it contains and the relations between them.
One of the key features of ontologies is that anyone can model their own
knowledge without being forced to respect any pre-established standards. The issue at
hand is that it is very costly for organizations to reach a common denominator and
even if they do the result won’t be customized to the needs of every party involved.
People would try to bring extensions and additions to the ontology and errors and
incompatibilities will arise.
Ontology alignment is a way by which we can find correspondences between
the various modeled concepts, fix heterogeneity issues and use them as a whole.
Although there are many techniques that specifically deal with ontology matching
through data analysis, machine learning, language engineering .etc [2], the problem is
complex and many of these algorithms cannot cope with the shear amount of data
available in some cases.
In their paper [1], Martinez-Gil et al. propose a new way to deal with this
issue by implementing an ontology alignment solution based on genetic algorithms, a
subclass of evolutionary computing. Their solution is thus able to search a high
dimensional space and provide an efficient mechanism for matching different sets of
ontologies.
Like in any other genetic algorithm this approach uses an encoding of the
solution candidates and a fitness function which returns the quality of an individual.
In this case several parameters are encoded into a single chromosome by using a
function which converts bit representations into floating-point sub-unitary numbers.
The fitness function uses one of the parameters returned by an alignment
evaluation method (precision, recall, f-measure) and is capable of producing better
end results by focusing on a single characteristic.
Unfortunately we have no more information on how exactly is this method
implemented but judging from the results that the authors present in their paper it is
capable of performing as good as all the other existing algorithms but has an
3. advantage when working with large data sets, strength directly given by working with
this type of nature inspired model. It manages to reach convergence in only five
consecutive generations and find the optimal alignment in most of the cases.
2.2 Genetic Algorithms for RDF Query Path Optimization [3]
Another issue present in the current context of the Semantic Web is that
information is scattered and there is yet an algorithm capable of efficiently querying
multiple heterogeneous sources and returning more relevant results. The execution
time of this type of algorithm is mainly given by the order in which the various parts
of the query are evaluated.
Research in this field has resulted in the iterative improvement algorithm
followed by simulated annealing. This is referred to as the two-phase optimization
algorithm. We know that in some cases like the circuit partitioning problem and the
traffic routing problem genetic algorithms perform better than simulated annealing [4]
and this is the main reason why Alexander Hogenboom et al. propose a variant of the
two-phase optimization algorithm in which the simulated annealing part is replaced
by a genetic algorithm the main goal being a more rapid response time. ”Entirely new
queries should be optimized and resolved real-time.”[3]
Large queries can be seen as a series of smaller queries composed by join
operations. Optimizing the order in which these joins arise directly improves the
overall response time. The method presented in this article associate a cost to each
join based on the cardinality of each operand. These costs directly influence the
fitness function as the solution with the lowest cost has the highest ranking.
The chromosomes are encoded using a number encoding scheme for bushy
trees [5] which is efficient and permits fast crossover operations. This algorithm joins
concept from an ordered list together saving the result on the position of the first
concept. After each iteration the positions of the concepts are added to the encoding
of the current chromosome.
The results presented denote that the genetic algorithm can perform better
than the two-phase algorithm when it comes to solution quality, consistency and
execution time needed. The algorithm performs better as the complexity of the
solution space increases but if the solution space is simple then executing the query
might be faster. Yet another example where nature inspired models outperforms other
algorithms and aid the development of the Semantic Web.
2.3 Anytime Query Answering in RDF through Evolutionary Algorithms [6]
As the Semantic Web is sometimes imperfect or too large to work with as a
whole answering queries through SPARQL might not produce the best results. An
approximation based method could prove to be useful when dealing with these kind of
problems and provide better and faster results.
Oren et al. [6] propose a technique of this type that they hope “will be useful
in many applications and even essential for others”. Their evolutionary computing
based solution encodes queries as sets of constraints and finds a solution by
addressing the assignment which validates the implication between the query graph
and the data graph. Although an exhaustive search for the assignment could be used
4. they resort to nature inspired methods like mutation and crossover and use the number
of satisfied constraints and Bloom filters as the fitness function. The Bloom filters
contain a compressed version of the data graph and are used because they provide fast
approximate access to the data and evolutionary techniques.
Four genetic operators are used: parent selection, recombination, mutation
and survivor selection. Each one of them was selected after a series of experiments as
to which will perform better under the given issue.
Overall, this approach is faster in finding approximate answers to the queries and
given the fact that the Semantic Web still has its problems this type of approximate
calculation could prove useful and even better than traditional methods.
2.4 Semantic Web Reasoning by Swarm Intelligence [7]
Swarm intelligence is a technique in which groups of agents work together
towards reaching a common goal. Each of them respects simple rules and although
there is no centralized control they manage to self-organize and interact with each
other and with the environment to reach some degree of global artificial intelligence.
This technique is inspired by models found in nature like ant and termite colonies,
bird flocks, fish schools etc.
Semantic Web reasoners should be able to access all the data available on the
web and be capable of accessing it in any format (RDF, XML, OWL .etc) but as the
data is constantly changing this scope might be harder to reach. Swarm intelligence
has the property that it is decentralized and self-organizing being able to make use
with ease of new data or old modified data. It is also robust and scalable which make
it a candidate technique in obtaining optimized reasoning performance for the
Semantic Web.
In their paper K.Dentler et al. propose a new method of reasoning the web in
which a swarm of agents traverse an RDF graph and each one of the agents represents
a reasoning rule. The RDF graph is looked at as an interconnected network where
each node is an object and each edge is a property. By walking the edges of the graph
they are able to distribute the relationship rules to the individuals, each one of them
applying one single rule. When they find a path that respects the condition of that rule
they locally add a new derived triple to the graph. Traditionally this approach is
implemented by indexing all the resulted triples and merging the results of multiple
queries. In the method presented no indexing is used, technique which leads to
efficiency improvements but also leads to redundancy as the agents have to sometime
follow unnecessary paths.
Even though the experiments in [7] are just a proof of concept the basic idea that
swarm intelligence can significantly improve the way Semantic Web reasoners work
remains. The distributivity, robustness and scalability of this nature inspired model
and the results of these experiments are reason enough to continue research in this
direction.
5. 3 Artificial Neural Networks
Neural networks are inspired by the configuration of biological neural
networks and are essentially a multi-layered hierarchic structure where any two
processing units can communicate. Each neuron is represented by a node in the
network and each node holds a primitive function. The way in which these functions
are composed is strictly given by topology of the network. If weights are associated to
each connection in the network then different weight values produce various results.
Artificial neural networks have been successfully used in function
approximation, pattern and sequence recognition, economic applications such as
market trading and bankruptcy predictions, data processing, medical diagnosis and
others. They are capable of adaptive learning, self-organizing and real time operations
and represent a reliable and robust system used in a vast number of disciplines.
We will continue by studying applications of the artificial neural networks for the
Semantic Web and the way in which they improved or changed this domain.
3.1 Ontology Matching Using an Artificial Neural Network to Learn Weights [10]
Similar to genetic algorithms, artificial neural networks are another nature
inspired model which can be used to match ontologies. The majority of the ontology
alignment techniques are based on either rule-based or learning-based models both of
which have shortcomings. In the rule-based approach the schemas to be aligned are
represented as graphs or trees which require a significant number of traversals. The
learning based model requires much computational effort in order to train its learners.
Both of these issues become more important as the system is needed in working with
large schemas and dynamic environments. In addition to these efficiency
disadvantages the second model also requires human intervention as to setting the
weights of different aspects within the ontology. This is an error-prone practice and
requires effort in generating good, relevant data.
The authors of [10] propose a new artificial neural network based technique
in which the weights stated above are learned from the ontology schema instead of
being given a priori.
The algorithm creates a tree-dimensional vector, one dimension for each
concept taken into consideration (name, properties and relationships), which are
afterwards compared using different similarity methods.
For the name concepts their string values are compared directly and a sub-
unitary value is returned as to the degree of similarity. If the two strings are equal or
synonyms (WordNet lookup) then the function will return 1. Otherwise the similarity
is calculated using a simple equation involving Levenshtein distance and the string
length.
The concept properties similarity is given by the number of properties
matched between the two ontologies. Two properties are aligned when their data type
is the same or their name similarity is above a preset value.
Similarity in relationships takes into consideration all the ancestors a concept
can have, up to the root. It is assumed that both ontologies are a subclass of “thing”.
The overall relationship similarity will be given by the maximal value found by pair
wise comparing all the ancestors.
6. These similarity functions will represent the input to a 3 by 1 neural network
which will calculate the overall equivalence of two concepts. The weights of the
edges of this network will be, at first, randomly generated, and then some concepts in
the first ontology will be manually matched to aid learning.
The experiments made by the authors reveal that this approach has an 85%
precision rate with minimum human intervention. The results are encouraging and
provide proof that artificial neural networks can be successfully used for aligning
ontologies.
3.2 Text-Based Ontology Enrichment Using Hierarchical Self-organizing Maps [13]
Self organizing maps are a type of artificial neural networks trained using
unsupervised learning. They were invented by Professor Teuvo Kohonen and are
capable of transforming a multi-dimensional dataset into smaller one (usually two-
dimensional). They are also called Kohonen Networks or Kohonen Maps.
Input data for a self organizing map is not labeled in any way but is instead
clustered based on properties identified during the training process. Each neuron
contains an item from the data set and has an associated weight vector which is
adjusted accordingly during training. The advantage to Kohonen Networks is that
they can accurately map the whole input space based on these weight vectors.
E.Chifu and A.Letia present a way in which Kohonen Maps can be used as a
means of enriching ontologies by extracting new concepts from domain related
documents. Data mined from these documents passes through a “symbolic-neural
translation” phase which represents the initial state of the neural network. In order for
the network to correctly classify concepts weight vector are required for each neuron.
These vectors are calculated based on how many occurrences a term has had during
the parsing of the documents or based on document category histogram.
Even though ontology enrichments systems are hard to compare because
most of them use different domains and ontologies, the result published on the
method presented suggest that it is suitable for this kind of operation.
3 Conclusions
We have presented a series of nature inspired models, each with its
advantages and innovations, that changed the way Semantic Web problems are
solved. Some of the methods presented in this article performed better than classic
algorithms by enhancing response time and computational costs. Others were just
proof of concept, first steps towards new techniques that will improve their respective
field. Nature inspired models have proven to be useful in a relatively new domain
empowering the strengthening yet again the idea that they are strong, efficient
models.
7. References
1. Martinez-Gil, J., Alba, E., Aldana-Montes, J.F.: Optimizing Ontology
Alignments by Using Genetic Algorithms
2. Euzenat, J. et al.: D2.2.3: State of the art on ontology alignment
3. Hogenboom, A., Milea, V., Frasincar, F., Kaymak, U.: Genetic Algorithms
for RDF Query Path Optimization
4. Kohonen, J.: A brief comparison of simulated annealing and genetic
algorithm approaches –http://www.cs.helsinki.fi/u/kohonen/papers/gasa.html
5. Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and Randomized
Optimization for the Join Ordering Problem
6. Oren, E., Gueret, C., Schlobach, S.: Anytime Query Answering in RDF
through Evolutionary Algorithms
7. Dentler, K., Schlobach, S., Guéret, C.: Semantic Web Reasoning by Swarm
Intelligence
8. Bry, F., Marchiori, M.: Reasoning on the Semantic Web: Beyond Ontology
Languages and Reasoners
9. Yang Liu, Passino, K.M.: Swarm Intelligence: Literature Overview
10. Huang, J., Dang, J., Vidal, J.M., Huhns, M.N.: Ontology Matching Using an
Artificial Neural Network to Learn Weights
11. Bagheri Hariri, B., Abolhassani, H., Sayyadi, H.: A Neural Networks Based
Approach for Ontology Alignment
12. Algergawy, A., Schallehn, E., Saake, G.: A Sequence-based Ontology
Matching Approach
13. Chifu, E.S., Letia, I.A.: Text-Based Ontology Enrichment Using Hierarchical
Self Organizing Maps