SlideShare une entreprise Scribd logo
1  sur  10
Dynamic Network Reconfiguration in Presence of Multiple Node and Link
Failures Using Autonomous Agents
Juan Ram´on Acosta and Dimiter R. Avresky
Network Computing Lab, Northeastern University, Boston, MA
{jracosta,avresky}@ece.neu.edu
Abstract
Currently, high-speed networks are indispensable commodities for
all users and they have become an integral part of their lifestyles.
For this reason, it is necessary for the network to be available most
of the time and to achieve transparent network failure recovery. In
this paper, it is proposed to use Agent NetReconf 1
, an agent based
dynamic network reconfiguration algorithm that is capable of tol-
erating multiple router and link failures in high-speed networks
with arbitrary topology. Agent NetReconf updates the routing ta-
bles asynchronously and does not require any global knowledge
of the network topology. Agent NetReconf uses mobile and au-
tonomous agents to detect and recover the network from failures.
Agent NetReconf highlights the benefits of using smart networking
devices as a means of building an active network. The complexity
of Agent NetReconf is analyzed and the termination, liveliness and
safety are proved.
Keywords: high-speed networks, autonomous mobile agents, dy-
namic reconfiguration, fault tolerance, adaptive routing, arbitrary
topologies
Introduction
The increasing number of users of the Internet has trig-
gered a significant growth in the number of networked de-
vices and the traffic they generate. Computer networks are
now been pushed to their limit. In this context, computing
capacity is available but it can be severely affected by fail-
ures. The major challenge faced by service providers today
is to keep their ability to give customers the level of ser-
vice they require, regardless of system conditions and the
number of faults on the network.
The need to provide increased availability has lead re-
searchers such as Hood and Ji [8] to develop a sophisti-
cated intelligent software agent that performs fault detec-
tion accurately and in certain cases predicts the fault before
1This work was supported by the U.S. National Science Foundation
under grant CCR-0004515
it appears. Others such as Whit et al. [15] have imple-
mented communities of mobile agents that roam the net-
work collecting and exchanging network information based
on the ”social insects” paradigm (ant behavior) described
by Schoonderwoerd et al. [11].
In this paper, an algorithm is proposed for achieving dy-
namic network fault detection and avoidance in arbitrary
topologies using autonomous agents running at each router.
The reconfiguration algorithm is distributed and embedded
in the agents’ behavior. The paper is organized in six sec-
tions as follows: Section 1, presents an overview on agents
and how they are used in adaptive routing. Section 2,
describes a new router architecture that uses autonomous
agents for its routing services. Section 3, describes Agent
NetReconf and how it does the tables reconfiguration to re-
store routing capabilities at the network segment affected by
the failure. Section 4, presents the complexity, termination,
safety and cognitive properties of Agent NetReconf. Section
5, presents a fault recovery example showcasing the algo-
rithm execution. The last section in the paper contains the
conclusions.
1. Autonomous Agents
This section presents an overview of previous work that
has been published on how agents are used to achieve effi-
cient network routing and fault tolerance.
The term agent has been used to refer to a software
and/or hardware component which is capable of acting ex-
actingly in order to accomplish tasks on behalf of its user
[10]. An agent is able to cooperate with other agents, learns
from its environment [17], and sometimes has the capabil-
ity of migrating under its own control from one machine to
another, provided both computers are part of a network.
Agents communicate with other agents to achieve suc-
cessfully all the tasks given to them [16]. Communication
between agents is modeled as a point-to-point exchange of
messages whose content is a construction of a well defined
language, for example: the Knowledge Query and Manipu-
lation Language (KQML) [4] , the Knowledge Interchange
Format (KIF) [14] or, the most recent, the OWL Web On-
tology Langauage [2].
1.1. Applications on Network Fault Tolerance
Minar in [9], describes an algorithm to discover the net-
work topology using mobile agents. The agents travel the
network and from each node they visit they learn its cur-
rent connectivity. In addition, the agents complement the
acquired knowledge by cooperating with other agents they
meet at the same node. Finally, when agents finish explor-
ing the network, the topology is fully discovered, and this
information is then used to define the routing tables at each
node. Agents have also been used in adaptive routing, for
example, Gianni in [3], introduced a distributed adaptive
routing algorithm based on mobile agents that is capable of
learning the routing tables of a computer network using the
ant colony metaphor. Garijo, Cancer and S´anches in [6],
for example, describe a centralized Multi-agent Coopera-
tive Network-Fault Management system (CNFM) that uses
ISO standard interfaces at each router to detect and avoid
faults on the network. In CNFM the agents are working
as watch dogs of the network monitoring each element and
generating events into the CNFM engine when faults are
recognized.
Cynthia Hood and Chuanyi Ji [8], took advantage of
the increasingly available computation power in networking
devices and the benefits of artificial intelligence to design
an intelligent agent that processes information collected by
the Simple Network Management Protocol agents (SNMP-
agents) at each node and uses this information to detect net-
work anomalies that typically precede a fault. “The intel-
ligent agent learns the normal behavior from each reading
made by the SNMP-agent and combines the information us-
ing a Bayesian network that could trigger a local corrective
action or a message to a centralized network manager.” In
a similar approach presented by Phuan and Yufang in [19],
an intelligent mobile agent has the capability to extract data
from a network element using a local high-bandwidth com-
munication session without consuming network resources
and reducing the overall communication traffic. The intel-
ligent mobile agent has the ability to integrate knowledge
from a network manager and any network element to per-
form inferences on which type of fault recovery it will be
necessary to perform.
The algorithm proposed in this paper is different from
the solutions described earlier in that Agent NetReconf ex-
ecutes network failure recovery using only the local knowl-
edge at each router without having to know the network
topology or the type of faulty element (router or link), and
it is platform independent.
2. Agent Based Router
In order for network failure recovery to happen at the ex-
act location where an element failed, it is necessary that the
routing elements in the vicinity take an active role in the
detection and contention of the fault. As mentioned earlier,
network fault recovery and detection is commonly imple-
mented in a way such that a central network monitoring sta-
tion launches all the corrective actions from a remote site,
as seen in [8, 6, 19] and only a few implementations, such
as those described in [1, 5], make the adjacent routers to the
failure participate in the restoration of connectivity.
The authors, in this section, propose an agent based
router in which the detection and reconfiguration tasks are
performed by a group of intelligent agents. The agents are
goal oriented and capable of incorporating new knowledge
learned during the router operation and network reconfigu-
ration.
In essence, the new router is an active intelligent network
device capable of reacting and adjusting its operation based
on the events that occur in its internal and external environ-
ment.
2.1. Architecture
The architecture of the new intelligent router, in Figure
1, is based on a high-speed cross bar switch with an en-
hanced embedded software module that contains an agent
subsystem. For simplicity, the agent platform will not be
specified.
The router hosts a community of agents that are responsi-
ble for controlling the router’s activities and coordinate all
the tasks involved in the dynamic reconfiguration of rout-
ing tables when the router participates in the recovery of
a failure. The knowledge used by the agents to represent
the router, links, neighbors and the execution parameters of
the fault-tolerant reconfiguration algorithm is saved in the
agent’s main memory. The structural representation of the
knowledge is defined using ontology classes written in the
OWL web Ontology Language [2].
The definition of the agents operating the router is as fol-
lows:
1. Node Manager Agent. This agent oversees the opera-
tion of the router and the other agents. The node man-
ager is the router public interface that can be use by
network administration tools, visiting explorer agents,
neighbor routers and other external network elements
to communicate with the router. The manager agent
is also responsible for the security and integrity of the
router; it supervises all the access made to the routing
tables and memory, and makes sure that all the request
made to it are safe. The node manager agent is the
.
.
Arbitration
Decision
Routing
Crossbar
NxN
Tables
0
ii
Input Ports Output Ports
Node Manager
Agent
Router
Agent
AgentRouting
N−1
0
N−1
Link Manager
Figure 1. Agent based router architecture
only component in the router that can initiate a recon-
figuration task. The node manager agent uses a rein-
forcement learning method to acquire new knowledge
to make better decisions during node management and
fault recovery.
2. Router Agent. It is the only agent in the new architec-
ture that can manipulate the routing tables and has the
capability of accepting or declining updates. The agent
behavior is determined by the inherent routing algo-
rithm and the dynamic reconfiguration policies. As
seen in Figure 1, the router’s arbitration and routing
decision logic are controlled by this agent. The router
agent reacts only to requests from the node manager
agent.
3. Link Manager Agent. Responsible for managing the
router’s connected links, ports and queues. The agent
is in charge of detecting and reporting failures and con-
gestion to the node manager. The agent uses a rein-
forcement learning model to learn the characteristic
symptoms before a failure or congestion take place,
this allows the agent to choose the appropriate cor-
rective actions and promptly trigger a restoration task.
The agent uses the “I’m alive” message model to de-
termine failures and the flow-unaware statistical de-
lay method described in [13] to accurately determine
packet delays without depending on the dynamic in-
formation of the packet flow.
4. Explorer Agent. These agents are dynamically cre-
ated in each router when Agent NetReconf is executed.
When an explorer agent is working in search mode
it cooperates with other agents to build a restoration
spanning tree that will re-connect the nodes discon-
nected by the failure. When an explorer agent is work-
ing in restoration mode, it collaborates with the node
manager agents at each router on the restoration tree
to update the local router tables. An explorer agent is
a delegate of the router that created it, such that any
interaction between two different agents is equivalent
to the two routers interacting directly point-to-point.
3. Network Failure Recovery
3.1. Agent NetReconf
This section describes a new dynamic network reconfig-
uration algorithm Agent NetReconf. The algorithm uses a
set of collaborative agents to restore network connectivity
after a failure is detected. Agent NetReconf is a distributed
intelligent algorithm that operates at the network level with-
out any global information of the network topology.
The strategy used by Agent NetReconf consists in iden-
tifying the set of nodes adjacent to a failure and from them
selecting a leader to coordinate the construction of a restora-
tion spanning tree and synchronize the updates to the rout-
ing tables at each node on the restoration tree.
The complete reconfiguration process consists of four
phases: Leader Selection, Restoration Tree Construction,
Reconfiguration Synchronization and Tables Update. The
correct execution of these phases is subject to the validity
of the following assumptions:
Assumption 3.1 After a failure F is detected, no additional
failures will occur on any link or node that belongs to the
restoration tree, until Agent NetReconf finishes the recon-
figuration process for F.
Assumption 3.2 The network is not partitioned as result of
the failures.
Before describing in detail each phase, for clarity, con-
sider R to be the set of all routers in the network and that
each router Ri is connected to N other routers, its imme-
diate neighbors. Also let Sij be the collection of IDs of all
routers that are two hops away from Ri via link Lj. Addi-
tionally, assume that each Lk is monitored and managed by
one of the link manager agents (LMk). At each router Ri,
the link manager LMk that detects missing “I’m alive” mes-
sages from link Lk, immediately notifies the Node Manager
Agent (NMi) by raising the asynchronous NetworkFailure-
Detected event.
Leader Selection After the failure is detected by router Ri,
the node manager NMi suspends the traffic targeting Lk,
the link leading to the presumed faulty node. From Sik,
NMi selects the ID with the highest value and records it in
memory as the ID corresponding to the Restoration Leader
(RLF ). If the selected ID equals Ri’s ID then Ri becomes
the leader and immediately starts Phase 1. Otherwise, when
the selected ID does not match Ri’s, the router starts timer
Tstart and waits for a control signal from RLF that indi-
cates that the node can join Phase 1. If Tstart times out and
no signal from RLF was received, Ri marks RLF faulty
and starts the leader selection again.
Definition 3.1 Node Adjacent to Failure (NAF) It is a node
that was not selected “Restoration Leader” and was di-
rectly connected to a node or link that failed.
Phase 1. Restoration Tree Construction The first step in
Agent NetReconf is to build a restoration tree to establish a
communication path between the leader and the NAFs.
Step 1a. Begin Phase
Phase 1 starts with the Restoration Leader RLF (
RLF = Ri ) creating one explorer agent Eij per active
link Lj. Eij is initialized in search mode and is provided
with the list of disconnected NAFs. Eij makes Ri its home
and starts the search for NAFs by migrating to the neighbor
connected to Lj.
After all Eij migrated out of the leader node, RLF starts
timer Tack and waits for the arrival of control signals con-
firming that a restoration path was found between RLF and
each NAF.
Step 1b. Searching for NAFs
As the explorer agent Eij arrives at a node Rx, it adds the
ID of the visited node to the restoration path it is building.
Eij exchanges information with the current node and uses
this information to define an itinerary for its next migration.
If the explorer agent did not arrive at a NAF, then it
uses the information to create clones of itself to help it con-
tinue searching. The itinerary and the number of clones are
based on the number of active links and the available feasi-
ble routes to the NAFs. For example, in Figure 2, explorer
EH3 learns from RE that there are two active links L0 and
L3, and one feasible route via L3. NAFs {C,D} are pre-
sumed to be reachable through L3 and {A,B} will need to
be searched via L0. This implies that at least two clones are
required. However, since RE is not a NAF then EH3 can
continue searching. Therefore, only one clone is required
for the next migration.
When Eij arrives at a NAF, the explorer agent removes
Rx from the list of NAFs and tells NMx to save the restora-
tion path Eij traveled. Then, NMx stops Tstart and creates
an agent explorer for restoration ERxi that sends back to
the restoration leader Ri to confirm that the restoration path
was found. Although, Eij reached a NAF the search needs
to continue for the remaining NAFs in the list. Eij then
creates clones and their itinerary following the same crite-
ria mention before. Each clone then continues the search.
Meanwhile Eij stays at Rx and starts timer Tphase3 to wait
for a signal from RLF to start Phase 3. The case in which
Tphase3 times out represents a situation in which a failure
might occurr during reconfiguration. However, based on
Assump. 3.1, this will not occur.
Cycles are prevented in the restoration paths by deacti-
vating an explorer agent when it arrives a node that has been
visited already by either itself, one of its clones or one of its
siblings.
In order to distinguish between node and Link Failures,
Agent NetReconf uses explorer agents as follows: If a NAF
receives an Eij from a node which is assumed to be faulty,
then a link failure is identified, therefore the NAF must up-
date its reconfiguration information for the node and mark it
safe. In the case in which two nodes, each at the end of the
faulty link, may have determined that both are restoration
leaders for the link failure, it is required to synchronize the
nodes such that only one leader remains. The synchroniza-
tion will occur when both nodes receive an explorer agent
from each other, Eij and Eyj. The restoration leader for
the faulty link will be the parent of the explorer agent that
has the highest ID value. For example Ry, parent of Eyj,
becomes the restoration leader for the failed link and node
Ri becomes a NAF. After the leader synchronization has
occurred Agent NetReconf will continue with the reconfig-
uration.
Step 1c Establishing Tree
At each node Rj that is on the path followed by ERxi,
NMj marks the links on which ERxi arrives and departs
members of the restoration tree. Furthermore, if NMj de-
tects that a different ERxy, from leader Ry, has already
visited the node, then to avoid any conflicts with the recon-
figuration, it gives ERxi the information about Ry, such
that when it gets to Ri this can synchronize with Ry before
it proceeds with Phase 3. ERxi continues migrating until it
reaches the restoration leader.
When Tack times out at the restoration leader, RLF de-
termines which NAF did not reply with an ERxi in order to
mark it faulty and exclude it from the reconfiguration. The
restoration leader continues and builds the restoration tree
by merging each root of the confirmed restoration paths.
After the restoration tree is completed, each ERxi sends a
point-to-point Restoration Tree Built (RBT) message signal
to its parent.
Definition 3.2 Node On Restoration Tree (NORT)
It is a node that has at least one link belonging to the
restoration tree.
Phase 2. Multiple Failure Synchronization
When multiple failures appear, Agent NetReconf estab-
lishes an ordered sequence of priorities between the restora-
tion leaders detected by the visited NORTs, such that the
reconfigurations occurs in a “safe” sequence in which the
restoration leader with the highest ID always executes Phase
3 first, while the others await their chance. For example, if
we assume that Ry’s ID is higher than Ri’s then it will pro-
ceed to Phase 3 before Ri.
Phase 3. Routing Information Update
This phase starts with a NAF processing an incoming
RTB message and providing new routing information to the
awaiting Eij. After the information exchange finishes the
explorer agent starts migrating back to RLF using the ac-
knowledged restoration path. As Eij travels back to RLF ,
the node manager of a visited node exchanges routing in-
formation with Eij and if necessary it updates its rout-
ing tables. Eij continues migrating until it reaches RLF .
The information given to Eij by the NAF, and each visited
node, includes the IDs of all destinations that are reachable
through each of these nodes using links that do not belong
to the restoration tree.
Upon arrival to the restoration leader, Eij delivers to
RLF the routing information it collected. RLF processes
the data to adjust its routing tables and deactivates Eij.
When RLF completes the update, it then provides each
ERxi with the IDs of all the destinations reachable through
its active links excluding the link on which the ERxi arrived
and then ERxi migrates to its parent NAF. After all restora-
tion explorers have migrated, RLF starts a timer Tcomplete
to wait for a confirmation signal from each NAF indicating
that the updates were completed and that they are ready to
resume operations. The case in which Tcomplete times out
represents a case similar to that described earlier and will
be dealt with in a future publication..
As ERxi travels back, the node manager of a visited
node exchanges routing information with ERxi and if nec-
essary it updates its routing tables. ERxi continues travel-
ing until it reaches its parent NAF. The routing information
provided by the visited node includes the IDs of all the des-
tinations reachable through the visited node using the links
that belong to the restoration tree with the exception of the
IDs of nodes accessible via the links on which ERxi arrives
and leaves the visited node.
Upon arrival of ERxi, Rx updates its routing tables
with the information contained in the restoration explorer
and ERxi is deactivated. NAF sends RLF a point-to-point
Update Complete Response (UCR) message signal. When
RLF receives the UCR signal, it stops Tcomplete and re-
sumes normal operations.
The reconfiguration algorithm, as described, uses to the
maximum the ability of the agents to interact with each
other. Communication between the explorer and node man-
ager agents are performed mostly within the router’s agent
module, only a very few leave the router and happen in a
point-to-point form. This is an important contribution of
Agent NetReconf because it maintains the algorithm execu-
tion distributed at each router and keeps to a minimum the
overhead on the bandwidth usage and the number of links
preempted for the reconfiguration to work.
Agent NetReconf bases its execution on the natural abil-
ity of autonomous agents to acquire and share knowledge,
for instance, when the explorer agents are searching for
NAFs they learn information at each node that helps them
design an optimal migration pattern that reduces network
flooding significantly.
4. Properties of Agent NetReconf
4.1. Complexity
The complexity of Agent NetReconf is analyzed in terms
of the number of explorer agents created during restora-
tion tree construction and routing table reconfiguration. Let
LActive be the number of active links on each router, nfin
the number of NAFs for failure F, and P a path between
RLF and a NAF.
Theorem 4.1 The complexity for Agent NetReconf for mul-
tiple failures is given by
O(LActive ∗ ((nmax ∗ Pmax) + 1))
where Pmax is the longest path connecting RLF and any
NAF and nmax is the maximum number of NAFs.
Proof: Agent NetReconf determines RLF without cre-
ating explorer agents such that leader selection is achieved
with O(0) complexity.
In Phase 1, when the recovery step initiates, RLF creates
LActive exploration agents Eij, one per active link. The
corresponding complexity for this operation is O(LActive).
As an explorer agent migrates searching for target NAFs,
the maximum number of explorer agents created at the vis-
ited node Rx as described in Phase 3 is LActive − 1. In
cases where Rx is a NAF, Rx creates one exploration agent
for recovery, ERxi, such that the maximum number of ex-
plorer agents created at an intermediate router is LActive.
Now, considering that the longest restoration path be-
tween RLF and a NAF is Pmax, the total number of ex-
plorer agents needed to continue searching for a NAF is
Pmax ∗ LActive. Considering the worst case in which each
NAF is reached via a disjoint restoration path, the total num-
ber of explorers created is given by nmax ∗ Pmax ∗ LActive.
Assuming that all restoration trees intersect, Phase 2 is
executed independently for each RT without creating any
agents, which results in O(0) complexity.
Then, by adding the number of agents created by
the restoration leader, the complexity of Agent NetRe-
conf becomes O((nmax ∗ Pmax ∗ LActive) + LActive), or
O(LActive ∗ ((nmax ∗ Pmax) + 1)). Q.E.D 2
Now, by comparing O(LActive ∗ ((nmax ∗ Pmax) + 1))
with the complexity of NetRec in [1], which is O(N ∗ (L +
nmax ∗Pmax +N ∗Pmax)), it is clear that Agent NetReconf
reduces the complexity of NetRec by more than one order
of magnitude. This is explainable by the fact that LActive is
expressed in terms of the number of active links instead of
the total number of links in the network. The improvement
presented here is possible because in Agent NetReconf the
agents are using their knowledge to make inferences and
execute actions that otherwise, in standard NetRec, would
require several point-to-point message exchanges. This, in
fact, is a powerful feature of agent based systems as is men-
tioned in [14].
4.2. Termination
The following agent migration patterns and message de-
livery properties are used for proving Agent NetReconf’s
Termination.
Definition 4.1 If a point-to-point message is sent from a
source agent S to a destination agent D, then it will be re-
ceived once and only once by D.
Definition 4.2 Every point-to-point message sent between
an exploration agent Eij or ERxi and a node manager
agent NMx will be routed following a path on the restora-
tion tree and will be reliably delivered to its destination.
Definition 4.3 The restoration leader RLF considers an
arriving ERxi to be the acknowledgment sent from a NAF
to confirm that a restoration path has been created.
Definition 4.4 The restoration leader RLF considers a re-
turning Eij to be the acknowledgment sent by a NAF to con-
firm that a restoration tree was established and the request
to update its routing tables with the information carried by
Eij .
Definition 4.5 A NAF considers a returning ERxi to be the
acknowledgment sent by the RLF that it updated its routing
information and that the NAF must update its table with the
new information carried by ERxi
Lemma 4.1 For a given faulty node F, all NAFs will elect
the same RL.
Proof: We prove by contradiction. Suppose that two
NAFs will elect different RLs. Since the router with highest
ID among the NAFs is elected for RL, then these two NAFs
must have used different NAF sets. However, all NAFs are
two hops from each other through F and by definition each
NAF knows its own ID and the IDs of all routers that are
two hops away from it. Thus, the NAF sets determined by
the NAFs cannot be different, which contradicts the suppo-
sition. Q.E.D. 2
Lemma 4.2 For a given fault F, the RLF and all the NAFs
will successfully establish a restoration tree rooted at RLF
such that Agent NetReconf can start the reconfiguration
step.
Proof: According to Lemma 4.1, all non-faulty NAFs will
elect the same RLF . Phase 3 and Def. 4.3 assure that a NAF
is reached by RLF and that the restoration path is estab-
lished. By sending a Restoration Tree Built (RTB) message,
as described in Phase 3, it is guaranteed that a NAF is no-
tified that the restoration tree was established. Def. 4.1 and
4.2 assure that this point-to-point message is delivered to
its destination reliably. Finally, Def. 4.4 assures that both
RLF and NAFs receive the routing information describing
the restoration tree. Therefore the restoration tree is reliably
established. Q.E.D 2
Lemma 4.3 For a given failure all NAFs, NORTs and RLF
successfully update their routing tables and Agent NetRe-
conf execution terminates.
Proof: Since Lemma 4.2 assures that the restoration tree
is reliably established, then from Phase 3, it is assured that
new routing information is collected by the explorer agents.
Def. 4.4 assures that RLF receives the new information and
updates its table before any NAF. Def. 4.5 guarantees that
the NAFs receive new information after RLF completes its
updates. Phase 3 makes sure that RLF knows that a NAF
finished updating and that it is ready to resume operations.
Q.E.D. 2
Lemma 4.4 All the explorer agents Eij and ERxi deacti-
vate.
Proof: By Def. 4.4, an Eij explorer returns home after
the restoration tree RTF has been established. Phase 3 as-
sures that Eij deactivates after the RLF updates its routing
information. Similarly, Def. 4.5 assures that ERxi returns
home and deactivates after the NAF updates its table. In ad-
dition, Phase 3 assures that the Eij that were created and
never reach a NAF will deactivate. Q.E.D. 2
Lemma 4.5 In the presence of multiple intersecting
restoration trees, none of the intersecting RLs will remain
forever in Phase 2.
Proof The goal of Phase 2 is to ensure that at any given
time only RLs with non-intersecting restoration trees will
be executing Phase 3, in which the routing information is
updated. In the cases of consecutive failures and simulta-
neous disjoint failures, this is always true, so Phase 2 is
skipped and the RLs will proceed to Phase 3 independently
from each other. If there are simultaneous failures with in-
tersecting restoration trees, then their RLs must establish
such order, which results in a sequence of temporally dis-
joint reconfigurations around single failures or simultane-
ous disjoint failures.
For each two intersecting restoration trees there is at
least one joint node, which detects the intersection. This
guarantees that at least one of the RLs in each intersection
will be notified about it. The temporal order is established
by the intersecting RLs based on their node IDs - nodes
with higher IDs have higher priority. All lower priority RLs
will wait in Phase 2 until all higher priority RLs have com-
pleted Phase 3. Following the algorithm, after completing
Phase 3, each RL notifies all lower-priority RLs, which al-
lows the next leader in the temporal order to execute Phase
3. Thus, all leaders that were waiting in Phase 2 will even-
tually receive the required synchronization messages that
allow them to proceed to Phase 3. Q.E.D. 2
Theorem 4.2 On all nodes Agent NetReconf will success-
fully complete in the presence of multiple failures, i.e. Agent
NetReconf will terminate and the nodes adjacent to the fail-
ures will be reachable.
Proof: Based on Lemmas 4.1 - 4.5, it can be concluded
that the RLF and the NAFs will proceed with all phases
of Agent NetReconf and will generate the required explorer
agents to carry out the establishment of the restoration tree
and the reconfiguration of each node (RLF , NAFs and
NORTs) on the tree. Q.E.D. 2
4.3. Liveliness
In this section is proved that on completion of Agent Ne-
tReconf the network will be reconfigured appropriately.
Theorem 4.3 On completion of Agent NetReconf, all con-
nected nodes in the network are reachable.
Proof: The appearance of a failure causes all the paths
that go through the faulty link or node to be bisected. The
results are segments of unreachable nodes where each seg-
ment begins with a NAF. By Assumption 3.2, the network is
not partitioned, such that all connected nodes are reachable
through non-faulty physical paths. Lemma 4.2 assures that
all the NORTs and NORTs are reachable through a spanning
tree rooted at the NAF acting as restoration leader. During
the recovery phase, Lemma 4.3 guarantees that all the nodes
on the restoration tree have their routing tables updated in
a way such that all the faulty segments are replaced with
restoration paths. Theorem 4.2 demonstrates that Agent Ne-
tReconf will terminate for any single failure by executing
a “safe” sequence of reconfigurations that are performed
synchronously and coordinated by the restoration leader.
Q.E.D. 2
4.4. Safety
The goal of this section is to define and prove the safety
property of Agent NetReconf, namely, avoidance of infinite
loops and cyclic dependencies
Theorem 4.4 Agent NetReconf does not create infinite
loops or cyclic dependencies.
Proof: Cyclic dependencies among the nodes on the
restoration tree will not be created, because Step 3.1 pre-
vents any explorer agents Eij in search mode to either re-
turn back to the RLF or continue exploring if the current
visited node was already visited by another Eij from RLF .
Lemma 4.5 proves that no restoration leader will be blocked
forever in Phase 2. As well, cyclic dependencies between
the RLs cannot arise, because they are resolved by always
giving priority to the nodes with higher ID or nodes that are
already in Phase 3.
In the presence of multiple failures, the RLs will enter
Phase 3 in the priority order, which was established in Phase
2, i.e., at any time only RLs with disjoint restoration trees
are permitted to concurrently execute Phase 3. Therefore,
cyclic dependences cannot be formed between the RLs. The
RL-NAF relations are based on a strict request-response
model, so there are no cyclic dependencies between them.
Since all possible faulty NAFs have been isolated from the
restoration tree in Phase 1 and all reconfiguration messages
are reliably delivered, all loops in Phase 3 will terminate
after the corresponding messages are received. Q.E.D. 2
4.5. Cognitive Properties
Having autonomous mobile agents execute the algorithm
in parallel at each router reduces the required point-to-point
interactions between the restoration leader and the NAFs.
For instance, two agents would only exchange point-to-
point messages when necessary, otherwise they will work
with the knowledge that exists at each node, and the knowl-
edge they acquire from other agents during the construction
of the restoration tree or the reconfiguration phase.
To have agents execute the recovery algorithm allows
keeping the knowledge of a failure closer to where it hap-
pened instead of widely spreading the information to other
elements that are oblivious of such a fault. Also, with
agents, more intelligent interactions occur between routers.
For example, the manager NMi at RLF knows that the ar-
rival of an ERxi is the confirmation that the NAF is alive
and the path followed by an Eij is the desired restoration
path. Similarly, if an ERxi returns home it is known to the
NAF that the restoration leader has completed updating its
routing information and that it is its turn to do the same.
The lower complexity in Agent NetReconf, allows the al-
gorithm to scale because it only involves a small number of
links, as was proved in Section 4.1.
In Agent NetReconf, an explorer agent represents more
than one message type of those used in message based al-
gorithms such as [1, 5], and without oversimplifying, an
agent is considered a smart message that has cognitive and
evolutive capabilities.
These cognitive properties allow the reconfiguration al-
gorithm to execute faster, because the agents are retrieving
the information from the data knowledge base at the router
and do not have to wait for synchronous acknowledgment
from any router. The use of agents in the reconfiguration
algorithm helps reduce the number of message exchanges,
the number of links used in the reconfiguration and allows
an agent to make an optimal selection of the link that leads
to the next node.
5. Examples of Failure Recovery
5.1. Node Failure Recovery
To illustrate the behavior of Agent NetReconf for recov-
ering a node failure, consider that router R fails on the net-
work shown in Figure 2. After a TIamAlive timeout expires,
routers {A, B, C, D, E, H} detect the failure F. Each router
then becomes a Node Adjacent to Failure (NAF) and in par-
allel they start selecting a restoration leader RLF .
F
G
H
1
0
0
1
2
3
0
1
0
2 1
4 5
3
0
1
23
0
1
2
1
2
2
1
0
2
ED
C
B A
30 E
2
1
0
E0,H
ER
ER
B
C
E
E
1
3,H
E3,H
1
E3,H
2
E3,H
2
E3,H
3
3
3,H
E
ERD
ER E
E3,H
0
R
1,H
E
1
1,H
E
3,H
Figure 2. Node failure recovery
Phase 0. In D, NMD queries SD1, its knowledge base,
and determines that router H has the highest ID among
the others that are two hops away via link L1. Similarly,
{A, B, C, D} select H as RLF and then become NAFs.
Phase 1. At H, NMH creates three explorer agents
EH0, EH1 and EH3, one per active neighbor. Each agent
learns the list of NAFs and starts migrating, searching for
NAFs. Consider EH3. the explorer when it arrives RE
learns that there are two active links L0 and L3, and one fea-
sible route via L3. NAFs {C,D} are presumed to be reach-
able through L3 and {A,B} will need to be searched via L0.
This implies that at least two clones are required. However,
since RE is not a NAF then EH3 can continue searching.
As each explorer reaches a NAF, a restoration explorer is
sent to RLF . At RLF , when ERAH, ERBH, ERCH
and ERDH arrive, the restoration tree is considered built,
shown with black lines in Figure 2.
Phase 2. Since there are no overlapping restoration trees,
the agents move to the next phase.
Phase 3. Each ERxi sends a point-to-point RTB message
back home to make each Eij return back to RLF . Each
Eij on its way back learns routing information that it later
shares with RLF .
Table 1. Router D, original table
Dest Port Dest Port
A 1 F 0
B 1 G 1
C 2 H 1
D - R 1
E 0
Table 2. Router D, updated table
Dest Port Dest Port
A 0 F 0
B 0 G 0
C 2 H 1
D - R 1
E 0
When all Eij have arrived, RLF determines the destina-
tions that can be reached through its active links and gives
to each ERxi a list from which it excludes the destina-
tions reachable through the port on which ERxi came in.
ERDH, for example, will be provided with {A, B, F, G}.
On its way home, each node visited by ERDH provides the
destinations reachable through links belonging to RTF ex-
cluding those reachable through the links on which ERDH
arrived at and departed from the node. When ERDH gets
home, it asks NMD to update its routing tables with the
information that it is carrying. After NMD finishes updat-
ing its table, it sends a point-to-point UCR confirmation to
RLF . The table for router D after the reconfiguration is
complete is as shown in Table 2
5.2. Link Failure Recovery
The following example illustrates the behavior of Agent
NetReconf recovering a link failure. Assume that the link
connecting routers J and K fails in Figure 3. After the
TIamAlive timeout expires, routers J and K start the leader
selection phase and both routers assume that its neighbor, at
the other end of the link, has failed.
Phase 0. During leader selection, router J is selected
restoration leader RLJ by routers {A, C, D}. Likewise,
router K is selected restoration leader RLK by routers
{E, G, H, I}.
S D
S A
S B
S F
S E
E K,3
S
E K,3
E J,4
0
E K,3
E K,3
E J,4
0
E J,4
0
J K C
D
A
G
I
F
B
H
E
0
1
2
3
4
0
1
2
0
1
23
4
0
12
3
4
0
1
2
3
0
12
3
01
2
0
1
2 3
4
Figure 3. Link failure recovery
Phase 1. At J, four explorer agents are created:
EJ1, EJ2, EJ3 and EJ4. At K, three explorer agents
are created EK0, EK1 and EK3. To start building the
restoration paths, the explorers from each leader start mi-
grating to search for the known NAFs to each leader. In
the search process, explorer agents EK3 and EJ4 arrive
at restoration leaders RLJ and RLK respectively. With the
arrival of the explorers both leaders realize that the router
they presumed failed is indeed alive. Both leaders mark
faulty the link that connected them and move to determine
which is the new role of the supposedly faulty node in this
phase. Router J determines that router K’s ID is higher and
becomes a NAF belonging to RLK. Router J then issues a
deactivate point-to-point message to all its explorers to indi-
cate it is no longer the leader, see pseudo-code in Appendix
A. After the new role is assumed by J, Phase 1 continues as
described in section 3.1. Note that EK3 stays at J since it
became a NAF.
Phase 2. Since there are no overlapping restoration trees,
the agents move to the next phase.
Phase 3. Each ERxi sends a point-to-point RTB message
back home to make each EKj return back to RLK. Each
EKj, on its way back learns routing information that it later
shares with RLK. Phase 3 continues as described in section
3.1 to the end. The table for router J after the reconfigura-
tion is complete is as shown in Table 4
6. Conclusions
This paper has presented Agent NetReconf, a dynamic
network reconfiguration algorithm that uses collaborative
agents. It was proved by complexity analysis that Agent Ne-
tReconf is significantly more efficient than message based
algorithms [1, 5], and reduces by more than one order
of magnitude the number of interactions and message ex-
changes required to perform the network reconfiguration as
was explained in Section 4.1.
The improvement in complexity achieved in Agent Ne-
tReconf is based on the fact that all the agent interactions
Table 3. Router J, original table
Dest Port Dest Port Dest Port
A 0 F 3 SB 2
B 2 G 4 SD 0
C 0 H 2 SE 1
D 0 I 3 SF 3
E 1 SA 0
Table 4. Router J, updated table
Dest Port Dest Port Dest Port
A 4 F 3 SB 2
B 2 G 4 SD 4
C 4 H 2 SE 1
D 4 I 3 SF 3
E 1 SA 4
occur at each router and the number of point-to-point non-
in-router communications are minimal.
Another important, but not obvious, contributor to Agent
NetReconf’s reduction in complexity, is the representation
of agent knowledge as an OWL ontology. Using OWL sim-
plifies dramatically the way in which agents exchange in-
formation. For example, during the Leader Selection an
agent will only have to make a query to the router’s knowl-
edge base specifying that it needs to know the neighbor with
the highest ID that is two hops away. Querying the OWL
knowledge base is executed in constant time and does not
require any agents to be created such that its contribution to
the communication complexity is zero. This is mainly be-
cause the queries are executed locally and never leave the
current router. This last property assures that there is no
need for the agents, nor Agent NetReconf, to use any global
network information.
The combination of the agent based architecture and
Agent NetReconf represent an important contribution to ac-
tive networking because the network takes control of all its
tasks and uses intelligence as a way to provide improved
reliability and quality routing.
The cognitive properties of the agents allow the reconfig-
uration algorithm to execute faster, because the agents are
retrieving the information from the data knowledge base at
the router and do not have to wait for synchronous acknowl-
edgment from any other router. This facilitates the optimal
selection of the link that leads to the next node during the
reconfiguration.
To conclude, Agent NetReconf is a low complexity, in-
telligent distributed dynamic network reconfiguration algo-
rithm that is applicable to network computers with arbitrary
topologies, is application-transparent and is capable of iso-
lating and tolerating multiple faulty links or nodes.
References
[1] D. Avresky and N. Natchev. Dynamic Reconfiguration in
Computer Clusters with Irregular Topologies in the Presence
of Multiple Node and Link Failures. IEEE Transactions on
Computers, 55(2), May 2005.
[2] N. Bennacer, Y. Bourda, and B. Doan. Formalizing for
Querying Learning Objects Using OWL. In Proceedings of
IEEE International Conference on Advanced Learning Tech-
nologies, pages 321–325, 2004.
[3] G. D. Caro and M. Dorigo. Mobile Agents for Adaptive
Routing. In Proceedings of 31st International Conference
on System Sciences (HICSS-31), 1998.
[4] H. Chalupsky, T. Finin, R. Fritzson, D. McKay, S. Shapiro,
and G. Weiderhold. An Overview of KQML: A Knowl-
edge Query and Manipulation Language. Technical report,
KQML Advisory Group, Apr. 1992.
[5] J. Duato, R. Casado, A. Berm´udez, and F. J. Quiles. A Pro-
tocol for Deadlock-Free Dynamic Reconfiguration in High-
Speed Local Area Networks. IEEE Transactions on Parallel
and Distributed Systems, 12(2):115 – 132, February 2001.
[6] M. Garijo, A. Cancer, and J. Sanchez. A Multi-Agent Sys-
tem for Cooperative Network-Fault Management. In Pro-
ceedings of the First International Conference and Exhibi-
tion on the Practical Applications of Intelligent Agents and
Multi-agent Technology, pages 279 – 294, 1996.
[7] M. Heusse, S. Gu’erin, D. Snyers, and P. Kuntz. Adaptive
Agent-Driven Routing and Load Balancing in Communica-
tion Networks. Complex Systems, 1998.
[8] C. S. Hood and C. Ji. Intelligent Agents for Proactive
Fault Detection. IEEE The Internet Computing, 2(2):65–72,
March – April 1998.
[9] N. Minar, K. H. Kramer, and P. Maes. Cooperating Mobile
Agents for Mapping Networks. In Proceedings of the First
Hungarian National Conference on Agent Based Computa-
tion, 1999.
[10] H. S. Nwana. Software Agents: An Overview. Knowledge
Engineering Review, 11(3):205–244, Oct./Nov. 1995.
[11] R. Schoonderwoerd, O. E. Holland, J. L. Bruten, and L. J. M.
Rothkrantz. Ant-Based Load Balancing in Telecommunica-
tions Networks. Adaptive Behavior, 5(2):169–207, 1996.
[12] D. L. Tennenhouse, J. M. Smith, W. D. Sincoskie, D. J.
Wetherall, and G. J. Minden. A Survey of Active Network
Research. IEEE Communications Magazine, 35(1):80–86,
1997.
[13] S. Wang, D. Xuan, R. Bettati, and W. Zhao. A Study of Pro-
viding Statistical QoS in a Differentiated Services Network.
In NCA’03, Proceedings of IEEE International Symposium
on Network Computing and Applications, pages 0297–0304,
2003.
[14] G. Weiss. Multi Agent Systems, A Modern Approach to Dis-
tributed Artificial Intelligence. MIT Press, 2001. ISBN:
0-262-23203-0.
[15] T. White, A. Bieszczad, and B. Pagurek. Distributed Fault
Location in Networks Using Mobile Agents. In IATA
1998,Proceedings of the Second International Workshop
on Intelligent Agents for Telecommunication, volume 1437,
1998.
[16] M. J. Wooldridge. The Logical Modeling of Computational
Multi-Agent Systems. PhD thesis, University of Manchester,
1992.
[17] M. J. Wooldridge and N. R. Jennings. Intelligent Agents:
Theory and Practice. Knowledge Engineering Review,
10(2):115–152, June 1995.
[18] Y. Yemini and S. daSilva. Towards programmable networks.
In Proceedings of IFIP/IEEE International Workshop on
Distributed Systems: Operations and Management, 1996.
[19] P. Zhang and Y. Sun. A New Approach Based on Mobile
Agents to Network Fault Detection. In ICCNMC’01, Pro-
ceedings of the International Conference on Computer Net-
works and Mobile Computing, 2001.

Contenu connexe

Tendances

fault localization in computer network..
fault localization in computer network..fault localization in computer network..
fault localization in computer network..CDAC PUNE
 
Sensor Adhoc Networks SECOM paper-Final - format
Sensor Adhoc Networks SECOM paper-Final - formatSensor Adhoc Networks SECOM paper-Final - format
Sensor Adhoc Networks SECOM paper-Final - formatJohn A. Serri
 
RTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKS
RTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKSRTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKS
RTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKSIJNSA Journal
 
Paper id 2520141231
Paper id 2520141231Paper id 2520141231
Paper id 2520141231IJRAT
 
Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...
Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...
Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...eSAT Publishing House
 
A SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSN
A SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSNA SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSN
A SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSNIAEME Publication
 
Some aspects of wireless sensor networks
Some aspects of wireless sensor networksSome aspects of wireless sensor networks
Some aspects of wireless sensor networkspijans
 
Wireless Micro-Sensor Network Models
Wireless Micro-Sensor Network ModelsWireless Micro-Sensor Network Models
Wireless Micro-Sensor Network ModelsIOSR Journals
 
SURVEY ON MOBILE AD HOC NETWORK
SURVEY ON MOBILE AD HOC NETWORKSURVEY ON MOBILE AD HOC NETWORK
SURVEY ON MOBILE AD HOC NETWORKIAEME Publication
 
IRJET - Securing Computers from Remote Access Trojans using Deep Learning...
IRJET -  	  Securing Computers from Remote Access Trojans using Deep Learning...IRJET -  	  Securing Computers from Remote Access Trojans using Deep Learning...
IRJET - Securing Computers from Remote Access Trojans using Deep Learning...IRJET Journal
 
IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...
IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...
IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...IRJET Journal
 
Enhanced security in spontaneous wireless ad hoc
Enhanced security in spontaneous wireless ad hocEnhanced security in spontaneous wireless ad hoc
Enhanced security in spontaneous wireless ad hoceSAT Publishing House
 
Crypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over NetworkingCrypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over NetworkingIRJET Journal
 
11.soft handover scheme for wsn nodes using media independent handover functions
11.soft handover scheme for wsn nodes using media independent handover functions11.soft handover scheme for wsn nodes using media independent handover functions
11.soft handover scheme for wsn nodes using media independent handover functionsAlexander Decker
 

Tendances (18)

fault localization in computer network..
fault localization in computer network..fault localization in computer network..
fault localization in computer network..
 
M026075079
M026075079M026075079
M026075079
 
Sensor Adhoc Networks SECOM paper-Final - format
Sensor Adhoc Networks SECOM paper-Final - formatSensor Adhoc Networks SECOM paper-Final - format
Sensor Adhoc Networks SECOM paper-Final - format
 
RTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKS
RTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKSRTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKS
RTOS BASED SECURE SHORTEST PATH ROUTING ALGORITHM IN MOBILE AD- HOC NETWORKS
 
Paper id 2520141231
Paper id 2520141231Paper id 2520141231
Paper id 2520141231
 
Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...
Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...
Energy efficient ccrvc scheme for secure communications in mobile ad hoc netw...
 
50120140506001
5012014050600150120140506001
50120140506001
 
H1803055461
H1803055461H1803055461
H1803055461
 
Bi33349355
Bi33349355Bi33349355
Bi33349355
 
A SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSN
A SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSNA SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSN
A SURVEY OF ENERGY-EFFICIENT COMMUNICATION PROTOCOLS IN WSN
 
Some aspects of wireless sensor networks
Some aspects of wireless sensor networksSome aspects of wireless sensor networks
Some aspects of wireless sensor networks
 
Wireless Micro-Sensor Network Models
Wireless Micro-Sensor Network ModelsWireless Micro-Sensor Network Models
Wireless Micro-Sensor Network Models
 
SURVEY ON MOBILE AD HOC NETWORK
SURVEY ON MOBILE AD HOC NETWORKSURVEY ON MOBILE AD HOC NETWORK
SURVEY ON MOBILE AD HOC NETWORK
 
IRJET - Securing Computers from Remote Access Trojans using Deep Learning...
IRJET -  	  Securing Computers from Remote Access Trojans using Deep Learning...IRJET -  	  Securing Computers from Remote Access Trojans using Deep Learning...
IRJET - Securing Computers from Remote Access Trojans using Deep Learning...
 
IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...
IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...
IRJET- Securing on Demand Source Routing Protocol in Mobile Ad-Hoc Networks b...
 
Enhanced security in spontaneous wireless ad hoc
Enhanced security in spontaneous wireless ad hocEnhanced security in spontaneous wireless ad hoc
Enhanced security in spontaneous wireless ad hoc
 
Crypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over NetworkingCrypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
 
11.soft handover scheme for wsn nodes using media independent handover functions
11.soft handover scheme for wsn nodes using media independent handover functions11.soft handover scheme for wsn nodes using media independent handover functions
11.soft handover scheme for wsn nodes using media independent handover functions
 

En vedette

The Future Of Robots
The Future Of RobotsThe Future Of Robots
The Future Of Robotsliz00
 
Getting started as an android developer
Getting started as an  android developerGetting started as an  android developer
Getting started as an android developerAva Meredith
 
An introduction to Autonomous mobile robots
An introduction to Autonomous mobile robotsAn introduction to Autonomous mobile robots
An introduction to Autonomous mobile robotsZahra Sadeghi
 
MS EXCEL PPT PRESENTATION
MS EXCEL PPT PRESENTATIONMS EXCEL PPT PRESENTATION
MS EXCEL PPT PRESENTATIONMridul Bansal
 

En vedette (8)

Agent basedqos
Agent basedqosAgent basedqos
Agent basedqos
 
The Future Of Robots
The Future Of RobotsThe Future Of Robots
The Future Of Robots
 
Ipdps intel dynreconf
Ipdps intel dynreconfIpdps intel dynreconf
Ipdps intel dynreconf
 
Introductionto agents
Introductionto agentsIntroductionto agents
Introductionto agents
 
Introduction to locomotor
Introduction to locomotorIntroduction to locomotor
Introduction to locomotor
 
Getting started as an android developer
Getting started as an  android developerGetting started as an  android developer
Getting started as an android developer
 
An introduction to Autonomous mobile robots
An introduction to Autonomous mobile robotsAn introduction to Autonomous mobile robots
An introduction to Autonomous mobile robots
 
MS EXCEL PPT PRESENTATION
MS EXCEL PPT PRESENTATIONMS EXCEL PPT PRESENTATION
MS EXCEL PPT PRESENTATION
 

Similaire à Collcom2005 agent basedft

Multi agent based network monitoring and management using jade
Multi agent based network monitoring and management using jadeMulti agent based network monitoring and management using jade
Multi agent based network monitoring and management using jadeAlexander Decker
 
OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...
OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...
OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...IJNSA Journal
 
Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...
Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...
Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...IRJET Journal
 
A SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKS
A SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKSA SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKS
A SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKSijdpsjournal
 
A middleware approach for high level overlay network
A middleware approach for high level overlay networkA middleware approach for high level overlay network
A middleware approach for high level overlay networkIOSR Journals
 
Current issues - International Journal of Network Security & Its Applications...
Current issues - International Journal of Network Security & Its Applications...Current issues - International Journal of Network Security & Its Applications...
Current issues - International Journal of Network Security & Its Applications...IJNSA Journal
 
Present and desired network management to cope with the expected expansion, n...
Present and desired network management to cope with the expected expansion, n...Present and desired network management to cope with the expected expansion, n...
Present and desired network management to cope with the expected expansion, n...Alexander Decker
 
A New Approach for Improving Performance of Intrusion Detection System over M...
A New Approach for Improving Performance of Intrusion Detection System over M...A New Approach for Improving Performance of Intrusion Detection System over M...
A New Approach for Improving Performance of Intrusion Detection System over M...IOSR Journals
 
Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks
Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks   Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks
Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks pijans
 
AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...
AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...
AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...pijans
 
Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...
Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...
Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...IJARIIT
 
Network Simulation.pptx
Network Simulation.pptxNetwork Simulation.pptx
Network Simulation.pptxSmashSmash5
 
MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...
MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...
MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...ijwmn
 
Performance measurement of MANET routing protocols under Blackhole security a...
Performance measurement of MANET routing protocols under Blackhole security a...Performance measurement of MANET routing protocols under Blackhole security a...
Performance measurement of MANET routing protocols under Blackhole security a...iosrjce
 
Interference Revelation in Mobile Ad-hoc Networks and Confrontation
Interference Revelation in Mobile Ad-hoc Networks and ConfrontationInterference Revelation in Mobile Ad-hoc Networks and Confrontation
Interference Revelation in Mobile Ad-hoc Networks and Confrontationirjes
 
Wireless sensor networks software architecture
Wireless sensor networks software architectureWireless sensor networks software architecture
Wireless sensor networks software architectureAdeel Javaid
 
SEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEM
SEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEMSEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEM
SEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEMijiert bestjournal
 

Similaire à Collcom2005 agent basedft (20)

Final_Report
Final_ReportFinal_Report
Final_Report
 
Multi agent based network monitoring and management using jade
Multi agent based network monitoring and management using jadeMulti agent based network monitoring and management using jade
Multi agent based network monitoring and management using jade
 
OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...
OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...
OPTIMIZING CONGESTION CONTROL BY USING DEVICES AUTHENTICATION IN SOFTWARE-DEF...
 
Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...
Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...
Detecting Various Black Hole Attacks by Using Preventor Node in Wireless Sens...
 
A SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKS
A SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKSA SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKS
A SCALABLE MONITORING SYSTEM FOR SOFTWARE DEFINED NETWORKS
 
A middleware approach for high level overlay network
A middleware approach for high level overlay networkA middleware approach for high level overlay network
A middleware approach for high level overlay network
 
Current issues - International Journal of Network Security & Its Applications...
Current issues - International Journal of Network Security & Its Applications...Current issues - International Journal of Network Security & Its Applications...
Current issues - International Journal of Network Security & Its Applications...
 
Present and desired network management to cope with the expected expansion, n...
Present and desired network management to cope with the expected expansion, n...Present and desired network management to cope with the expected expansion, n...
Present and desired network management to cope with the expected expansion, n...
 
Ds35676681
Ds35676681Ds35676681
Ds35676681
 
A New Approach for Improving Performance of Intrusion Detection System over M...
A New Approach for Improving Performance of Intrusion Detection System over M...A New Approach for Improving Performance of Intrusion Detection System over M...
A New Approach for Improving Performance of Intrusion Detection System over M...
 
Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks
Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks   Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks
Minimum Process Coordinated Checkpointing Scheme For Ad Hoc Networks
 
AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...
AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...
AN EFFICIENT ROUTING PROTOCOL FOR MOBILE AD HOC NETWORK FOR SECURED COMMUNICA...
 
Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...
Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...
Cooperative Black Hole Attack Prevention by Particle Swarm Optimization with ...
 
Network Simulation.pptx
Network Simulation.pptxNetwork Simulation.pptx
Network Simulation.pptx
 
MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...
MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...
MACHINE LEARNING FOR QOE PREDICTION AND ANOMALY DETECTION IN SELF-ORGANIZING ...
 
M017248993
M017248993M017248993
M017248993
 
Performance measurement of MANET routing protocols under Blackhole security a...
Performance measurement of MANET routing protocols under Blackhole security a...Performance measurement of MANET routing protocols under Blackhole security a...
Performance measurement of MANET routing protocols under Blackhole security a...
 
Interference Revelation in Mobile Ad-hoc Networks and Confrontation
Interference Revelation in Mobile Ad-hoc Networks and ConfrontationInterference Revelation in Mobile Ad-hoc Networks and Confrontation
Interference Revelation in Mobile Ad-hoc Networks and Confrontation
 
Wireless sensor networks software architecture
Wireless sensor networks software architectureWireless sensor networks software architecture
Wireless sensor networks software architecture
 
SEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEM
SEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEMSEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEM
SEARCHING DISTRIBUTED DATA WITH MULTI AGENT SYSTEM
 

Dernier

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Dernier (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Collcom2005 agent basedft

  • 1. Dynamic Network Reconfiguration in Presence of Multiple Node and Link Failures Using Autonomous Agents Juan Ram´on Acosta and Dimiter R. Avresky Network Computing Lab, Northeastern University, Boston, MA {jracosta,avresky}@ece.neu.edu Abstract Currently, high-speed networks are indispensable commodities for all users and they have become an integral part of their lifestyles. For this reason, it is necessary for the network to be available most of the time and to achieve transparent network failure recovery. In this paper, it is proposed to use Agent NetReconf 1 , an agent based dynamic network reconfiguration algorithm that is capable of tol- erating multiple router and link failures in high-speed networks with arbitrary topology. Agent NetReconf updates the routing ta- bles asynchronously and does not require any global knowledge of the network topology. Agent NetReconf uses mobile and au- tonomous agents to detect and recover the network from failures. Agent NetReconf highlights the benefits of using smart networking devices as a means of building an active network. The complexity of Agent NetReconf is analyzed and the termination, liveliness and safety are proved. Keywords: high-speed networks, autonomous mobile agents, dy- namic reconfiguration, fault tolerance, adaptive routing, arbitrary topologies Introduction The increasing number of users of the Internet has trig- gered a significant growth in the number of networked de- vices and the traffic they generate. Computer networks are now been pushed to their limit. In this context, computing capacity is available but it can be severely affected by fail- ures. The major challenge faced by service providers today is to keep their ability to give customers the level of ser- vice they require, regardless of system conditions and the number of faults on the network. The need to provide increased availability has lead re- searchers such as Hood and Ji [8] to develop a sophisti- cated intelligent software agent that performs fault detec- tion accurately and in certain cases predicts the fault before 1This work was supported by the U.S. National Science Foundation under grant CCR-0004515 it appears. Others such as Whit et al. [15] have imple- mented communities of mobile agents that roam the net- work collecting and exchanging network information based on the ”social insects” paradigm (ant behavior) described by Schoonderwoerd et al. [11]. In this paper, an algorithm is proposed for achieving dy- namic network fault detection and avoidance in arbitrary topologies using autonomous agents running at each router. The reconfiguration algorithm is distributed and embedded in the agents’ behavior. The paper is organized in six sec- tions as follows: Section 1, presents an overview on agents and how they are used in adaptive routing. Section 2, describes a new router architecture that uses autonomous agents for its routing services. Section 3, describes Agent NetReconf and how it does the tables reconfiguration to re- store routing capabilities at the network segment affected by the failure. Section 4, presents the complexity, termination, safety and cognitive properties of Agent NetReconf. Section 5, presents a fault recovery example showcasing the algo- rithm execution. The last section in the paper contains the conclusions. 1. Autonomous Agents This section presents an overview of previous work that has been published on how agents are used to achieve effi- cient network routing and fault tolerance. The term agent has been used to refer to a software and/or hardware component which is capable of acting ex- actingly in order to accomplish tasks on behalf of its user [10]. An agent is able to cooperate with other agents, learns from its environment [17], and sometimes has the capabil- ity of migrating under its own control from one machine to another, provided both computers are part of a network. Agents communicate with other agents to achieve suc- cessfully all the tasks given to them [16]. Communication between agents is modeled as a point-to-point exchange of messages whose content is a construction of a well defined language, for example: the Knowledge Query and Manipu- lation Language (KQML) [4] , the Knowledge Interchange
  • 2. Format (KIF) [14] or, the most recent, the OWL Web On- tology Langauage [2]. 1.1. Applications on Network Fault Tolerance Minar in [9], describes an algorithm to discover the net- work topology using mobile agents. The agents travel the network and from each node they visit they learn its cur- rent connectivity. In addition, the agents complement the acquired knowledge by cooperating with other agents they meet at the same node. Finally, when agents finish explor- ing the network, the topology is fully discovered, and this information is then used to define the routing tables at each node. Agents have also been used in adaptive routing, for example, Gianni in [3], introduced a distributed adaptive routing algorithm based on mobile agents that is capable of learning the routing tables of a computer network using the ant colony metaphor. Garijo, Cancer and S´anches in [6], for example, describe a centralized Multi-agent Coopera- tive Network-Fault Management system (CNFM) that uses ISO standard interfaces at each router to detect and avoid faults on the network. In CNFM the agents are working as watch dogs of the network monitoring each element and generating events into the CNFM engine when faults are recognized. Cynthia Hood and Chuanyi Ji [8], took advantage of the increasingly available computation power in networking devices and the benefits of artificial intelligence to design an intelligent agent that processes information collected by the Simple Network Management Protocol agents (SNMP- agents) at each node and uses this information to detect net- work anomalies that typically precede a fault. “The intel- ligent agent learns the normal behavior from each reading made by the SNMP-agent and combines the information us- ing a Bayesian network that could trigger a local corrective action or a message to a centralized network manager.” In a similar approach presented by Phuan and Yufang in [19], an intelligent mobile agent has the capability to extract data from a network element using a local high-bandwidth com- munication session without consuming network resources and reducing the overall communication traffic. The intel- ligent mobile agent has the ability to integrate knowledge from a network manager and any network element to per- form inferences on which type of fault recovery it will be necessary to perform. The algorithm proposed in this paper is different from the solutions described earlier in that Agent NetReconf ex- ecutes network failure recovery using only the local knowl- edge at each router without having to know the network topology or the type of faulty element (router or link), and it is platform independent. 2. Agent Based Router In order for network failure recovery to happen at the ex- act location where an element failed, it is necessary that the routing elements in the vicinity take an active role in the detection and contention of the fault. As mentioned earlier, network fault recovery and detection is commonly imple- mented in a way such that a central network monitoring sta- tion launches all the corrective actions from a remote site, as seen in [8, 6, 19] and only a few implementations, such as those described in [1, 5], make the adjacent routers to the failure participate in the restoration of connectivity. The authors, in this section, propose an agent based router in which the detection and reconfiguration tasks are performed by a group of intelligent agents. The agents are goal oriented and capable of incorporating new knowledge learned during the router operation and network reconfigu- ration. In essence, the new router is an active intelligent network device capable of reacting and adjusting its operation based on the events that occur in its internal and external environ- ment. 2.1. Architecture The architecture of the new intelligent router, in Figure 1, is based on a high-speed cross bar switch with an en- hanced embedded software module that contains an agent subsystem. For simplicity, the agent platform will not be specified. The router hosts a community of agents that are responsi- ble for controlling the router’s activities and coordinate all the tasks involved in the dynamic reconfiguration of rout- ing tables when the router participates in the recovery of a failure. The knowledge used by the agents to represent the router, links, neighbors and the execution parameters of the fault-tolerant reconfiguration algorithm is saved in the agent’s main memory. The structural representation of the knowledge is defined using ontology classes written in the OWL web Ontology Language [2]. The definition of the agents operating the router is as fol- lows: 1. Node Manager Agent. This agent oversees the opera- tion of the router and the other agents. The node man- ager is the router public interface that can be use by network administration tools, visiting explorer agents, neighbor routers and other external network elements to communicate with the router. The manager agent is also responsible for the security and integrity of the router; it supervises all the access made to the routing tables and memory, and makes sure that all the request made to it are safe. The node manager agent is the
  • 3. . . Arbitration Decision Routing Crossbar NxN Tables 0 ii Input Ports Output Ports Node Manager Agent Router Agent AgentRouting N−1 0 N−1 Link Manager Figure 1. Agent based router architecture only component in the router that can initiate a recon- figuration task. The node manager agent uses a rein- forcement learning method to acquire new knowledge to make better decisions during node management and fault recovery. 2. Router Agent. It is the only agent in the new architec- ture that can manipulate the routing tables and has the capability of accepting or declining updates. The agent behavior is determined by the inherent routing algo- rithm and the dynamic reconfiguration policies. As seen in Figure 1, the router’s arbitration and routing decision logic are controlled by this agent. The router agent reacts only to requests from the node manager agent. 3. Link Manager Agent. Responsible for managing the router’s connected links, ports and queues. The agent is in charge of detecting and reporting failures and con- gestion to the node manager. The agent uses a rein- forcement learning model to learn the characteristic symptoms before a failure or congestion take place, this allows the agent to choose the appropriate cor- rective actions and promptly trigger a restoration task. The agent uses the “I’m alive” message model to de- termine failures and the flow-unaware statistical de- lay method described in [13] to accurately determine packet delays without depending on the dynamic in- formation of the packet flow. 4. Explorer Agent. These agents are dynamically cre- ated in each router when Agent NetReconf is executed. When an explorer agent is working in search mode it cooperates with other agents to build a restoration spanning tree that will re-connect the nodes discon- nected by the failure. When an explorer agent is work- ing in restoration mode, it collaborates with the node manager agents at each router on the restoration tree to update the local router tables. An explorer agent is a delegate of the router that created it, such that any interaction between two different agents is equivalent to the two routers interacting directly point-to-point. 3. Network Failure Recovery 3.1. Agent NetReconf This section describes a new dynamic network reconfig- uration algorithm Agent NetReconf. The algorithm uses a set of collaborative agents to restore network connectivity after a failure is detected. Agent NetReconf is a distributed intelligent algorithm that operates at the network level with- out any global information of the network topology. The strategy used by Agent NetReconf consists in iden- tifying the set of nodes adjacent to a failure and from them selecting a leader to coordinate the construction of a restora- tion spanning tree and synchronize the updates to the rout- ing tables at each node on the restoration tree. The complete reconfiguration process consists of four phases: Leader Selection, Restoration Tree Construction, Reconfiguration Synchronization and Tables Update. The correct execution of these phases is subject to the validity of the following assumptions: Assumption 3.1 After a failure F is detected, no additional failures will occur on any link or node that belongs to the restoration tree, until Agent NetReconf finishes the recon- figuration process for F. Assumption 3.2 The network is not partitioned as result of the failures. Before describing in detail each phase, for clarity, con- sider R to be the set of all routers in the network and that each router Ri is connected to N other routers, its imme- diate neighbors. Also let Sij be the collection of IDs of all routers that are two hops away from Ri via link Lj. Addi- tionally, assume that each Lk is monitored and managed by one of the link manager agents (LMk). At each router Ri, the link manager LMk that detects missing “I’m alive” mes- sages from link Lk, immediately notifies the Node Manager Agent (NMi) by raising the asynchronous NetworkFailure- Detected event. Leader Selection After the failure is detected by router Ri, the node manager NMi suspends the traffic targeting Lk, the link leading to the presumed faulty node. From Sik, NMi selects the ID with the highest value and records it in memory as the ID corresponding to the Restoration Leader (RLF ). If the selected ID equals Ri’s ID then Ri becomes the leader and immediately starts Phase 1. Otherwise, when the selected ID does not match Ri’s, the router starts timer
  • 4. Tstart and waits for a control signal from RLF that indi- cates that the node can join Phase 1. If Tstart times out and no signal from RLF was received, Ri marks RLF faulty and starts the leader selection again. Definition 3.1 Node Adjacent to Failure (NAF) It is a node that was not selected “Restoration Leader” and was di- rectly connected to a node or link that failed. Phase 1. Restoration Tree Construction The first step in Agent NetReconf is to build a restoration tree to establish a communication path between the leader and the NAFs. Step 1a. Begin Phase Phase 1 starts with the Restoration Leader RLF ( RLF = Ri ) creating one explorer agent Eij per active link Lj. Eij is initialized in search mode and is provided with the list of disconnected NAFs. Eij makes Ri its home and starts the search for NAFs by migrating to the neighbor connected to Lj. After all Eij migrated out of the leader node, RLF starts timer Tack and waits for the arrival of control signals con- firming that a restoration path was found between RLF and each NAF. Step 1b. Searching for NAFs As the explorer agent Eij arrives at a node Rx, it adds the ID of the visited node to the restoration path it is building. Eij exchanges information with the current node and uses this information to define an itinerary for its next migration. If the explorer agent did not arrive at a NAF, then it uses the information to create clones of itself to help it con- tinue searching. The itinerary and the number of clones are based on the number of active links and the available feasi- ble routes to the NAFs. For example, in Figure 2, explorer EH3 learns from RE that there are two active links L0 and L3, and one feasible route via L3. NAFs {C,D} are pre- sumed to be reachable through L3 and {A,B} will need to be searched via L0. This implies that at least two clones are required. However, since RE is not a NAF then EH3 can continue searching. Therefore, only one clone is required for the next migration. When Eij arrives at a NAF, the explorer agent removes Rx from the list of NAFs and tells NMx to save the restora- tion path Eij traveled. Then, NMx stops Tstart and creates an agent explorer for restoration ERxi that sends back to the restoration leader Ri to confirm that the restoration path was found. Although, Eij reached a NAF the search needs to continue for the remaining NAFs in the list. Eij then creates clones and their itinerary following the same crite- ria mention before. Each clone then continues the search. Meanwhile Eij stays at Rx and starts timer Tphase3 to wait for a signal from RLF to start Phase 3. The case in which Tphase3 times out represents a situation in which a failure might occurr during reconfiguration. However, based on Assump. 3.1, this will not occur. Cycles are prevented in the restoration paths by deacti- vating an explorer agent when it arrives a node that has been visited already by either itself, one of its clones or one of its siblings. In order to distinguish between node and Link Failures, Agent NetReconf uses explorer agents as follows: If a NAF receives an Eij from a node which is assumed to be faulty, then a link failure is identified, therefore the NAF must up- date its reconfiguration information for the node and mark it safe. In the case in which two nodes, each at the end of the faulty link, may have determined that both are restoration leaders for the link failure, it is required to synchronize the nodes such that only one leader remains. The synchroniza- tion will occur when both nodes receive an explorer agent from each other, Eij and Eyj. The restoration leader for the faulty link will be the parent of the explorer agent that has the highest ID value. For example Ry, parent of Eyj, becomes the restoration leader for the failed link and node Ri becomes a NAF. After the leader synchronization has occurred Agent NetReconf will continue with the reconfig- uration. Step 1c Establishing Tree At each node Rj that is on the path followed by ERxi, NMj marks the links on which ERxi arrives and departs members of the restoration tree. Furthermore, if NMj de- tects that a different ERxy, from leader Ry, has already visited the node, then to avoid any conflicts with the recon- figuration, it gives ERxi the information about Ry, such that when it gets to Ri this can synchronize with Ry before it proceeds with Phase 3. ERxi continues migrating until it reaches the restoration leader. When Tack times out at the restoration leader, RLF de- termines which NAF did not reply with an ERxi in order to mark it faulty and exclude it from the reconfiguration. The restoration leader continues and builds the restoration tree by merging each root of the confirmed restoration paths. After the restoration tree is completed, each ERxi sends a point-to-point Restoration Tree Built (RBT) message signal to its parent. Definition 3.2 Node On Restoration Tree (NORT) It is a node that has at least one link belonging to the restoration tree. Phase 2. Multiple Failure Synchronization When multiple failures appear, Agent NetReconf estab- lishes an ordered sequence of priorities between the restora- tion leaders detected by the visited NORTs, such that the reconfigurations occurs in a “safe” sequence in which the restoration leader with the highest ID always executes Phase 3 first, while the others await their chance. For example, if we assume that Ry’s ID is higher than Ri’s then it will pro- ceed to Phase 3 before Ri.
  • 5. Phase 3. Routing Information Update This phase starts with a NAF processing an incoming RTB message and providing new routing information to the awaiting Eij. After the information exchange finishes the explorer agent starts migrating back to RLF using the ac- knowledged restoration path. As Eij travels back to RLF , the node manager of a visited node exchanges routing in- formation with Eij and if necessary it updates its rout- ing tables. Eij continues migrating until it reaches RLF . The information given to Eij by the NAF, and each visited node, includes the IDs of all destinations that are reachable through each of these nodes using links that do not belong to the restoration tree. Upon arrival to the restoration leader, Eij delivers to RLF the routing information it collected. RLF processes the data to adjust its routing tables and deactivates Eij. When RLF completes the update, it then provides each ERxi with the IDs of all the destinations reachable through its active links excluding the link on which the ERxi arrived and then ERxi migrates to its parent NAF. After all restora- tion explorers have migrated, RLF starts a timer Tcomplete to wait for a confirmation signal from each NAF indicating that the updates were completed and that they are ready to resume operations. The case in which Tcomplete times out represents a case similar to that described earlier and will be dealt with in a future publication.. As ERxi travels back, the node manager of a visited node exchanges routing information with ERxi and if nec- essary it updates its routing tables. ERxi continues travel- ing until it reaches its parent NAF. The routing information provided by the visited node includes the IDs of all the des- tinations reachable through the visited node using the links that belong to the restoration tree with the exception of the IDs of nodes accessible via the links on which ERxi arrives and leaves the visited node. Upon arrival of ERxi, Rx updates its routing tables with the information contained in the restoration explorer and ERxi is deactivated. NAF sends RLF a point-to-point Update Complete Response (UCR) message signal. When RLF receives the UCR signal, it stops Tcomplete and re- sumes normal operations. The reconfiguration algorithm, as described, uses to the maximum the ability of the agents to interact with each other. Communication between the explorer and node man- ager agents are performed mostly within the router’s agent module, only a very few leave the router and happen in a point-to-point form. This is an important contribution of Agent NetReconf because it maintains the algorithm execu- tion distributed at each router and keeps to a minimum the overhead on the bandwidth usage and the number of links preempted for the reconfiguration to work. Agent NetReconf bases its execution on the natural abil- ity of autonomous agents to acquire and share knowledge, for instance, when the explorer agents are searching for NAFs they learn information at each node that helps them design an optimal migration pattern that reduces network flooding significantly. 4. Properties of Agent NetReconf 4.1. Complexity The complexity of Agent NetReconf is analyzed in terms of the number of explorer agents created during restora- tion tree construction and routing table reconfiguration. Let LActive be the number of active links on each router, nfin the number of NAFs for failure F, and P a path between RLF and a NAF. Theorem 4.1 The complexity for Agent NetReconf for mul- tiple failures is given by O(LActive ∗ ((nmax ∗ Pmax) + 1)) where Pmax is the longest path connecting RLF and any NAF and nmax is the maximum number of NAFs. Proof: Agent NetReconf determines RLF without cre- ating explorer agents such that leader selection is achieved with O(0) complexity. In Phase 1, when the recovery step initiates, RLF creates LActive exploration agents Eij, one per active link. The corresponding complexity for this operation is O(LActive). As an explorer agent migrates searching for target NAFs, the maximum number of explorer agents created at the vis- ited node Rx as described in Phase 3 is LActive − 1. In cases where Rx is a NAF, Rx creates one exploration agent for recovery, ERxi, such that the maximum number of ex- plorer agents created at an intermediate router is LActive. Now, considering that the longest restoration path be- tween RLF and a NAF is Pmax, the total number of ex- plorer agents needed to continue searching for a NAF is Pmax ∗ LActive. Considering the worst case in which each NAF is reached via a disjoint restoration path, the total num- ber of explorers created is given by nmax ∗ Pmax ∗ LActive. Assuming that all restoration trees intersect, Phase 2 is executed independently for each RT without creating any agents, which results in O(0) complexity. Then, by adding the number of agents created by the restoration leader, the complexity of Agent NetRe- conf becomes O((nmax ∗ Pmax ∗ LActive) + LActive), or O(LActive ∗ ((nmax ∗ Pmax) + 1)). Q.E.D 2 Now, by comparing O(LActive ∗ ((nmax ∗ Pmax) + 1)) with the complexity of NetRec in [1], which is O(N ∗ (L + nmax ∗Pmax +N ∗Pmax)), it is clear that Agent NetReconf reduces the complexity of NetRec by more than one order
  • 6. of magnitude. This is explainable by the fact that LActive is expressed in terms of the number of active links instead of the total number of links in the network. The improvement presented here is possible because in Agent NetReconf the agents are using their knowledge to make inferences and execute actions that otherwise, in standard NetRec, would require several point-to-point message exchanges. This, in fact, is a powerful feature of agent based systems as is men- tioned in [14]. 4.2. Termination The following agent migration patterns and message de- livery properties are used for proving Agent NetReconf’s Termination. Definition 4.1 If a point-to-point message is sent from a source agent S to a destination agent D, then it will be re- ceived once and only once by D. Definition 4.2 Every point-to-point message sent between an exploration agent Eij or ERxi and a node manager agent NMx will be routed following a path on the restora- tion tree and will be reliably delivered to its destination. Definition 4.3 The restoration leader RLF considers an arriving ERxi to be the acknowledgment sent from a NAF to confirm that a restoration path has been created. Definition 4.4 The restoration leader RLF considers a re- turning Eij to be the acknowledgment sent by a NAF to con- firm that a restoration tree was established and the request to update its routing tables with the information carried by Eij . Definition 4.5 A NAF considers a returning ERxi to be the acknowledgment sent by the RLF that it updated its routing information and that the NAF must update its table with the new information carried by ERxi Lemma 4.1 For a given faulty node F, all NAFs will elect the same RL. Proof: We prove by contradiction. Suppose that two NAFs will elect different RLs. Since the router with highest ID among the NAFs is elected for RL, then these two NAFs must have used different NAF sets. However, all NAFs are two hops from each other through F and by definition each NAF knows its own ID and the IDs of all routers that are two hops away from it. Thus, the NAF sets determined by the NAFs cannot be different, which contradicts the suppo- sition. Q.E.D. 2 Lemma 4.2 For a given fault F, the RLF and all the NAFs will successfully establish a restoration tree rooted at RLF such that Agent NetReconf can start the reconfiguration step. Proof: According to Lemma 4.1, all non-faulty NAFs will elect the same RLF . Phase 3 and Def. 4.3 assure that a NAF is reached by RLF and that the restoration path is estab- lished. By sending a Restoration Tree Built (RTB) message, as described in Phase 3, it is guaranteed that a NAF is no- tified that the restoration tree was established. Def. 4.1 and 4.2 assure that this point-to-point message is delivered to its destination reliably. Finally, Def. 4.4 assures that both RLF and NAFs receive the routing information describing the restoration tree. Therefore the restoration tree is reliably established. Q.E.D 2 Lemma 4.3 For a given failure all NAFs, NORTs and RLF successfully update their routing tables and Agent NetRe- conf execution terminates. Proof: Since Lemma 4.2 assures that the restoration tree is reliably established, then from Phase 3, it is assured that new routing information is collected by the explorer agents. Def. 4.4 assures that RLF receives the new information and updates its table before any NAF. Def. 4.5 guarantees that the NAFs receive new information after RLF completes its updates. Phase 3 makes sure that RLF knows that a NAF finished updating and that it is ready to resume operations. Q.E.D. 2 Lemma 4.4 All the explorer agents Eij and ERxi deacti- vate. Proof: By Def. 4.4, an Eij explorer returns home after the restoration tree RTF has been established. Phase 3 as- sures that Eij deactivates after the RLF updates its routing information. Similarly, Def. 4.5 assures that ERxi returns home and deactivates after the NAF updates its table. In ad- dition, Phase 3 assures that the Eij that were created and never reach a NAF will deactivate. Q.E.D. 2 Lemma 4.5 In the presence of multiple intersecting restoration trees, none of the intersecting RLs will remain forever in Phase 2. Proof The goal of Phase 2 is to ensure that at any given time only RLs with non-intersecting restoration trees will be executing Phase 3, in which the routing information is updated. In the cases of consecutive failures and simulta- neous disjoint failures, this is always true, so Phase 2 is skipped and the RLs will proceed to Phase 3 independently from each other. If there are simultaneous failures with in- tersecting restoration trees, then their RLs must establish such order, which results in a sequence of temporally dis- joint reconfigurations around single failures or simultane- ous disjoint failures. For each two intersecting restoration trees there is at least one joint node, which detects the intersection. This
  • 7. guarantees that at least one of the RLs in each intersection will be notified about it. The temporal order is established by the intersecting RLs based on their node IDs - nodes with higher IDs have higher priority. All lower priority RLs will wait in Phase 2 until all higher priority RLs have com- pleted Phase 3. Following the algorithm, after completing Phase 3, each RL notifies all lower-priority RLs, which al- lows the next leader in the temporal order to execute Phase 3. Thus, all leaders that were waiting in Phase 2 will even- tually receive the required synchronization messages that allow them to proceed to Phase 3. Q.E.D. 2 Theorem 4.2 On all nodes Agent NetReconf will success- fully complete in the presence of multiple failures, i.e. Agent NetReconf will terminate and the nodes adjacent to the fail- ures will be reachable. Proof: Based on Lemmas 4.1 - 4.5, it can be concluded that the RLF and the NAFs will proceed with all phases of Agent NetReconf and will generate the required explorer agents to carry out the establishment of the restoration tree and the reconfiguration of each node (RLF , NAFs and NORTs) on the tree. Q.E.D. 2 4.3. Liveliness In this section is proved that on completion of Agent Ne- tReconf the network will be reconfigured appropriately. Theorem 4.3 On completion of Agent NetReconf, all con- nected nodes in the network are reachable. Proof: The appearance of a failure causes all the paths that go through the faulty link or node to be bisected. The results are segments of unreachable nodes where each seg- ment begins with a NAF. By Assumption 3.2, the network is not partitioned, such that all connected nodes are reachable through non-faulty physical paths. Lemma 4.2 assures that all the NORTs and NORTs are reachable through a spanning tree rooted at the NAF acting as restoration leader. During the recovery phase, Lemma 4.3 guarantees that all the nodes on the restoration tree have their routing tables updated in a way such that all the faulty segments are replaced with restoration paths. Theorem 4.2 demonstrates that Agent Ne- tReconf will terminate for any single failure by executing a “safe” sequence of reconfigurations that are performed synchronously and coordinated by the restoration leader. Q.E.D. 2 4.4. Safety The goal of this section is to define and prove the safety property of Agent NetReconf, namely, avoidance of infinite loops and cyclic dependencies Theorem 4.4 Agent NetReconf does not create infinite loops or cyclic dependencies. Proof: Cyclic dependencies among the nodes on the restoration tree will not be created, because Step 3.1 pre- vents any explorer agents Eij in search mode to either re- turn back to the RLF or continue exploring if the current visited node was already visited by another Eij from RLF . Lemma 4.5 proves that no restoration leader will be blocked forever in Phase 2. As well, cyclic dependencies between the RLs cannot arise, because they are resolved by always giving priority to the nodes with higher ID or nodes that are already in Phase 3. In the presence of multiple failures, the RLs will enter Phase 3 in the priority order, which was established in Phase 2, i.e., at any time only RLs with disjoint restoration trees are permitted to concurrently execute Phase 3. Therefore, cyclic dependences cannot be formed between the RLs. The RL-NAF relations are based on a strict request-response model, so there are no cyclic dependencies between them. Since all possible faulty NAFs have been isolated from the restoration tree in Phase 1 and all reconfiguration messages are reliably delivered, all loops in Phase 3 will terminate after the corresponding messages are received. Q.E.D. 2 4.5. Cognitive Properties Having autonomous mobile agents execute the algorithm in parallel at each router reduces the required point-to-point interactions between the restoration leader and the NAFs. For instance, two agents would only exchange point-to- point messages when necessary, otherwise they will work with the knowledge that exists at each node, and the knowl- edge they acquire from other agents during the construction of the restoration tree or the reconfiguration phase. To have agents execute the recovery algorithm allows keeping the knowledge of a failure closer to where it hap- pened instead of widely spreading the information to other elements that are oblivious of such a fault. Also, with agents, more intelligent interactions occur between routers. For example, the manager NMi at RLF knows that the ar- rival of an ERxi is the confirmation that the NAF is alive and the path followed by an Eij is the desired restoration path. Similarly, if an ERxi returns home it is known to the NAF that the restoration leader has completed updating its routing information and that it is its turn to do the same. The lower complexity in Agent NetReconf, allows the al- gorithm to scale because it only involves a small number of links, as was proved in Section 4.1. In Agent NetReconf, an explorer agent represents more than one message type of those used in message based al- gorithms such as [1, 5], and without oversimplifying, an
  • 8. agent is considered a smart message that has cognitive and evolutive capabilities. These cognitive properties allow the reconfiguration al- gorithm to execute faster, because the agents are retrieving the information from the data knowledge base at the router and do not have to wait for synchronous acknowledgment from any router. The use of agents in the reconfiguration algorithm helps reduce the number of message exchanges, the number of links used in the reconfiguration and allows an agent to make an optimal selection of the link that leads to the next node. 5. Examples of Failure Recovery 5.1. Node Failure Recovery To illustrate the behavior of Agent NetReconf for recov- ering a node failure, consider that router R fails on the net- work shown in Figure 2. After a TIamAlive timeout expires, routers {A, B, C, D, E, H} detect the failure F. Each router then becomes a Node Adjacent to Failure (NAF) and in par- allel they start selecting a restoration leader RLF . F G H 1 0 0 1 2 3 0 1 0 2 1 4 5 3 0 1 23 0 1 2 1 2 2 1 0 2 ED C B A 30 E 2 1 0 E0,H ER ER B C E E 1 3,H E3,H 1 E3,H 2 E3,H 2 E3,H 3 3 3,H E ERD ER E E3,H 0 R 1,H E 1 1,H E 3,H Figure 2. Node failure recovery Phase 0. In D, NMD queries SD1, its knowledge base, and determines that router H has the highest ID among the others that are two hops away via link L1. Similarly, {A, B, C, D} select H as RLF and then become NAFs. Phase 1. At H, NMH creates three explorer agents EH0, EH1 and EH3, one per active neighbor. Each agent learns the list of NAFs and starts migrating, searching for NAFs. Consider EH3. the explorer when it arrives RE learns that there are two active links L0 and L3, and one fea- sible route via L3. NAFs {C,D} are presumed to be reach- able through L3 and {A,B} will need to be searched via L0. This implies that at least two clones are required. However, since RE is not a NAF then EH3 can continue searching. As each explorer reaches a NAF, a restoration explorer is sent to RLF . At RLF , when ERAH, ERBH, ERCH and ERDH arrive, the restoration tree is considered built, shown with black lines in Figure 2. Phase 2. Since there are no overlapping restoration trees, the agents move to the next phase. Phase 3. Each ERxi sends a point-to-point RTB message back home to make each Eij return back to RLF . Each Eij on its way back learns routing information that it later shares with RLF . Table 1. Router D, original table Dest Port Dest Port A 1 F 0 B 1 G 1 C 2 H 1 D - R 1 E 0 Table 2. Router D, updated table Dest Port Dest Port A 0 F 0 B 0 G 0 C 2 H 1 D - R 1 E 0 When all Eij have arrived, RLF determines the destina- tions that can be reached through its active links and gives to each ERxi a list from which it excludes the destina- tions reachable through the port on which ERxi came in. ERDH, for example, will be provided with {A, B, F, G}. On its way home, each node visited by ERDH provides the destinations reachable through links belonging to RTF ex- cluding those reachable through the links on which ERDH arrived at and departed from the node. When ERDH gets home, it asks NMD to update its routing tables with the information that it is carrying. After NMD finishes updat- ing its table, it sends a point-to-point UCR confirmation to RLF . The table for router D after the reconfiguration is complete is as shown in Table 2 5.2. Link Failure Recovery The following example illustrates the behavior of Agent NetReconf recovering a link failure. Assume that the link connecting routers J and K fails in Figure 3. After the TIamAlive timeout expires, routers J and K start the leader selection phase and both routers assume that its neighbor, at the other end of the link, has failed. Phase 0. During leader selection, router J is selected restoration leader RLJ by routers {A, C, D}. Likewise, router K is selected restoration leader RLK by routers {E, G, H, I}.
  • 9. S D S A S B S F S E E K,3 S E K,3 E J,4 0 E K,3 E K,3 E J,4 0 E J,4 0 J K C D A G I F B H E 0 1 2 3 4 0 1 2 0 1 23 4 0 12 3 4 0 1 2 3 0 12 3 01 2 0 1 2 3 4 Figure 3. Link failure recovery Phase 1. At J, four explorer agents are created: EJ1, EJ2, EJ3 and EJ4. At K, three explorer agents are created EK0, EK1 and EK3. To start building the restoration paths, the explorers from each leader start mi- grating to search for the known NAFs to each leader. In the search process, explorer agents EK3 and EJ4 arrive at restoration leaders RLJ and RLK respectively. With the arrival of the explorers both leaders realize that the router they presumed failed is indeed alive. Both leaders mark faulty the link that connected them and move to determine which is the new role of the supposedly faulty node in this phase. Router J determines that router K’s ID is higher and becomes a NAF belonging to RLK. Router J then issues a deactivate point-to-point message to all its explorers to indi- cate it is no longer the leader, see pseudo-code in Appendix A. After the new role is assumed by J, Phase 1 continues as described in section 3.1. Note that EK3 stays at J since it became a NAF. Phase 2. Since there are no overlapping restoration trees, the agents move to the next phase. Phase 3. Each ERxi sends a point-to-point RTB message back home to make each EKj return back to RLK. Each EKj, on its way back learns routing information that it later shares with RLK. Phase 3 continues as described in section 3.1 to the end. The table for router J after the reconfigura- tion is complete is as shown in Table 4 6. Conclusions This paper has presented Agent NetReconf, a dynamic network reconfiguration algorithm that uses collaborative agents. It was proved by complexity analysis that Agent Ne- tReconf is significantly more efficient than message based algorithms [1, 5], and reduces by more than one order of magnitude the number of interactions and message ex- changes required to perform the network reconfiguration as was explained in Section 4.1. The improvement in complexity achieved in Agent Ne- tReconf is based on the fact that all the agent interactions Table 3. Router J, original table Dest Port Dest Port Dest Port A 0 F 3 SB 2 B 2 G 4 SD 0 C 0 H 2 SE 1 D 0 I 3 SF 3 E 1 SA 0 Table 4. Router J, updated table Dest Port Dest Port Dest Port A 4 F 3 SB 2 B 2 G 4 SD 4 C 4 H 2 SE 1 D 4 I 3 SF 3 E 1 SA 4 occur at each router and the number of point-to-point non- in-router communications are minimal. Another important, but not obvious, contributor to Agent NetReconf’s reduction in complexity, is the representation of agent knowledge as an OWL ontology. Using OWL sim- plifies dramatically the way in which agents exchange in- formation. For example, during the Leader Selection an agent will only have to make a query to the router’s knowl- edge base specifying that it needs to know the neighbor with the highest ID that is two hops away. Querying the OWL knowledge base is executed in constant time and does not require any agents to be created such that its contribution to the communication complexity is zero. This is mainly be- cause the queries are executed locally and never leave the current router. This last property assures that there is no need for the agents, nor Agent NetReconf, to use any global network information. The combination of the agent based architecture and Agent NetReconf represent an important contribution to ac- tive networking because the network takes control of all its tasks and uses intelligence as a way to provide improved reliability and quality routing. The cognitive properties of the agents allow the reconfig- uration algorithm to execute faster, because the agents are retrieving the information from the data knowledge base at the router and do not have to wait for synchronous acknowl- edgment from any other router. This facilitates the optimal selection of the link that leads to the next node during the reconfiguration. To conclude, Agent NetReconf is a low complexity, in- telligent distributed dynamic network reconfiguration algo- rithm that is applicable to network computers with arbitrary topologies, is application-transparent and is capable of iso- lating and tolerating multiple faulty links or nodes.
  • 10. References [1] D. Avresky and N. Natchev. Dynamic Reconfiguration in Computer Clusters with Irregular Topologies in the Presence of Multiple Node and Link Failures. IEEE Transactions on Computers, 55(2), May 2005. [2] N. Bennacer, Y. Bourda, and B. Doan. Formalizing for Querying Learning Objects Using OWL. In Proceedings of IEEE International Conference on Advanced Learning Tech- nologies, pages 321–325, 2004. [3] G. D. Caro and M. Dorigo. Mobile Agents for Adaptive Routing. In Proceedings of 31st International Conference on System Sciences (HICSS-31), 1998. [4] H. Chalupsky, T. Finin, R. Fritzson, D. McKay, S. Shapiro, and G. Weiderhold. An Overview of KQML: A Knowl- edge Query and Manipulation Language. Technical report, KQML Advisory Group, Apr. 1992. [5] J. Duato, R. Casado, A. Berm´udez, and F. J. Quiles. A Pro- tocol for Deadlock-Free Dynamic Reconfiguration in High- Speed Local Area Networks. IEEE Transactions on Parallel and Distributed Systems, 12(2):115 – 132, February 2001. [6] M. Garijo, A. Cancer, and J. Sanchez. A Multi-Agent Sys- tem for Cooperative Network-Fault Management. In Pro- ceedings of the First International Conference and Exhibi- tion on the Practical Applications of Intelligent Agents and Multi-agent Technology, pages 279 – 294, 1996. [7] M. Heusse, S. Gu’erin, D. Snyers, and P. Kuntz. Adaptive Agent-Driven Routing and Load Balancing in Communica- tion Networks. Complex Systems, 1998. [8] C. S. Hood and C. Ji. Intelligent Agents for Proactive Fault Detection. IEEE The Internet Computing, 2(2):65–72, March – April 1998. [9] N. Minar, K. H. Kramer, and P. Maes. Cooperating Mobile Agents for Mapping Networks. In Proceedings of the First Hungarian National Conference on Agent Based Computa- tion, 1999. [10] H. S. Nwana. Software Agents: An Overview. Knowledge Engineering Review, 11(3):205–244, Oct./Nov. 1995. [11] R. Schoonderwoerd, O. E. Holland, J. L. Bruten, and L. J. M. Rothkrantz. Ant-Based Load Balancing in Telecommunica- tions Networks. Adaptive Behavior, 5(2):169–207, 1996. [12] D. L. Tennenhouse, J. M. Smith, W. D. Sincoskie, D. J. Wetherall, and G. J. Minden. A Survey of Active Network Research. IEEE Communications Magazine, 35(1):80–86, 1997. [13] S. Wang, D. Xuan, R. Bettati, and W. Zhao. A Study of Pro- viding Statistical QoS in a Differentiated Services Network. In NCA’03, Proceedings of IEEE International Symposium on Network Computing and Applications, pages 0297–0304, 2003. [14] G. Weiss. Multi Agent Systems, A Modern Approach to Dis- tributed Artificial Intelligence. MIT Press, 2001. ISBN: 0-262-23203-0. [15] T. White, A. Bieszczad, and B. Pagurek. Distributed Fault Location in Networks Using Mobile Agents. In IATA 1998,Proceedings of the Second International Workshop on Intelligent Agents for Telecommunication, volume 1437, 1998. [16] M. J. Wooldridge. The Logical Modeling of Computational Multi-Agent Systems. PhD thesis, University of Manchester, 1992. [17] M. J. Wooldridge and N. R. Jennings. Intelligent Agents: Theory and Practice. Knowledge Engineering Review, 10(2):115–152, June 1995. [18] Y. Yemini and S. daSilva. Towards programmable networks. In Proceedings of IFIP/IEEE International Workshop on Distributed Systems: Operations and Management, 1996. [19] P. Zhang and Y. Sun. A New Approach Based on Mobile Agents to Network Fault Detection. In ICCNMC’01, Pro- ceedings of the International Conference on Computer Net- works and Mobile Computing, 2001.