Master Thesis - Simonis 3206963

Creating Realistic Agent-User Dialogs
P.P.G. Simonis
July, 2009
Utrecht University
Supervised by:
Drs. N.L. Vergunst
Dr. F.P.M. Dignum
Inf/Scr-08-66

Creating Realistic Agent-User Dialogs
Thesis number Inf/Scr-08-66
Student information
Student P.P.G. Simonis
Student card number 3206963
Master Program Agent Technology
Starting year 2007
Master’s Project information
Subject Social Interaction with Agents
Title Creating Realistic Agent-User Dialogs
Supervising institute
or organization
Utrecht University,
Institute of Information &
Computing Sciences,
Faculty of Science
Daily supervisor Drs. N.L. Vergunst
Second supervisor Dr. F.P.M. Dignum

Master Thesis – Creating Realistic Agent-User Dialogs – PPG Simonis 3206963
page 1 of 68
Acknowledgments
I would like to take this opportunity to thank everyone that helped or supported me during the
creation of this thesis. First of all, I would like to thank Frank Dignum, one of my supervisors, who
introduced me to available research at Utrecht University which lead me to chose this research
subject. I would also like to thank Nieske Vergunst, my other supervisor, who provided me with
related work in the form of interesting articles. Her own work also proved to be very helpful along
setting up my own research. I would like to thank both for their help and support throughout my
research. Their guide and availability, in spite of busy schedules, proved very useful whenever I had
questions. In restructuring my thesis, they both also provided great help.
Then of course my family and friends who helped me think outside the box. Explaining what I was
doing helped me look at it from an outsider’s perspective. This can be very useful to regain overview.
Of course all their proof-readings of my thesis have provided me with helpful remarks. Last but not
least, I want to thank them for their patience since my graduation has been my main focus and
priority these last weeks.

page 2 of 68
Abstract
When an agent and a user collaborate in performing tasks, communication comes into play. When
communicating with an agent, a user ideally wants it to be as communicatively skilled as a normal
person. When communicating with artificial intelligence like agents, a lot of aspects come into play.
Besides having to take care of any procedural steps, the agent will also have to be aware of the
domain and context its currently working in and use this in its reasoning and dialog.
To keep the research manageable, an example scenario on agent-user communication is introduced.
In this scenario, the agent assists the user in cooking. Not only will the agent guide the user through
the steps of a recipe, but it will also try to solve occurring problems. Both support and guidance
require communication.
Ideally the agent will be running on a companion robot. The agent will then represent the brain of
this robot. The recipe scenario introduced reduces a very large domain of companion robot to the
more simple task of supporting the user in preparing a recipe. Extending the scenario will ultimately
create a companion-agent to be used together with the a robot to create a companionship robot.
The object of this research is how to put the BDI concepts of agents to use in deliberating on dialog.
With this deliberation, the ultimate goal is to create realistic agent-user dialog. Goals and beliefs can
help structure not only the recipe procedure but also the flow of the dialog. Status information can
for example help guide the deliberation to a suitable answer to the user’s questions.

page 3 of 68
Table of contents
1 Introduction................................................................................................................................5
1.1 The iCat...............................................................................................................................5
1.2 Agents.................................................................................................................................6
1.3 Recipe scenario ...................................................................................................................7
2 Problem definition ......................................................................................................................9
2.1 Research objective ..............................................................................................................9
2.2 Associated questions.........................................................................................................10
3 Related work.............................................................................................................................11
3.1 Agents...............................................................................................................................11
3.2 Human-robot interaction...................................................................................................14
3.3 Research............................................................................................................................15
4 Approach to the scenario..........................................................................................................17
4.1 Elements to support the dialog..........................................................................................17
4.2 Use of BDI features............................................................................................................20
4.3 Recipe phases....................................................................................................................22
4.4 Recipe statuses..................................................................................................................24
5 Approach to the dialog..............................................................................................................26
5.1 Recipe selection phase ......................................................................................................26
5.2 Recipe planning phase.......................................................................................................27
5.3 Instructing phase...............................................................................................................28
5.4 Recipe finalization phase ...................................................................................................31
5.5 Overall communication......................................................................................................31
6 Agent architecture ....................................................................................................................33
7 Agent implementation ..............................................................................................................35
7.1 Recipe selection phase ......................................................................................................35
7.2 Recipe planning phase.......................................................................................................38
7.3 Instructing phase...............................................................................................................41
7.4 Recipe finalization phase ...................................................................................................43
8 Results ......................................................................................................................................44
8.1 Implementation result.......................................................................................................44
8.2 Resulting message based conversation..............................................................................45
8.3 Resulting ‘dialog’...............................................................................................................46

page 4 of 68
9 Conclusion ................................................................................................................................48
10 Future work ..........................................................................................................................49
10.1 Extended use of BDI elements ...........................................................................................49
10.2 Extended dialogs ...............................................................................................................50
10.3 User control.......................................................................................................................51
10.4 Ontology for ingredients and utensils ................................................................................52
10.5 Timing and planning..........................................................................................................52
10.6 Integration with the iCat....................................................................................................53
Bibliography......................................................................................................................................54
Appendix 1: Diagrams and flowcharts...............................................................................................55
Recipe phases and statuses...........................................................................................................55
Dialog flowcharts..........................................................................................................................56
Appendix 2: 2APL illustrations...........................................................................................................59
Recipe selection phase..................................................................................................................59
Recipe planning phase ..................................................................................................................60
Instructing phase ..........................................................................................................................61
Recipe finalization phase...............................................................................................................62
Implementation fragments ...........................................................................................................63
Appendix 3: Example dialog ..............................................................................................................65
The recipe.....................................................................................................................................65
The dialog.....................................................................................................................................66

page 5 of 68
1 Introduction
Communication nowadays has many forms. Conversation alone can be performed in various
different ways. Mobile phones for example allow both use of text messages and phone conversations
whereas the internet allows chat conversations which can be extended with video and voice when
using a webcam and microphone. The conversational form dialog is the focus of my research. This, in
definition, refers to a conversation between two or more people. In my research however, we
assume a one-on-one dialog.
In computer science, software agents are introduced as pieces of software that act for a user or other
program. The idea is that agents are not strictly invoked for a task, but activate themselves. As with
humans, agents also communicate amongst each other. Agents however often have a good reason
for communicating whereas humans sometimes just “chat meaningless”. The agent’s reasons for
communicating could be retrieving information on the current world situation or trying to find out
what goals their partners have. The information they retrieve will then be put to use in their own
deliberation. Communication between agents is generally text-based because text messages are
relatively easy to generate and recognize when compared to speech for example.
The research I have done will form a contribution to Social Interaction with Agents’ (SIA) research, an
existing internal research program at Utrecht University. The goal of the SIA research is to realize an
agent with which a user can have a social interaction. Besides user-to-user or agent-to-agent
communication, users can also interact with agents and vice versa. This research is on exactly this
scenario; a dialog between an artificial agent and a user (human). An important keyword of my
research however is still missing in this description: realistic. My research will investigate how agent-
user dialogs can be made realistic. What exactly makes a dialog realistic will be explained later on.
1.1 The iCat
Not only linguistic skills but also facial expressions have to be
taken into account. In order to give the artificial agent used for this
research a face, an iCat will be used. The pilot for the iCat was
developed and released by Philips in August 2005 (1). In the years
up to now, the iCat has been developed into a robotic agent
suitable for user interaction. The iCat is modeled off humans and
made for studying.
Philips describes the iCat as “a user interaction robot for studying
intelligent behavior and social interaction styles in the home. iCat
is aimed at developing software technologies for decision making
and reasoning. iCat 's interaction style is modeled of the
interaction between humans using speech, facial expressions and
intelligent behavior” (2).
Besides being a friendly companion as a game buddy, iCat can help to control the ambient intelligent
home switching on lights, heaters and other appliances. iCat can also help out in other duties such as

page 6 of 68
keeping track of everyone’s calendars and messages. iCat can even give advice when selecting music
and movies. According to Philips, “iCat has already proven to be a valuable research tool for studying
intelligent interaction styles”.
The iCat can be programmed by using Philips’ Open Platform for Personal Robotics (OPPR) software,
which is another advantage. The incorporated Philips’ Dynamic Module Library (DML) can be used to
make your own components, such as vision and speech recognition components (2). This technology
can be used to allow incorporating agents onto the iCat. This will allow the agents to control the iCat
by using them as ‘brain’ of this robot.
As illustrated on the left, OPPR is represented in 4 parts;
architecture, animation, connectivity and intelligence. To skip a
technical description of all inner workings of the iCat, I will only
briefly describe the OPPR system. For a more detailed
description of all the components, the iCat research community
(2) is a good source of information.
Inside the Architecture part of the OPPR system is the DML
Middleware solution. The DML Middleware defines a software
component model for developing modular applications. Every
module is an executable software component, which is
designed and implemented based on the DML component
model. This would in theory allow agents to run on the iCat. However, research on whether this
actually can be done and if so how, still needs to be done at this time.
Next to this opportunity to develop modular applications, the OPPR system also comes with an
Animation Editor and Animation Module. These can be used for creating animated motions for the
iCat robot by providing precise control over every actuator of the iCat. These can be used to generate
facial expressions for example, giving the agent an even more realistic face. Inside the Intelligence
part of the OPPR system is room for scripting to develop dialogues. This will however be taken care
of by the agent.
1.2 Agents
As already mentioned above, an agent will be used to function as a brain for the iCat robot. The
realistic requirement is actually a reason why agents have been chosen to represent the artificial side
of the dialog. Agents are used to model human behavior which adds to the realism of the dialog.
Modeling a human onto an agent should be more easy than programming it in a ‘normal’
programming language. The agent will eventually use the available sensors and actuators of the iCat
in order to be able to communicate with the user.
For programming the agent, the choice has been made for “A Practical Agent Programming
Language” or in short, 2APL (pronounced: double-a-p-l). This language has been created at Utrecht
University. 2APL is a BDI-based agent-oriented programming language with a formal syntax that
supports an effective integration of declarative programming constructs such as belief and goals and
imperative style programming constructs such as events and plans. On top of that, 2APL allows an

page 7 of 68
effective and efficient implementation of an Multi Agent System (MAS) architecture through
combination of individual cognitive agents, a shared environment and multi-agent organization.
Primary reason to use 2APL is the ability to create cognitive agents based on the Beliefs, Desires and
Intentions (BDI) model. The use of this model allows using constructs as goals and beliefs to support
structuring both the procedure as the flow of the dialog. Another advantage of using 2APL over
another agent programming language, is the flexibility of plans. First of all, multiple plans concerning
the same goal can be separated by using a belief query (to check what the agent currently beliefs to
be true) to allow executing them in the correct order or at the correct time.
If the execution of an agent’s action fails, this failure can be repaired in 2APL with so-called plan
repair rules (PR-rules). A PR-rule indicates that if the execution of an agent’s plan (i.e., any plan that
can be instantiated with the abstract plan) fails and the agent has a certain belief, then the failed
plan should be replaced by another plan. When a matching PR-rule is found, this replacement can be
done during runtime.
Last but not least, 2APL has “programming constructs to improve the practical application of existing
BDI-based agent-oriented programming languages that have formal semantics” (3). These
programming constructs include operations such as testing, adopting and dropping declarative goals,
different execution modes for plans as well as event and exception handling mechanisms.
1.3 Recipe scenario
Having a conversation with a artificial agent can take lots of different forms and can be on a limitless
number of subjects. To keep such a large area manageable, a recipe scenario is introduced. In this
scenario, the user is to prepare a recipe together with an iCat. The iCat will provide help to the user.
Helping in this case goes beyond just providing cooking information in the form of reading
instructions, keeping a timer or sounding an alarm. The iCat should also be able to offer performing a
task for the user for example. Other help could be explaining an action in the recipe in a more
detailed way. Providing support and solutions when problems occur of course is another vailuble way
to help the user.
In this scenario the iCat agent can be seen as a kitchen help with whom the user can have a dialog.
This dialog of course is set in the cooking context. Because of the kitchen setting and the cooking
context, the conversational boundaries are limited. This will make the research, and projects
belonging to it, more manageable. Other reasons for using a cooking scenario as Grosz and Kraus (4)
have recognized are:
(1) unlike tasks such as constructing space stations, making a meal is an undertaking for which
almost all readers have first-hand knowledge and good intuitions.
(2) This task contains the essential elements of the typical collaborative task in which computer
systems and robots in particular may be expected to participate:
- limited resources (including time constraints),
- multiple levels of action decomposition,
- a combination of group and individual activities,

page 8 of 68
- partial knowledge on the part of each agent, and
- the need for coordination and negotiation.
As Grosz and Kraus also mention, the previous work using this domain (such as the work on plan
recognition) provides a baseline for comparison. So using this scenario not only makes it more
manageable but also allows using ideas of other research. This has also been a reason for the
researchers at Utrecht University to also use this recipe scenario.

page 9 of 68
2 Problem definition
This chapter will define the primary object of my research. There is a main question behind my
research for which, in doing this research, I hope to find an answer. Alongside this main question,
sub-questions form. These might form along the line of the main question or submerge as several
aspects of the research are looked into.
2.1 Research objective
To start this chapter, I will introduce the main object of my research and the related question. The
title of this document already gives away the research objective: creating realistic agent-user dialogs.
Of course these agent-user dialogs will ideally be just as realistic as when the user is having a dialog
with any other person. The question related to this can then only be: “How do you create realistic
agent-user dialogs?”. Of course performing the research in the domain as mentioned earlier, this will
also reduce the size of the question. The pursuit of creating a realistic agent-user dialog will however
not change.
This main question brings along several different other questions, the first and most intriguing
stating: “When can we call an agent-user dialog realistic?” This for me is mostly determined by what
the agent asks the user and how the agent reacts to the user. Of course during the recipe procedure,
the agent will expect input from the user. The agent will communicate with the user in order to get
the information it needs. Besides the expected informational needs of the agent, the user might also
require information from the agent. These questions, for example a question on what to do next,
might be asked at unexpected timing. Even problems with kitchen utensils might occur during
execution of actions. A water boiler malfunctioning or the water supply temporarily being cut off
could have drastic consequences for a recipe. When these problems occur, the user might consult
the agent for a suitable solution.
The agent should be able to react to remarks and questions of the user. This requires the agent to
relate the question of the user to the current situation. Understanding and answering the user is
needed in order to reach their goal. Before the agent can react to the user, it needs to process the
provided information and deliberate on its answer.
Besides finding out what the user is asking, the agent also has to ask itself why the user asked this
question. To be able to do this deliberation, the agent should be aware of the conversational context
as well as the domain context. An interesting thing to look at along these lines is; What
conversational information provided by the user (or retrieved from the world) the agent can use in
its deliberation. Of course even as interesting is the accompanying question on; In what way
conversational and environmental information should be stored inside the agent to allow him to
work with it efficiently when communicating with a user.
Another feature which adds to making dialogs more realistic is the fact that the dialog is goal-
directed. This ensures utterances continue on previous remarks, adding structure to the dialog.
Having a goal also allows for questions in lines of achieving it. The agent could for example request

page 10 of 68
the user to do something in line of the goal they are trying to reach. Of course this goal can also be
used by the agent to deliberate on the question ‘why did the user ask or tell me this?’.
As an illustration of the sort of dialog we want to achieve, an example dialog is added in appendix 3.
This dialog feels realistic, to us at least, in the questions which are asked and remarks that are made.
Cohen and Levesque (5) used an dialog which did not feel realistic. Their example of a phone
conversation between an expert and an apprentice, to my opinion, has too many confirmation
points. It seems each instruction followed by a request for confirmation. The conversation itself
however is over a different medium (by phone). The lack of eye-contact might cause this need for
more confirmation checks. Being able to see each other is an advantage in the kitchen scenario when
needing confirmation. The iCat could ideally only use its camera to confirm the user has started
working on executing an action.
2.2 Associated questions
A considerable amount of the research will be on examining how to implement the kitchen and
cooking domain into an agent. For example, how will recipes be represented? Of course there’s a lot
more to this then just copying the text from a cookbook. The agent needs to be able to work with it.
A second interesting section of this research is the information the agent keeps on the user. When
for example the agent believes the user can perform a certain action, it doesn’t need to explain to
the user how it is performed.
Also interesting in this research are the different phases of cooking a dish combined with
communication. The planning of a recipe requires communication just as the execution of the recipe
actions themselves. Communication in different phases might have totally different meaning. This
means the agent needs knowledge of the current situation in order to be able to communicate
correctly but also realistically.
My research will therefore examine how beliefs (on the current situation), goals and communication
can be combined to create more realistic dialogs. To illustrate the ideas produced alongside this
research, a 2APL agent will be created in which these techniques will be implemented. It is perhaps
the most interesting to see how the agent combines status information and received messages in its
deliberation in order to come to a suitable reply or course of action.

page 11 of 68
3 Related work
As the title reveals, this chapter will focus on work related to my research on creating realistic agent-
user dialogs. Related research is discussed and examined on whether it is useful to use along my
research or not. Some research might link closely to my research objective where other concepts
form a general idea which might help implementation alone. Several general concepts like the
definition of an agent as well as an introduction to the BDI model and 2APL will pass the review first.
3.1 Agents
Because the notion of ‘an agent’ differs across literature, I have chosen a definition of an agent.
When I started my Agent Technology master at Utrecht University, this definition was given to
illustrate the concept ‘agents’. This definition originates from M. Wooldridge’s book “Reasoning
about rational agents” (5).
Agent definition
Agents are software / hardware entities that display a certain degree of autonomy / initiative and are
proactive / goal-directed. Agents are mostly described in terms of having ‘mental states’ (‘strong’
notion of agency). They show informational and motivational attitudes.
Furthermore, agents have the following properties:
1. Situated: acting in an environment.
2. Reactive: able to react adequately to (unexpected) situations arising in the environment.
3. Pro-active: setting / pursuing its own goals.
4. Social: able to co-operate with other agents
This definition may look extensive for such an abstract notion. If you read carefully however, you can
see this definition only hints on some agent features and displayed behavior. It leaves out how these
should be implemented. This is left for the creator of the agent to decide. I have used this definition
throughout several different courses at Utrecht University and found it suiting in every situation.
The BDI-model
In the above given definition, terms like ‘mental states’ and ‘goals’ already provide hints towards the
BDI-model, a software model developed for programming intelligent agents. The letters BDI stand for
Beliefs, Desires and Intentions. Besides these three concepts, a BDI-agent also has Plans at its
disposal. These four concepts will be briefly discussed below.
Beliefs represent the informational state of the agent. In other words, the agent’s beliefs about the
world including itself and other agents. Beliefs can also include inference rules, allowing forward
chaining to lead to new beliefs. Often, this information is stored in a database (a belief base). By
using the term belief instead of knowledge, the fact that the agent’s beliefs might be false is
incorporated.
Desires (or goals) represent the motivational state of the agent. They represent objectives or
situations that the agent would like to accomplish or bring about. Examples of desires might be: ‘go
to the party’ or ‘become rich and famous’. Because the above given agent definition uses the term
goals, I will also use this term rather than using desires.

page 12 of 68
Intentions represent the deliberative state of the agent: that what the agent has chosen to do.
Intentions are goals to which the agent has to some extent committed. In implemented systems, this
means the agent has begun executing a plan.
Plans are sequences of actions that an agent can perform to achieve one or more of its goals. Plans
may include other plans: a plan to make spaghetti and meatballs for example includes a plan to boil
water to prepare the spaghetti in.
JADE and 2APL
JADE is an abbreviation for Java Agent DEvelopment Framework (6). JADE is a software framework
for multi-agent systems which is created in Java. The JADE platform allows coordination of multiple
agents. Besides coordination, JADE also offers the ability to use an agent communication language
for communication between agents. Every JADE platform contains a main container containing an
agent management system (AMS) and a Directory Facilitator (DF) agent. The AMS is the only agent
that can create and kill other agents, kill additional containers, and shut down the platform. The DF
agent implements a “yellow pages”-service which advertises the services of agents in the platform so
other agents requiring those services can find them.
As mentioned earlier 2APL is an agent programming language created at Utrecht University (7). The
link between 2APL and JADE is that the 2APL platform is built on JADE (6) and uses related tools.
These tools for example allow monitoring mental attitudes of individual agents, monitoring their
reasoning and monitoring communications. 2APL is provided with its own Integrated Development
Environment (IDE).
All 2APL agents consist of the following parts: BeliefUpdates, Beliefs, Goals, Plans, PG-rules, PC-rules
and PR-rules. Next to the beliefs and goals already introduced with the BDI model, belief updates and
rules are introduced. The rules are used for different purposes. PG-rules specify the plans an agent
can generate to complete a certain goal. The PC-rules are procedural rules. These generate plans as a
reaction to either incoming messages, external events or the execution of abstract actions. The PR-
rules are rules for plan repair. These will get triggered when a plan failed and needs to be replaced
(repaired) by another plan. The belief updates are rules with pre- and post-conditions for updating
the agent’s beliefs.
The basic BDI-elements thus return in 2APL in the form of Beliefs, Goals and Plans. The Beliefs are
represented in logical form in the belief base. This belief base consists of facts (about the world or
the agent itself) which the agent holds to be true. These facts can also be formed as formulas
containing variables which the agent can then substitute to reason with its beliefs and deduce new
information. The goal base represents the goals the agent currently has. The goals are in formula
form as well. Last but not least, there is a plan base. These plans are made up out of actions the
agent can perform. The plans can be accompanied by a head. This head then forms a condition
needed to hold before the plan will be executed.
The last element of 2APL I will discuss here is the deliberation cycle. This cycle contains steps which
will cause the agent to adopt plans and execute the corresponding actions in order to reach the goals
the agent has. The discussion of this cycle will illustrate the deliberation process within a 2APL agent.

page 13 of 68
start Apply all PG-rules
Execute first action
of all plans
Process external events
Process internal events
Process messages Remove reached goals
Sleep until external
events or messages
arive
Rules applied,
Plans executed,
Events or Messages
Processed?
no
yes
The 2APL deliberation cycle illustrated above consists of 5 steps. The first is to apply all PG-rules. This
will generate plans for the agent according to its goals. Executing these plans will enable the agent to
(come closer towards) completing the goal. Second is to execute the first actions of the plans the
agent has generated. Next is to process external events such as received messages. These can also
lead to adoption of a plan through PC-rules. Step 5 is to process internal events. These contain for
example errors occurring on execution of plans.
After these 5 steps, the agent will check if it has done something in the last cycle. If so, it will perform
the cycle again. If not, the agent will enter sleep mode until external events or messages arrive.
When they do, the agent enters the cycle again. This way, the agent will generate plans (intentions)
to achieve active goals. This deliberation cycle is also adaptable so the programmer can actually alter
the order in which the agent handles events and plan generation.
More on 2APL such as the latest news can be read on the official website with Utrecht University (8).
How to program agents in 2APL is described in the 2APL user guide available through that same
website.

page 14 of 68
3.2 Human-robot interaction
Communication nowadays has various forms that all are very different from each other. Just like
normal communication, communicating with artificial intelligence (AI) has gone through several
phases. At first the communication with AI was limited to text input. Nowadays, communication with
AI is growing towards Natural Language Processing (NLP) and Speech Processing. Although NLP is also
needed with text-based input, speech processing clearly refers to verbal communication with AI.
Human-robot interaction (HRI) studies interaction between users (people) and robots. The basic goal
of HRI is to develop principles and algorithms to allow more natural and effective communication
and interaction between humans and robots. This requires knowledge on multiple disciplines like
natural language understanding and robotics but also artificial intelligence and social science. Having
these principles and algorithms, the end-goal is creation of robots which can coordinate their
behaviors with the requirements and expectations of humans.
Although the HRI research has a very broad range, several aspects belonging to HRI are interesting
along the lines of my research. “Human-robot collaboration” for example is a field to which my
research could be accounted a member of. Although the iCat is not as skilled in performing actions as
the user (caused particularly due to a lack of limbs), the user and iCat are collaborating in performing
a recipe. Of course the iCat compensates its lack of actuators by supporting the user on other fields
such as keeping information on how to perform a recipe.
Second, the field “Social robot” also belongs to HRI research. This is of course the ‘ultimate’ goal of
the Social Interaction with Agents research at Utrecht University; to create an agent which can be
integrated with a robot to create a social robot. However, as the term ‘social robot’ describes, we
need a robot as physical representation of the agent. This could be taken care of by enabling the
agent to run on the iCat.
When looking at talking to artificial intelligence, we can also discover robots described as
“chatterbots”. A chatterbot is a type of conversational agent, a computer program designed to
simulate an intelligent conversation with one or more human users via auditory or textual methods.
Most chatterbots “converse" by recognizing cue words or phrases from the human user. This allows
them to use pre-prepared or pre-calculated responses which can move the conversation on in an
apparently meaningful way without requiring them to know what they are talking about.
Chatterbots however usually do not take part in goal-directed conversations. They “just talk”. Most
chatterbots do not attempt to gain a good understanding of a conversation to allow them to carry on
a meaningful dialog. This is a large difference considering our target agent, which is clearly directing
its conversation based on preparing a recipe and needs understanding of the conversation at hand.

page 15 of 68
3.3 Research
Several fields of research provide a background to my research. Some work will relate closely
whereas other work can be used in specific areas. Cohen and Levesque in their paper “Confirmations
and Joint Action” analyzed features of communication that arise during joint or team activities (9).
When communicating on joint action, a lot of elements come into play. For example, are both parties
required to confirm everything that’s said? And how will things be confirmed? Can one assume
messages will reach the receiver? Cohen en Levesque specify joint intention to bind teams together.
Furthermore, they claim an account of joint intention is needed to characterize the nature of many
situation-specific expectations properly.
Cohen and Levesque assume “that the agents jointly intend to perform together some partially
specified sequence of actions.” In the recipe scenario, both user and agent intend to prepare a
recipe. What actions are used to reach the final result is not specified, just the result they would like
to reach. They make an additional statement that “commitment to an action sequence gives rise to a
commitment to elements of that sequence”. This can be illustrated as follows in the recipe scenario;
if you want to make spaghetti Bolognese, you have to make the required pasta sauce.
On confirmation in conversation, Cohen and Levesque write: “both parties should be committed to
have the goal to attain mutual belief of successful action”. They argue because both parties jointly
intend to engage in the task, the requirement arises for confirmation of actions. One party is
supposed to confirm whereas the other party may hold him to it. This can be used in the recipe
scenario when the agent has instructed the user. The user is suppose to react to this instruction
whereas the agent may request confirmation from the user.
Other closely related research is the work of N. Vergunst et al. on “Agent-Based Speech Act
Generation in a Mixed-Initiative Dialogue Setting” (10). Their paper presents a system that acts both
proactively and reactively, accounting for a mixed-initiative interaction. Furthermore, they use the
same scenario of a cooking assistant alongside treating some general principles for mixed-initiative
systems.
Their system is proactive in that it is goal-directed and that it takes initiative to check whether a
certain recipe is possible given the circumstances, instead of blindly following the user's commands.
It also keeps track of the user's capabilities and tailors its instructions to them. The system is also
reactive, waiting with its next instruction until it believes that the current task is done. The system
uses the concept of joint goals, which translate into joint plans that consist of tasks that will be
performed by one of the participants. It also uses communication actions to make sure all
participants are aware of the current status of the joint plans and goals. As will follow in the
illustration of my own research, I’ve used a large amount of the presented concepts in my own
research and agent implementation.
An interesting field belonging to Human-Robot Interaction research is “Dialog management”.
Research done in this field concerns use of a dialog system in robots (or agents) to converse with a
human with a coherent structure. Dialog systems have employed text, speech, graphics, gestures and
other modes for communication on both the input and output channel. Some systems include all
forms of input and output channels where others only incorporate a subset.

page 16 of 68
An example of a dialog system is OVIS (11). OVIS stands for: Openbaar Vervoer Informatie Systeem,
which is Dutch for ‘Public Transport Information System’. OVIS is developed in order to make it
possible to answer more calls on information about various forms of public transport. They decided
to use a spoken dialog system to try and automate part of this service. This dialog system is
specialized on dialog input and output. Since it concerns communication through phone, gestures,
text and graphics are not needed.
There are lots of dialog systems already out there. These can be categorized alongside different
dimensions such as modality which concerns text-based systems, spoken dialog systems and multi-
modal systems. Another way to categorize dialog systems is by separating them according to
initiative. The user might take initiative in the system or the system might take the initiative. Of
course a mixture is also possible. A last possible way of categorization I will introduce is on
application. Dialogue systems can of course be categorized based on what they were designed to be
used for. This could be ‘information service’ (such as OVIS), entertainment, companion systems or
even healthcare.
What is principal to any dialog system, no matter what it is designed for, is the dialog manager. The
dialog manager is a component that manages the state of the dialog and the dialog strategy. In the
created agent, we will see this is taken care of by the dialog module. The dialog strategy can be any
of the following three strategies; “System-initiative”, “User-initiative” or “Mixed-initiative”. In
“System-initiative” strategy, the system is in control to guide the dialog at each step while with the
“User-initiative” strategy, the user takes the lead and the system responds to whatever the user
directs. Of course in the “Mixed-initiative” strategy users can change the dialog direction. The system
follows the user request, but tries to direct the user back the original course. This is the most
commonly used dialog strategy in today's dialog systems. As will be illustrated further on also, the
current implemented agent applies the “system-initiative” strategy.

page 17 of 68
4 Approach to the scenario
This chapter will elaborate on some principles that are of interest for this Master’s Project. Based on
the principles described in this chapter, I’ve taken on this Master’s Project. Several aspects which the
agent can use in order to come to a realistic answer to the question or information provided by the
user will be discussed. Furthermore, the way BDI features are used throughout the recipe scenario
will be presented, as will be the approach I took towards splitting the recipe scenario into
manageable phases. Last but not least, the expected dialog during these phases will be presented
alongside flowcharts.
4.1 Elements to support the dialog
As mentioned in the problem definition, the agent needs to keep the dialog realistic along several
lines like being able to answer to unexpected remarks of the user. To be able to do this, the agent
can use conversational features, shared concepts between user and agent and also internal data like
beliefs and or goals. This section will elaborate on the elements the agent can use to support keeping
the dialog realistic.
4.1.1 Context
In order to come to an realistic answer to the user’s questions or remarks, the agent can use two
contexts available. The agent can make use of the conversational context as well as the domain
context. It can use this information to guide its deliberation on the provided information. This
deliberation should then lead him to take action like answering the user’s questions, performing an
action or adopting a goal. The following two examples illustrate this through a question the user asks
and a remark the user makes.
Imagine the user asking the question: “Can you turn on the stove?”. The agent should realize using
the kitchen setting in context of cooking a dish, the user wants to use the stove in the cooking
process. Using the recipe at hand, the agent can determine the stove is needed to for example boil
water in a pan to cook the pasta in. The agent can of course also first ask: “Why do you want me to
turn the stove on?” When the user replies: “To boil water on”, the agent could react with a simple
conformation and turn the oven on. The agent could also provide the user with an alternative such
as: “You could also use a water boiler to do this. Do you want to do this instead?”.
Another situation is where the user informs the agent: “The water boiler can’t be turned on”. Using
its internal recipe, the agent can deduce the user needs the water boiler to boil water to put the
pasta in. Of course it could also ask “Why do you need the water boiler?”. Now knowing water needs
to be boiled and the water boiler is broken, the deliberation could lead the agent to an alternative
approach. The agent can reply: “I suggest to use a pan to boil water”.
In the above situation, the user took initiative and asked or informed the agent. The agent will act
differently with other levels of user control. When the user has a high level of control, the agent can
wait until the user asks questions on steps to take (as shown above). When there’s a low level of user
control, the agent should ideally take initiative and walk the user through the process. The agent will
help the user performing the recipe by asking the user to perform the actions the recipe consist of.

page 18 of 68
This way of instructing the user indicates low level user control. Since the user can always ask
questions or make remarks, the level of user control can also switch when the user takes initiative.
4.1.2 Shared concepts
Another aspect to help structure the conversation is the fact that the agent and the user share
certain things. First of all, they’re joined by the task of preparing a recipe. They both have the same
goal, a joint goal (4). Because the goal is shared, the agent knows the goal of the user. The agent
therefore knows that anything the user attempts will most likely be to achieve its goal of cooking the
selected dish. This goal can be reached by executing a plan. Since both participants (user and agent)
are collaborating to reach the goal, this plan can be described as a collaborative plan. In the sections
below, both the joint goal and the collaborative plan will be discussed as well as their use in the
scenario.
Joint goal
The joint goal will trigger plans inside the agent which leads the agent to handle the needed steps in
creating a dish together with the user. Until the recipe has been prepared, the goal will keep
triggering plans, it will keep the agent committed to executing plans.
Throughout the cooking process, the agent will have different sub-goals which bring it closer to
reaching its final goal of completing the recipe. Whenever the agent is interacting with the user,
these sub-goals will provide information of the current situation. This can be used when deliberating
on an answer to a question of the user. However, knowing the user wants to cook a dish is not very
detailed information.
To allow for more detailed status information, a recipe status can be created. This status illustrates
the agent’s location in the recipe procedure like “searching for a recipe” or “instructing”. The recipe
status will change along the progress the agent and user make in executing the recipe. The status can
be used to put questions of the user into context and of course to keep the agent updated on what
steps to take next.
Collaborative plan
Of course because of the cooperation between the user and agent, a collaborative plan (4) should
form. The most important feature of the collaborative plan in this scenario is the question of who
(agent or user) will perform which action. To once again keep it simple and manageable, the choice
has been made to use a single person (user or agent) per action. The agent will either tell the user it
will perform the action or ask the user to perform it.
Grosz and Kraus in their work on collaborative plans also talk about partial and complete plans. An
agent has a complete plan when it has completely defined the way in which it will perform an action.
Furthermore, they make a distinction between individual (formed by a single agent) and shared plans
(formed by a group of agents). These two separations lead to a number of possible plan-type
combinations. As will be illustrated later on, the agent will create the plan for executing the recipe.
The user is introduced to this plan because he1
initialized the procedure (he decided to prepare a
recipe). Since the iCat is a simple robot without a lot of practical cooking skills due to a lack of limbs,
1
Of course the user could also be female. In the continuing of this document however, we will use ‘he’ to refer
to the user. To avoid confusing the user with the agent, the agent will continue to be referred to as ‘it’.

page 19 of 68
the user will have to perform most actions himself. The agent will thus come up with a plan to
prepare the recipe and request the user to perform most actions.
4.1.3 Internal recipe
Another guide in structuring the conversation is the internal recipe the agent has to fall back on. We
assume the agent has a database of recipes available to work with. Each recipe consisting of sub-
steps following each other to prepare the recipe. Responsibility for creating and managing this
database lies with the programmer of the agent. Alongside implementing the agent, an example
scenario is implemented according to the dialog illustrated in appendix 3. The recipe database used
in this example implementation only contains 2 recipes, enough for a simple example. When the
agent will actually be used in the kitchen, this database of course needs much more recipes. The
agent will work with this recipe database when searching a recipe and creating an instruction list. The
programmer will be responsible for adding, removing or altering any recipes in this database.
The internal recipe is especially useful in the recipe scenario when the agent is instructing the user.
The agent can divide the tasks to be done between itself and the user. Assigning a task to the user of
course means the agent will need feedback. First of all on whether the user accepts, second on if the
task has been done.
Another interesting thing to look at here is when the user encounters a problem. Of course people
sometime ask a question when they’re stuck. This can also be seen as initiating a sub-dialog.
However, if the user fixes the problem without help of the agent, the agent probably will not have
noticed this problem ever occurred. When the user is less experienced, he could of course ask the
agent what to do in the current (problematic) situation. When the agent has just instructed the user,
it can use this information in trying to find what went wrong and possibly fix the problem. Another
approach the agent might take is analyzing the action and finding an alternative which also fits within
the recipe.
Take a pan Boil water
Boil water in
water boiler
Fill pan
with water
Turn on stove
Put pan
on stove
Wait for water
to boil in pan
Put boiled
water in pan
Put pasta
in pan
Wait for pasta
to be ready
Drain water
Optional:
Put a liitle olive
oil in pan
Use lid of pan
to drain water
Put pasta in
colander
Put the pasta
back in to the pan
Cook pasta
Put pan
on stove
The agent’s internal recipe can be represented in tree form as illustrated above. Nodes represent
actions and children of a node represent their sub-actions. As you move down one level in the tree,

page 20 of 68
the actions get more and more atomic. We assume all composite actions (including the recipes
themselves) can be decomposed into these atomic actions.
Some actions might have an overlap in sub-actions (such as boil water as illustrated in the recipe
tree), others might have two completely different approaches (like draining the water). Because
some actions can be done in different ways, this means the agent can make a choice. These
alternative approaches to a single action can also be used when for example one of them failed.
This internal representation will be a guide to what action the agent should ask the user to do next.
The internal recipe will help structure the dialog since the agent can use it to keep the instructions in
the correct order. As already mentioned above, it also helps the agent to determine what was the
last instruction given to the user when a problem occurs.
4.2 Use of BDI features
This section will illustrate how the BDI features are used inside the agent to manage for example
storage of recipes, creation of the instruction list and generating intentions. The BDI features will, as
illustrated below, not only be used for procedural steps but they will also be used in conversational
steps and efficient storage of facts.
4.2.1 Plans
The plans the agent has available are used to help structure the scenario in defining the actions to be
performed after each other. This sequence of actions can of course also contain nested plans. This
allows for a layered approach where high level plans can be worked out into sub-plans. This is exactly
how I started working out the scenario: first on a high level and then down to lower and lower levels.
The highest level of steps I created into recipe phases, which I will describe after this section on the
use of BDI features. The level below define the agent’s responsibilities per phase. When for example
selecting a recipe, the agent is responsible for searching its database for a recipe, checking the recipe
and confirming the chosen recipe with the user. On an even lower level, we arrive at the plans. These
implement the agent’s actions in sub-steps. Having this sequence of steps available helps structure
the scenario for the agent.
4.2.2 Goals
Goals in 2APL can be used to trigger PG-rules. These rules will create plans to reach these goals. As is
defined by the BDI-model, goals of the agent are used to represent objectives. Goals can be used to
keep the agent focused on finishing the recipe procedure. During the execution of the recipe
procedure, other goals come into play to keep the agent committed to achieving a certain sub-goal
such as creating the instruction list.
4.2.3 Beliefs
When going through a recipe scenario, the agent needs to keep track of several kinds of beliefs. Not
only does the agent need to keep track of the current state of the kitchen containing available
ingredients and utensils. Its beliefs should also incorporate recipes and their corresponding
(sub)actions as well as skills of both the user and the agent itself. To allow working with these beliefs,
they need to be stored efficiently in the agent’s belief base. Updating these beliefs is another
concern which will also be discussed in this section.

page 21 of 68
Storage of facts
For storing several aspects of the recipe procedure such as skills or the instruction list, choices have
been made. Storing beliefs in a way the agent can easily work with them is something not to be taken
lightly. Deliberation needs easy access of required facts.
For storing recipes in the agent’s belief base, lists are used. A recipe is stored alongside a code for
identification, a description for searching purposes and the first level of sub-actions inside a list. This
list can consist of atomic actions, composite actions or a combination of these two. The sub-actions
of a composite action are also stored in a list-form. Composite actions can therefore themselves be
seen as a sub-recipe.
Storing decompositions of composite actions alongside the recipes allows for reuse of these
composite actions in other recipes. It also allows for different decompositions of one action. Boiling
water can for example be done in a water boiler but also in a pan. Atomic actions are also stored
alongside a list but this list is empty since they have no sub-actions. As already mentioned in the
previous section when discussing the internal recipe, any defined composite action (a recipe can also
be seen as a composite action) should be able to be decomposed down to the level of atomic
actions.
The atomic action level has been introduced to prevent having to decompose an action like ‘take a
pan’ any further. This level is the lowest in action decomposition we will use in recipes. When
creating test recipes, this level of actions was determined by taking into account ‘basic kitchen skills’.
If the iCat agent is to help someone in preparing a recipe, we assume for example this user knows for
example what a pan is.
Ingredients are stored alongside their available amount (stock(eggs,12) for example) whereas
utensils are stored alongside their current status (like ‘available’, ‘filled with water’ or ‘dirty’). For
storing skills, a very basic approach has been chosen. The skills are saved as a belief statement
coupling the agent or a user to an action when the required skills are available. The actions
referenced in these statements could even refer to a (sub)recipe. The example below illustrates user
‘peter’ having the required skills for the action ‘boil egg’.
Beliefs:
skill( peter, boil_egg ).
As is done with recipe actions, a nested approach has also been used for the instruction list. The sub-
lists in the instruction list indicate sub-recipes. Sub-recipes could be separate parts of a recipe, like
making the Bolognese sauce to go with the pasta. They could also be actions with a lot of sub-actions
grouped together in a sub-list.
Updating beliefs
2APL incorporates BeliefUpdates in an agent. These actions update the belief base of an agent when
executed. A belief update can be used to store information received from the user (through
messages) or the environment (when the agent gains access to the iCat’s sensors). They can also be
used to store temporarily data or results of computations. Besides storing new beliefs, the belief
updates can also be used to alter (update) existing beliefs.

page 22 of 68
Each belief update is accompanied by a pre- and post-condition. The pre-condition indicates the
beliefs required for the belief update to be able to execute. The post-condition illustrates the agent’s
beliefs after the belief update has been executed. Below is a simple example illustrating slicing a
tomato. Pre-condition is that the agent beliefs it has a tomato, post-conditions are that the agent no
longer beliefs it has a tomato but instead now beliefs it has sliced tomato.
BeliefUpdates:
{ tomato } Slice(tomato) { not tomato, sliced_tomato }
4.3 Recipe phases
To best handle the different steps of an agent performing a recipe together with a user, I have split
the recipe scenario into a number of phases. This section will discuss the contents of these recipe
phases. Figure 1 in appendix 1 displays the recipe phases following each other.
To illustrate the different phases, one recipe scenario will be traversed to illustrate start and end of
the phases. As the user enters the kitchen, he will start talking to the iCat. At one point, cooking
comes into play. Since having small talk with the user is not part of the recipe scenario created, the
recipe scenario starts with the user requesting a recipe. This triggers the agent to enter the first
phase: the recipe selection phase.
4.3.1 Recipe selection phase
The user describes what kind of recipe he would like to perform. The first thing the agent does in this
phase is saving the request. It will use this request alongside its goal ‘find recipe’ to search for a
recipe inside its recipe belief base. When a match is found, the agent continues to check the recipe.
The first check is on whether or not all recipe actions can be broken down into the lowest level of
instructions: atomic actions. This check ensures that the agent will always have a way to explain an
action (in the form of listing the sub-actions) to the user. The next check is on opportunities. This
check is done to ensure the user will have the opportunity to perform the recipe steps one after
another. This ensures the recipe steps are complete to be able to get to the result. The last check
concerns checking the kitchen on whether all the needed utensils and ingredients are present. This
check searches through the recipe for the required ingredients and utensils. These found items will
then be checked against the kitchen content. Any missing items will be listed in the agents belief
base.
When the first checks succeed and the kitchen is checked, the agent will ask the user for
confirmation on the choice of recipe. If there were any missing ingredients or utensils, the agent will
include this information. When the user does not confirm, the agent will search and check another
recipe. When the user confirms the choice made by the agent, the agent will adopt the joint goal to
perform the chosen recipe with the user. It will also store a status for the recipe selected. This status
will be set to ‘confirmed’ which triggers the agent to enter the recipe planning phase.

page 23 of 68
4.3.2 Recipe planning phase
As the name reveals, the execution of the recipe will be planned in this phase. The biggest step the
agent will take here is creating an instruction list. This list will contain actions the agent will ask the
user to perform. This list is created based on the top level actions of a recipe, the sub-actions for the
composite actions and the skills of the user.
Before this list will be been created, the agent will ask the user to gather the required utensils and
ingredients. Of course if any items were found missing during the recipe selection phase whilst
checking the recipe, the agent will ask the user to go shopping first. These steps can be seen as the
user placing the needed ingredients on the worktop, ready for use in the recipe. This will limit the
user’s movement away from the work area when cooking. When the assembly is done, the agent sets
the recipe status to ‘active’ which triggers the agent to enter the instructing phase.
4.3.3 Instructing phase
With the recipe set to ‘active’ and the instruction list created in the planning phase, the agent will
now continue to instruct the user. It will go through the contents of the instruction list one by one
and either 1) offer the user to perform it for him, 2) tell the user that it will perform it since only the
agent has the required skills or 3) requests the user to perform the action. After a message is sent,
the agent will pause the recipe to await the user’s reaction. After an action has been performed, the
agents beliefs will be updated according to the executed action. Afterwards, the action will be
removed from the instruction list.
When the instruction list is empty and the recipe status is active, this means the user and the agent
have worked through all actions to perform the recipe. This is a signal for the agent to inform the
user the recipe is completed. The agent will then remove the empty instruction list and set the recipe
status to ‘finished’. This triggers the agent to enter the final phase: the recipe finalization phase.
4.3.4 Recipe finalization phase
The recipe finalization phase takes care of the last steps in the recipe procedure. This phase can be
triggered by either a completed recipe or a cancelled recipe. When the recipe is ‘finished’, the agent
will of course inform the user of this fact. Afterwards, it will drop the joint goal and clean the recipe
status.
As was noted above, another way to reach the recipe finalization phase is through cancellation of the
recipe by the user. The user can at any moment choose to abort the recipe procedure. After
confirmation of this request, the agent will change the current recipe status to ‘cancelled’, triggering
the recipe finalization phase. Because this cancellation can be at any time, the ‘cleaning up’ has to be
more extensive. The agent will check for any remaining goals and or beliefs which will be dropped or
removed.

page 24 of 68
4.4 Recipe statuses
In order to guide the agent through all of the different phases, recipe statuses are used. The main
trigger will be taken care of by goals whereas these statuses will guide on a more specific level. As
the term status reveals, the recipe statuses are used to save the progress the agent has made in
executing the recipe. These are belief the agent has and are therefore stored in its belief base.
Because the status changes multiple times during the execution of a recipe procedure, the agent’s
beliefs on the recipe status will alter during execution of the recipe scenario.
Because the beliefs concern procedural information the agent holds to be currently true, and not
something he strives to achieve, the choice has been made to save these statuses as beliefs. These
beliefs in combination with the agent’s goal trigger plans. When a recipe status is achieved, the
corresponding follow-up plan will be triggered.
An alternate approach could be creating sub-goals which describe the next recipe status to achieve.
This would however produce a large number of goals. Using beliefs for status information also allows
to better see where the agent currently is in the scenario. Having a goal only shows what you strive
to achieve, not what you have done so far. An additional advantage the use beliefs has over the use
of goals concerns implementation. Through the use of belief updates, the belief status information is
easily updated. Furthermore, these beliefs can be ideally combined with the agent’s goal in the 2APL
reasoning rules. When using the sub-goals, the agent would have to check for these goals inside the
plans triggered by the main goal.
As described above, these statuses all will be stored in the agent’s belief base. Appendix 1 figure 2 is
a state diagram illustrating the recipe statuses discussed above and the possible transactions
between different statuses. Below, I will describe the transitions between the different statuses
based on a recipe scenario.
Recipe selection phase
At first, the agent will receive a message from the user concerning a recipe he would like to perform.
When this message is received, the agent adopts a goal to find a recipe for the user. The search
request is stored alongside the initial recipe status:
currentRecipe(User,searching,SearchRequest).
Combining the goal findRecipe(User) with this status, the agent searches a recipe. When a recipe
is found, it will create a status for it in order for it to be checked:
currentRecipe(Code,User,checking). This second status will ensure the search request is
saved. Combined with the findRecipe(User) goal, the checking process will start. When the
checking somehow fails (action decompositions might be missing) the recipe status will be removed
and the recipe will be rejected by creating a belief: recipeRejected(Code, Reason). Because the
second status is removed, the searching will continue. Because of the created belief, the agent will
not check the previously checked recipes again. When the checking succeeds, the second status will
be altered from checking to checked, a signal for the dialog agent to ask the user confirmation on
the recipe.

page 25 of 68
When the user denies the agent’s choice, it will again remove the second status and adopt a
rejectedRecipe(Code, Reason) belief before returning to its search for another recipe. When
however the user confirms the agent’s choice, the search request and checked recipe are merged
into one new recipe status: ‘confirmed’. This triggers the recipe planning phase.
Recipe planning phase
The recipe planning phase starts with changing the recipe status to
’planning(gather_things)’. Since this phase consists of several steps, the recipe status will
also resemble different statuses. First off is gathering the required items. When this is done, the
status will be set to ‘planning(instruction_list)’. For the creation of the instruction list, a
goal will also be adopted which will be dropped again after successful creation of the instruction list.
After this list has been created, the recipe status is set to ‘planned’ indicating the planning has
finished. This will allow a final belief update to execute, changing the recipe status to ‘active’ and
with that starting the instructing phase.
Instructing phase
The agent uses the instruction list to decide which action has to be done first. It will then send a
signal to the dialog agent to communicate this with the user. The dialog agent informs the user of the
next action and moves the recipe state to: ‘awaiting_reply(instruction)’. When the dialog
agent then receives a message in return, it decides upon which action to take. When the user
answers the agent the action has been performed, the agent will return the recipe status to
‘active’. This will trigger the main agent to continue with the next action.
When a sub-list is encountered, the agent will handle this as a separate recipe. It will first pause the
main recipe by changing the status to ‘subRecipe’. Afterwards, it will create an instruction list
containing the sub-list. It will also create a recipe status set to ‘active’ and adopt a joint goal to
execute the sub-recipe. Because of the instruction list, the recipe status and a joint goal, this sub-
recipe will be handled by the agent as any other recipe. After finishing the sub-recipe, the main
recipe status is restored to status ‘active’ continuing with its own instruction list.
Recipe finalization phase
When an instruction list is empty, the agent will finish the recipe by moving the status to ‘finished’.
This triggers a plan to finalize the recipe. First of all, this plan will remove the empty instruction list.
This plan will also remove the corresponding recipe status and joint goal. As mentioned above, if the
recipe concerns a sub-recipe, this plan will resume the main recipe.
Another way to end the recipe is when the user aborts the execution. Whenever the user sends an
abort message to the agent, the agent will first confirm the user’s choice and afterwards, move the
recipe status to ‘cancelled’. This will trigger the same finalization plan but with a slightly different
plan. Since we do not know exactly at what point the user will send a abort message, a clean-up plan
is needed to ensure no data of the current recipe is left behind. Of course here also the recipe status
and joint goal are removed.

page 26 of 68
5 Approach to the dialog
This chapter on the approach used to communication is split into sections according to the different
recipe phases. In each phase, the different kinds of messages expected to go back and forth will be
shortly discussed. We assume all message sent will reach their target. This removes the need to
check with the receiver if the message is received when an answer is overdue.
The way I approached the communication is by looking at both sides (user and agent) and analyzing
which messages the agent and user could send. From those initial messages, the possible reactions
could be analyzed. I found it useful to make the separation between “Agent -> User” and “User ->
Agent” communication to be able to see who initiates contact.
The basic approach to handle messages is to pause the activity and handle the message. When the
dialog is handled, the procedure can be resumed. Whenever the agent needs more information, it
can pause the procedure, ask the user for the needed information and await his reply. However, the
user can at any time interrupt the agent by asking a question. These messages might not always
come at a convenient time for the agent. It might for example be busy executing an update on its
beliefs.
When the user initiates a (section of a) dialog by, for example asking the agent “why?”, it depends on
the current state of the ‘cooking procedure’ as to what the agent will answer. During the several
recipe phases, the agent will also require different kinds of information from the user. Since the
agent will know what it needs, it will know what it can expect in return from the user.
The conversational context helps the agent with finding what information should be retrieved from
the users questions and or remarks. It also helps in finding why the user wants something. Using the
parsed conversation input from the user and the beliefs the agent currently has, the agent should be
able to derive a suitable reply. This can be in the form of adopting goals, answering questions by
providing information or executing actions.
In the ‘Dialog flowcharts’ section of appendix 1 are several flowcharts illustrating the dialogs
discussed below. These will be referenced within the different sections. This chapter will also make
use of the example dialog in appendix 3. This dialog is an illustration of what kind of dialog ideally
should be possible in the finished system. In the upcoming sections, the example dialog will be
referenced by adding a reference to line numbers as follows: [line 12-15].
5.1 Recipe selection phase
The dialog flow related to the recipe selection phase is illustrated in figure 3 in appendix 1. As the
user is eventually going to eat the recipe prepared, we assume he is the one initiating creating it in
the first place. The initial message of the recipe selection phase (and thus the recipe scenario) is the
user requesting the agent to find a recipe. After the agent has received the message, it will start
searching for recipes matching the description [lines 4, 5]. Depending on what the agent finds, there
are different replies possible.

page 27 of 68
When no recipe is found, the agent will first inform the user of this fact. The fact that there are no
results can be for two reasons. The first is that the search term did not match any recipe. If this is the
case, the will request the user to provide a new search and wait for another description. This will end
the ‘current’ dialog. Receiving the new description will start a ‘new’ dialog.
The second case is when all found recipes were rejected (either by the user or the agent). These
recipes can be rejected by the agent itself since they did not pass the checks described below.
Rechecking agent rejected recipes will just result in another rejection since both the recipes and the
checks will not change. What are interesting rejected recipes to look at are those the user rejected.
These have passed the checks but the user did not confirm the agent’s choice. The agent could
gather these recipes and ask the user if he wants to reconsider previous made choices. If so, the
agent will offer the user the choice between earlier suggestions which got rejected by the user. If
not, the agent will request a new search.
Before asking the user to confirm a found recipe, the agent will check it. Several checks will be
performed. A failure in a test is a reason for the agent to reject the recipe and continue its search.
Whenever the checking of a recipe is complete, the agent will let the user know it found a recipe. It
will ask the user for confirmation on execution of the recipe [line 5]. When there are kitchen items
missing (ingredients or utensils), the agent will include this information when informing the user.
When the user decides the missing items to be crucial for successful execution of the recipe, the user
will reject the choice. Of course the user could also reject a recipe because he just doesn’t feel like
making a certain dish [line 6]. This will lead the agent to continue its search after saving the rejection
of the suggested recipe to the belief base [line 7].
When for example a pan is missing, the agent will request confirmation of the user in the following
way: “I found the following recipe: ___. For this, we need a pan which is not available. Do you want
to perform this recipe or should I search for another recipe?” Otherwise, the agent will just ask the
user confirmation by first telling the user what recipe it found: “I found the following recipe ___.”
Followed by a question for confirmation like: “Do you want to perform this recipe?”.
When the user rejects the recipe suggested by the agent, the agent will take note of this in its belief
base and continue searching through the recipes. When the user confirms the checked recipe, the
agent makes stores the recipe code and can start the next phase [line 8]. In the example dialog, the
recipe selection phase takes up line 4 up to and including line 8.
5.2 Recipe planning phase
The dialog flow related to the recipe planning phase is illustrated in figure 4 in appendix 1. During the
recipe planning phase, there is some communication with the user. In the example dialog, the recipe
planning phase starts from line 9. In the example, the agent requests the user information on for
how many people the user would like to prepare the recipe. This information can then be used for
calculating the needed ingredients. The option of creating recipes for various amounts of people is
not incorporated into the created agent. This is however mentioned in more detail in future work.

page 28 of 68
Next, if there were any items missing from the kitchen though the user confirmed the recipe, the
agent will have to send the user shopping. After the user informs this has been done, the agent will
request him to gather the remaining needed utensils and afterwards the remaining needed
ingredients [lines 11 – 25]. The agent will wait for the user to notify he is done with gathering the
remaining things.
After all needed items are gathered, the agent will construct the instruction list. During the creation
of the instruction list, the agent will check the skill information. If the user has the required skill(s) for
the action but it is recently learned, the agent will ask the user if he remembers how to perform the
action. If so, the agent removes the ‘recently learned’ parameter and can add the action to the
instruction list knowing this action will not cause any problems. If the user does not remember, the
agent can also discard the recently learned parameter as well as the skill.
Afterwards, as is the same when the skill was not recently learned or not present, the agent checks
the action on availability for teaching the user. If an action is available for teaching, the agent will ask
the user if he wants to learn how to perform that action. This will expand the user’s skills. Depending
on the answer from the user, the agent will either store a learning parameter or decompose the
action into sub-actions to be put into the instruction list.
The creation of the instruction list does not require any dialog other than requesting the user to
teach him an action or if he remembers how an action is performed. If these requests are not
needed, the instruction list creation will not show in the dialog, as is the case in the example dialog in
appendix 3. After the instruction list has been created, the planning phase is completed. In the
example dialog, the planning phase takes up lines 9 up to and including 25.
5.3 Instructing phase
The dialog flow related to the recipe planning phase is illustrated in figure 5 in appendix 1. During
instructing the user, there are lots of different messages the agent can send to the user. The different
messages ask different information from the user in return. The messages can be a request or simply
an informational message. Of course this phase is also the most sensitive for incoming user messages
such as occurring problems or questions. This phase will therefore form the largest part of the dialog.
As the agent starts going through the instruction list, it will send a message to the user depending on
the skill information. If for example both the agent and the user have the required skill(s), the agent
will offer the user to perform the action for him. If the user agrees, the agent will perform the action.
Afterwards the agent updates its beliefs and continues to the next action in the instruction list.
If the user rejects the agent’s offer, the agent will send another message now requesting the user to
perform the action. The agent will also send the user this request of execution of the current
instruction when the agent does not have the required skill(s). When the user confirms, the agent
notes the user will perform the action. The agent will then wait until the user informs the agent of
the fact that the action is done. Again, afterwards it will update its beliefs on the executed action
before continuing to the next action.

page 29 of 68
If the user however denies the request to execute the action, the agent checks (again) whether it
itself is skilled enough to perform the action. If so, it can perform the action itself (of course letting
the user know first). If not, the action needs to be done by the user which just refused to perform it.
The agent will now stress this fact by informing (instead of requesting) the user that he needs to
perform the action.
If the user again denies the task assigned to him, the agent will let him know this will abort the recipe
and ask him if he’s sure. If so, the agent will abort the recipe. If the user does not want to abort the
procedure, the agent will resume the dialog where they left off. An optional implementation instead
of aborting the recipe is trying to find an alternative action which results in the same outcome. This
however requires the ability to reason with action pre- and post-conditions which is not currently
available in 2APL. Since replacing an action with an alternative could be quite tricky, this is left open
for future work since there was not enough time to incorporate it into my research.
Another possible addition to the implementation is allowing the user to give a reason for his
rejection of the instruction. When for example the user rejects because the skill information is
incorrect (he does not have the required skills), the agent can fix this rejection by decomposing the
action into sub-actions. The user can currently not give a reason as to why he rejected the agent’s
request. This is also discussed in the chapter on future work.
When only the agent has the required skill(s) for the current action, it will inform the user of the fact
that it will perform the action. After the action has been performed, the agent will update its beliefs
on the performed action. Afterwards, the agent will continue to handle the next instruction.
A slightly different process occurs when the agent is teaching the user how to perform an action. Of
course the ideal way to do this is let the user perform all the actions himself. The agent will therefore
when teaching only request the user to perform the actions and not offer or perform them himself.
As mentioned above, when the user rejects a request, the agent will then inform him once again he
needs to perform the action. In this way mentioning this is the only way to complete the recipe. An
illustration of the teaching process can be found in the following sub-section.
Because of the limited abilities of the iCat, the agent will not be able to perform a lot of actions other
than monitoring (by using the camera) or keeping time. As shows in the example dialog, the agent
will request the user to perform all physical actions. The instruction phase takes up lines 26 up to and
including 73 in the example dialog.
Lines 26, 27, 28 and 29 illustrate the agent instructing the user. The user confirms the request. The
instruction can be seen as two steps; washing the parsley and chopping the parsley. As extra
information, the user also informs the agent of when he starts executing the ‘second’ step. The agent
will just answer with ‘okay’ leaving the user to finish the instruction. In line 29, the user informs the
agent he is done.

page 30 of 68
5.3.1 Teaching the user
To shorten future instructions, the agent is equipped with a teaching ability. Upon creating the
instruction list, actions will be checked against the user’s skills. Consider the user does not have the
skill for a certain action A, but does have the required skills for all its sub-actions. If the agent would
let the user know what order of sub-actions action A is built up of, the user will be able to perform
action A by himself in the future.
Starting the teaching process
The first step of course is asking the user whether or not he would like to learn it. This step takes
place when the agent is constructing the instruction list as the agent recognizes a possible action for
teaching. The agent offers the user to teach him this action. Since it might be confusing to the user
when the agent starts teaching an action while teaching another action, the agent will only offer to
teach the user an action which consists of sub-actions for which the user has the needed skills to
perform them.
If the user answers yes, the agent will add a belief to its belief base so it remembers it has to teach
the action to the user and to be able to update the user’s skills in the end. When the user answers
no, the agent will continue constructing the instruction list as normal. The agent will decompose the
action into sub-actions (the user has skills for these sub-actions because of the requirement set for
actions available for teaching) and finish constructing the rest of the instruction list.
The teaching process
When the agent encounters an action marked for teaching in its belief base, it will inform the user it
will teach him a new action. The agent then takes the user through the sub-actions of the composite
action. When all sub-actions are done, the agent will finish the teaching process by letting the user
know he has performed all necessary sub-actions. The agent can now discard the belief of the user
wanting to learn this action and update the user’s skills. Since this skill is freshly acquired, the agent
will set a parameter in its belief base indicating this composite action is recently learned.
Finishing the teaching process
Next time the agent encounters the recently learned action whilst creating the instruction list and
notices the ‘recently learned’-belief, it will ask the user if he remembers how it is done. If the user
answers yes, the agent can remove the ‘recently learned’-belief indicating the user has successfully
learned how to perform the action. This ensures the system will no longer ask the user on whether or
not he knows how to perform the action. The action can be added to the instruction list and the
agent can continue creating this list.
If the user answers he does not remember, the agent will remove both the skill information and the
‘recently learned’-belief. If available for teaching, the agent will then of course offer to teach the user
the action.

page 31 of 68
5.4 Recipe finalization phase
This last phase does not contain a lot of communication. This phase actually only contains the
confirmation by the agent on whether the user is sure of cancelling the recipe. This of course is only
done when the user sent a cancellation message. Otherwise, the agent will reach this phase after the
recipe has been completed. This will trigger the agent to send an informational message to the user
about this fact. When the correct message has been sent, the agent will clean up any remaining goals
and beliefs which are no longer necessary to keep. This cleaning up of the agent’s beliefs of course
will not show up in the example dialog. Lines 74 up to and including 77 in the example dialog
illustrate the recipe finalization phase.
5.5 Overall communication
This section will elaborate on messages which might be sent throughout the recipe procedure so in
any of the above described phases. Since their timing is unknown, this separate section is created.
These will be messages the user sends the agent since any agent initiated dialog can be related to
any of the recipe phases.
As we might have a curious user, willing to learn new things, he might ask the agents questions like
“how much salt should I add?” [line 43]. Replying to these kinds of questions requires the agent to
deliberate on the action at hand. For example, adding salt to the water in which pasta will be
prepared is done to improve the flavor. Another example question is illustrated in line 33 where the
user asks the agent “how small” he should make the meatballs.
This knowledge of the action will not be represented in the post-conditions. It will for example not
say: ‘water with salt to boost the flavor’ as a result of the action ‘add salt to water’. To be able to
answer these kinds of questions, the agent needs more information on the recipe actions besides
their direct results. These informative questions therefore open up a whole domain of additional
deliberation needed to keep the dialog ongoing and realistic.
Other messages the user might send the agent is informing of occurred problems. Of course the
instructing phase is where the most user messages concerning problems arise since this is where the
actions will be executed. Actions might for example fail leading the user to question which further
steps to take. Having the status information and the failed action, the agent could search for an
alternative action (or plan) which brings the user to the desired result. If for example there is a
problem with the oven, the user will inform the agent of this problem [line 51]. The agent derives the
oven is used for baking the meatballs. It will then check its recipe base in order to see whether or not
there’s an alternative recipe for ‘bake meatballs’. The alternative approach (using a pan instead of
the oven) is found and selected for instruction.
A suitable alternative might not always be at hand. In the example dialog, the agent is lucky an
alternative approach is available in its recipe base. Of course for a lot of actions per recipe, searching
a suitable alternative needs to be a flexible method. This needs to take into account the needed
utensils for the rest of the recipe, so the rest of the recipe can remain unchanged. Some actions

page 32 of 68
might partially replace another action, but the post-conditions have to match closely in order to
meet the pre-conditions of the following action.
As a final type of message the user might send the agent, there is the cancellation message. The user
could send this at any time during the recipe procedure. The agent will not ask the user for its
reasons but will request him if he is sure on cancelling the procedure. If not, the agent will disregard
the message and continue where they left off. Otherwise, the agent will resolve to the recipe
finalization phase.
Not all of these kinds of messages are as easy to implement as others. For handling some of these,
like questions on why recipe actions are done, the agent needs additional information. During the
discussion of the agent implementation which will follow, the messages I was able to implement will
be discussed. As is to be expected, the other kinds of messages will return in the chapter future work
alongside more detailed information.

page 33 of 68
6 Agent architecture
Alongside my research I created a 2APL-agent. This chapter will present the created agent and it’s
architecture. To handle different features of the recipe scenario, several files have been created.
Splitting the agent over multiple files results in a modular design which improves ‘readability’ of the
code. It also allows for quick expansion or adaption of the code. In this section, all modules will be
introduced and their relation will be illustrated in a diagram. The resulting agent will ideally run on
the 2APL module which on its part runs on the internal structure of the iCat.
Below, I will list the different modules of the agent and their purpose. Some modules will only be
used for separating belief base data, others may also have plans or algorithms designed for managing
specific aspects of the recipe scenario. The names of the modules are defined to illustrate the focus
of the module in the recipe scenario.
Mainagent.2apl This file concerns the agent itself. All other modules have been
separated from this file to create the modular design. The main control
cycle the agent follows during execution of a recipe scenario is defined
in here. All other modules are called along this main cycle.
Dialog.2apl This module will take care of all communication towards and from the
user. Depending on the user’s messages, it will take appropriate action.
Recipes.2apl The ‘recipe module’ stores the recipes and the accompanying needed
ingredients and utensils. It also contains the methods for searching and
checking a recipe.
Actiondecompositions.2apl This module shows decomposition of actions into sub-actions. This will
only concern belief base statements.
Skills.2apl In the ‘skills module’, all the user’s and agent’s action related skills are
stored. This module will also only contain belief base statements.
Instructionlist.2apl The ‘instruction list module’ uses the ‘action decompositions module’
together with the ‘skills module’ in order to create a user specific
instruction list for the selected recipe.
Kitchen.2apl The ‘kitchen module’ represents the kitchen including all ingredients
and utensils. It stores the stock and available utensils. It also houses
methods for checking the kitchen on available ingredients and utensils
as well as a method for gathering these before the user and agent start
preparing a recipe.

page 34 of 68
Atomicactions.2apl This module contains the belief updates to be executed when an action
has been performed in the kitchen. When for example the user has
turned on the gas, the agent needs to update its beliefs to resemble
this fact.
Listoperations.2apl This last module stores belief updates on working with lists such as the
instruction list. This module separates this functionality from the rest
so adaption of these belief updates can be made when for example the
Prolog version incorporating 2APL changes.
As the module for running this 2APL agent on the iCat was not available at the time of my research,
these modules were loaded into the 2APL user interface and run on a computer. Below is a diagram
of which module includes what other module. The instruction list module for example needs to know
what skills the user has in order to create the right instructions. Therefore, the instruction list module
includes the skills module. Here one can also clearly see the ‘main agent module’ is on top of it all.
Main agent
(mainagent.2apl)
Recipe module
(recipes.2apl)
Kitchen module
(kitchen.2apl)
Instruction module
(instructions.2apl)
Action decompositions
(actiondecompositions.2apl)
Atomic action implementations
(atomicactions.2apl)
Skills module
(skills.2apl)
Dialog module
(dialog.2apl)
List operations
(listoperations.2apl)
To allow testing the created agent, an user agent has been introduced. This agent houses procedural
rules containing plans to respond to the agent’s messages. The ‘user agent’ can be instructed to start
a recipe scenario by sending it a specific message. The user implemented along the agent however
has an initial plan to request a recipe. When the created agent runs on the iCat, a real user will talk to
the iCat. This speech will then have to be parsed to messages with which the agent can work.

Master Thesis - Simonis 3206963

Master Thesis - Simonis 3206963

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (7)

Similaire à Master Thesis - Simonis 3206963

Similaire à Master Thesis - Simonis 3206963 (20)

Master Thesis - Simonis 3206963