This document discusses a study that analyzed how helpers' eye gaze patterns related to task properties, partners' actions, and message content during remote collaborative puzzle assembly tasks. The researchers found helpers looked at alternative puzzle pieces more when they were harder to differentiate and less over repeated trials. They then used this data to build a classifier model that predicted helpers' focus of attention with 65-74% accuracy based on task properties, partner actions, and message content. The results suggest intelligent camera systems could automatically show helpers desired views by understanding these predictive factors.
This document summarizes two experiments that compare an eye gaze interaction technique for object selection to selection using a mouse. In the first experiment, participants selected highlighted circles from a grid using either eye gaze or a mouse, and in the second they selected letters from a grid. The results showed that the eye gaze technique was faster than using a mouse in both experiments. The eye gaze technique was effective because it leveraged an understanding of eye movements and preserved the natural quickness of the eye, despite latencies from the eye tracking hardware.
Personality Traits and Visualization Survey by Christy CaseChristy C Langdon
This document summarizes key findings from 5 research articles on personality traits and visualization. The articles found:
1) Machine learning can classify users as fast or slow based on eye gaze and interaction data, and predict personality traits. This could enable real-time adaptive visualization systems.
2) Studies found locus of control, a personality measure, was important - those with internal locus performed better on procedural tasks in different interfaces.
3) Gaze-cursor alignment varies by time, task, and cursor behavior. Models using multiple cursor features can predict gaze better than cursor position alone.
4) Manipulating locus of control influenced visualization task performance, showing personality affects problem-solving.
5) Intro
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor Actstoukaigi
This document describes research into developing a conversational robot that can integrate speech acts and sensorimotor acts when resolving ambiguities. The robot needs to decide whether to ask a clarifying question or perform a sensory action like moving its head to see from a different perspective. The researchers present a planning algorithm that treats speech acts and sensory actions in a common framework by calculating the expected costs and information rewards of different actions. They evaluate the algorithm's performance under various settings and discuss possible extensions.
CHAINED DISPLAYS: CONFIGURATION OF MULTIPLE CO-LOCATED PUBLIC DISPLAYSIJCNCJournal
Networks of pervasive display systems involving public and semi-public displays have
allowed experiences to be created that span across multiple displays to achieve a stronger
effect on the viewers. However, little research has been done so far on the configuration
of content for multiple displays, especially when encountered in sequence in what is
commonly referred to as chained displays. As a first step towards determining appropriate
configuration strategies for chained displays, we have identified and investigated different
approaches for configuring content. We report on a user study on the effect of the different
configuration models in terms of usability and user engagement.
This document proposes a Curiosity-Based Learning Algorithm (CBLA) to allow distributed interactive sculptural systems to learn about their own mechanisms and environment through self-experimentation and interaction with humans. The CBLA uses multiple learning agents that each focus on a subset of the system's sensors and actuators. This allows the learning to scale to larger systems. Experiments on a prototype sculpture show it exploring patterns and collective learning behaviors as the agents integrate through shared inputs. The CBLA is meant to reduce reliance on pre-programmed behaviors and allow the sculpture to adapt over time to keep interactions engaging for humans.
Advances in Mixed Reality (MR) technologies are reshaping collaborative practices. The seamless integration of
physical and virtual elements enhances the perception of the
working environment, providing a more enriched collaborative
task experience. While revealing intriguing potential across
various sectors, wearing head-mounted displays (HMDs) can pose challenges in communication and in understanding others’ behaviours. This paper analyses the main elements of collaborative
augmented practices through the case study of Hololiver, a MR
system developed to assist surgeons in planning laparoscopic liver surgeries. The work discusses guidelines for designing interfaces
to preserve awareness in MR interactions.
This document summarizes two experiments that compare an eye gaze interaction technique for object selection to selection using a mouse. In the first experiment, participants selected highlighted circles from a grid using either eye gaze or a mouse, and in the second they selected letters from a grid. The results showed that the eye gaze technique was faster than using a mouse in both experiments. The eye gaze technique was effective because it leveraged an understanding of eye movements and preserved the natural quickness of the eye, despite latencies from the eye tracking hardware.
Personality Traits and Visualization Survey by Christy CaseChristy C Langdon
This document summarizes key findings from 5 research articles on personality traits and visualization. The articles found:
1) Machine learning can classify users as fast or slow based on eye gaze and interaction data, and predict personality traits. This could enable real-time adaptive visualization systems.
2) Studies found locus of control, a personality measure, was important - those with internal locus performed better on procedural tasks in different interfaces.
3) Gaze-cursor alignment varies by time, task, and cursor behavior. Models using multiple cursor features can predict gaze better than cursor position alone.
4) Manipulating locus of control influenced visualization task performance, showing personality affects problem-solving.
5) Intro
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor Actstoukaigi
This document describes research into developing a conversational robot that can integrate speech acts and sensorimotor acts when resolving ambiguities. The robot needs to decide whether to ask a clarifying question or perform a sensory action like moving its head to see from a different perspective. The researchers present a planning algorithm that treats speech acts and sensory actions in a common framework by calculating the expected costs and information rewards of different actions. They evaluate the algorithm's performance under various settings and discuss possible extensions.
CHAINED DISPLAYS: CONFIGURATION OF MULTIPLE CO-LOCATED PUBLIC DISPLAYSIJCNCJournal
Networks of pervasive display systems involving public and semi-public displays have
allowed experiences to be created that span across multiple displays to achieve a stronger
effect on the viewers. However, little research has been done so far on the configuration
of content for multiple displays, especially when encountered in sequence in what is
commonly referred to as chained displays. As a first step towards determining appropriate
configuration strategies for chained displays, we have identified and investigated different
approaches for configuring content. We report on a user study on the effect of the different
configuration models in terms of usability and user engagement.
This document proposes a Curiosity-Based Learning Algorithm (CBLA) to allow distributed interactive sculptural systems to learn about their own mechanisms and environment through self-experimentation and interaction with humans. The CBLA uses multiple learning agents that each focus on a subset of the system's sensors and actuators. This allows the learning to scale to larger systems. Experiments on a prototype sculpture show it exploring patterns and collective learning behaviors as the agents integrate through shared inputs. The CBLA is meant to reduce reliance on pre-programmed behaviors and allow the sculpture to adapt over time to keep interactions engaging for humans.
Advances in Mixed Reality (MR) technologies are reshaping collaborative practices. The seamless integration of
physical and virtual elements enhances the perception of the
working environment, providing a more enriched collaborative
task experience. While revealing intriguing potential across
various sectors, wearing head-mounted displays (HMDs) can pose challenges in communication and in understanding others’ behaviours. This paper analyses the main elements of collaborative
augmented practices through the case study of Hololiver, a MR
system developed to assist surgeons in planning laparoscopic liver surgeries. The work discusses guidelines for designing interfaces
to preserve awareness in MR interactions.
Intention recognition for dynamic role exchange in hapticأحلام انصارى
1) The paper summarizes an experimental study that investigated the utility of a role exchange mechanism in human-computer collaboration through haptic interaction.
2) In this framework, a human dynamically interacts with a computer partner by communicating through the haptic channel to trade control levels on a shared task.
3) The study examined conditions of equal control, role exchange, and visuo-haptic cues, analyzing effects on task performance, efficiency, and subjective measures.
An investigation into the physical build and psychological aspects of an inte...Jessica Navarro
This dissertation investigates creating an interactive information point and examines the psychological effects on users. The student aims to build an animatronic information point that tracks objects and interacts with users. Research covers object tracking hardware/software, human-computer interaction, and effects of anthropomorphism. The student will create a physical animatronic head, programming in LabVIEW and Roborealm, conduct user testing via questionnaire, and analyze the results. The dissertation aims to determine if a more lifelike interactive information point improves the user experience of conveying information.
CONSIDERATION OF HUMAN COMPUTER INTERACTION IN ROBOTIC FIELD ijcsit
This document discusses considerations for improving human-robot interaction through applying principles of human-computer interaction. It summarizes several existing models of human-computer interaction and adapts the action theory model to the context of human-robot interaction. The adapted model incorporates the simulation of robot emotions and awareness of human emotions into the interaction process. The summary then provides examples of how this adapted model could be applied to scenarios of social interaction between humans and robots.
Lukasz Jedrzejczyk developed a haptic interface called "Privacy-Shake" that allows users to manage location privacy settings on their mobile phones through shaking and sweeping motions. He conducted a study with 16 participants to evaluate the usability of Privacy-Shake. The study found that while Privacy-Shake allowed some tasks to be completed faster than a graphical user interface, many participants struggled with some gestures and felt the interface was awkward or annoying. Jedrzejczyk plans to continue improving Privacy-Shake by adding calibration, more discreet gestures, and expanded functionality.
DYNAMIC AND REALTIME MODELLING OF UBIQUITOUS INTERACTIONcscpconf
This document discusses modeling real-time interaction between a user and a ubiquitous system using dynamic Petri net models. It proposes using Petri nets to model a user's activity as a set of elementary actions. Elementary actions are modeled as Petri net structures that are then composed together through techniques like sequence, parallelism, etc. to form an overall model of user-system interaction. The models can be dynamically adapted based on changes to the user's context. OWL-S ontology is used to describe the dynamic aspects of the Petri net models, especially real-time composition of models. Simulation results validate the approach of dynamically modeling user-system interaction through mutation of Petri net models.
The document describes a study that investigates using gestures as a form of authentication on smartwatches. The researchers collected accelerometer data from smartwatches as users performed different gestures. They extracted time and frequency domain features from the data and used k-nearest neighbors and random forest classifiers to distinguish between gestures and identify individual users performing the same gesture. Through 5-fold cross validation experiments, they found it was possible to accurately classify gestures and identify users with error rates comparable or better than previous gait-based authentication studies. This suggests gesture-based authentication on smartwatches is a viable solution.
1
User Interface Development
User Interface Development
Shashank Pitla
Wilmington University
Iteration 1 – Develop a storyboard
Plan
Nowadays, as the technology and the Web are continually being used to perform various operations, it becomes paramount to have an interactive and attractive user interface (Molina, Redondo, & Ortega, 2009). That is because humans interact with these systems through an interface. This iteration entails storyboarding for the user interface. A storyboard is a technique used for illustrating the interaction between humans and products in a narrative format that incorporates a series of sketches, drawings, pictures, and words to tell a story (Gruen, 2000). In this iteration, I plan to create storyboards that specify how the user interface will be changing in reaction to the user’s actions as well as to show the external elements to the system. I plan to use as few details as possible to get the key points on board regarding the big picture because the storyboard is supposed to present clear and precise information of the user interface.
In the procedure for storyboard design, there are three major activities that I plan to carry out including deciding what to incorporate, building the storyboard, and lastly feedback and iteration. In deciding what to do, I plan to interact with some users in the company to understand their needs, goals, and background. This analysis will also aid in understanding the system and the features. I will also get to brainstorm with the design team, identify people and artifacts in this storyboard and then develop the storyboard scenarios. During the time for building the storyboard, I will put the gathered information concerning the storyboard features into practice and illustrate the user actions on the storyboard. During the last step in my procedure; feedback and iteration, I plan to gather feedback from the internal and external stakeholders and then iterate the storyboard design.
Action
The documenting of the iteration’s objectives was the first activity that was carried out before commencing the main activities of the session. I then began with the first step of my procedure that is, deciding what to include. To accomplish that, I had to interact with the users with the aim of understanding their backgrounds and goals, and to understand the system better in terms of the desired features. I also brainstormed with the design team about the storyboard before developing the storyboard scenarios.
In the second step, I broke the story into smaller sections known as frames; I identified the key frames from the scenarios as I focus on each frame’s individual features. In each frame, I had to draw the user, the product as well as other fundamental objects for each frame. I used tests for the users’ thoughts or reactions and made sure to use as minimum detail as possible in communicating the user interface features. I then wrote short descriptions for each frame to ...
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...Kalle
To estimate viewer’s contextual understanding, features of their
eye-movements while viewing question statements in response to definition statements, and features of correct and incorrect responses were extracted and compared. Twelve directional features
of eye-movements across a two-dimensional space were created, and these features were compared between correct and incorrect responses. The procedure of estimating the response was developed with Support Vector Machines, using these features. The estimation performance and accuracy were assessed across combinations of features. The number of definition statements, which needed to be memorized to answer the question statements during the experiment, affected the estimation accuracy. These results provide evidence that features of eye-movements during reading statements
can be used as an index of contextual understanding.
This document presents a framework for reusing existing software agents through ontological engineering. The framework includes components like a user interface agent, query processor, mapping agent, transfer agent, wrapper agent, and remote agents containing ontologies. The query processor reformulates the user's query, the mapping agent identifies relevant ontologies, and the transfer agent sends the query to remote agents. The remote agents provide ontologies as output, which are then integrated/merged and presented back to the user interface agent. The goal is to enable reuse of heterogeneous agents across different development environments through a standardized ontology representation.
This document summarizes a study that used the Think-Aloud Protocol technique to observe users' mental models and gestural interactions when using a touchscreen tablet for the first time. Five participants completed tasks in a calendar app and space game app on an iPad2. Observations found that users struggled with gestures requiring indirect touch like long presses and spin gestures. They tended to use direct touch gestures like taps and drag-and-drops based on prior mouse experience. The study provides insight into how users' existing mental models can influence their understanding of new touchscreen interactions.
The document discusses using Google Glass in a classroom setting to provide teachers feedback from students. It proposes an Augmented Lecture Feedback System (ALF System) that uses Google Glass to display symbols over students' heads indicating their understanding of the lesson or answers to questions. A previous prototype showed promise but used bulky AR glasses. The new system uses Google Glass, which is lighter and allows for evaluating the system's effectiveness during daily classroom use. Experiments were conducted to understand teachers' and students' views, and results suggest the system could improve communication and help teachers adapt lesson pacing based on student comprehension.
Collaborative Learning of Organisational KnolwedgeWaqas Tariq
This paper presents recent research into methods used in Australian Indigenous Knowledge sharing and looks at how these can support the creation of suitable collaborative envi- ronments for timely organisational learning. The protocols and practices as used today and in the past by Indigenous communities are presented and discussed in relation to their relevance to a personalised system of knowledge sharing in modern organisational cultures. This research focuses on user models, knowledge acquisition and integration of data for constructivist learning in a networked repository of or- ganisational knowledge. The data collected in the repository is searched to provide collections of up-to-date and relevant material for training in a work environment. The aim is to improve knowledge collection and sharing in a team envi- ronment. This knowledge can then be collated into a story or workflow that represents the present knowledge in the organisation.
Separation of Organic User Interfaces: Envisioning the Diversity of Programma...Felix Epp
Emerging technologies in nanotechnology, material science and mi- cro-robotics will make Programmable Matter possible. This creates the vision of transformable user interfaces as the successor of today’s user interfaces. This theoretical work discusses the new concepts in creating these interfaces. Instead of creating a singular device for the various use-cases and contents, future inter- faces will focus more on representing the underlying content and mental models than on a certain technique. This will lead to a vast variety of physical devices each representing one virtual entity and forming the overall user interaction in combination with each other. The concept of TiID – Time Interface Device – exemplifies such a device by including all time dependent actions and attributes in one device.
This document is a project report submitted by D.Surya Teja to fulfill requirements for the CS 361 Mini Project Lab at Acharya Nagarjuna University. The report describes the development of a Placement Management System to manage student and company information for university career services. It identifies key actors like students, recruiters, and administrators. Several use cases are defined including registration, validation, and other interactions between actors and the system. The document also covers analysis diagrams, class diagrams, relationships between classes, and system deployment.
Engelman.2011.exploring interaction modes for image retrievalmrgazer
This document discusses exploring different interaction modes for image retrieval. It describes developing a framework that allows multimodal interaction using techniques like eye tracking, voice recognition, and multi-touch. An experiment was conducted to compare the usability of different interaction methods for query by example image retrieval. Nine participants used four methods - anchor, gaze, mouse, and touch - to select regions in images. Metrics like accuracy, precision and time were measured. Preliminary results showed touch interaction had the most consistent performance and shortest completion times.
The document summarizes the staff, doctoral students, resources, and laboratories of the HCI Group at Tallinn University. It lists the researchers, professors, and analysts that make up the staff. It also lists the doctoral students that have been or are currently affiliated with the group. Finally, it describes two laboratories managed by the group - the Interaction Design Laboratory and the User Experience Laboratory, including their purposes and example projects.
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
This document discusses a proposed system for deep collaborative filtering with aspect information. The system aims to help web users efficiently locate relevant information on unfamiliar topics to increase their knowledge. It utilizes techniques like multi-keyword search, synonym matching, and ontology mapping to return relevant web links, images, and news articles to the user based on their search terms. The proposed system architecture includes an index structure to efficiently search and rank results based on similarity to the search query terms. The implementation and evaluation of the proposed system are also discussed.
VIDEO OBJECTS DESCRIPTION IN HINDI TEXT LANGUAGE ijmpict
Video activity recognition has grown to be a dynamic location of analysis in latest years. A widespread
information-driven approach is denoted in this paper that produces descriptions of video content into
textual content description inside the Hindi language. This method combines the final results of modern
item with "real-international" records to pick the in all subject-verb-object triplet for depicting a video. The
usage of this triplet desire technique, a video is tagged via the trainer, mainly, Subject, Verb, and object
(SVO) and then this data is mined to improve the result of checking out video clarification by using pastime
as well as item identity. Contrasting preceding approaches, this method can annotate arbitrary videos
deprived of wanting the large series and annotation of a similar schooling video corpus. The proposed
work affords initial and primary text description within the Hindi language that is producing easy words
and sentence formation. But the fundamental challenging attempt on this work is to extract grammatically
accurate and expressive text records in Hindi textual content regarding video content.
This document proposes evaluating the usability of VALET, a visual analytics software developed by Purdue University's VACCINE lab for law enforcement officials. The project aims to collect user data through surveys, interviews, and recording user actions during goal-directed tasks. This will provide insights into difficulties with the current interface to propose design improvements around information scent, cognitive task analysis, and GOMS modeling. Future work would compare a new interface design to the current one through additional user studies and expert reviews.
Monitoring and Visualisation Approach for Collaboration Production Line Envir...Waqas Tariq
In this paper, a tool, called SPMonitor, to monitor and visualize of run-time execution productive processes is proposed. SPMonitor enables dynamically visualizing and monitoring workflows running in a system. It displays versatile information about currently executed workflows providing better understanding about processes and the general functionality of the domain. Moreover, SPMonitor enhances cooperation between different stakeholders by offering extensive communication and problem solving features that allow actors concerned to react more efficiently to different anomalies that may occur during a workflow execution. The ideas discussed are validated through the study of real case related to airbus assembly lines.
How To Write A Good Hook For An English Essay - How ToKayla Smith
The document provides instructions for creating an account and submitting assignment requests to the writing service HelpWriting.net. It describes a 5-step process: 1) Create an account with an email and password. 2) Complete an order form with instructions, sources, and deadline. 3) Review bids from writers and choose one. 4) Review the completed paper and authorize payment. 5) Request revisions until satisfied with the work. The service promises original, high-quality content and refunds for plagiarized work.
The document provides instructions for using an essay writing service. It outlines a 5-step process: 1) Create an account, 2) Complete an order form providing instructions and deadline, 3) Review bids from writers and select one, 4) Review the paper and authorize payment, 5) Request revisions to ensure satisfaction. It emphasizes the service's commitment to original, high-quality work and full refunds for plagiarized content.
Contenu connexe
Similaire à Analyzing And Predicting Focus Of Attention In Remote Collaborative Tasks
Intention recognition for dynamic role exchange in hapticأحلام انصارى
1) The paper summarizes an experimental study that investigated the utility of a role exchange mechanism in human-computer collaboration through haptic interaction.
2) In this framework, a human dynamically interacts with a computer partner by communicating through the haptic channel to trade control levels on a shared task.
3) The study examined conditions of equal control, role exchange, and visuo-haptic cues, analyzing effects on task performance, efficiency, and subjective measures.
An investigation into the physical build and psychological aspects of an inte...Jessica Navarro
This dissertation investigates creating an interactive information point and examines the psychological effects on users. The student aims to build an animatronic information point that tracks objects and interacts with users. Research covers object tracking hardware/software, human-computer interaction, and effects of anthropomorphism. The student will create a physical animatronic head, programming in LabVIEW and Roborealm, conduct user testing via questionnaire, and analyze the results. The dissertation aims to determine if a more lifelike interactive information point improves the user experience of conveying information.
CONSIDERATION OF HUMAN COMPUTER INTERACTION IN ROBOTIC FIELD ijcsit
This document discusses considerations for improving human-robot interaction through applying principles of human-computer interaction. It summarizes several existing models of human-computer interaction and adapts the action theory model to the context of human-robot interaction. The adapted model incorporates the simulation of robot emotions and awareness of human emotions into the interaction process. The summary then provides examples of how this adapted model could be applied to scenarios of social interaction between humans and robots.
Lukasz Jedrzejczyk developed a haptic interface called "Privacy-Shake" that allows users to manage location privacy settings on their mobile phones through shaking and sweeping motions. He conducted a study with 16 participants to evaluate the usability of Privacy-Shake. The study found that while Privacy-Shake allowed some tasks to be completed faster than a graphical user interface, many participants struggled with some gestures and felt the interface was awkward or annoying. Jedrzejczyk plans to continue improving Privacy-Shake by adding calibration, more discreet gestures, and expanded functionality.
DYNAMIC AND REALTIME MODELLING OF UBIQUITOUS INTERACTIONcscpconf
This document discusses modeling real-time interaction between a user and a ubiquitous system using dynamic Petri net models. It proposes using Petri nets to model a user's activity as a set of elementary actions. Elementary actions are modeled as Petri net structures that are then composed together through techniques like sequence, parallelism, etc. to form an overall model of user-system interaction. The models can be dynamically adapted based on changes to the user's context. OWL-S ontology is used to describe the dynamic aspects of the Petri net models, especially real-time composition of models. Simulation results validate the approach of dynamically modeling user-system interaction through mutation of Petri net models.
The document describes a study that investigates using gestures as a form of authentication on smartwatches. The researchers collected accelerometer data from smartwatches as users performed different gestures. They extracted time and frequency domain features from the data and used k-nearest neighbors and random forest classifiers to distinguish between gestures and identify individual users performing the same gesture. Through 5-fold cross validation experiments, they found it was possible to accurately classify gestures and identify users with error rates comparable or better than previous gait-based authentication studies. This suggests gesture-based authentication on smartwatches is a viable solution.
1
User Interface Development
User Interface Development
Shashank Pitla
Wilmington University
Iteration 1 – Develop a storyboard
Plan
Nowadays, as the technology and the Web are continually being used to perform various operations, it becomes paramount to have an interactive and attractive user interface (Molina, Redondo, & Ortega, 2009). That is because humans interact with these systems through an interface. This iteration entails storyboarding for the user interface. A storyboard is a technique used for illustrating the interaction between humans and products in a narrative format that incorporates a series of sketches, drawings, pictures, and words to tell a story (Gruen, 2000). In this iteration, I plan to create storyboards that specify how the user interface will be changing in reaction to the user’s actions as well as to show the external elements to the system. I plan to use as few details as possible to get the key points on board regarding the big picture because the storyboard is supposed to present clear and precise information of the user interface.
In the procedure for storyboard design, there are three major activities that I plan to carry out including deciding what to incorporate, building the storyboard, and lastly feedback and iteration. In deciding what to do, I plan to interact with some users in the company to understand their needs, goals, and background. This analysis will also aid in understanding the system and the features. I will also get to brainstorm with the design team, identify people and artifacts in this storyboard and then develop the storyboard scenarios. During the time for building the storyboard, I will put the gathered information concerning the storyboard features into practice and illustrate the user actions on the storyboard. During the last step in my procedure; feedback and iteration, I plan to gather feedback from the internal and external stakeholders and then iterate the storyboard design.
Action
The documenting of the iteration’s objectives was the first activity that was carried out before commencing the main activities of the session. I then began with the first step of my procedure that is, deciding what to include. To accomplish that, I had to interact with the users with the aim of understanding their backgrounds and goals, and to understand the system better in terms of the desired features. I also brainstormed with the design team about the storyboard before developing the storyboard scenarios.
In the second step, I broke the story into smaller sections known as frames; I identified the key frames from the scenarios as I focus on each frame’s individual features. In each frame, I had to draw the user, the product as well as other fundamental objects for each frame. I used tests for the users’ thoughts or reactions and made sure to use as minimum detail as possible in communicating the user interface features. I then wrote short descriptions for each frame to ...
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...Kalle
To estimate viewer’s contextual understanding, features of their
eye-movements while viewing question statements in response to definition statements, and features of correct and incorrect responses were extracted and compared. Twelve directional features
of eye-movements across a two-dimensional space were created, and these features were compared between correct and incorrect responses. The procedure of estimating the response was developed with Support Vector Machines, using these features. The estimation performance and accuracy were assessed across combinations of features. The number of definition statements, which needed to be memorized to answer the question statements during the experiment, affected the estimation accuracy. These results provide evidence that features of eye-movements during reading statements
can be used as an index of contextual understanding.
This document presents a framework for reusing existing software agents through ontological engineering. The framework includes components like a user interface agent, query processor, mapping agent, transfer agent, wrapper agent, and remote agents containing ontologies. The query processor reformulates the user's query, the mapping agent identifies relevant ontologies, and the transfer agent sends the query to remote agents. The remote agents provide ontologies as output, which are then integrated/merged and presented back to the user interface agent. The goal is to enable reuse of heterogeneous agents across different development environments through a standardized ontology representation.
This document summarizes a study that used the Think-Aloud Protocol technique to observe users' mental models and gestural interactions when using a touchscreen tablet for the first time. Five participants completed tasks in a calendar app and space game app on an iPad2. Observations found that users struggled with gestures requiring indirect touch like long presses and spin gestures. They tended to use direct touch gestures like taps and drag-and-drops based on prior mouse experience. The study provides insight into how users' existing mental models can influence their understanding of new touchscreen interactions.
The document discusses using Google Glass in a classroom setting to provide teachers feedback from students. It proposes an Augmented Lecture Feedback System (ALF System) that uses Google Glass to display symbols over students' heads indicating their understanding of the lesson or answers to questions. A previous prototype showed promise but used bulky AR glasses. The new system uses Google Glass, which is lighter and allows for evaluating the system's effectiveness during daily classroom use. Experiments were conducted to understand teachers' and students' views, and results suggest the system could improve communication and help teachers adapt lesson pacing based on student comprehension.
Collaborative Learning of Organisational KnolwedgeWaqas Tariq
This paper presents recent research into methods used in Australian Indigenous Knowledge sharing and looks at how these can support the creation of suitable collaborative envi- ronments for timely organisational learning. The protocols and practices as used today and in the past by Indigenous communities are presented and discussed in relation to their relevance to a personalised system of knowledge sharing in modern organisational cultures. This research focuses on user models, knowledge acquisition and integration of data for constructivist learning in a networked repository of or- ganisational knowledge. The data collected in the repository is searched to provide collections of up-to-date and relevant material for training in a work environment. The aim is to improve knowledge collection and sharing in a team envi- ronment. This knowledge can then be collated into a story or workflow that represents the present knowledge in the organisation.
Separation of Organic User Interfaces: Envisioning the Diversity of Programma...Felix Epp
Emerging technologies in nanotechnology, material science and mi- cro-robotics will make Programmable Matter possible. This creates the vision of transformable user interfaces as the successor of today’s user interfaces. This theoretical work discusses the new concepts in creating these interfaces. Instead of creating a singular device for the various use-cases and contents, future inter- faces will focus more on representing the underlying content and mental models than on a certain technique. This will lead to a vast variety of physical devices each representing one virtual entity and forming the overall user interaction in combination with each other. The concept of TiID – Time Interface Device – exemplifies such a device by including all time dependent actions and attributes in one device.
This document is a project report submitted by D.Surya Teja to fulfill requirements for the CS 361 Mini Project Lab at Acharya Nagarjuna University. The report describes the development of a Placement Management System to manage student and company information for university career services. It identifies key actors like students, recruiters, and administrators. Several use cases are defined including registration, validation, and other interactions between actors and the system. The document also covers analysis diagrams, class diagrams, relationships between classes, and system deployment.
Engelman.2011.exploring interaction modes for image retrievalmrgazer
This document discusses exploring different interaction modes for image retrieval. It describes developing a framework that allows multimodal interaction using techniques like eye tracking, voice recognition, and multi-touch. An experiment was conducted to compare the usability of different interaction methods for query by example image retrieval. Nine participants used four methods - anchor, gaze, mouse, and touch - to select regions in images. Metrics like accuracy, precision and time were measured. Preliminary results showed touch interaction had the most consistent performance and shortest completion times.
The document summarizes the staff, doctoral students, resources, and laboratories of the HCI Group at Tallinn University. It lists the researchers, professors, and analysts that make up the staff. It also lists the doctoral students that have been or are currently affiliated with the group. Finally, it describes two laboratories managed by the group - the Interaction Design Laboratory and the User Experience Laboratory, including their purposes and example projects.
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
This document discusses a proposed system for deep collaborative filtering with aspect information. The system aims to help web users efficiently locate relevant information on unfamiliar topics to increase their knowledge. It utilizes techniques like multi-keyword search, synonym matching, and ontology mapping to return relevant web links, images, and news articles to the user based on their search terms. The proposed system architecture includes an index structure to efficiently search and rank results based on similarity to the search query terms. The implementation and evaluation of the proposed system are also discussed.
VIDEO OBJECTS DESCRIPTION IN HINDI TEXT LANGUAGE ijmpict
Video activity recognition has grown to be a dynamic location of analysis in latest years. A widespread
information-driven approach is denoted in this paper that produces descriptions of video content into
textual content description inside the Hindi language. This method combines the final results of modern
item with "real-international" records to pick the in all subject-verb-object triplet for depicting a video. The
usage of this triplet desire technique, a video is tagged via the trainer, mainly, Subject, Verb, and object
(SVO) and then this data is mined to improve the result of checking out video clarification by using pastime
as well as item identity. Contrasting preceding approaches, this method can annotate arbitrary videos
deprived of wanting the large series and annotation of a similar schooling video corpus. The proposed
work affords initial and primary text description within the Hindi language that is producing easy words
and sentence formation. But the fundamental challenging attempt on this work is to extract grammatically
accurate and expressive text records in Hindi textual content regarding video content.
This document proposes evaluating the usability of VALET, a visual analytics software developed by Purdue University's VACCINE lab for law enforcement officials. The project aims to collect user data through surveys, interviews, and recording user actions during goal-directed tasks. This will provide insights into difficulties with the current interface to propose design improvements around information scent, cognitive task analysis, and GOMS modeling. Future work would compare a new interface design to the current one through additional user studies and expert reviews.
Monitoring and Visualisation Approach for Collaboration Production Line Envir...Waqas Tariq
In this paper, a tool, called SPMonitor, to monitor and visualize of run-time execution productive processes is proposed. SPMonitor enables dynamically visualizing and monitoring workflows running in a system. It displays versatile information about currently executed workflows providing better understanding about processes and the general functionality of the domain. Moreover, SPMonitor enhances cooperation between different stakeholders by offering extensive communication and problem solving features that allow actors concerned to react more efficiently to different anomalies that may occur during a workflow execution. The ideas discussed are validated through the study of real case related to airbus assembly lines.
Similaire à Analyzing And Predicting Focus Of Attention In Remote Collaborative Tasks (20)
How To Write A Good Hook For An English Essay - How ToKayla Smith
The document provides instructions for creating an account and submitting assignment requests to the writing service HelpWriting.net. It describes a 5-step process: 1) Create an account with an email and password. 2) Complete an order form with instructions, sources, and deadline. 3) Review bids from writers and choose one. 4) Review the completed paper and authorize payment. 5) Request revisions until satisfied with the work. The service promises original, high-quality content and refunds for plagiarized work.
The document provides instructions for using an essay writing service. It outlines a 5-step process: 1) Create an account, 2) Complete an order form providing instructions and deadline, 3) Review bids from writers and select one, 4) Review the paper and authorize payment, 5) Request revisions to ensure satisfaction. It emphasizes the service's commitment to original, high-quality work and full refunds for plagiarized content.
Best Tips For Writing A Good Research PaperKayla Smith
The document provides instructions for writing a research paper using the website HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email; 2) Complete a 10-minute order form providing instructions, sources, and deadline; 3) Review bids from writers and select one to complete the assignment; 4) Review the completed paper and authorize payment if satisfied; 5) Request revisions until fully satisfied, with the option of a full refund for plagiarized work. The process aims to match students with qualified writers to help complete research papers.
Scholarship Essay Compare And Contrast Essay OutlineKayla Smith
The document outlines the steps to request an assignment writing service from HelpWriting.net:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, and deadline. Attach sample work to imitate writing style.
3. Review bids from writers for the request, choose one based on qualifications and feedback, then pay a deposit to start.
4. Review the completed paper and authorize full payment if pleased, or request revisions using the free revision policy.
MBA Essay Writing Service - Get The Best HelpKayla Smith
The document provides information about MBA essay writing services from HelpWriting.net. It outlines a 5-step process for students to get help on their MBA essays: 1) Create an account, 2) Complete an order form with instructions and deadline, 3) Review bids from writers and choose one, 4) Receive the paper and authorize payment, 5) Request revisions if needed. The service aims to provide original, high-quality content and offers refunds for plagiarized work.
The document provides instructions for requesting writing assistance from HelpWriting.net in 7 steps: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and choose one based on qualifications. 4) Review the completed paper and authorize payment if satisfied. 5) Request revisions to ensure satisfaction. HelpWriting.net guarantees original, high-quality work and refunds for plagiarized content.
27 Outstanding College Essay Examples CollegeKayla Smith
The Elaboration Likelihood Model proposes that persuasion can occur via central or peripheral routes, with the central route involving careful consideration of arguments and the peripheral route relying on simple cues, and Cacioppo's theory further specifies that the central route is used when people are motivated and able to process arguments logically while the peripheral route is used when they lack motivation or ability.
How To Start An Essay With A Quote Basic TipsSampleKayla Smith
This document provides instructions for how to request an assignment be written by writers on the HelpWriting.net website. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete an order form with instructions, sources, and deadline. 3) Review bids from writers and choose one. 4) Review the completed paper and authorize payment. 5) Request revisions until satisfied with the work. It promises original, high-quality content or a full refund.
How To Format Essays Ocean County College NJKayla Smith
The personal narrative is from the perspective of Princess Lucy Willows who dreams of exploring nature despite her mother's objections. She devises a plan to sneak out of the castle at night with supplies in her backpack. After climbing out of her bedroom window, she runs into the forest and sets up her tent. However, while eating soup by the campfire, she hears a strange growling sound that causes her to worry that her mother was right to be concerned about the dangers that could be found in nature.
Essay Writing - A StudentS Guide (Ideal For Yr 12 AndKayla Smith
This document provides guidance on securing a network server that is used for data storage, application sharing, and connecting desktop computers. It recommends implementing access controls for different user groups, encrypting data for security, and using virus checks, firewalls, and encryption protocols. The document also generates an encrypted message using a Vigenere cipher and lists goals to reduce security problems and deficiencies.
This document discusses the experience of being a sophomore in high school. It notes that sophomores are considered the lowest class and get treated as such, with worse seating and parking. However, the document also states that high school is meant to be a fun time. It provides an example of the positive school spirit at football games in the author's small town, where the community comes together at games to support the team.
Winter Snowflake Writing Paper By Coffee For The KidKayla Smith
The document discusses several key similarities and differences between the Inca and Mayan civilizations. Both empires had control over large territories at their height but collapsed, with the Mayans existing earlier than the Incas. The Mayans had several spoken languages and developed writing and hieroglyphics, while the Incas only had one spoken language with no written form. The Mayans were also more intellectually advanced and engaged in more brutal practices than the relatively peaceful Incas. While the civilizations declined, they both made important contributions to fields like mathematics and architecture.
Example Of Case Study Research Paper - 12+ CasKayla Smith
This document provides instructions for requesting and completing an assignment writing request through the HelpWriting.net platform. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and choose one based on qualifications. 4) Review the completed paper and authorize payment if pleased. 5) Request revisions to ensure satisfaction, with a full refund option for plagiarized work. The process aims to fully meet customer needs through high-quality, original content.
This document provides instructions for writing a term paper through an online service. It outlines a 5-step process: 1) Create an account; 2) Submit a request with instructions and sources; 3) Review writer bids and select one; 4) Review the paper and authorize payment; 5) Request revisions to ensure satisfaction and receive a refund if plagiarized.
Essay Computers For And Against TelegraphKayla Smith
This summary discusses the unnecessary practice of infant circumcision in the United States. The passage describes a scenario where a newborn baby is restrained and prepared for circumcision, highlighting the inhumane nature of the procedure. It notes that circumcision is rarely medically necessary and violates medical ethics principles. While common in the US, most of the world's men are left intact. The author argues circumcision should not be performed on infants who cannot consent.
A conceptual framework for international human resource management research i...Kayla Smith
This paper proposes a conceptual framework for analyzing the transfer of human resource management (HRM) practices from advanced economies to less developed economies. The framework is based on institutional theory and identifies three key dimensions to consider:
1) Regulatory/coercive factors related to differences in rules and regulations between home and host countries.
2) Cognitive/mimetic factors regarding differences in social norms and values between economies.
3) Normative factors stemming from differences in professionalization and education systems.
The framework aims to help multinational enterprises evaluate how institutional differences between advanced and less developed economies may create opportunities or constraints when transferring HRM practices internationally.
Associating to Create Unique Tourist Experiences of Small Wineries in Contine...Kayla Smith
This document discusses opportunities for small wineries in Croatia to create unique tourist experiences through association. It conducted interviews and a survey of winery owners in Virovitica-Podravina County. The findings show that while owners see potential, they are unfamiliar with concepts like scattered hotels and experience economies. They also face obstacles to association that limit their tourism offerings. The document argues that small wineries should form micro-clusters through activities like themed routes and accommodation to better compete in tourism and make use of their cultural and agricultural resources.
This document provides an overview of various academic reference management software options. It describes the two main categories of reference managers as desktop-based tools and cloud-based tools. Some key desktop-based options mentioned include Bookends, Sente, and Papers, which are proprietary Mac applications. Mendeley and Zotero are described as major cloud-based options that began as browser-only software but now have desktop and mobile apps. ReadCube, Citavi, and Docear are also briefly outlined. The document provides details on how these different options approach collecting references, organizing and annotating them, and integrating with word processing. It concludes by advising readers to consider their needs and preferences before choosing a reference manager to try.
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Diana Rendina
Librarians are leading the way in creating future-ready citizens – now we need to update our spaces to match. In this session, attendees will get inspiration for transforming their library spaces. You’ll learn how to survey students and patrons, create a focus group, and use design thinking to brainstorm ideas for your space. We’ll discuss budget friendly ways to change your space as well as how to find funding. No matter where you’re at, you’ll find ideas for reimagining your space in this session.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Analyzing And Predicting Focus Of Attention In Remote Collaborative Tasks
1. Analyzing and Predicting Focus of Attention
In Remote Collaborative Tasks
Jiazhi Ou, Lui Min Oh, Susan R. Fussell, Tal Blum, Jie Yang
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213, USA
{jiazhiou, keithoh, sfussell, tblum, jie.yang}@cmu.edu
ABSTRACT
To overcome the limitations of current technologies for remote
collaboration, we propose a system that changes a video feed
based on task properties, people’s actions, and message
properties. First, we examined how participants manage different
visual resources in a laboratory experiment using a collaborative
task in which one partner (the helper) instructs another (the
worker) how to assemble online puzzles. We analyzed helpers’
eye gaze as a function of the aforementioned parameters. Helpers
gazed at the set of alternative pieces more frequently when it was
harder for workers to differentiate these pieces, and less
frequently over repeated trials. The results further suggest that a
helper’s desired focus of attention can be predicted based on task
properties, his/her partner’s actions, and message properties. We
propose a conditional Markov model classifier to explore the
feasibility of predicting gaze based on these properties. The
accuracy of the model ranged from 65.40% for puzzles with easy-
to-name pieces to 74.25% for puzzles with more difficult to name
pieces. The results suggest that we can use our model to
automatically manipulate video feeds to show what helpers want
to see when they want to see it.
Categories and Subject Descriptors
H5.3. Information interfaces and presentation (e.g., HCI): Group
and organizational interfaces – collaborative computing,
computer-supported collaborative work
General Terms
Algorithms, Experimentation, Human Factors
Keywords
Remote Collaborative Tasks, Eye Tracking, Focus of Attention,
Keyword Spotting, Computer-Supported Cooperative Work
1. INTRODUCTION
Imagine that an engineer in the United States needs to instruct a
technician in India on how to service a faulty machine. They are
forced by circumstances to collaborate over geographical distance.
The participants in this collaborative task must work on physical
objects. Their participation can be differentiated into a helper role
(the person offering the knowledge to guide the operations) and a
worker role (the person who actually performs the physical
actions). Given the dynamic nature of collaborative tasks, helpers
and workers must coordinate their interaction, so that assistance is
provided in a timely manner.
Video systems have emerged to help bridge the distance between
remote collaborators on physical tasks by providing them a shared
visual space [15] for conversational grounding ([4], [6]). We are
interested in how a better understanding of task dynamics can
improve existing video systems.
Video systems that provide the helper with some view of the
worker’s environment have shown to help improve task
performance, compared to audio-only systems ([8], [9], [11],
[15]). However, these systems are handicapped by the reality of
expensive bandwidth consumption and by the limited view angle
and mobility of the camera. These systems, at best, can provide
only a subset of the visual information available in side-by-side
conditions. Some systems have attempted to address these issues
by having a pan/tilt/zoom camera remotely controlled by the
helper ([17], [18]); however, the task of manipulating the camera
interferes with smooth interpersonal communication [21]. Other
systems offering multiple views simultaneously are bandwidth
intensive, and have not been proven to be beneficial [9]. Systems
that allow switching between multiple views (e.g., [10])
circumvent bandwidth limitations, but incur high equipment costs
and hinder common understandings of what view of the
environment is being shared.
We hypothesize that if we can show the remote helper the desired
view of the worker’s environment in any specific instance of the
task automatically, it will free the helper from having to control
the camera. At the same time, the helper will have the necessary
visual information to communicate effectively and assist the
worker in the collaborative task. Before we can design a system
that shows the right view at the right time, however, we need a
better understanding of how properties such as the nature of the
physical task, partners’ actions, and speech characteristics affect
helpers’ visual attention.
To examine this issue, we created a real-time collaborative online
task in which a remote helper instructs a worker on how to build a
puzzle. We first investigated whether the helper’s focus of
attention towards the different visual resources showed
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
ICMI’05, October 4–6, 2005, Trento, Italy.
Copyright 2005 ACM 1-59593-028-0/05/0010…$5.00.
116
2. regularities by analyzing the effects of task properties, partner
actions, and message content on the helper’s gaze. The results,
obtained from previous work [23], suggest that the most beneficial
view of the workspace can be predicted based on these
parameters. In the current study, we build on previous findings by
creating and training a classifier that attempts to predict a helper’s
desired focus of attention. Following the development phase, we
evaluated the classifier’s performance. The results demonstrate
the feasibility of creating intelligent camera systems based on task
properties, partner actions, and message content.
In the remainder of the paper, we first discuss relevant previous
work. Then, we describe the experiment in detail. Next, we
present the experimental results, our proposed gaze prediction
algorithm, and its subsequent evaluation. We conclude with a
discussion of some remaining issues and our plans for future work.
2. RELATED WORK
The successful performance of collaborative physical tasks
requires tight coordination of conversation and action. People plan
their utterances by monitoring others' activities and changes in
task status to determine what steps should be taken next. Video
conferencing tools that provide remote helpers with views of a
workspace lead to more efficient task performance than audio-
only communications tools (e.g., [9][22]). However, performance
with video systems rarely equals that of face-to-face interaction.
This may be because video cameras may not show the right part
of a workspace at the right degree of resolution at the right time.
Clark’s theory of conversational grounding (e.g., [4] [5]) suggests
that helpers will look at targets that help them determine whether
or not their messages have been understood as intended. Other
research has indicated that gaze patterns of speakers and listeners
are closely linked to the words spoken, and help in the timing and
synchronization of utterances (e.g., [1]). Vertegaal et al. found
that in multi-party conversations, speakers looked at the person
they were talking to 77% of the time and listeners looked at the
speaker 88% of the time. They also built a multi-agent
conversational system that uses eye gaze input to determine to
which agent the user is listening or speaking [28]. Stiefelhagen et
al. developed a system to estimate participants’ focus of attention
from gaze directions and sound sources. They demonstrated that
acoustic information provides 8% relative error reduction
compared to only using one modality [27].
Recent studies demonstrate that people naturally look at objects or
devices with which they are interacting. Campana et al. describe a
system that uses eye movements to determine what a speaker is
referring to [3]. Maglio et al. investigated how people use speech
and gaze when interacting with an “office of the future,” in which
they could interact with office applications (e.g. Calendar, Map
and Address Book) via speech recognition, and found that people
nearly always looked at a device before making a request to it [19].
Similaly, Brumitt et al. [2] investigated speech, gesture, touch,
and other nontraditional interfaces to control lights in their “Easy
Living Lab”, a mock up of a small living room. They reported that
people typically looked at the lights they wanted to control.
Eye gaze has been used as an important modality for building new
human computer interfaces. Earlier work includes eye-controlled
interfaces for the disabled [12], and eye gaze word processors [7].
In those interfaces, users can either make use of intentional or
natural eye movements. Sibert and Jacob compared eye gaze with
mouse input [25]. They found that eye-gaze selecting technology
was faster than selecting with a mouse. Stiefelhagen and Yang
illustrated a multimodal interface using eye gaze and speech to
drive a panoramic image viewer [26].
A major problem of gaze-based interfaces is the difficulty in
interpreting eye movement patterns due to unconscious eye
movements such as saccades and to gaze tracking failure. Jacob
[13] approached the problem by predicting a series of fixations
separated by saccades and fitting the raw data to this model.
Salvucci used hidden Markov models to interpret gaze data and
reported good interpretation results in an eye typing study [24].
Oh et. al built a gaze-aware interface, “Look-To-Talk,” (LTT) that
could direct the speech to a software agent in a multi-user
collaborative environment [20]. They compared LTT to “Talk-To-
Talk” (TTT), a spoken keyword-driven paradigm, and “Push-To-
Talk” (PTT), a keyboard-driven paradigm. They concluded that
LTT is a promising interface. In this research, we are interested in
the relationship between spoken utterances and gaze. Our goal is
to predict focus of attention from keywords extracted from the
dialogue during a collative physical task, i.e., Talk-to-Look.
3. METHOD
3.1 Design
Our experiment used an online jigsaw puzzle task adapted from
Kraut and colleagues [16], in which a helper and worker
collaborated to construct a series of puzzles. The helper could
gaze freely among three areas to obtain visual information as he
or she provided instructions:
• The pieces bay, in which the puzzle pieces were stored. By
monitoring the pieces bay, the helper could assess whether the
worker had selected the correct piece from among the
alternatives.
• The workspace, in which the worker was constructing the
puzzle. By monitoring the workspace, the helper could assess
whether the worker had positioned a piece correctly.
• The target solution, which showed how the puzzle should be
constructed. This appeared only on the helper’s screen.
We manipulated the differentiability of the puzzle pieces (solid
colors vs. shaded) and the complexity of the puzzle (5, 10 or 15
pieces). Each participant completed three puzzles for each
condition (piece differentiability x puzzle complexity), randomly
presented in a single block. The design formed a 2 (piece
differentiability) x 3 (puzzle complexity) x 9 (trial) factorial
within-subjects study. The order of the puzzle blocks was
counterbalanced across participants.
3.2 Materials
We created 18 target puzzles by randomly selecting color pieces
and forming configurations of 5, 10 or 15 pieces (see Figure 1).
There were 6 different puzzles for each level of complexity, three
formed from a pieces pool with solid colors (easier to describe),
and three formed from a pieces pool with shaded colors (harder to
describe). In the former case, there were at most two shades of the
same color in the pieces pool (e.g., two distinctly different greens,
such as bright green and dark green); in the latter case, there were
five shades of the same color.
117
3. Figure 1. Examples of puzzle configurations with 5, 10, and 15
pieces, respectively.
The worker’s screen was laid out so that the workspace and pieces
bay were adjacent to each other. The helper’s display was
designed such that the 3 areas (workspace, pieces bay, and target)
were in a triangular shape. The helper could shift his/her eye gaze
from any area directly to any of the other two areas.
3.3 Equipment and Software
LCD monitors were used for displays and adjusted for color
consistency. Sony wireless microphones were used to record the
conversation between the subjects on separate channels.
An eye tracking system, consisting of an ISCAN RK-426PC
pupil/corneal reflection tracker, an ISCAN HMEIS head mounted
eye-imaging system with head tracking sensor, a Polhemus
InsideTRAK magnetic position sensor, and a stand-alone scene
camera, was calibrated to each helper and recorded the
intersection of the helper’s line of sight with the screen plane at 60
Hz. The video feed of the scene, showing the coordinates of the
helper’s eye gaze and the worker’s actions, was then recorded
using a Panasonic DV-VCR.
The helper’s focus of attention on any one of the 3 areas
(workspace, pieces bay, and target) over time was derived from
eye gaze coordinates. To overcome the unreliable metric posed by
the zero error of the magnetic sensor and the pupil/corneal
reflection tracker, we clustered all gaze coordinates in each
session using K-Means vector quantization (VQ). We first chose 3
initial centers in the same triangular fashion as the 3 areas on the
helper’s display. Within 10 iterations, the algorithm converged
and the outputs were 3 new centers. Subsequently, the helper’s
gaze coordinates were indexed based on their proximity to these 3
new centers. An example of the clustering result of the eye gaze
coordinates for one of the sessions is shown in Figure 2.
Figure 2. An example of eye gaze distribution from one section
of the tasks. After running K-Means VQ algorithm we got 3
clusters and classified each point’s focus of attention.
3.4 Participants and Procedure
Twenty-four college undergraduate and graduate students, all with
normal color vision, participated in this study. Participants were
randomly assigned to the helper and worker roles. They were
seated in the same room at their respective computer terminals
with a barrier between them so that they could hear but not see
one another, simulating a remote collaboration environment.
The experimenter first calibrated the eye-tracker on the helper.
After calibration, the helper gave verbal instructions to the worker
on how to select puzzle pieces from the pieces bay and assemble
them properly in the workspace to complete the target puzzle. The
worker was allowed to converse freely with the helper and ask
questions whenever necessary. The helper was able to see the
worker’s actions in the pieces bay and workspace. In order to
prevent eye fatigue, participants were given a 5-minute break after
half of the puzzles were completed. After the break, the
experimenter recalibrated the eye-tracker. Sessions lasted 60 to 90
minutes.
4. STATISTICAL ANALYSIS
We employed statistical analysis to look at the relationships
between the helper’s gaze pattern and the following three factors:
task properties, worker’s actions and message content. Details of
the analysis were presented in [23].
4.1 Gaze and Task Properties
To look at how task properties affect the gaze, we used a mixed-
model design in which subjects was a random factor and shading,
puzzle complexity, trial, and block were fixed-subjects. This
model takes individual differences in gaze into account while
computing the fixed effects. For this analysis, we focus on
percentage gaze directed at the pieces bay. However, because
gaze toward the target (puzzle solution) remained relatively
constant, gaze toward the pieces bay and gaze toward the
workspace are inversely related (r = -.76). Consequently, the
results for gaze toward the workspace show essentially the same
pattern of significance but in the opposite direction. Overall, the
fit of this model to the data was excellent (R Square = .69). A total
of 18% of the variance was accounted for by the subject variable.
As shown in Figure 3, gaze toward the pieces bay was
significantly higher for shaded than for solid pieces (F [1, 182] =
255.98, p < .0001), and significantly lower for puzzles with more
pieces (F [2, 182] = 11.28, p < .0001). There was no interaction
between shading and puzzle complexity (F < 1, ns).
We also found a significant effect of trial (F [1, 182] = 37.68, p <
.0001), indicating that helpers spent less time monitoring the
pieces bay over trials. However, as can be seen in Figure 4, the
trial effect only held for the easy-to-describe (solid) pieces; for
shaded pieces, gaze toward the pieces bay remained high across
all trials (for the interaction, F [1, 182] = 27.49, p < .0001).
4.2 Gaze and Worker’s Actions
Worker actions in the workspace and pieces bay were
automatically detected. As anticipated, gaze toward the workspace
was higher when the worker was acting in that area, and vice
versa when the worker was acting in the pieces bay (see Figure 5).
However, the effect of worker actions on gaze was not significant
for solid-color puzzles. We assume this is because the solid colors
are easy to describe and distinguish so that the helper can be
Workspace
Pieces
Target Puzzle
118
4. confident that the worker is grabbing the right piece without
monitoring the pieces bay.
4.3 Gaze and Instructional Content
In order to analyze the relationships between the helper’s eye gaze
and the content of their instructions, two coders separated
transcribed utterances into clauses and coded each of them as one
of the categories shown in Table 1. In a subset of the data, the two
coders agreed with each other 95% of the time.
Table 1. Coding of clauses
Code Instructional Content
0 Description of color piece, e.g. “Take the green block”
1 Description of location, e.g. “And then put that to the
right of the dark gray”
2 Correcting color piece, e.g. “A little lighter than that”
3 Correcting location, e.g. “It’s on the very right”
4 Others
0
10
20
30
40
50
5 10 15
Puzzle Size
Percent
Gaze
at
Pieces
Solid Shaded
Figure 3. Percentage of gaze directed toward the pieces bay as
a function of piece discriminability (shading) and puzzle size.
0
10
20
30
40
50
1 2 3 4 5 6 7 8 9
Trial within Block
Percent
Gaze
at
Pieces
Solid Shaded
Figure 4. Percentage gaze directed at the pieces bay as a
function of piece discriminability and trial.
We then computed eye gaze distributions for all clause segments
as a function of clause coding. Distributions were computed to
1/60 second (Figure 6). The results illustrate that gaze pattern
varies as a function of clause coding (description/correction of the
next piece vs. description/correction of its location within the
puzzle). When describing a piece, helpers overwhelmingly look at
the pieces bay, whereas when they are describing a location, they
are much more likely to look at the workspace.
0
10
20
30
40
50
60
70
Action in Workspace Action in Pieces-Bay
Solid-Color Tasks
Percentage
Helper
Gaze
Workspace Pieces Bay Target
0
10
20
30
40
50
60
70
Action in Workspace Action in Pieces-Bay
Shaded-Color Tasks
Percentage
Helper
Gaze
Workspace Pieces Bay Target
Figure 5. Helper gaze as a function of worker actions in the
workspace and pieces bay
0
10
20
30
40
50
60
70
0 1 2 3
Clause Coding
Percent
of
Helper's
Gaze
Workspace Pieces Bay Target
Figure 6. Relationship between helper message content and
gaze toward workspace, pieces bay
119
5. 5. PREDICTING FOCUS OF ATTENTION
The statistical analysis shows that a helper’s gaze is highly related
to the dialogue content and worker’s actions. We implemented a
conditional Markov model to predict the helper’s gaze and
explore the possibility of building an automated camera system to
support remote physical collaboration.
5.1 The Problem
We define the problem as a classification problem. Given a
section of the puzzle task with the sequence of transcribed words
as {w1, w2, …, wN}, their corresponding start times derived from
the wave signals by the speech recognizer as {s1, e1, s2, e2, …, sN,
eN}, then the worker’s action mt at time t is obtained by the
analysis of the mouse click/move events. { }
1
,
0
,
1
!
"
t
m , where -1,
0, 1 are the codes for Not-Moving, Moving-in-Workspace, and
Moving-in-Pieces-Bay respectively. gt is defined as the helper’s
gaze coded at time t,
{ }
2
,
1
,
0
!
t
g , where 0, 1, 2 are the codes for workspace, pieces
bay, and target respectively. mt and gt were processed at a
sampling rate of 60HZ. An example of how a helper’s dialogue
and gaze, and the worker’s action are synchronized is shown in
Figure 7.
At each time t we predict the helper’s gaze as t
ĝ . In our
evaluation, we only consider gaze to the workspace and pieces
bay, ignoring gazes towards the target. Let gt, t = 1 .. T, be the
actual gaze codes collected from the experiment and processed by
VQ algorithm (Section 3.3). The classification error at time t is
defined as:
!
"
# $
=
=
otherwise
g
g
and
or
g
if
g
g
err t
t
t
t
t
,
0
ˆ
,
1
0
,
1
)
,
ˆ
( (1)
and the performance of the classifier in one puzzle task, Acc is
defined as:
{ }
1
0
|
)
,
ˆ
(
1 1
or
g
g
g
g
err
Acc
i
i
T
t
t
t
=
!
=
"
= , (2)
where { }|
1
0
|
| or
g
g i
i = is the number of gazes excluding those
towards the target.
5.2 Offline Prediction
For comparison purposes, we first used human dialogue coding to
perform the classification offline. The helper’s gaze at time t is
predicted using maximum likelihood estimation:
}
,
|
Pr{
max
arg
ˆ
1
0
t
t
or
j
t m
clause
j
g
=
= , (3)
while }
3
,
2
,
1
,
0
{
!
t
clause is the clause coding in Table 1 (Section
4.3) at time t. In both training and testing phases the clause coding
and the worker’s actions are known. We estimated the conditional
probabilities of each gaze target from training sample frequencies.
5.3 Online Prediction
As our objective is to control the camera automatically in a video
system (such as [22]) for remote collaborative physical tasks, the
helper’s focus of attention has to be predicted in advance based on
previous information from the dialogue and worker’s actions.
Using the predictive model, the camera shifts between the
workspace and the pieces bay. We do not need to predict camera
shift to the target puzzle (solution), as it is always available at the
helper’s side. Moreover, in online prediction, the system does not
have supervised knowledge of clause boundaries and coding.
We formulate online prediction as: at each sampling point t, given
the previous words (w1, w2, …, wi), and the previous worker’s
actions (m1, m2, …, mt) as input, classify the next gaze code 1
ˆ +
t
g
as 0 (workspace) or 1 (pieces bay). The classification problem is
illustrated in Figure 8.
Figure 7. Demonstration of the three sources of data in a 12-second period. The helper’s was giving the instruction “OK UMM
NOW UMM THE DARKEST BLACK PIECE UMM AND THEN PUT THAT UMM TO THE RIGHT OF THE LAST `ONE
YOU JUST PUT DOWN”. Starting time and ending time of each word are aligned with the worker’s action (-1: Not-Moving, 0:
Moving-in-Workspace, 1: Moving-in-Pieces-Bay) and the helper’s gaze (0: Workspace, 1: Pieces-Bay, 2: Target).
61.94 73.08
DARKEST
BLACK PIECE
OK UMM UMM
NOW THE UMM
AND THEN
PUT THAT
UMM
TO THE
RIGHT
OF THE
LAST ONE YOU
JUST PUT DOWN
Time (s)
Helper’s Gaze ( gt )
Worker’s Action ( mt )
Helper’s Dialogue ( wi )
120
6. Figure 8. A multimodal classifier. w1, w2, …, wi and m1, m2, …,
mt are dialogue and action data collected until point t,
respectively; 1
ˆ +
t
g is the classified gaze code.
Winner-Takes-All Strategy
Compared with the worker’s action data (of which the majority is
-1, or Not-Moving), the helper’s utterances are a much richer
source of information. Due to the difficulty in capturing the
fluctuations of the helper’s gaze within the start and end times of a
single word, we don’t expect to predict the gaze well at every
sampling time. Therefore we apply a winner-takes-all strategy and
smooth the helper’s gaze and worker’s action data based on the
boundary of words. That is, the decision is only made at the end
of each word. Let wi, si, ei be the ith word and its starting and
ending time, and define the smoothed action Mi and gaze Gi as:
!
"
!
#
$
%
<
=
&
=
&
=
&
otherwise
e
t
e
j
m
m
e
and
e
time
between
movements
no
if
M
i
i
t
t
or
j
i
i
i |,
}
,
|
{
|
max
arg
,
1
1
1
0
1 (4)
|
}
,
|
{
|
max
arg 1
1
0
i
i
t
t
or
j
i e
t
e
j
g
g
G !
<
=
= "
=
(5)
Mi and gaze Gi are interpreted as the majorities of action and gaze
codes between time ei-1 and ei (ignoring the target area). This
process is graphically shown in Figure 9.
Now the problem becomes: given w1, w2, …, wi and M1, M2, …,
Mi as input features, output the prediction of Gi+1 as 0
(workspace) or 1 (pieces bay).
A Conditional Markov Model Classifier
Since clause boundaries and coding have proven to be very useful
for gaze prediction and are not available in online settings, we
predict the clause coding of each word. To capture the
dependencies between current word/action and previous
word/action more directly we propose a conditional Markov
model. Pairing gaze (G = 0 or 1) and clause coding (Clause = 0, 1,
2, or 3), we formed a sequence of 8 possible states. Let W
v
and
M
v
be the word sequence and action sequence respectively. The
probability of a state sequence S
v
conditioned on the observation
sequences W
v
and M
v
is inferred through factors ! , ! , and ! :
! "
"
"
"
"
#
i
i
i
i
i
i
i
i
i
i
i s
M
M
M
s
w
w
w
s
s
M
W
S ))
,
,
,
(
*
)
,
,
,
(
*
)
,
(
(
}
,
|
Pr{ 2
1
2
1
1 $
%
&
v
r
r (6)
and we define:
}
|
,
,
Pr{
)
,
,
,
(
}
|
,
,
Pr{
)
,
,
,
(
}
|
Pr{
)
,
(
2
1
2
1
2
1
2
1
1
1
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
s
M
M
M
s
M
M
M
s
w
w
w
s
w
w
w
s
s
s
s
!
!
!
!
!
!
!
!
!
!
=
=
=
"
#
$
(7)
! captures the relationship between the current state and the
previous state, while ! and ! characterize features W
v
and M
v
using trigram. Figure 10 shows the factor graph representation of
the conditional Markov model.
Figure 9. The smoothed action and gaze data based on the
Winner-Takes-All strategy (Eq. (4) and Eq. (5)).
Figure 10. A conditional Markov model.
To decrease the number of parameters, we classify each word as
one of the following 13 categories: Color_Name (e.g., “red”),
Preposition (e.g., “above”), Adjective (e.g., “darkish”), Verb (e.g.,
“take”), Linking_Verb (e.g. “are”), Noun (e.g., “box”), Pronoun
(e.g., “you”), Positive_Feedback (e.g., “yes”), Negative_Feedback
(e.g., “wrong”), Adverb (e.g., “very”), Conjunction (e.g., “and”),
Non_Word_Utt (e.g., “umm”), Other. ! is approximated by:
}
|
,
,
Pr{
}
|
,
,
Pr{
)
,
,
,
(
2
1
2
1
2
1
i
i
i
i
i
i
i
i
i
i
i
i
s
c
c
c
s
w
w
w
s
w
w
w
!
!
!
!
!
!
"
=
# (8)
where ci is the category of wi.
Since training is supervised, we employ a maximum likelihood
method to learn the parameters, and Good-Tuning smoothing to
estimate the unseen trigram in the training data. Moreover, as
si-1
Mi
si
……
wi
Mi-1 wi-1
Mi-2 wi-2
!
! !
! !
1
0
1
0
2
-1
0
1
-1
0
1
Time
Helper’s Dialogue ( wi )
wi
wi-1
Worker’s Action ( mt )
The Smoothed Action ( Mi )
Helper’s Gaze ( gt )
The Smoothed Gaze ( Gi )
0:67.25 0:68.83
……
Classifier
}
1
,
0
{
ˆ 1 !
+
t
g
w1, w2, …, wi
m1, m2, …, mt
121
7. discussed in Section 4, the gaze distribution varies as a function of
task characteristics. Therefore we estimate two sets of parameters,
one for solid puzzles and the other for shaded puzzles. In testing
we use Viterbi algorithm to find the optimal path given the
parameters and observations. The performance of the classifier
was evaluated according to Eq. (2).
5.4 Experimental Results and Discussion
To test our classifier, we used the data described earlier (i.e., 216
puzzle tasks from 12 pairs of subjects). We trained and tested with
solid and shaded puzzles separately. Given puzzles of the same
color differentiability, we used half of the data for training and the
other half for testing. Then we switched the training set and
testing set and performed the experiment again.
Offline and online gaze prediction accuracies for solid and shaded
puzzles are listed in Table 2. In predicting gaze, the accuracies for
shaded puzzles are significantly higher than those for solid
puzzles. It can be explained by the statistical analysis in Section
4.2 (Figure 5), showing that data from workers’ actions are more
discriminative for shaded puzzles. Online prediction is not as
good as offline prediction because it is a more challenging
problem given that clause boundaries and coding are unknown.
Table 2. Test set accuracies in predicting
helpers’ focus of attention
Task Property
Offline Prediction
Accuracy
Online Prediction
Accuracy
Solid Puzzles 69.81% 65.40%
Shaded Puzzles 76.62% 74.25%
Tables 3 and 4 show the confusion matrixes. For solid puzzles, the
system classified the workspace better, most likely because the
prior probably of gaze toward the workspace is higher than that of
gaze toward the pieces bay (see Section 4.3). In contrast, for
shaded puzzles, the pieces bay had a higher prior probability and
the classifier was better at classifying it.
Table 3. The confusion matrix for solid puzzles.
0 (Workspace) 1 (Pieces Bay)
Offline Prediction
0 (Workspace) 79.56% 20.44%
1 (Pieces Bay) 43.13% 56.87%
Online Prediction
0 (Workspace) 75.17% 24.83%
1 (Pieces Bay) 47.24% 52.76%
Table 4. The confusion matrix for shaded puzzles.
0 (Workspace) 1 (Pieces Bay)
Offline Prediction
0 (Workspace) 61.11% 38.89%
1 (Pieces Bay) 12.73% 87.27%
Online Prediction
0 (Workspace) 57.56% 42.44%
1 (Pieces Bay) 15.42% 85.58%
To examine the success of the conditional Markov classifier in
predicting dialogue content, we define the instructional coding
prediction accuracies as the percentage of correctly classified
words. The accuracies for solid-color and shaded-color puzzles
were 59.00% and 48.37%, respectively. Prediction of instructional
coding for solid-color puzzles was much better because helpers
used simpler language to describe the puzzle pieces.
6. CONCLUSION AND FUTURE WORK
In this paper, we have demonstrated the feasibility of predicting
focus of attention in remote collaborative tasks. The statistical
analyses demonstrate that in remote collaboration, the percentage
of time a helper looks at different targets is predictable from task
properties, the worker’s actions, and message content. The results
are consistent with a conversational grounding view of
collaboration (cf. [4]): When a helper lacks confidence that his/her
instructions were understood in the context of previous
interactions and a shared common vocabulary, he/she seeks
additional visual evidence of understanding from the worker's
environment.
Based on our analysis, we formulated the problem of predicting
gaze in remote collaboration as a multimodal classification
problem. We further employed a conditional Markov model to
predict gaze as well as clause coding in real time. The
experimental results show that overall accuracy is 65.40% for
solid color puzzles and 74.25% for shaded puzzles. These results
indicate the feasibility of developing intelligent video camera
systems that predict where a helper wants to look in real time
during remote collaboration. Such a system can optimally use
network resources and enhance remote collaboration.
Our future research will follow up on three aspects of our findings.
First, our classifier was more accurate when discriminative
workers’ action data was available. While it is easy to obtain this
data during online collaborative tasks, it is much more difficult in
3D tasks. This suggests the need to monitor and interpret the
worker’s actions in physical collaborations. Second, our current
model does not take workers’ messages into account.
Incorporating this information should enhance overall accuracies
of prediction. Finally, our accuracy rates were higher than random
guessing but still far from perfect. We plan to conduct behavioral
research to determine how good an intelligent video camera
system must be in order to be beneficial in practical use.
7. ACKNOWLEDGMENTS
This work was funded by National Science Foundation Grant
#0208903. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the
authors and do not necessarily reflect the views of the National
Science Foundation. We thank Darren Gergle for his statistical
advice, and Jeffrey Wong for the discussion sessions.
8. REFERENCES
[1] Argyle, M. & Cook, M. (1976). Gaze and Mutual Gaze.
Cambridge University Press.
[2] Brumitt B., Krumm J., Meyers B., & Shafer S. (2000). Let
there be light: Comparing interfaces for homes of the future.
IEEE Personal Communications, August 2000.
[3] Campana, E., Baldridge, J., Dowding, J., Hockey, B. A.,
Remington, R. W., & Stone, L. S. (2001). Using eye
122
8. movements to determine referents in a spoken dialogue
system. Proceedings of the 2001 Workshop on Perceptive
User Interfaces, pp. 1–5.
[4] Clark, H. H. (1996). Using Language. Cambridge, England:
Cambridge University Press.
[5] Clark, H. H., & Brennan, S. E. (1991). Grounding in
Communication. In L. B. Resnick, R. M. Levine, & S. D.
Teasley (Eds.). Perspectives on socially shared cognition
(pp. 127-149). Washington, DC: APA.
[6] Clark, H. H. & Wilkes-Gibbs, D. (1986). Referring as a
collaborative process. Cognition, 22, 1-39.
[7] Frey L. A., White K. P. Jr., Hutchinson T. E. (1990). Eye-
gaze word processing. IEEE Transactions on Systems, Man
and Cybernetics, 20(4), 944–950.
[8] Fussell, S. R., Kraut, R. E., & Siegel, J. (2000). Coordination
of communication: Effects of shared visual context on
collaborative work. Proceedings of CSCW 2000 (pp. 21-30).
NY: ACM Press.
[9] Fussell, S. R., Setlock, L. D., & Kraut, R. E. (2003). Effects
of head-mounted and scene-oriented video systems on
remote collaboration on physical tasks. Proceedings of CHI
2003 (pp. 513-520). NY: ACM Press.
[10] Gaver, W., Sellen, A., Heath, C., & Luff, P. (1993). One is
not enough: Multiple views in a media space. Proceedings of
Interchi '93 (pp. 335-341). NY: ACM Press.
[11] Gergle, D., Millan, D. R., Kraut, R. E., & Fussell, S. R.
(2004). Persistence matters: Making the most of chat in
tightly-coupled work. CHI 2004 (pp. 431-438). NY: ACM
Press.
[12] Hutchinson T. E., White K. P. Jr., Martin W. N., Reichert K.
C., Frey L. A. (1989). Human-computer interaction using
eye-gaze input. IEEE Transaction on Systems, Man, and
Cybernetics, 19, pp. 1527–1534.
[13] Jacob, R. J. K. (1993). Eye-movement-based human-
computer interaction techniques. In H. R. Hartson & D. Hix
(Eds.), Advances in Human-Computer Interaction, Vol. 4
(pp. 151–190). Norwood, NJ: Ablex.
[14] Kraut, R. E., Fussell, S. R., Brennan, S., & Siegel, J. (2003).
A framework for understanding effects of proximity on
collaboration : Implications for technologies to support
remote collaborative work. In P. Hinds & S. Kiesler (Eds.).
Technology and Distributed Work.
[15] Kraut, R. E., Fussell, S. R., & Siegel, J. (2003). Visual
information as a conversational resource in collaborative
physical tasks. Human Computer Interaction, 18, 13-49.
[16] Kraut, R. E., Gergle, D., & Fussell, S. R. (2002). The use of
visual information in shared visual spaces: Informing the
development of virtual co-presence. Proceedings of CSCW
2002 (pp. 31-40). NY: ACM Press.
[17] Kuzuoka, H., Kosuge, T., & Tanaka, K.. (1994) GestureCam:
A video communication system for sympathetic remote
collaboration. Proceedings of CSCW 1994 (pp. 35-43). NY:
ACM.
[18] Kuzuoka, H., Oyama, S., Yamazaki, K., Suzuki, K., &
Mitsuishi, M. (2000). GestureMan: A mobile robot that
embodies a remote instructor's actions. Proceedings of
CSCW 2000 (pp. 155-162). NY: ACM Press.
[19] Maglio, P. P., Matlock T., Campbell C. S., Zhai S., Smith B.
A. (2000). Gaze and speech in attentive user interfaces. In
Proceedings of the International Conference on Multimodal
Interfaces, volume 1948 from LNCS. Springer, 2000.
[20] Oh, A., Fox, H., Kleek, M. V., Adler, A., Gajos, K.,
Morency, L., Darrell, T., Evaluating Look-to-Talk: A Gaze-
Aware Interface in a Collaborative Environment (2002), In
Proceedings of CHI '02 extended abstracts on Human factors
in computing systems, pp. 650 – 651.
[21] Ou, J. (unpublished). DOVE-2: Combining gesture with
remote camera control.
[22] Ou, J., Fussell, S. R., Chen, X., Setlock, L. D., & Yang, J.
(2003). Gestural communication over video stream:
Supporting multimodal interaction for remote collaborative
physical tasks. In Proceedings of International Conference
on Multimodal Interfaces, Nov. 5-7, 2003, Vancouver,
Canada.
[23] Ou, J., Oh, L.M., Yang, J., & Fussell, S. R. (2005). Effects of
task properties, partner actions, and message content on eye
gaze patterns in a collaborative task. Proceedings of the
SIGCHI conference on Human factors in computing systems,
pp. 231 – 240. ACM Press.
[24] Salvucci Dario D. (1999). Inferring Intent in Eye-Based
Interfaces: Tracing Eye Movements with Process Models. In
Human Factors in Computing Systems: CHI 99, 1999.
[25] Sibert, L. E., and Jacob, R. J. (2000). Evaluation of Eye Gaze
Interaction, In Proceedings of the SIGCHI conference on
Human factors in computing systems, pp. 281 – 288.
[26] Stiefelhagen, R., Yang, J. (1997). Gaze Tracking for
Multimodal Human-Computer Interaction. In Proceedings of
International Conf. on Acoustics, Speech, and Signal
Processing, April 1997.
[27] Stiefelhagen, R., Yang, J., & Waibel, A. (2002). Modeling
focus of attention for meeting indexing based on multiple
cues. IEEE Transactions on Neural Networks, 13, 928-938.
[28] Vertegaal, R., Slagter, R., van der Veer, G., & Nijholt, A.
(2001). Eye gaze patterns in conversations: There is more to
conversational agents than meets the eyes. Proceedings of
CHI 2001 (pp. 301-308). NY: ACM Press.
123