Complexities of Practical Web Automation

Yury Puzis, Yevgen Borodin, I.V. Ramakrishnan
Complexities of Practical
Web Automation
Stony Brook University
2015
NSF Grant No. IIS-1218570

Contents
Goal: help design practical web automation tools by sharing
observational experience
❖ Human-Computer Interaction Perspective
❖ Technical Perspective
❖ Example: Automation Assistant
❖ Conclusion

Why Web Automation?
❖ Problem: non-visual browsing is hard
❖ It is hard (or impossible) to ﬁnd relevant information
and easy to become overwhelmed by what is irrelevant
❖ There are many shortcuts (gestures) to learn and hard
(or impossible) to accomplish non-trivial tasks
❖ Web automation has the potential to enable visually
impaired users to breeze through Web browsing tasks that
beforehand were slow, hard, or even impossible to achieve

Observation
User Environment Web Automation Tool
Browsing Actions Events
User Environment Web Automation Tool
Events Automation Instructions
Automation

Maximizing Trust
❖ Gaining and maintaining user trust is the cornerstone of
web automation: even a few disasters is a big problem
❖ The user needs to know and inﬂuence what will happen
(review, parameterize, choose) and what has happened
(review, revert, recover) at all times
❖ Failure is inevitable and has to be graceful: terminate
automation, ignore failed action, take corrective action,
or suggest the user to take corrective action

Minimizing End-To-End Cost
❖ Cognitive load and operation time must be end-to-end
lower when using automation than otherwise
❖ Web automation costs: managing creation, execution and
consequences of automation; context switching
❖ Screen-Reader and browser costs: many and are well known,
including the need to plan complex sequences actions by
memory or execute exhaustive search and guess, guess,
guess
❖ In conﬂict with the need to maximize trust

Dealing with Uncertainty
Goal: automate user intent without resorting to handcrafting
scripts (programming), interpret environment reaction
Problem: we can only guess
❖ Semantics of user browsing actions
❖ Semantics of environment events
❖ Semantics of webpage elements

Making Observations
❖ Goal: make meaningful observations from events
❖ Problem: browsing actions can trigger multiple
(including cascading) events, and there are different
types of events: e.g., shortcut press -> JavaScript call ->
DOM mutation -> virtual cursor movement
❖ Problem: over time, an event may change its semantics
(same event - different results) or implementation
(different event - same results)

Addressing Webpage Elements
❖ Goal: identify target webpage element
❖ Problem: most addressing approaches are designed to
query DOM for elements at the speciﬁed address, but we
need to query DOM for address of the speciﬁed element
❖ Solutions: sloppy programming, machine learning, etc.
but no unbreakable approaches exist

Detecting Action Completion
❖ Goal: wait for action to complete (succeed or fail) before
continuing to interact with the user & the environment
❖ Problem: no standard way to specify action completion;
cascading, asynchronous and scheduled JavaScript
events make things harder
❖ Solutions: listen to all relevant JavaScript events
through callback functions; timeout; wait for predeﬁned
DOM mutations / value changes (success or failure)

Example: Automation Assistant
❖ Observes everything the user is doing (no macros)
❖ Guides the user through browsing tasks step-by-step
❖ suggests several alternative browsing actions based on
user’s prior actions
❖ automates only one action at a time
❖ each set of suggestions is explicitly requested, each
action is explicitly chosen, each outcome is reviewed
❖ No context switch between automation and screen-reading
Puzis Y., Borodin Y., Puzis R., Ramakrishnan I. V.,
Predictive Web Automation Assistant for people with vision impairments. WWW '13.

Conclusion
❖ There are some successes but automation is not there yet
❖ The biggest technical challenge is uncertainty which stems
from lack standardization
❖ The biggest HCI challenges are building trust and keeping
things “cheap”
❖ The HCI aspect of this talk is, to a large extent, applicable
to all automation tools, not just web automation. It is also
applicable to all users not just the visually impaired users
(think handheld, wrist devices)

Complexities of Practical Web Automation

Recommandé

Recommandé

Contenu connexe

Similaire à Complexities of Practical Web Automation

Similaire à Complexities of Practical Web Automation (20)

Dernier

Dernier (20)

Complexities of Practical Web Automation