2. Abstract
Review of recent publications on aesthetics field reveals growing interest towards aesthetics of interaction.
Many previous works state that, besides their beauty of appearance, things can also be perceived
beautiful in use. However, fewer studies address the evaluation of interaction aesthetics (or aesthetics of
use) apart of aesthetics of appearance.
Current study tries to fill this gap by proposing questionnaire items for evaluating interaction aesthetics.
Users' aesthetic perceptions were studied during interactions with touch devices. Repertory grid technique
(RGT) was used to elicit aesthetics-related constructs after users have tried out 9 selected interaction
episodes. As a result, 21 participants were able to elicit 134 personal constructs. These constructs were
then grouped by similarity and named with suitable semantic differentials, suitable for use in aesthetics
evaluation questionnaire.
Further research perspective could be coming up with the set of modified attributes, suitable for different
fields of interaction: e.g. tangible interaction, interactive art, impaired people's interactions, implicit
interactions, etc.
3. Introduction
Problem: mostly the appearance
of interface is addressed while
evaluating aesthetics. No tools for
evaluating aesthetics of
interaction apart from aesthetics
of appearance.
Question: Why is interaction
process perceived aesthetical?
Goal: describe the attributes of
interaction aesthetics.
(Mõttus2015)
4. Concerns: diversity of UX
Individual aesthetic perception is diverse. See
figure (Karapanos 2010).
How diversity affects the results?
Diverse individual experiences make the
aesthetics hard to explain for larger groups.
What factors were addressed in current study?
● Individual - all data analysed individually
● Product - diverse episodes and apps
selected (stimuli)
Time and situation were intentionally kept
constant.
5. Stimuli
Requirements
● Touch interactions in popular free mobile applications
○ Why touch devices?
○ Why popular and free?
● Short episodes of interaction with no specific task or goal. E.g: tap to select, swipe to scroll, slide to
navigate
● Set of 9, 12 or 15 possibly diverse experiences
○ Why 9, 12 or 15
○ How to evaluate diversity?
6. Stimuli
Selection
● Google Play store’s top 100 free apps retrieved
● 230 episodes extracted
● Episodes evaluated --- 11 interaction (Lenz2013) attributes used to assess diversity of UX
● 50 most suitable episodes extracted
● 50 selected episodes evaluated by second expert
● Interrater agreement calculated and 28 most agreed episodes extracted
● Factor analysis for determining 9 most suitable episodes
● MDS over interaction attributes was used to visualize diversity of these episodes
7. Factor analysis
11 interaction attributes
were represented as 5
factors. (PCA). Diverse
episodes were considered
those which most contribute
to some certain factor, but
don’t load much to others.
Max pos and neg scores
were considered.
8. MDS results
Final selection of 9 episodes
are marked in yellow.
Visualization shows the
diversity as the distance
between episodes in
projection on two-
dimensions.
9. RGT study
Stimuli and devices
3 Sliding left or right anywhere on the screen to select a menu item
5 Using slide-gesture anywhere on screen for viewing a vehicle in 3D
12 Course of play. Using tap gesture on right or left edge of screen to steer a car accordingly
14 Tap on buttons at lower edge of the screen to select menu items
20 Tap on buttons to select the items in settings menu
27 Turn the device to change screen orientation
42 Slide down and release anywhere on screen to refresh the list
43 Swipe up or down anywhere on screen to scroll the list
48 Slide right or left on playback timeline to rewind or forward the track
10. Participants
Considerations
● Aesthetic experts
○ Art background
○ Design background
○ Psychology background
○ UX background
● Use of lay people
Pilot study was conducted, as a result:
● Use of lay people is fine
● IT developers show poor results, tend to be pragmatics oriented
● Ability to explain aesthetic perception is rather individual than related to expert background
● Genre of interaction (e.g. game type) influences aesthetic appraisal
● Art and psychology backgrounds showed better results
11. RGT study
Procedure
1. Participant was invited individually, then suggested to focus on aesthetic interaction procedure
rather than appearance and follow the emotions rather than analyse.
2. Collecting demographic data: age, nationality, gender, residence, familiarity with touch devices,
native OS, expertise or profession (e.g. art, psychology, etc.)
3. All episodes were tested thoroughly and aesthetic appraisal was collected on 7p Likert scale.
4. Groups of three were combined. Cards, representing episodes were used to proceed elicitation of
personal construct.
a. Question1: which one of three episodes differs aesthetically from the rest of two?
b. Question2: why the selected episode is different from other two?
5. Evaluation of episodes upon a newly elicited construct on 7p Likert scale.
12. Results
Qualitative:
21 participants (9 male, average age 32.0, ranging from 17 to 52) yielded in 134 personal constructs.
Each personal construct was elicited as adjective and participants were also asked to name the semantic
opposites of that adjective. Elicitation interviews were audio recorded: 15 of 21 elicitation interviews were
conducted in Estonian and 6 interviews in English.
Quantitative:
● All participants gave aesthetic rating to the episodes.
● Scales of all constructs were agreed and the constructs were evaluated.
13. Results
Examples of quantitative data
● Individual - table at the top
shows 6 personal constructs,
elicited by one individual and
the values given by the same
individual to all 9 episodes
● Collective - table at the bottom
shows aesthetic values, given
by study participants to all 9
episodes. Bottom row shows
average aesthetic values of
these episodes.
episodes’ ID - > 48 43 12 20 27 5 14 42 3
Construct #23 3 4 5 5 6 3 3 7 3
Construct #24 2 3 5 6 6 2 3 3 4
Construct #25 3 4 4 2 1 6 3 2 3
Construct #26 4 7 5 2 1 6 2 2 2
Construct #27 3 3 5 7 6 1 2 3 5
Construct #28 3 5 6 4 6 5 5 7 2
3 5 12 14 20 27 42 43 48
1 6 6 5 6 5 4 4 5 7
2 5 6 4 3 5 6 6 5 6
3 5 7 4 6 4 6 4 6 6
4 5 7 6 5 2 5 6 6 7
... ... ... ... ... ... ... ... ... ...
6 6 6 5 7 4 6 6 7 7
7 4 6 4 7 4 5 4 7 7
19 5 6 5 3 4 4 5 4 6
20 4 3 5 2 7 6 3 4 6
21 4 5 2 6 4 6 1 6 7
Avg: 5.3 5.8 4.4 4.9 4.7 5.5 4.7 5.4 6.4
14. Results
Qualitative
Constructs were
named as semantic
differentials.
Descriptions were
transcribed from audio
records of interviews’.
1 "smooth" vs jaggy The construct reportedly relates to user action e.g finger movement: sliding movement (episode 5 and 14) was
perceived smooth, tapping (20) felt unsmooth or jaggy.
2 "precise" vs imprecise Precision in coordination means accurateness of reaction and absence of ambiguity of actions. precision in time
also means absence of delay. Pushing the buttons (20) was considered precise. Sliding the control (48) and
being able to stop wherever wanted was perceived as precise. However scrolling the list (43) with swipe gesture
was considered less precise due to inertia.
3 "fast" vs slow Slow means delay in reaction, but the interaction also feels slow when it seems normal to act slowly in this
situation. There were no big differences in tested triad, but pushing buttons (20) and flipping pages (3) were
perceived sightly faster than sliding the control (48)
4 "playful" vs sedate
Playfulness is explained as possibility/opportunity/freedom to manipulate things. It adds joyful dimension. It is
also connected to fun of tinkering (näppida) and play around without purpose or goal. Selecting settings via push
buttons (20) is perceived less playful (there is obvious goal and no particular freedom to play around. The
episodes with swipe gesture (3, 43) were perceived more playful, even though episode #3 has obvious goal in
sight, but it also has the freedom to complete the action everywhere on the screen.
5 "boring" vs exciting Exciting means the creativeness of design. Boring relates to classical and well known solurtions. Episode #48
was considered very excitig and #14 and #27 were extremely boring
15. Analysis
Qualitative:
1. Describing constructs according to interview audio
2. Expert sorting of constructs, done individually by 2 experts.
a. Two independent experts
b. A letter, explaining procedure and format of results
c. Preparation of cards for sorting
d. As a result 2 sets of grouping were proposed (one set also included higher level grouping)
3. Analysing the expert sortings, detecting inter-rater discrepancies
4. Reaching consensus between experts via making bilateral agreements
a. Naming the aesthetic attributes
17. Analysis
Quantitative:
● MDS - mapping of episodes according to newly elicited constructs
○ Individual basis
○ Overall basis
● Factor analysis (PCA)
○ Calculating loadings for proposed attributes
○ Further groupings for detecting possible higher level factors
○ Biplot of
● Aesthetic correlates, relevance of proposed aesthetic attributes according to
RGT data
○ Between proposed attributes and aesthetic appraisal
○ Between possible factors and aesthetic appraisal
19. MDS overall
Average aesthetic value in red
Qualitative analysis suggests meanings
of dimensions:
● Left-right e.g
○ Smooth-unsmooth
○ Slow-fast
○ Delayed-immediate
● Up-down
○ Logical-illogical
○ Predictable-unpredictable
○ Natural-unnatural
More beautiful episodes tend to be on
upper left while less beautiful are lower
right
20. Continuation of study
Questionnaire of evaluating aesthetics of interaction.
● Item generation is completed - the interaction attributes as semantic
differentials.
● Validation of scales - planning of online study for validating the items
○ Experimental study
○ Use of mobile (touch screen) interactions
○ Specially prepared stimuli (A beautiful and B ugly interaction episodes)
○ Participants > 300
● Refining the scales
○ Cleaning items
○ Regrouping if necessary
21. Perspective
Widen the scope of questionnaire application to tangible interactions (industrial
design) and interactions with desktop computers.