Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Nozawa thesis
1. 1University of Aizu, Graduation Thesis. March, 2012 s1170033
Abstract
User manuals for physical performance help us
understand how a task is actually performed in a 3-d
space. Literature on spatial information
comprehension is scant on the topic related to
identifying factors which leads spatial comprehension
of physical tasks. The literature on mental imagery
and rotation has been discussed in this context of an
experiment where body rotations, object height and
action combinations have been studied to understand
how mental rotation tasks are performed. The
experiment reported in this thesis focused on
matching body rotation-action-object height
combinations shown from body height with overhead
images. Two types of activities were used: holding a
bat and swinging a bat. Five body rotations from full
front to back views were used with the bat being held
at chest and waist height. Results show that canonical
viewpoints and angles across the display plane are
somewhat preferred, although accuracy with non-
canonical viewpoints and angles into the display
plane were also high. The study thus goes on to show
that with more practice and time spent, mental
rotation tasks could be better performed.
1 Introduction
Mental imagery is an experience and an important
aspect of our general understanding of how different
objects functions in space without direct visualization
[1]. In a complex spatial world, mental imagery can
present some complex cases of comprehension
involving mental rotation. Mental rotation is the
ability to rotate two-dimensional and three-
dimensional objects in space but as an internal
representation of the mind. It is basically about how
the brain moves objects in the physical space in a
manner that helps with spatial understanding and
intelligence (including structural and functional
attributes) of objects in space [2][3][4].
Research in psychology has provided enough
literature demonstrating how people develop and
customize mental models and perform mental
rotation towards performing procedural actions in
space. This is where technical illustrations can
actually help develop guidelines in a way that might
help users perform mental rotations in a predefined or
expected sequence. This leads us to the question of
why we need technical illustrations for
communicating visually complex information.
Technical illustration is the use of illustration to
visually communicate information of complex
information [5]. The main purpose of any technical
illustration is to create expressive images, which has
meaning to the human senses and observer. The
accuracy of technical illustrations in terms of
dimensions and proportions help readers with visual
comprehension of the structural and functional
aspects of a given object in space. Naturally, this is
very important for showing body positions.
In this context, one must introduce the concept of
kinesthetic learning. It is a learning style whereby the
performer learns by carrying out a physical activity in
actual physical space, rather than thinking and
conjecturing about a physical action. The theory of
multiple intelligence by Gardner [6] has mentioned
kinesthetic learning. Kinesthetic learners are thought
to be the ones who prefer to physically try out and
perform the action involving their own bodily
experience.
Technical illustrations are designed to act as visual
aids that help replicate physical actions in a way
intended by instructors of the act. Technical
illustration show actions from the point of view of
performers, especially if performers’ bodies are
required to be positioned a particular way to perform
actions. One can choose to understand complex
information by using technical illustrations as an aid.
2 Review of the Literature
2.1 Overview
The technical communication literature is not very
rich with studies other than by Krull et al., (2001;
2003; 2004); (Szlichcinski, 1980); (Heiser and
Tversky, 2002) and few others, focusing primarily on
comprehension of procedural illustrations, and less
for body positions in space [7][8][9][10][11].
Traditionally, studies of mental imagery and rotation
in experimental psychology have addressed this issue
of object positions in space, but for comprehending
Efficacy of Technical Illustrations in a Technical
Communication Environment
Masato Nozawa s1170033 Supervised by Prof. Debopriyo Roy
2. 2University of Aizu, Graduation Thesis. March, 2012 s1170033
human ability to perform mental rotation tasks in
space. This review of the literature is designed to
explain two major factors that might help
comprehend physical actions and performing mental
rotation in a 3-d environment.
● How do perception of depth and body/object-
centered viewpoints influence comprehension of
physical actions in a 3-d space?
● How do motor skills and learning influence the
way we design technical illustrations?
Such understanding will help us comprehend what it
takes to design technical illustrations of physical
actions performed in a 3-d environment. What factors
should technical illustrators consider for designing
user actions in space?
2.2 PERCEPTION OF DEPTH 2-D
TECHNICAL ILLUSTRATION
There is extensive research done by Krull et al.,
(2004) with the suggestion that graphics for physical
tasks need to take into account the needs of users who
will carry out actions in a physical environment [9].
Research suggests that graphics need to show tasks
from the users' viewpoint, and need to make clear
how tools are to be used and the direction in which
actions are to be exerted. The paper provides some
sample graphic design guidelines.
Technical illustrations are useful only when readers
are actually able to use their vision systems when
performing tasks in the three dimensional space [5].
Readers might scan through physical illustrations
showing physical actions and use one type of vision
system primarily for object identification purposes,
while the other type of vision system could be used
for orienting their bodies in space [8]. However,
people are often able to comprehend well the distance
between objects or body parts across the display
plane where the space between objects is visible.
Contrarily, people find it relatively difficult to be able
to judge object positions when distances are to be
judged into the display plane [8].
The problem is that while showing different variants
of body positions and physical actions as in sports,
people often perceive positions, objects, movements,
and forces along the line of sight into the display
plane, thereby obscuring the vision necessary to
comprehend or copy the action, exactly as it should
be executed.
Research suggests that monocular vision dominates
binocular vision in experiencing depth from 2D
pictures and speculated that binocular vision did not
develop as a separate visual system but as an add-on
to monocular vision [12]. Hochberg (1978) suggested
that readers of 2-D illustrations on print or electronic
media have only monocular vision to help them
interpret what they see [13]. Krull concluded that
monocular cues reduces depth perception for 2-D
illustrations, thereby making interpretation more
difficult, and situates the choice of illustration’s
perspective (body-centered vs. object-centered) as a
central consideration. Depth perception arises from a
variety of depth cues. These are typically classified
into binocular cues that require input from both eyes
and monocular cues that require the input from just
one eye [14].
So, an optimal illustration should technically always
help readers see the maximum viewpoints available
in the scene, and show objects in a way such that
almost no parts of the body in business is obscured in
a way that handicaps the possible understanding and
execution of the task. This is what we call an object-
centered view where objects are placed across the
display plane.
2.3 USER-CENTERED VS.
OBJECT-CENTERED
PERSPECTIVE
An illustration with an object-centered point of view
positions objects across a display plane. This
viewpoint, which could also be called a spectator’s
view, allows objects to be placed so as to direct
viewers’ attention without obscuring important parts
of objects [6].
When we have to show a man pushing a cart, should
we show the scene where we see the back of the man
pushing the cart? Although it goes well with the user-
centered perspective, the cart will be obscured from
direct viewing; neither would it be possible to gauge
the hand placement as to replicate exactly how the
cart is being pushed. However, if a 1/3rd front or a
1/3rd side vision is shown, at a waist length, it might
be a lot easier to see most of the body and parts of the
cart, including the hand placement of the man
pushing the cart. Research by Heiser and Tversky
(2002) with a furniture assembly task and
Szlichcinski (1979; 1980) with hand positions
supported the efficacy of partially rotated objects as
compared to objects shown head-on, or shown with
full back [11][10].
Psychological research [8] has concurred that
canonical views showing two-dimensional
representations of physical actions that are held in a
three dimensional world are best represented when
illustrations are shown with objects in a three-quarter
view from slightly below the camera position.
3. 3University of Aizu, Graduation Thesis. March, 2012 s1170033
Although canonical views (slightly rotated viewpoint
to show maximum angles) are always preferred,
when it comes to replicating tasks, the choice
between a spectator’s viewpoint (seeing the action as
an observer and not as a doer) and object-centered
viewpoint (seeing the action as a doer and not as an
observer) is rather obscure and more context-driven.
If the question is to judge the distance between legs
when pushing a heavy cart, then a complete side view
might be the most preferred option. However, if we
need to see the grip and arm movements (stretching)
when pushing the cart, both side and zoomed-in front
views might both be effective. This is important to
understand because there are individual differences in
the way people prioritize objects in space vis-a-vis
the orientation of their bodies in space and with
different interpretations of visual information [15]
and with different performance levels on the task [16].
2.4 UNDERSTANDING MOTOR
SKILLS FOR TECHNICAL
ILLUSTRATIONS DESIGN
While designing illustrations of physical actions in a
user manual, technical illustrators should consider
two important things.
●How is motor learning and performance
developed?
● What are the best possible strategies for drawing
technical illustrations (for different tasks) such that it
helps readers understand the physical actions, not
only what needs to be accomplished, but exactly how
it needs to be done?
Skills classified by task: A specific task, based on
specific skills could be classified in terms of how
well defined is the movement in a discrete, serial and
continuous continuum.
Skills classified by Cognitive Elements: While
netting the ball in a basketball game, there should be
cognitive strategies deciding on the precise nature of
jump and the throw (how much to jump and the
distance to throw). Perfecting the jump and throw to a
certain level of efficiency is recognition of fine motor
skill, and the strategy behind such efficiency is
cognitive skill, and the combination leads to the
constant adaptation needed to reach a certain level of
efficiency.
Skills Classified by Environmental Factors: With
more environmental conditions and related
unpredictability, the levels of cognitive skills might
have more impact [17]. For example, when playing
baseball, how to swing the bat to hit the ball when the
ball swings in the air due to windy conditions is a
valid consideration.
3 MAJOR RESEARCH QUESTION
AND HYPOTHESES
What might be the most optimal viewpoint towards
comprehending a two-dimensional illustration
showing physical actions in a three-dimensional
space?
Hypothesis:
● Objects shown from a performer’s point of view
should be easier to understand.
● Illustrations showing more angles across the
display plane might be easier to understand.
● Levels of comprehension based on a two-
dimensional illustration should differ based on
whether the objects are shown at or below the camera
position.
The purpose of this experiment as designed for the
reported study is not to measure motor skills and
performance, but to identify it as a factor influencing
performance and learning, and most importantly to
explore the extent to which readers are able to
comprehend illustrations when demonstrated in a
print media from different perspectives and depth
perceptions.
Sample and Context: Forty-one students who are
non-native speakers of English (native Japanese
speakers) participated in this study.
Procedure
The experiment aims to understand how common
people understand images and relates them to images
shown from different perspectives and camera
positions. We asked test subjects to evaluate body
images via matching tasks and asked them to rate
their confidence in their choices.
4 Method
41 subjects took part in the experiment and each
subject rated 40 image types, divided into two blocks
of 20 each. As part of its robust design, the
experiment considered two sets of images. For the
experiment, we generated images of body positions
for two kind activities: a man holding a bat and a man
hitting with a bat. The purpose for using two different
types of objects relates to the exploration of whether
object types influence how decisions about depth
perceptions and display planes and viewpoints
(object-centered or performer-centered) are made.
This paper only discusses the results generated from
the image set related to the man with the bat. The
4. 4University of Aizu, Graduation Thesis. March, 2012 s1170033
other set (man with ball) has been discussed as part of
another paper.
Each participant was handed out two different sets as
part of an in-class graded assignment, with each set
having 20 test sheets. Each participant was first
handed out an instruction sheet in Japanese, and they
were orally explained in Japanese as to what is
expected of them from the experiment.
The volunteers explained to them the purpose of
the experiment, what it aims to achieve and how each
participant should approach the test. At that point, the
participants were allowed to ask questions related to
the experiment, and voice any question or concern.
The volunteers were also available throughout the
experiment to answer queries related to the
experiment. There was no time limit set for the
participants to complete the experiment, but they
were expected to complete their responses within 90
minutes. However, they were allowed to retain the
answer sheet with them until the next class meeting
exactly a week later. There were two reasons why
there was no time limit maintained.
(1) Students were allowed to think and re-think
about illustrations and were allowed to change their
responses if they wanted to.
(2) Students had to complete a series of questions
related to the experiment in Moodle as a graded
assignment, and retaining the test sheets and
referring back to those while answering the
questions in Moodle were naturally thought of as
more enriching.
In each test sheet, participants were asked to circle
the correct choice. Each of the three options were
demonstrated as Picture A, B and C. They also went
on report their second best choice for each test sheet
and also their levels of confidence for each response.
Instruments:
Using a computer program called POSER Figure
Artist that sustains accurate three-dimensional
relationships among body parts, the experimenter
produced variations of viewpoints and body positions.
Each position included two heights for each activity:
Chest and Waist. The man-with-bat-holding is shown
as holding a bat centered in front of chest or waist with
the hands gripping the bat from both sides. The man-
with-bat-throwing version shows hitting with bat at the
chest or waist height. These action gestures were
captured for five positions where the body moves with
the camera position remaining constant: Front - 0
degrees (the man holding/throwing the ball and facing
the camera head on), 1/3 Side - 30 degrees, Side - 90
degrees, 1/3 Back - 120 degrees, Back - 180 degrees.
For all these images, the camera was positioned
slightly above the waist height.
Each set had five images and there were 4 sets in total.
Every set was rotated in five angles as mentioned
before. The first set showed a man holding a bat at
chest height; the second set showed hitting at chest
height. Two other sets showed a man holding and
hitting with a bat at waist height respectively.
Once these images were generated, the camera was
then positioned to capture images from the top for the
above-mentioned sets. A matching top image was
generated for each image generated from the sets
above, with a displacement along the y and z-axis to
position the camera exactly on top of the head. Each of
the images generated for the 4 sets were tested to see
whether readers could identify the same when shown
from the top. Each test sheet had an image from the
above sets, with three top views out of which only one
top view correctly represents the view shown from
slightly over the waist height. Each test sheet had three
questions and question 1 and 3 were answered in a
Likert scale.
1 Identify the most appropriate picture shown from the
top that matches the picture shown from the waist
height. (Three options provided).
2 Which illustration shown from the top stands the
second best?
3 How confident are you about your response?
Findings:
A comprehensive review of data allows us to see
that there is some significant difference between the 20
different body position-height-action combinations
that were used for this analysis. Subsequent analysis
revealed whether the difference in the mean values of
accuracy between body positions, as has been
discussed in the next paragraph is statistically
significant.
Data shows relatively highest mean values for man
holding bat at chest height for 1/3rd side rotation at .93
(meaning 93% of the participants completed the
matching task accurately), holding waist 1/3rd
back,
holding waist back and front at over .90 mean score.
Only one score from hitting category, hitting waist
back rotation is marked at .93. Interestingly, almost all
the highest levels of accuracy for any given matching
task are recorded for holding bat positions; with hitting
positions for any angle (except hitting waist back
rotation) have lower levels of accuracy.
5. 5University of Aizu, Graduation Thesis. March, 2012 s1170033
Further, data shows that all the highest frequencies
are recorded for 1/3rd
side, back, 1/3rd back positions.
The lowest mean accuracy scores were recorded for
hitting chest 1/3rd back positions at .66, hitting waist
1/3rd
side positions at .68 and other hitting positions
also recording lower mean accuracy scores. With over
80% accuracy scores, frequency data shows over 30
individuals performing the matching task correctly.
I then performed a non-parametric Cochran’s Q test
for binary data (0 = inaccurate; 1 = accurate) for the 5
angular rotations at “holding bat at chest height”.
Results show that with Cochran’s Q value at 4.182 and
p = .382 > .05, there is no significant difference
between the different matching tasks at holding chest
height for 5 angular rotations.
An overall Cochran’s Q test for all the 20
combinations of data (5 angular rotations, height and
bat action) shows a value of - 17.968, with Asymp. Sig
= .525.
For hitting at chest position for 5 angular rotations,
data shows a value of 6.552 with Asymp. Sig = .162.
Although data shows statistically insignificant
difference between the 5 matching tasks in the group,
it certainly shows more diverse data when compared to
the “holding chest” group.
Data show that for “holding bat at the waist height”
combinations for the 5 angular rotations there is
insignificant difference between the mean accuracy
scores. A Cochran’s value of 2.615 and p = .624 goes
on to show the insignificance. However, data does
indicate that the accuracy performance is less varied
for “holding-waist” group than it is for “hitting-chest”
group.
Data further shows that there are statistically
significant difference accuracy scores between the 5
angular body positions for the 5 “hitting-waist”
combinations. With a Cochran’s test value of 14.122
and a p value = .007 < .05, we see that angular
rotations for hitting waist positions did not call for the
same type of accuracy scores. This evidence shows
that when compared to all other groups, the data is
more varied between these 5 matching tasks.
Comparative accuracy scores between four front
positions between the chest and waist heights and for
actions (holding or hitting) show a difference in mean
accuracy scores between 78 ~ 90%. The comparative
accuracy scores between the 1/3rd
side positions
between the chest and waist heights and for actions
(holding or hitting) show a difference in mean
accuracy scores between 68 ~ 93%. Results suggest
that hitting waist 1/3rd
side with 68% accuracy was
way lower than any other position combination
discussed so far. The other hitting position at chest
height had much more accuracy at 83%. But overall, it
looks like the holding positions were relatively easier
to complete. Results suggest that hitting positions on
side angles has relatively lower levels of accuracy
around 76 ~ 78%, but the holding positions (chest and
waist) have higher accuracy scores at 85%.
For the confidence self-reports on the 5 angular
rotations for the holding chest positions, we see a
variation in self-confidence levels in a 1 ~ 5 scale
between 3.50 ~ 3.98. Interestingly, front and 1/3rd side
positions clearly show higher levels of comfort and
confidence.
Interestingly, for hitting chest positions we clearly see
a lowered confidence level around the 3.5 levels for all
the given angular rotations. Even when the confidence
levels are lower when compared to the holding chest
positions, even within this hitting-chest group we see
quite a difference in confidence levels between front
position at 3.43 and 1/3rd back position at 3.63.
Surprisingly, we see higher confidence levels for 1/3rd
back positions, whereas for back position, the
confidence is quite lower.
This data is not conclusive and indicative of any
pattern, but there exists some indications that
canonical viewpoints show a strong correlation
between actual accuracy scores and confidence.
4 Discussions
In the review of the literature, we had a section on
how motor skills and related performance happens
for physical tasks. We wanted readers to be conscious
of the fact that types of actions shown (discrete, serial
or continuous), cognitive information processing by
the actor, linking motor skills and cognitive elements,
and environmental factors for e.g., consideration of
wind factors (while bowling in a game of cricket),
ground slope (when playing golf) etc., technically
and practically have a bearing on how the physical
task is completed. However, in this study we could
not consider it to be a factor that influences
understanding of illustrations. Rather, we wanted
readers to know that it becomes a factor when readers
probably try to emulate the action based on their
comprehension of the illustration. Comprehension of
how the task is to be completed and actual
implementation of the task are different factors and
readers should be aware of the fact that actual
implementation needs more calculation and judgment
based on specific context of action which probably
can’t always be designed as part of technical
illustrations. Motor learning is based on motor
6. 6University of Aizu, Graduation Thesis. March, 2012 s1170033
performance and is different from learning about an
action from a technical illustration with visible
viewpoints. Technical illustrations will work for
initial comprehension of action patterns, but beyond
that, motor learning and performance should
complement each other.
5 Conclusions
This study is aimed at carrying forward the studies
performed by Krull et al (2004) [10]. As compared to
previous studies by Krull et al., this study aimed at
including more variations in actions and body heights
and making those positional features more explicit
and detailed. Further, with this study the aim was to
include a serious group of participants who actually
participated in this exercise for a grade. Future
studies should continue to include more variations in
body height and action types, with more details and
objects in and across the line of sight. This study does
allow us to see the importance of different variables
and how it influences performance. More testing is
needed before we could definitely reach a conclusion
about the preferences that readers might have for
visualization purposes. Finally, besides testing with
different variations on body height – action
combinations, future testing should also make
alterations to the way the current experiment has been
designed to more systematically include more options
for test sheets.
Reference
[1] Michel-Ange Amorim et al., “Embodied Spatial
Transformations: Body Analogy for the Mental
Rotation of Objects” American Psychological
Association, Vol. 135, no. 3, pp. 327-347, 2006.
[2] Johnson A.M., “The speed of mental rotation as a
function of problem-solving strategies.” Perception
and Motor Skills, Vol. 71, no. 3, pp.803-806,
Dec.1990.
[3] Jones B et al., “Effects of sex, handedness,
stimulus and visual field on “mental rotation”.”
Cortex, Vol. 18, no. 4, pp. 501-514, Dec. 1982.
[4] Hertzog C., “Age differences in components of
mental-rotation task performance.” Bulletin of the
Psychonomic Society, Vol. 29, no. 3, pp. 209-212,
May. 1991
[5] Viola I., Kanitsar A., Groller M.E., “Importance-
driven feature enhancement in volume visualization”
IEEE, Vol.11, No.4, pp.408-418, July-Aug, 2005.
[6] H. Gardner, Frames of Mind: The Theory of
Multiple Intelligence New York: Basic Books, 1983
[7] Krull R., “Writing for Bodies in Space”
Proceedings of the IEEE Professional
Communication Society, September, 2001.
[8] Robert Krull, Debopriyo Roy, Shreyas D’Souza,
Marilyn Morgan, “User Perceptions and Point of
View in Technical Illustration s”, STC Proceedings,
2003.
[9] Robert Krull, Shereyas J. D'souza, Debopriyo Roy,
AND D. Michael Sharp, “Designing Procedural
Illustrations” IEEE TRANSACTIONS ON
PROFESSIONAL COMMUNICATION, VOL. 47,
NO. 1, MARCH 2004
[10] Szlichcinski, “The syntax of pictorial
instructions” In P.A. Lolers, M.E Wrolstad, and H.
Bouma(Eds.) Processing of Visible Language, Vol. 2,
pp. 113-124. 1980
[11] Heiser J. and B. Tversky, “Diagrams and
Descriptions in Acquiring Complex Systems.”
Proceedings of the 24th
Annual Meeting of the
Cognitive Science Society, Fairfax, VA, August, 2002.
[12] C.J. Erkelens “Interaction of monocular and
binocular vision” Perception 39 ECVP Abstract
Supplement. 2010.
[13] Kenneth J., Hochberg, ”A SIGNED MEASURE
ON PATH SPACE RELATED TO WIENER
MEASURE” The Annals of Probability, Vol.6, No.3,
Jun 1978.
[14] H. Goldstein, “Communication Intervention for
Children” Journal of Autism Developmental
Disorders, Vol. 32, No. 5, October, 2002.
[15] A. David Milner and Melvyn A. Goodale, “The
visual Brain in Action” Great Clarendon Street,
Oxford OX2 6DP, 1995.
[16] Zacks et al., “Mental Spatial Transformations of
Objects and Perspective.” Spatial Cognition &
Computation, pp. 315-322, 2002.
[17] Schmidt Richard, & Wrisberg Craig, “Motor
Learning and Performance” Human Kinetics
Publishers, United States, 2008.