Lecture 5 in the COMP 4010 class on Augmented and Virtual Reality. This lecture was about AR Interaction and Prototyping methods. Taught by Mark Billinghurst on August 24th 2021 at the University of South Australia.
4. Common AR Optical Tracking Types
• Marker Tracking
• Tracking known artificial markers/images
• e.g. ARToolKit square markers
• Markerless Tracking
• Tracking from known features in real world
• e.g. Vuforia image tracking
• Unprepared Tracking
• Tracking in unknown environment
• e.g. SLAM tracking
5. Marker Tracking
• Available for more than 20 years
• Several open-source solutions exist
• ARToolKit, ARTag, ATK+, etc
• Fairly simple to implement
• Standard computer vision methods
• A rectangle provides 4 corner points
• Enough for pose estimation!
7. Natural Feature Tracking
• Use Natural Cues of Real Elements
• Edges
• Surface Texture
• Interest Points
• Model or Model-Free
• No visual pollution
Contours
Features Points
Surfaces
9. Detection and Tracking
Detection
Incremental
tracking
Tracking target
detected
Tracking target
lost
Tracking target
not detected
Incremental
tracking ok
Start
+ Recognize target type
+ Detect target
+ Initialize camera pose
+ Fast
+ Robust to blur, lighting changes
+ Robust to tilt
• Tracking and detection are complementary approaches.
• After successful detection, the target is tracked incrementally.
• If the target is lost, the detection is activated again
10. NFT – Real-time tracking
• Search for keypoints in the video image
• Create the descriptors
• Match the descriptors from the
live video against those in the database
• Remove the keypoints that are outliers
• Use the remaining keypoints
to calculate the pose of the camera
Keypoint detection
Descriptor creation
and matching
Outlier Removal
Pose estimation
and refinement
Camera Image
Pose
Recognition
12. Tracking from an Unknown Environment
• What to do when you don’t know any features?
• Very important problem in mobile robotics - Where am I?
• SLAM
• Simultaneously Localize And Map the environment
• Goal: to recover both camera pose and map structure
while initially knowing neither.
• Mapping:
• Building a map of the environment which the robot is in
• Localisation:
• Navigating this environment using the map while keeping
track of the robot’s relative position and orientation
13. Parallel Tracking and Mapping
Tracking Mapping
New keyframes
Map updates
+ Estimate camera pose
+ For every frame
+ Extend map
+ Improve map
+ Slow updates rate
Parallel tracking and mapping uses two
concurrent threads, one for tracking and one
for mapping, which run at different speeds
14. How SLAMWorks
• Three main steps
1. Tracking a set of points through successive camera frames
2. Using these tracks to triangulate their 3D position
3. Simultaneously use the estimated point locations to calculate
the camera pose which could have observed them
• By observing a sufficient number of points can solve for both
structure and motion (camera path and scene structure).
16. Combining Sensors andVision
• Sensors
• Produces noisy output (= jittering augmentations)
• Are not sufficiently accurate (= wrongly placed augmentations)
• Gives us first information on where we are in the world,
and what we are looking at
• Vision
• Is more accurate (= stable and correct augmentations)
• Requires choosing the correct keypoint database to track from
• Requires registering our local coordinate frame (online-
generated model) to the global one (world)
17. Example: Outdoor Hybrid Tracking
• Combines
• computer vision
• inertial gyroscope sensors
• Both correct for each other
• Inertial gyro
• provides frame to frame prediction of camera
orientation, fast sensing
• drifts over time
• Computer vision
• Natural feature tracking, corrects for gyro drift
• Slower, less accurate
18. ARKit – Visual Inertial Odometry
• Uses both computer vision + inertial sensing
• Tracking position twice
• Computer Vision – feature tracking, 2D plane tracking
• Inertial sensing – using the phone IMU
• Output combined via Kalman filter
• Determine which output is most accurate
• Pass pose to ARKit SDK
• Each system compliments the other
• Computer vision – needs visual features
• IMU - drifts over time, doesn’t need features
19. ARKit –Visual Inertial Odometry
• Slow camera
• Fast IMU
• If camera drops out IMU takes over
• Camera corrects IMU errors
21. Different Types of AR Interaction
1. Browsing Interfaces
• simple (conceptually!), unobtrusive
2. 3D AR Interfaces
• expressive, creative, require attention
3. Tangible Interfaces
• embedded into conventional environments
4. Tangible AR
• combines TUI input + AR display
5. Natural AR Interfaces
• interact with gesture, speech and gaze
22. 1. AR Interfaces as Data Browsers
• 2D/3D virtual objects are
registered in 3D
• “VR in Real World”
• Interaction
• 2D/3D virtual viewpoint control
• Applications
• Visualization, training
23. Example: Google Maps AR Mode
• AR Navigation Aid
• GPS + compass, 2D/3D object placement
24. Advantages and Disadvantages
• Important class of AR interfaces
• Wearable computers
• AR simulation, training
• Limited interactivity
• Modification of virtual
content is difficult
Rekimoto, et al. 1997
25. 2. 3D AR Interfaces
• Virtual objects displayed in 3D
physical space and manipulated
• HMDs and 6DOF head-tracking
• 6DOF hand trackers for input
• Interaction
• Viewpoint control
• Traditional 3D user interface interaction:
manipulation, selection, etc.
Kiyokawa, et al. 2000
26. Advantages and Disadvantages
• Important class of AR interfaces
• Entertainment, design, training
• Advantages
• User can interact with 3D virtual
object everywhere in space
• Natural, familiar interaction
• Disadvantages
• Usually no tactile feedback
• User has to use different devices
for virtual and physical objects
Oshima, et al. 2000
28. 3. Augmented Surfaces and Tangible Interfaces
• Basic principles
• Virtual images are projected
on a surface
• Physical objects are used as
controls for virtual objects
• Support for collaboration
Wellner, P. (1993). Interacting with paper on the
DigitalDesk. Communications of the ACM, 36(7), 87-96.
29. Augmented Surfaces
• Rekimoto, et al. 1999
• Front projection
• Marker-based tracking
• Multiple projection surfaces
• Object interaction
Rekimoto, J., & Saitoh, M. (1999, May). Augmented
surfaces: a spatially continuous work space for hybrid
computing environments. In Proceedings of the SIGCHI
conference on Human Factors in Computing
Systems (pp. 378-385).
37. i/O Brush (Ryokai, Marti, Ishii) - 2004
Ryokai, K., Marti, S., & Ishii, H. (2004, April). I/O brush: drawing with everyday objects as ink.
In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 303-310).
39. Many Other Examples
• Triangles (Gorbert 1998)
• Triangular based story telling
• ActiveCube (Kitamura 2000-)
• Cubes with sensors
• Reactable (2007- )
• Cube based music interface
40. Lessons from Tangible Interfaces
• Physical objects make us smart
• Norman’s “Things that Make Us Smart”
• encode affordances, constraints
• Objects aid collaboration
• establish shared meaning
• Objects increase understanding
• serve as cognitive artifacts
41. But There are TUI Limitations
• Difficult to change object properties
• can’t tell state of digital data
• Limited display capabilities
• projection screen = 2D
• dependent on physical display surface
• Separation between object and display
• ARgroove – Interact on table, look at screen
42. Advantages and Disadvantages
•Advantages
• Natural - user’s hands are used for interacting
with both virtual and real objects.
• No need for special purpose input devices
•Disadvantages
• Interaction is limited only to 2D surface
• Full 3D interaction and manipulation is difficult
43. Orthogonal Nature of Interfaces
3D AR interfaces Tangible Interfaces
Spatial Gap No – Interaction is
Everywhere
Yes – Interaction is
only on 2D surfaces
Interaction Gap
Yes – separate
devices for physical
and virtual objects
No – same devices for
physical and virtual
objects
44. Orthogonal Nature of Interfaces
3D AR interfaces Tangible Interfaces
Spatial Gap No – Interaction is
Everywhere
Yes – Interaction is
only on 2D surfaces
Interaction Gap
Yes – separate
devices for physical
and virtual objects
No – same devices for
physical and virtual
objects
45. 4. Tangible AR: Back to the Real World
• AR overcomes display limitation of TUIs
• enhance display possibilities
• merge task/display space
• provide public and private views
• TUI + AR = Tangible AR
• Apply TUI methods to AR interface design
Billinghurst, M., Kato, H., & Poupyrev, I. (2008). Tangible augmented reality. ACM Siggraph Asia, 7(2), 1-10.
46. Space vs. Time - Multiplexed
• Space-multiplexed
• Many devices each with one function
• Quicker to use, more intuitive, clutter
• Real Toolbox
• Time-multiplexed
• One device with many functions
• Space efficient
• mouse
47. Tangible AR: Tiles (Space Multiplexed)
• Tiles semantics
• data tiles
• operation tiles
• Operation on tiles
• proximity
• spatial arrangements
• space-multiplexed
Poupyrev, I., Tan, D. S., Billinghurst, M., Kato, H., Regenbrecht, H., & Tetsutani, N. (2001,
July). Tiles: A Mixed Reality Authoring Interface. In Interact (Vol. 1, pp. 334-341).
51. Tangible AR: Time-multiplexed Interaction
• Use of natural physical object manipulations to control
virtual objects
• VOMAR Demo
• Catalog book:
• Turn over the page
• Paddle operation:
• Push, shake, incline, hit, scoop
Kato, H., Billinghurst, M., Poupyrev, I., Imamoto, K., & Tachibana, K. (2000, October). Virtual object manipulation on a table-top AR
environment. In Proceedings IEEE and ACM International Symposium on Augmented Reality (ISAR 2000) (pp. 111-119). IEEE.
54. Advantages and Disadvantages
•Advantages
• Natural interaction with virtual and physical tools
• No need for special purpose input devices
• Spatial interaction with virtual objects
• 3D manipulation with virtual objects anywhere in space
•Disadvantages
• Requires Head Mounted Display
55. 5. Natural AR Interfaces
• Goal:
• Interact with AR content the same
way we interact in the real world
• Using natural user input
• Body motion
• Gesture
• Gaze
• Speech
• Input recognition
• Nature gestures, gaze
• Multimodal input
FingARtips (2004)
Tinmith (2001)
56. External Fixed Cameras
• Overhead depth sensing camera
• Capture real time hand model
• Create point cloud model
• Overlay graphics on AR view
• Perform gesture interaction
Billinghurst, M., Piumsomboon, T., & Bai, H. (2014). Hands in space: Gesture interaction with
augmented-reality interfaces. IEEE computer graphics and applications, 34(1), 77-80.
58. Head Mounted Cameras
• Attach cameras/depth sensor to HMD
• Connect to high end PC
• Computer vision capture/processing on PC
• Perform tracking/gesture recognition on PC
• Use custom tracking hardware
• Leap Motion (Structured IR)
• Intel RealSense (Stereo depth)
Project NorthStar (2018)
Meta2 (2016)
62. Speech Input
• Reliable speech recognition
• Windows speech, Watson, etc.
• Indirect input with AR content
• No need for gesture
• Match with gaze/head pointing
• Look to select target
• Good for Quantitative input
• Numbers, text, etc.
• Keyword trigger
• “select”, ”hey cortana”, etc https://www.youtube.com/watch?v=eHMkOpNUtR8
63. Eye Tracking Interfaces
• Use IR light to find gaze direction
• IR sources + cameras in HMD
• Support implicit input
• Always look before interact
• Natural pointing input
• Multimodal Input
• Combine with gesture/speech
Camera
IR light
IR view
Processed image
Hololens 2
65. Evolution of AR Interfaces
Tangible AR
Tangible input
AR overlay
Direct interaction
Natural AR
Freehand gesture
Speech, gaze
Tangible UI
Augmented surfaces
Object interaction
Familiar controllers
Indirect interaction
3D AR
3D UI
Dedicated
controllers
Custom devices
Browsing
Simple input
Viewpoint control
Expressiveness, Intuitiveness
67. Interaction Design
“Designing interactive products to support
people in their everyday and working lives”
Preece, J., (2002). Interaction Design
• Design of User Experience with Technology
68. Bill Verplank on Interaction Design
https://www.youtube.com/watch?v=Gk6XAmALOWI
69. •Interaction Design involves answering three questions:
•What do you do? - How do you affect the world?
•What do you feel? – What do you sense of the world?
•What do you know? – What do you learn?
Bill Verplank
70. Typical Interaction Design Cycle
Develop alternative prototypes/concepts and compare them, And iterate, iterate, iterate....
82. Tom Chi’s Prototyping Rules
1. Find the quickest path to experience
2. Doing is the best kind of thinking
3. Use materials that move at the speed of
thought to maximize your rate of learning
83. How can we quickly prototype
XR experiences with little or no
coding?
91. Advantages/Disadvantages
Prototype Advantages Disadvantages
Low-fidelity
prototype
- low developmental cost
- evaluate multiple
design concepts
- limited error checking
- navigational and flow
limitations
High-fidelity
prototype
- fully interactive
- look and feel of final
product
- clearly defined
navigational scheme
- more expensive to develop
- time consuming to build
- developers are reluctant to
change something they have
crafted for hours
98. Buxton’s Key Attributes of Sketching
• Quick
• Work at speed of thought
• Timely
• Always available
• Disposable
• Inexpensive, little investment
• Plentiful
• Easy to iterate
• A catalyst
• Evokes conversations
106. Wireframes
It’s about
- Functional specs
- Navigation and interaction
- Functionality and layout
- How interface elements work together
- Defining the interaction flow/experience
Leaving room for the design to be created
117. Microsoft Maquette
•Prototype AR/VR interfaces from inside VR
•3D UI for spatial prototyping
•Bring content into Unity with plug-in
•Javascript support
119. Scene Assembly In AR
• Many tools for creating AR scenes
• Drag and drop your assets
• Develop on web, publish to mobile
• Examples
• Catchoom - CraftAR
• Blippar - Blipbuilder
• ARloopa - Arloopa studio
• Wikitude - Wikitude studio
• Zappar - ZapWorks Designer
120. CraftAR
•Web-based AR marker tracking
•Add 3D models, video, images to real print content
•Simple drag and drop interface
•Cloud based image recognition
•https://catchoom.com/augmented-reality-craftar/
127. AR Prototyping with Layers
● Separate world-stabilized and head stabilized
○ Draw world stabilized on background paper
○ Draw head stabilized on transparent plastic
● Simulate Field of View of AR HMD
Lauber, F., Böttcher, C., & Butz, A. (2014, September). Papar: Paper prototyping for augmented reality. In Adjunct Proceedings
of the 6th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (pp. 1-6).
FOV in Red
Head Stabilized - Foreground
World Stabilized - Background
128. Example: Mobile AR Prototyping
https://www.youtube.com/watch?v=Hrbflbct99o
138. Video Sketching
• Process
• Capture elements of real world
• Use series of still photos/sketches in a movie format.
• Act out using the product
• Benefits
• Demonstrates the product experience
• Discover where concept needs fleshing out.
• Communicate experience and interface
• You can use whatever tools you want, e.g. iMovie.
143. Tvori - Animating AR/VR Interfaces
• Animation based AR/VR prototyping tool
• https://tvori.co/
• Key features
• Model input, animation, etc
• Export 360 images and video
• Simulate AR views
• Multi-user support
• Present in VR
• Create VR executable