"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
A Practical Guide to WebRTC
1. A Practical Guide to
Building WebRTC Apps
Ben Strong
ben@vline.com
https://vline.com
@vlineinc
vLine
10/09/2013
2. 8/2011
My Background with WebRTC
Founded vLine in early 2011 to
build tools and infrastructure for
WebRTC
EDGE LOCATIONS ON 5 CONTINENTS
Global Infrastructure
Developer SDKs
Built what was probably the first
WebRTC App
8/2013
vLine
3. vLine 3
Topics
1. Making Mobile Work Well
2.Choosing a Strategy for Multi-Party Conferencing
3. UI Considerations
5. vLine 5
Mobile: The Basics
Android
• Works in Chrome 29+ and
Firefox 24+
• Google Maintains Java
PeerConnection APIs
• Works pretty well
iOS
• Not supported in any browser
(thanks to Apple’s browser
engine policies)
• Google Maintains Obj-C APIs
• Voice works well
• Video can be made to work well
(this will get easier)
6. vLine 6
Android: Browser vs. Native
Browser Pros Native App Pros
• Re-use app built for desktop
browsers.
• Supports the most powerful
part of WebRTC: Links
• Push notifications!!!
• Other native APIs like address
book and background services
• Easier to tune for particular
devices
• Slightly better performance
7. vLine 7
Android: Browser vs Native
Our Answer: Both
• Make your web app work well
on mobile, so that users without
native app have low-friction
experience
• Run the same app in PhoneGap
with push notifications, address
book integration, etc.
8. vLine 8
Mobile: Video Quality
• Mobile devices are less powerful
than laptops and desktops
(duh)
• Wide range of capabilities among
mobile devices
(smaller gap between high-end devices and low-end
laptops than between low-end and high-end mobile
devices)
• WebRTC methods for adapting to
device capabilities don’t work well
(yet)
9. vLine 9
Mobile: How WebRTC Adapts
Two Methods:
• Video Adapter reduces encode
resolution if CPU is overloaded
• Remote Bitrate Estimator tells
other side to reduce encode
bitrate if decoder isn’t keeping
up
Problems
• Only kick in after quality is visibly
degraded
• Encoder and decoder compete for
CPU, with unpredictable interactions
• Result is often bitrates of 50kbps on
devices and networks that could
support much better.
10. vLine 10
Mobile: Device Adaptation
The unfortunate reality
You’ll get much better results by imposing
limits on resolution, framerate, and bitrate
than by relying on the built-in adaptation
methods.
11. vLine 11
Device Adaptation: The Knobs
Framerate and resolution are controlled by MediaConstraints passed to
getUserMedia().
function getStream(width, height, framerate) {
var constraints = {
video: {
mandatory:{
minWidth:width,
maxWidth:width,
maxFramerate:framerate
}
}
};
navigator.webkitGetUserMedia(constraints, onStream1, onFailure);
}
Note: On Chrome, if another LocalMediaStream is active, the constraints will not
take effect. Stop all other streams first!
12. vLine 12
Device Adaptation: The Knobs
Max bitrate and quantization set by adding VP8 fmtp parameters to SDP
(Chrome only)
...
a=rtpmap:100 VP8/9000
a=fmtp:100 x-google-max-bitrate=700; x-google-max-quantization=20
...
13. vLine 13
Mobile: Choosing Parameters
Two Strategies:
1. If at least one party is on mobile, set very low limits on resolution,
framerate, and bitrate (e.g., 320x240 at 15fps and 400kbps)
2. Detect device capabilities, signal them to peers, and adapt encoding
parameters appropriately
• Set encoding parameters to minimum of what the local device can
encode and the remote device can decode
14. vLine 14
Mobile: Choosing Parameters
Detecting Device Capabilities
1. For native apps, Android and iOS APIs will give you fine-grained
capabilities
2. For browser-based apps, run a javascript benchmark
• Run it once in a single web worker.
• Run it in parallel on 4 web workers.
• Infer CPU clock rate and number of cores.
3. Never encode at a higher resolution than the remote device’s screen!
15. 15
Mass-Market Telepresence
With HDMI outputs, 1080p-capable
hardware codecs, and support for
bluetooth conference mics, Android
devices make the perfect embedded
telepresence systems, starting at $100 per
room.
A Mobile Bonus
HP Slate 21 - $400
Tegra 4 processor and 21”
screen. Perfect on a desktop or
wall-mounted in a small
conference room.
Galaxy NX - $1600
Exnyos-5 Octa processor plus
APS-C sensor. Mount it on a
wall and plug in TV for a room
system with broadcast-quality
video.
HDMI Dongle - $50
Rk3188 processor. Plug in a
USB camera and a TV for a
1080p capable room system.
Note 10 - $500
Exnyos-5 Octa processor. Put it
on a conference table and plug
in a wall-mounted TV and USB
camera to power a larger room.
18. vLine 18
Multi-Party: Mesh Topology
Mesh Pros
• No server (simpler and cheaper
infrastructure)
Mesh Cons
• More complex session management
• Only scales to about 4 participants
• Poor results on mobile devices
(which are often hard-pressed to
support a single peer)
19. vLine 19
Multi-Party: Star Topology
Star Pros
• Less processing and network
load on clients
• Higher quality on low-end
devices
• Potentially unlimited number of
participants.
• Good place to record, do
speaker detection, etc.
Star Cons
• Have to run lots of servers (complex
and expensive)
• For fast machines on fast network
connections, may be worse quality
(mostly remedied with geo-
distributed servers)
20. vLine 20
Multi-Party: Star Topology
How the Star Topology Works
• All clients connect to central server (usually called MCU or Media
Router)
• Clients encode and send one stream. MCU fans it out.
21. vLine 21
Multi-Party: Star Topology
MCU/Router Strategies
• Mix/composite once and re-encode a single stream for each client
(best quality. lowest load on client. very little flexibility on client. high
load on server)
• Re-encode each stream for each client (high quality. lots of flexibility
on client. more load on client. very high load on server)
• Restrict bandwidth, resolution, framerate, to lowest common
denominator and just forward (low load, potentially poor quality)
• Advanced strategies: temporal scaling and simulcast
22. vLine 22
Multi-Party: Star Topology
Mixed approach is probably best
• Send two streams to server from each client (low and high res)
• For small differences, adapt to lowest-common denominator and
forward.
• For large differences, re-encode.
• Use temporal scaling to fine-tune (if you control the encoder)
23. vLine 23
UI Considerations
For Video, Bigger is Better
• Use full window
• Encourage full-screen (especially on mobile)
• HD is best of all, but make sure devices on both ends can handle the
CPU load, or you’ll get a worse experience.
24. vLine 24
UI Considerations
No one notices the GetUserMedia permissions UI!
• Show an arrow to point at it and don’t let them do anything else
before accepting
• Be aware:
• In Chrome, button positions vary by platform.
• Firefox is different
• On Android Phone devices, buttons are at bottom.
25. vLine 25
UI Considerations
Feedback for states and errors
• If you have a mute button, make it obvious when you're muted
• Give the user feedback that you are trying to connect (or reconnect)
• If a session ends, distinguish between the other party hanging up
and losing connection
• Use stats api to detect poor network conditions and provide
feedback. User can then do something about it (e.g., move closer to
wifi access point)
• If video bitrate stays at 50kbps, encourage (or force) user to switch
to voice only