This document discusses GStreamer support and integration in WebKit. It covers:
1. The current integration of GStreamer for HTML5 audio and video playback, as well as WebAudio.
2. Plans for next-generation video rendering using GstGL to leverage hardware acceleration.
3. Support for adaptive streaming using Media Source Extensions and Dash playback via GStreamer.
4. Encrypted media playback support via Encrypted Media Extensions for DRM protected content.
5. Progress on WebRTC integration for real-time communication capabilities.
2. Talk outline
1. Brief introduction about WebKit
2. Current integration of GStreamer in WebKit
3. Next generation video rendering with GstGL
4. Adaptive streaming with Media Source Extensions
5. Protected content playback with Encrypted Media Extensions
6. WebRTC: are we there yet?
2
4. WebKit: web content rendering
● Originally a KHTML fork started by Apple in 2005
● Contributions from other big companies: Samsung, Intel, Adobe, Canon, …
● Forked by Google in 2012: Blink
● Ports maintained upstream: Mac, GTK, EFL, Windows
● Downstream ports:
○ QtWebKit: deprecated, replaced by QtWebEngine (powered by Blink)
○ WebKitForWayland: tight integration with modern graphics APIs. Soon upstream!
4
5. WebKit’s building blocks
● WTF: utility/commodity classes
● JavaScriptCore engine
● WebCore: internal library transforming HTML to render layers
○ DOM management, HTML5 specification implementations
○ platform integration for network support, graphics rendering, multimedia, etc…
● WebKit2: high level API and multi-process architecture
○ WebProcess running most of WebCore
○ NetworkProcess managing resources loading
○ UIProcess: graphical rendering in a widget
○ Widget Toolkit public API for GTK+, EFL, etc
5
7. <audio> and <video> HTML5 elements
● Media player based on playbin
● One source element for reading data over HTTP(S) with WebCore’s
network resource loader
● Dedicated video sink for rendering to a video layer
● Basic hardware acceleration support (GLTextureUploadMeta)
● Audio/Video/Text tracks integration
● Trick modes, frame-accurate seeking, on-disk buffering
● Codec installer support
7
8. WebAudio
● Use-case: sound samples control. Examples: games, music production
apps.
GStreamer backend:
● One decoder pipeline to inject audio from files/memory into WebCore’s
WebAudio framework
● One pipeline for playback of generated audio samples
● Integration with Media player to inject <video>/<audio> audio samples into
WebAudio.
8
10. Current approach: WebCore’s internal sink
● Allocation query management and handling of the various Meta
negotiations
● Currently only supports GLTextureUploadMeta
● Passes video frames to the player using a signal
=> Basically a Video AppSink
10
11. New approach using glimagesink
● Handles all the exotic Meta negotiation and allocation for us
● Handles texture conversions and zero-copy within playbin
● Passes textures to WebKit for rendering using the client-draw signal
● Double-buffering within the MediaPlayer and passes textures directly to
layer compositor
● Hardware acceleration and zero-copy support for free
● Reduced code maintenance in the long term!
11
13. Media Source Extensions
● Allows full control of the web-app on the media content
○ JavaScript feeds data blocks to the player
○ → Buffered ranges available for playback
○ Web-app is aware of the playback lifecycle and of the streams embedded in the media
● Use-cases:
○ Adaptive streaming (example: DASH), needed for YouTube 2015 conformance
○ Time shifting live streams
○ Rich UI showing buffering/bitrate statistics
○ ...
13
14. How it works
● JavaScript API:
○ MediaPlayer: Associated to <video> tag, configured with a MediaSource
○ MediaSource: Aggregates all the streams of a particular video
○ SourceBuffer:
■ Feedable data source for a particular stream (audio, video, text)
■ Aware of buffered time ranges (Samples)
○ Track: Exposes metadata, can get Samples from the SourceBuffer
● GStreamer implementation:
○ One “append” pipeline per SourceBuffer → generate Samples → stored in the MSE layer
○ One playback pipeline → getting Samples from the MSE layer
14
15. “Append” Pipeline
typefind qtdemux appsinkappsrc
● Works independently of the player state
● Work in progress: multiple demuxed streams per raw data stream
● Work in progress: better demuxer reconfiguration (caps changes) support
Raw data appended
to a SourceBuffer in
the MSE layer
demuxed samples
to audio/video/text track
in the MSE layer
15
16. playbin
WebKitMediaSrc
Playback Pipeline - playbin-based
16
appsrc
inputselector
parser bin
appsrc
decodebin
inputselector
playsink
Stream 1
Stream N
Video
Audio
Samples
from track
in the MSE
layer
Samples
from track
in the MSE
layer
parser bin
18. Encrypted Media Extensions use-case
● Initial spec from Netflix, Microsoft and Google. Apple also involved.
● Implemented in Chrome, Safari, Edge
● Decryption algorithms not mandated by the spec
● Protected content playback:
1. Decryption key negotiation: need-key HTML media event
2. Key acquisition from a license server (CDM interface)
3. Media content decryption
4. Secure video rendering (Protected access to raw video frames)
18
19. GStreamer’s protection event signaling
1. Demuxer detects encrypted content (ex: pssh box, PIFF UUID box).
2. Demuxer sends protection GstEvent(s).
3. Demuxer exposes protection system in src pad caps.
4. Decodebin hooks up a decryptor element supporting those caps before the
parser and decoder
No upstream decryptor implementation provided. Only signaling.
19
21. “Secure” video rendering - ugly approach
● Main requirement: protection of raw video frames from user-space/main
memory
● glimagesink and webkit videosink don’t comply
● Usual hack in a surprising amount of downstream WebKit forks:
○ render a transparent area instead of the actual video
○ defer video rendering to a separate player / process
○ rendering in a “under-layer”
○ ⇒ integration with CSS/WebGL/Canvas lost
○ ⇒ bad scrolling support
● Not acceptable for WebKit upstream
21
22. “Secure” video rendering with OP-TEE
● Open Portable Trusted Execution Environment on ARM’s TrustZone
● Decryption performed in OP-TEE OS
● Video4Linux2 mem-to-mem support
● DMA-BUF fds exchange between decoder and renderer
22
24. Getting there!
● OpenWebRTC is maturing
● Ericsson / Centricular / Igalia / Temasys http://www.webrtcinwebkit.org
● PeerConnection backend improving a lot, video calls now possible!
● DataChannel support also prototyped
● Soon in WebKitGTK upstream
24