3. What we support
Out of the box Firefox can handle the following codecs:
• Video: VP8 (ffvp8), VP9 (ffvp9) and Theora (libtheora)
• Audio: Vorbis (libvorbis) and Opus (libopus), FLAC (ffmpeg)
Relying on installed system frameworks for:
• H264, AAC and MP3.
• Windows: Media Foundation Transform (MFT), supports hardware
acceleration in combination with D3D9 and D3D11. Not available on
XP. European editions (N, KN) require installing extra packages.
• Mac: Video Toolbox, supports hardware acceleration; CoreMedia.
• Linux and others: FFmpeg. Software decoding only.
4. Media Source Extension
Everything is supported as of current draft specifications except:
• MPEG-TS
• raw AAC and MP3 streams
• Anything related to the Track elements
Limitations:
• All multi-channels audio tracks are downmixed to stereo.
• Only one source buffer type (audio or video) at once.
5. Media Source Extension
Always supported when we have local decoders:
video/mp4: H264, AAC, MP3. Soon Opus and FLAC
video/webm: VP8, VP9, Vorbis and Opus
Note for webm.
VP8 and VP9 codecs are only available by default if one of the
conditions is true:
• No H264 decoder found
• No hardware acceleration (typically blacklisted drivers)
• Machine is deemed fast enough
• media.mediasource.webm.enabled preferences is set to true.
6. HTML5 Media Element Architecture (Plain)
All operations between the media element and the media stacks
are asynchronous and use a Promise-like communication mechanism.
HTML Media Element
(manage events and
user operations)
Media Stack
(loading, demuxing, decoding)
JS
● currentTime
● readyState
● Load
● Play / Pause
● Seek
Video Compositor Audio Renderer
7. Media Stack (plain)
Asynchronous
Heavily multi-threaded MediaResource
MediaDecoder
State Machine
MediaDataDemuxer
MediaDataDecoder
Platform Module
MediaFormatReader
MediaDataDecoder
MediaDataDecoder
MediaCache
8. HTML5 Media Element Architecture (MSE)
All operations between the media element and the media stacks
are asynchronous and use a Promise-like communication mechanism.
HTML Media Element
(manage events and
user operations)
Media Stack
(loading, demuxing, decoding)
JS
● currentTime
● readyState
● Load
● Play / Pause
● Seek
Video Compositor Audio Renderer
MediaSource
SourceBuffer SourceBuffer
9. Media Stack (MSE)
Asynchronous
Heavily multi-threaded MediaSourceResourc
e
MediaDecoder
State Machine
MediaDataDemuxer
MediaDataDecoder
Platform Module
MediaFormatReader
MediaDataDecoder
MediaDataDecoder
TrackBuffer
10. Implementation Notes
• Mostly written in C++
• All demuxers are written in house. While we often use external
libraries to provide core features, we control the entire demuxing
chain.
11. MSE Implementations notes
Eviction strategies:
• In 50 and earlier, 100MB video source buffer, 30MB audio source
buffer (was both 100MB in 48 and earlier).
• In 51 and later, 100MB video, 10MB audio.
First, attempt to evict data located prior currentTime.
Second, attempt to evict future data, found after discontinuity
In the future, we are considering to stop having a set size, and instead
base the eviction on the duration of data buffered (e.g. 30s for both
audio and video).
Combined maximum buffer size shared across all source buffers.
12. Media Most Common Issues
• Buggy video drivers
Solutions: blacklisting, out of process decoding
• Unsupported media file
Solutions: Decoder: tough luck, Demuxer: fix it.
• Security
Solutions: rewriting some components in Rust language.
13. MSE Most Common Issues
• Bad muxing. In particular invalid tagging of keyframes.
• Invalid timestamps, gap in data (in 51 and earlier, Firefox will not go
over 125ms gap, 500ms in 52)
• Having to rely on platform decoder limitation or unique behaviour
especially on Windows.
• Chrome centric code, or relying on invalid Chrome behaviour.
• Not listening to appendBuffer events, especially buffer full.
14. HTML5 Media Element Architecture (EME)
EME is only working in combination with MSE
HTML Media Element
(manage events and
user operations)
Media Stack
(loading, demuxing, decoding)
JS
● currentTime
● readyState
● Load
● Play / Pause
● Seek
Video Compositor Audio Renderer
MediaKeys
MediaKey
session
MediaKey
session
15. Media Stack (EME)
The CDM runs in its own child process, within a sandbox.
Decrypted and decoded data is fed back into our media stack for
rendering
MediaSourceResourc
e
MediaDecoder
State Machine
MediaDataDemuxer
Platform Module
MediaFormatReader
EMEVideoDataDecoder
TrackBuffer
MediaKey
session MediaKey
session
EMEAudioDataDecoder
16. EME Support
• Currently only supporting Google’s Widevine and Adobe’s Primetime
and ClearKey CDM
• No access to Microsoft PlayReady or Apple FairPlay. This prevents us
from having access to hardware decoding for encrypted content.
• Netflix only delivering 720p, same for Amazon with some contents.
17. Gecko Future improvements.
• Out of process GPU decoding. When a driver crashes we can
immediately recover with zero visible consequences
• Suspend decoding for videos when in the background to reduce CPU
usage and increase battery life
• E10S: increasing the number of content processes
18. How can you help yourselves
• Test using Firefox!
• MSE implementation is very rigorous and 100% per spec.
• If it works in Firefox it will work with other compliant browsers. It’s
also more likely to work with all other browsers.
• You’re better off testing with Firefox