SlideShare a Scribd company logo
1 of 13
Download to read offline
OPSE: Online Per-Scene Encoding for Adaptive HTTP Live
Streaming
Vignesh V Menon1, Hadi Amirpour1, Christian Feldmann2, Mohammad Ghanbari1,3, and
Christian Timmerer1
1
Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria
2
Bitmovin, Klagenfurt, Austria
3
School of Computer Science and Electronic Engineering, University of Essex, UK
21 July 2022
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 1
Outline
1 Introduction
2 OPSE
3 Evaluation
4 Q & A
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 2
Introduction
Motivation
Per-scene encoding schemes are based on the fact that each resolution performs better
than others in a scene for a given bitrate range, and these regions depend on the video
complexity.
Increase the Quality of Experience (QoE) or decrease the bitrate of the representations as
introduced for VoD services.1
Figure: The bitrate ladder prediction envisioned using OPSE.
1
J. De Cock et al. “Complexity-based consistent-quality encoding in the cloud”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016,
pp. 1484–1488. doi: 10.1109/ICIP.2016.7532605.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 3
Introduction
Why not in live yet?
Though per-title encoding schemes2 enhance the quality of video delivery, determining the
convex-hull is computationally expensive, making it suitable for only VoD streaming
applications.
Some methods pre-analyze the video contents3.
Katsenou et al.4
introduced a content-gnostic method that employs machine learning to find
the bitrate range for each resolution that outperforms other resolutions. Bhat et al.5
proposed a Random Forest (RF) classifier to decide encoding resolution best suited over
different quality ranges and studied machine learning based adaptive resolution prediction.
However, these approaches still yield latency much higher than the accepted latency in
live streaming.
2
De Cock et al., “Complexity-based consistent-quality encoding in the cloud”; Hadi Amirpour et al. “PSTR: Per-Title Encoding Using Spatio-Temporal
Resolutions”. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021, pp. 1–6. doi: 10.1109/ICME51207.2021.9428247.
3
https://bitmovin.com/whitepapers/Bitmovin-Per-Title.pdf, last access: May 10, 2022.
4
A. V. Katsenou et al. “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming”. In: 2019 Picture Coding Symposium (PCS). 2019. doi:
10.1109/PCS48520.2019.8954529.
5
Madhukar Bhat et al. “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study”. In: 2021 IEEE
International Conference on Image Processing (ICIP). 2021, pp. 2164–2168. doi: 10.1109/ICIP42928.2021.9506310.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 4
OPSE
OPSE
Input Video
Video Complexity
Feature Extraction
Scene Detection
Resolution
Prediction
Resolutions (R)
Bitrates (B)
Per-Scene
Encoding
(E, h, ϵ)
(E, h)
Scenes (ˆ
r, b)
Figure: OPSE architecture.
E, h, and ϵ features are extracted using VCA open-source video complexity analyzer software.6
6
Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. 2022. isbn: 9781450392839.
doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 5
OPSE
OPSE
Phase 1: Feature Extraction
Compute texture energy per block
A DCT-based energy function is used to determine the block-wise feature of each frame
defined as:
Hk =
w−1
X
i=0
w−1
X
j=0
e|( ij
wh
)2−1|
|DCT(i, j)| (1)
where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when
i + j > 0, and 0 otherwise.
The energy values of blocks in a frame is averaged to determine the energy per frame.7
E =
C−1
X
k=0
Hp,k
C · w2
(2)
7
Michael King et al. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007,
pp. 1647–1650. doi: 10.1109/ICME.2007.4284983.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 6
OPSE
OPSE
Phase 1: Feature Extraction
hp: SAD of the block level energy values of frame p to that of the previous frame p − 1.
hp =
C−1
X
k=0
| Hp,k, Hp−1,k |
C · w2
(3)
where C denotes the number of blocks in frame p.
The gradient of h per frame p, ϵp is also defined, which is given by:
ϵp =
hp−1 − hp
hp−1
(4)
Latency
Speed of feature extraction = 1480fps for Full HD (1080p) video with 8 CPU threads and x86
SIMD optimization
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 7
OPSE
OPSE
Phase 2: Scene Detection
Objective:
Detect the first picture of each shot and encode it as an Instantaneous Decoder Refresh
(IDR) frame.
Encode the subsequent frames of the new shot based on the first one via motion compen-
sation and prediction.
Shot transitions can be present in two ways:
hard shot-cuts
gradual shot transitions
The detection of gradual changes is much more difficult owing to the fact it is difficult to
determine the change in the visual information in a quantitative format.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 8
OPSE
OPSE
Phase 2: Scene Detection
Step 1: while Parsing all video frames do
if ϵk > T1 then
k ← IDR-frame, a new shot.
else if ϵk ≤ T2 then
k ← P-frame or B-frame, not a new shot.
T1 , T2 : maximum and minimum threshold for ϵk
f : video fps
Q : Q : set of frames where T1 ≥ ϵ > T2 and ∆h > T3
q0: current frame number in the set Q
q−1: previous frame number in the set Q
q1: next frame number in the set Q
Step 2: while Parsing Q do
if q0 − q−1 > f and q1 − q0 > f then
q0 ← IDR-frame, a new shot.
Eliminate q0 from Q.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 9
OPSE
OPSE
Phase 3: Resolution Prediction
For each detected scene, the optimized bitrate ladder is predicted using the E and h features
of the first GOP of each scene and the sets R and B. The optimized resolution ˆ
r is predicted
for each target bitrate b ∈ B. The resolution scaling factor s is defined as:
s =
 r
rmax

; r ∈ R (5)
where rmax is the maximum resolution in R.
Hidden Layer
E R4
Hidden Layer
E R4
Input Layer
E R3
Output Layer
E R1
E
h
log(b)
ŝ
Figure: Neural network structure to predict optimized resolution scaling factor ŝ for a maximum
resolution rmax and framerate f .
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 10
Evaluation
Evaluation
R = {360p, 432p, 540p, 720p, 1080p}
B = {145, 300, 600, 900, 1600, 2400, 3400, 4500, 5800, 8100}.
Figure: BDRV results for scenes characterized by various average E and h.
BDRV : Bjøntegaard delta rate8 refers to the average increase in bitrate of the representations
compared with that of the fixed bitrate ladder encoding to maintain the same VMAF.
8
G. Bjontegaard. “Calculation of average PSNR differences between RD-curves”. In: VCEG-M33 (2001).
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 11
Evaluation
Evaluation
(a) Scene1 (b) Scene2
Figure: Comparison of RD curves for encoding two sample scenes, Scene1 (E = 31.96, h = 11.12) and
Scene2 (E = 67.96, h = 5.12) using the fixed bitrate ladder and OPSE.
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 12
Q  A
Q  A
Thank you for your attention!
Vignesh V Menon (vignesh.menon@aau.at)
Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 13

More Related Content

Similar to OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf

CODA_presentation.pdf
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdf
JunZhao68
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
Videoguy
 

Similar to OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf (20)

Green_VCA_presentation.pdf
Green_VCA_presentation.pdfGreen_VCA_presentation.pdf
Green_VCA_presentation.pdf
 
VCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdf
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
 
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdfETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
 
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live StreamingETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
 
Efficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streamingEfficient bitrate ladder construction for live video streaming
Efficient bitrate ladder construction for live video streaming
 
CODA_presentation.pdf
CODA_presentation.pdfCODA_presentation.pdf
CODA_presentation.pdf
 
LiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdfLiveVBR presentation at VQEG NORM.pdf
LiveVBR presentation at VQEG NORM.pdf
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
 
INCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVC
 
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
Energy-efficient Adaptive Video Streaming with Latency-Aware Dynamic Resoluti...
 
JASLA_presentation.pdf
JASLA_presentation.pdfJASLA_presentation.pdf
JASLA_presentation.pdf
 
Online Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC StreamingOnline Bitrate ladder prediction for Adaptive VVC Streaming
Online Bitrate ladder prediction for Adaptive VVC Streaming
 
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
IEEE ICIP'22:Efficient Content-Adaptive Feature-based Shot Detection for HTTP...
 
Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
Motion Compensation With Prediction Error Using Ezw Wavelet Coefficients
Motion Compensation With Prediction Error Using Ezw Wavelet CoefficientsMotion Compensation With Prediction Error Using Ezw Wavelet Coefficients
Motion Compensation With Prediction Error Using Ezw Wavelet Coefficients
 
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
 
Cuda project paper
Cuda project paperCuda project paper
Cuda project paper
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
 

More from Vignesh V Menon

Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Vignesh V Menon
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
Vignesh V Menon
 

More from Vignesh V Menon (8)

Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementat...
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
 
Green Variable framerate encoding for Adaptive Live Streaming
Green Variable framerate encoding  for Adaptive Live StreamingGreen Variable framerate encoding  for Adaptive Live Streaming
Green Variable framerate encoding for Adaptive Live Streaming
 
Doctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdfDoctoral Symposium presentation.pdf
Doctoral Symposium presentation.pdf
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdf
 
Video Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdfVideo Complexity Dataset (VCD).pdf
Video Complexity Dataset (VCD).pdf
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
 
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
IEEE PCS'21: Efficient multi-encoding for large-scale HTTP Adaptive Streaming...
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 

OPSE_Online Per-Scene Encoding for Adaptive HTTP Live Streaming.pdf

  • 1. OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming Vignesh V Menon1, Hadi Amirpour1, Christian Feldmann2, Mohammad Ghanbari1,3, and Christian Timmerer1 1 Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität, Klagenfurt, Austria 2 Bitmovin, Klagenfurt, Austria 3 School of Computer Science and Electronic Engineering, University of Essex, UK 21 July 2022 Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 1
  • 2. Outline 1 Introduction 2 OPSE 3 Evaluation 4 Q & A Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 2
  • 3. Introduction Motivation Per-scene encoding schemes are based on the fact that each resolution performs better than others in a scene for a given bitrate range, and these regions depend on the video complexity. Increase the Quality of Experience (QoE) or decrease the bitrate of the representations as introduced for VoD services.1 Figure: The bitrate ladder prediction envisioned using OPSE. 1 J. De Cock et al. “Complexity-based consistent-quality encoding in the cloud”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016, pp. 1484–1488. doi: 10.1109/ICIP.2016.7532605. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 3
  • 4. Introduction Why not in live yet? Though per-title encoding schemes2 enhance the quality of video delivery, determining the convex-hull is computationally expensive, making it suitable for only VoD streaming applications. Some methods pre-analyze the video contents3. Katsenou et al.4 introduced a content-gnostic method that employs machine learning to find the bitrate range for each resolution that outperforms other resolutions. Bhat et al.5 proposed a Random Forest (RF) classifier to decide encoding resolution best suited over different quality ranges and studied machine learning based adaptive resolution prediction. However, these approaches still yield latency much higher than the accepted latency in live streaming. 2 De Cock et al., “Complexity-based consistent-quality encoding in the cloud”; Hadi Amirpour et al. “PSTR: Per-Title Encoding Using Spatio-Temporal Resolutions”. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021, pp. 1–6. doi: 10.1109/ICME51207.2021.9428247. 3 https://bitmovin.com/whitepapers/Bitmovin-Per-Title.pdf, last access: May 10, 2022. 4 A. V. Katsenou et al. “Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming”. In: 2019 Picture Coding Symposium (PCS). 2019. doi: 10.1109/PCS48520.2019.8954529. 5 Madhukar Bhat et al. “Combining Video Quality Metrics To Select Perceptually Accurate Resolution In A Wide Quality Range: A Case Study”. In: 2021 IEEE International Conference on Image Processing (ICIP). 2021, pp. 2164–2168. doi: 10.1109/ICIP42928.2021.9506310. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 4
  • 5. OPSE OPSE Input Video Video Complexity Feature Extraction Scene Detection Resolution Prediction Resolutions (R) Bitrates (B) Per-Scene Encoding (E, h, ϵ) (E, h) Scenes (ˆ r, b) Figure: OPSE architecture. E, h, and ϵ features are extracted using VCA open-source video complexity analyzer software.6 6 Vignesh V Menon et al. “VCA: Video Complexity Analyzer”. In: Proceedings of the 13th ACM Multimedia Systems Conference. 2022. isbn: 9781450392839. doi: 10.1145/3524273.3532896. url: https://doi.org/10.1145/3524273.3532896. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 5
  • 6. OPSE OPSE Phase 1: Feature Extraction Compute texture energy per block A DCT-based energy function is used to determine the block-wise feature of each frame defined as: Hk = w−1 X i=0 w−1 X j=0 e|( ij wh )2−1| |DCT(i, j)| (1) where wxw is the size of the block, and DCT(i, j) is the (i, j)th DCT component when i + j > 0, and 0 otherwise. The energy values of blocks in a frame is averaged to determine the energy per frame.7 E = C−1 X k=0 Hp,k C · w2 (2) 7 Michael King et al. “A New Energy Function for Segmentation and Compression”. In: 2007 IEEE International Conference on Multimedia and Expo. 2007, pp. 1647–1650. doi: 10.1109/ICME.2007.4284983. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 6
  • 7. OPSE OPSE Phase 1: Feature Extraction hp: SAD of the block level energy values of frame p to that of the previous frame p − 1. hp = C−1 X k=0 | Hp,k, Hp−1,k | C · w2 (3) where C denotes the number of blocks in frame p. The gradient of h per frame p, ϵp is also defined, which is given by: ϵp = hp−1 − hp hp−1 (4) Latency Speed of feature extraction = 1480fps for Full HD (1080p) video with 8 CPU threads and x86 SIMD optimization Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 7
  • 8. OPSE OPSE Phase 2: Scene Detection Objective: Detect the first picture of each shot and encode it as an Instantaneous Decoder Refresh (IDR) frame. Encode the subsequent frames of the new shot based on the first one via motion compen- sation and prediction. Shot transitions can be present in two ways: hard shot-cuts gradual shot transitions The detection of gradual changes is much more difficult owing to the fact it is difficult to determine the change in the visual information in a quantitative format. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 8
  • 9. OPSE OPSE Phase 2: Scene Detection Step 1: while Parsing all video frames do if ϵk > T1 then k ← IDR-frame, a new shot. else if ϵk ≤ T2 then k ← P-frame or B-frame, not a new shot. T1 , T2 : maximum and minimum threshold for ϵk f : video fps Q : Q : set of frames where T1 ≥ ϵ > T2 and ∆h > T3 q0: current frame number in the set Q q−1: previous frame number in the set Q q1: next frame number in the set Q Step 2: while Parsing Q do if q0 − q−1 > f and q1 − q0 > f then q0 ← IDR-frame, a new shot. Eliminate q0 from Q. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 9
  • 10. OPSE OPSE Phase 3: Resolution Prediction For each detected scene, the optimized bitrate ladder is predicted using the E and h features of the first GOP of each scene and the sets R and B. The optimized resolution ˆ r is predicted for each target bitrate b ∈ B. The resolution scaling factor s is defined as: s = r rmax ; r ∈ R (5) where rmax is the maximum resolution in R. Hidden Layer E R4 Hidden Layer E R4 Input Layer E R3 Output Layer E R1 E h log(b) ŝ Figure: Neural network structure to predict optimized resolution scaling factor ŝ for a maximum resolution rmax and framerate f . Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 10
  • 11. Evaluation Evaluation R = {360p, 432p, 540p, 720p, 1080p} B = {145, 300, 600, 900, 1600, 2400, 3400, 4500, 5800, 8100}. Figure: BDRV results for scenes characterized by various average E and h. BDRV : Bjøntegaard delta rate8 refers to the average increase in bitrate of the representations compared with that of the fixed bitrate ladder encoding to maintain the same VMAF. 8 G. Bjontegaard. “Calculation of average PSNR differences between RD-curves”. In: VCEG-M33 (2001). Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 11
  • 12. Evaluation Evaluation (a) Scene1 (b) Scene2 Figure: Comparison of RD curves for encoding two sample scenes, Scene1 (E = 31.96, h = 11.12) and Scene2 (E = 67.96, h = 5.12) using the fixed bitrate ladder and OPSE. Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 12
  • 13. Q A Q A Thank you for your attention! Vignesh V Menon (vignesh.menon@aau.at) Vignesh V Menon OPSE: Online Per-Scene Encoding for Adaptive HTTP Live Streaming 13