SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
July 14, 2021 @QCOMResearch
Qualcomm Technologies, Inc.
How AI research is
enabling next-gen codecs
3
Agenda
The demand for
improved data
compression
is growing
1
AI is a
compelling
tool for
compression
2
Our latest
AI voice and
video codec
research
3
Our
on-device
neural video
decoder demo
4
Future
research
work and
challenges
5
4
4
The scale of video and voice
being created and consumed is massive
1M
Minutes of video
crossing the internet
per second
15B
Minutes of talking
per day on WhatsApp
calls
82%
Of all consumer
internet traffic is
online video
76
Minutes per day watching
video on digital devices
by US adults
8B
Average daily
video views
on Facebook
Cisco Visual Networking Index: Forecast and Trends, 2017–2022; WhatsApp blog 4/28/20 4
5
5
Video streaming
Extended reality Smart cities
Smart factories
Increasingly, video is all around us — providing entertainment,
enhancing collaboration, and transforming industries
Autonomous vehicles
Video conferencing
Sports
Smartphone
6
6
of internet traffic will
be video by 20221
Video technology revolutionized how we create and consume media
Enhanced video quality with less bits led to broad adoption across a wide range of devices and services
1. Cisco Annual Internet Report, 2018–2023
7
1. On a 1080p video at 30 frames per second; EVC = Essential Video Coding, VVC = Versatile Video Coding,
Technical innovation drives video evolution
A regular cadence of technical advancement in video codecs has led to massive reduction in file size
The first evolution
Introduced flexibility
Focused initially on optical disc media
Increased efficiency
Introduced internet video
Emerging video applications and technology
Robust network efficiency
Enabling user experiences as demand surges
Continued technical advancement
2013
HEVC
2020
EVC and VVC
Video
Compression
2003
AVC
1995
MPEG-2
Evolution transformed
Enriched experiences and proliferation
Video
Compression
Reduction in
file size with
VVC versus no
compression1
~1000x
DVD, digital TV Blu-ray, HDTV, internet
~50% compression ↑
Streaming, 4k, HDR, 360˚
~50% compression ↑
8k, immersive video
~30% ~40%
compression ↑
8
8
Visual quality is much more than resolution and frame rate
Preserving color accuracy, contrast, and brightness is also crucial
Broad interoperability
across all our screens
Resolution
Increased definition and sharpness
Frame rate
Reduced blurring and latency
Pixel quantity Pixel fidelity
Color accuracy
More realistic colors through an expanded color
gamut, depth, and temperature
Contrast and brightness
Increased detail through a larger dynamic range
and lighting enhancements
9
Variational auto
encoder (VAE)*
Invertible
Auto-regressive
Generative adversarial
network (GAN)
Generative
models
Extract features by
learning a low-dimension
feature representation
Sampling to generate,
restore, predict, or
compress data
Powerful
capabilities
Broad
applications
Computational photography
Text to speech
Graphics rendering
Speech/video compression
Voice UI
* VAE first introduced by D. Kingma and M. Welling in 2013
Deep generative
model research
for unsupervised
learning
Given unlabeled training data,
generate new samples from the
same distribution
10
Variational auto
encoder (VAE)*
Invertible
Auto-regressive
Generative adversarial
network (GAN)
Generative
models
Deep generative
model research
for unsupervised
learning
Given unlabeled training data,
generate new samples from the
same distribution
* VAE first introduced by D. Kingma and M. Welling in 2013
Encoder part
Decoder part
Input unlabeled
data, such as
images
Desired output
as close as
possible to input
Extract features by
learning a low-dimension
feature representation
Sampling to generate,
restore, predict or
compress data
11
AI-based
compression
has compelling
benefits
No special-purpose
hardware required, other
than an AI acceleration
Easy to upgrade, standardize,
and deploy new codecs
Specialized to a specific
data distribution
Easy to develop new codecs
for new modalities
Improved rate-distortion
trade-off
Optimized for advanced
perceptual quality metrics
Semantics aware for
human visual perception
Can generate
visual details not
in the bitstream
12
12
Achieving state-of-the-art data compression with AI
1 Comparison between state-of-the-art traditional speech coding algorithm and AI speech compression
encoder decoder
Input
speech
Output
speech
Compressed
speech
Bit-rate compression at same
speech quality using AI1
2.6X
13
13
Feedback Recurrent Autoencoder for Video Compression (Golinski, et al., arXiv 2020)
Video used in images is produced by Netflix, with CC BY-NC-ND 4.0 license: https://media.xiph.org/video/derf/ElFuente/Netflix_Tango_Copyright.txt
Novel machine learning-based video codec research
Neural network based I-frame and P-frame compression
Achieving state-of-the-art rate-distortion compared
with other learned video compression solutions
End-to-end deep learning
Reconstruction
Ground truth
Motion vector
Residual
T=3 Motion
estimation
P-frame
encoder
Motion
estimation
Entropy
encoding
bits Entropy
decoding P-frame
decoder Sum
Warp
Deep
neural net
T=1 I-frame
encoder
Entropy
encoding
bits Entropy
decoding I-frame
decoder
T=2 Motion
estimation
P-frame
encoder
Entropy
encoding
bits Entropy
decoding P-frame
decoder
Warp
Sum
14
14
Video compression utilizes temporal and spatial redundancy
Group of pictures (GOP) is a sequence of video frame types used to efficiently encode data
GOP structures
I-frame
(intra)
P-frame
(predicted)
B-frame
(bi-directional)
Independently code
each frame
Exploit spatial
redundancy
Code changes based
on previous frame
Exploit spatial and temporal
redundancy
Code changes based on
previous and next frame
Exploit spatial and temporal
redundancy
| | | | | | | | | | ● ● ● P ● ● ● | ● ● ● P ● ● ● | | ● ● ● B ● ● ● P ● ● ● B ● ● ● P
Increasing complexity and bitrate compression
15
Existing research for B-frame codecs has limitations
Our solution combines the best of existing P-frame and B-frame codecs while allowing them to share weights
Existing P-frame codec Existing B-frame codecs Our B-frame codec
Residual
compression
Motion
compensation
Motion
compression
𝑅𝑒𝑓
𝐼𝑛𝑝𝑢𝑡
𝑂𝑢𝑡𝑝𝑢𝑡
𝑃𝑟𝑒𝑑
Residual
compression
Bidir motion
compensation
Bidir motion
compression
𝑃𝑟𝑒𝑑
𝑅𝑒𝑓1
𝑅𝑒𝑓0
𝐼𝑛𝑝𝑢𝑡
𝑂𝑢𝑡𝑝𝑢𝑡
High correlation between the
two flows (motion constancy)
Residual
compression
Frame
interpolation
𝑃𝑟𝑒𝑑
𝑅𝑒𝑓1
𝑅𝑒𝑓0
𝐼𝑛𝑝𝑢𝑡
𝑂𝑢𝑡𝑝𝑢𝑡
Does not work well
for large motion
P-frame
codec
Frame
interpolation
𝑃𝑟𝑒𝑑
𝑅𝑒𝑓1
𝑅𝑒𝑓0
𝐼𝑛𝑝𝑢𝑡
𝑂𝑢𝑡𝑝𝑢𝑡
P-frame and B-frame
codecs can share weights
16
16
Extending Neural P-frame Codecs for B-frame Coding (Pourreza, et al., arXiv 2021)
Illustration of different inter-frame coding methods
B-Frame compression through Extended P-frame & Interpolation Codec (B-EPIC)
Actual
object
location
Object location
Prediction
Motion vector
Prediction error
Interpolation
Interpolation guide
𝐱𝑡
෤
𝐱𝑡
ො
𝐱ref
Input frame
The corresponding prediction
The reference
𝐱𝑡 ො
𝐱ref1
ො
𝐱ref0
෤
𝐱𝑡
B-frame prediction based on
bidirectional flow/warp: two motion
vectors with respect to two references
are transmitted.
ො
𝐱ref ෤
𝐱𝑡
P-frame prediction:
a motion vector with
respect to a single
reference is transmitted
ො
𝐱ref0
෤
𝐱𝑡 ො
𝐱ref1
B-frame prediction based on frame
interpolation: the interpolation result is
treated as the prediction. No motion
information is transmitted.
ො
𝐱ref0
෤
𝐱𝑡 ො
𝐱ref1
Our B-frame prediction approach:
the interpolation result is corrected
using a unidirectional motion vector
similar to P-frame
17
17
R. Pourezza, T. Cohen, Extending Neural P-frame Codecs for B-frame Coding, Under review at ICCV 2021
Our B-frame coding provides state-of-the-art results
Improved rate-distortion tradeoff by extending neural p-frame codecs for B-frame coding
32
33
34
35
36
37
38
39
40
41
0 0.1 0.2 0.3 0.4 0.5 0.6
PSNR
(dB)
Rate (bits per pixel)
PSNR
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
0 0.1 0.2 0.3 0.4 0.5 0.6
MS-SSIM
Rate (bits per pixel)
MS-SSIM
Ours Google [CVPR-20] H.265 (FFmpeg)
18
18
shared knowledge
𝜃𝒟 𝜃𝒟
sender receiver
Overfitting through instance-adaptive video compression
Improved compression via continual learning
Overfit a model on one video instance
and encode weight-deltas in the bitstream
Model
delta < baseline
bitstream
Encoded
bitstream
+
Send weight-deltas
based on overfitting
Send smaller
encoded bitstream
based on overfitting
encoder
𝑞𝜑(𝒛|𝒙)
0
0
0 1 1
1
ෝ
𝒙
0
0
0 1 1
1
𝒙
𝒃𝒛
E9
E9
E11
D4
D4
𝒛 𝒛
𝒃𝒛
D3
latent prior latent prior decoder
𝑝𝜃(𝒙|𝒛)
EE ED
E11 EE ED
0
0
1 0
0
1
𝒃ഥ
δ
E10
D1
𝒃ഥ
δ
⊖
ത
δ
⊕
ത
δ
D2
model prior model prior
decoder
𝑝𝜃(𝒙|𝒛)
T. van Rozendaal*, I.A.M. Huijben*, T. Cohen, Overfitting for Fun and Profit: Instance-Adaptive Data Compression, ICLR 2021
T. Van Rozendaal, Y. Zhang, J. Brehmer, T. Cohen, Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set, Under review at ICCV 2021
19
Mobile friendly
deployment
Decoding complexity can be reduced by
while still maintaining
SOTA results
72%
SOTA results from
instance-adaptive
video compression
24%
BD-rate savings
over leading neural
codec by Google
29%
BD-rate savings
over FFmpeg
H.265
(FFmpeg)
20
20
Variable bitrate image compression for simpler deployment
Three levels of solutions with the goal of single model producing a single bitstream
Level 0:
Multi-model multi-bitstream
encoder1 decoder1
encoder2 decoder2
encoder3 decoder3
Each model is
optimized for
a fixed 𝛽
𝑁 models are needed
for 𝑁 rate-distortion
tradeoffs
Level 1:
Single-model multi-bitstream
encoder decoder
encoder decoder
encoder decoder
Single
encoder-decoder
model
Certain params /input
are used to steer
R-D tradeoff
𝛽1
𝛽1 𝛽2 𝛽3 𝛽2
𝛽3
Level 2:
Single-model single-bitstream
encoder decoder
decoder
decoder
Single
encoder-decoder
model
Encode once,
embed all
bitrates
A.k.a. embedded/
progressive/
scalable coding
Increasing
order of design
difficulty
and ease of
deployment
21
1/s × s
1/s
𝑦
+
- EC
𝑥
𝑦(𝑠)
Hyper Codec
ED
𝒈𝑎 Analysis transform / encoder
𝒈s Synthesis transform / decoder
𝑥 Input image
s Scaling factor that determines the bitrate
µ, ℴ Prior parameters
EC Entropy coding
ED Input image
𝑔𝑎
𝑔𝑠
µ
𝑥
^
⎿
⎿
•
N(µ, ℴ)
Level 1 approach that applies
a scaling factor to the latent for
a different trade-off between rate
and distortion
Latent scaling
for a
single-model
multi-bitstream
solution
22
Level 2 approach that uses multiple
quantization levels and conditional
probability to create a single bitstream
with multiple quality levels
Nested
quantization
for a
single-model
single-bitstream
solution
Fine
scale s1
Coarse
scale s2
Finer
bins for
increased
quality
23
Y. Lu, Y. Zhu, Y. Yang, A. Said, T. Cohen, Progressive Neural Image Compression with Nested Quantization and Latent Ordering, Accepted at ICIP 2021
Variable-bitrate
progressive
neural image
compression
Achieves comparable
performance to HEVC
Intra but uses only a
single model and a
single bitstream
HEVC Intra
Separate bitstreams,
one for each quality
[N-model, N-bitstream] Level 0
[1-model, 1-bitstream] Level 2 - Ours
[1-model, N-bitstream] Level 1 - HEVC Intra
[1-model, 1-bitstream] Level 2
Our scheme
A single bitstream
where the prefixes
decode to different
qualities
24
24
Semantic-aware image compression for improved quality
Allocate more bits to regions of interest
Source Baseline bit allocation Semantic-aware bit allocation
25
25
Semantic
compression
focuses on regions
of interest
End-to-end deep learning
on foreground/background
weighted distortion loss
Mask code
Mask
encoder
Code
Decoder
Encoder
Mask
decode
Foreground /
background loss
Segmentation mask Input image
Gain
Reconstructed
image
26
Next step
Extend to video compression
Semantic-aware
image compression
improves quality
where it matters
State-of-the-art results for
rate-distortion tradeoff
28
30
32
34
36
38
40
42
0.2 0.4 0.6 0.8 1 1.2 1.4
RGB
PSNR
(dB)
Rate (bits per pixel)
Rate distortion
Segmentation AE (ROI) Segmentation AE (non-ROI)
Hyperprior AE (ROI) Hyperprior AE (non-ROI)
5dB gain
in region
of interest
(ROI)
2dB loss
at non-ROI
27
27
The tradeoff between perception and distortion metrics
Optimizing distortion and/or perceptual quality leads to different reconstructions
Original Rate + Distortion Rate + Perception Rate + Distortion + Perception
Distortion
Perceptual quality
Very good
Very bad
Very bad
Very good
Good
Good
28
28
How GAN-based codecs work
GAN-based codecs can produce much more visually appealing content at very low bits per pixel
Encoder Generator
Discriminator
1
2
1
0
Data distribution
P(𝑧)
Prior P(real | 𝑧)
P(real)
29
29
GAN
14.5 kB
0.221 bpp
145x reduction
Raw
2097.15 kB
8.0 bpp
512x1024
BPG QP=40
16.4 kB
0.250 bpp
128x reduction
30
Parallel
Entropy
Decoding
Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.
Real-time on-device neural video decode is now possible
Research demo showcases all-intra neural video decode on a smartphone. Inter-frame decoding is next
1280 × 704
(Comparable to HD 720p)
Mobile device powered by
Snapdragon® 888 Mobile Platform
CPU
Parallel
Entropy
Encoding
30+
Frames /
second
Offline processing
Encoder AI accelerator
Decoder
Bitstream
31
31
8-bit model with
efficient decoding
PEC
Decoder
AI Model Efficiency Toolkit
(AIMET)
https://github.com/quic/aimet
Step-3
AIMET quantization
aware training
AIMET is a product of Qualcomm Innovation Center, Inc.
Efficient on-device neural decoding
Limit to I-frame (intra-frame) component of neural video decoding; Inter-frame decoding is coming soon
Step-1
Decoder architecture
refinement
Decoder
Floating-point
neural codec
Step-2
Parallel Entropy Coding
(PEC) PEC
Entropy
coding
Decoder
Encoder
Parallel entropy coding + architecture refinement + 8-bit  Fast inference
AIMET quantization aware training  Improve quantized model accuracy
32
Real-time neural video decoding
on a mobile device
demo video
Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.
33
Perceptual quality
Need to develop differentiable perceptual
quality metrics
GANs are promising, but not perfect
Video GANs are challenging to train due
to computational requirements
Computational efficiency
Real-time on-device video encoding
and decoding is still a challenge
Rate distortion improvement
Bitrate needs to continue to get smaller for the
foreseeable future as video demand increases
New modalities
Lots of work to do on new modalities
like point clouds, omnidirectional video,
multi-camera setups (e.g. in AV), etc.
Challenges
for mass
deployment of
neural video
compression
34
34
AI-based compression has the
potential to more efficiently share
video and voice across devices
We are conducting leading
research and development
in end-to-end deep learning
for video and speech coding
We are solving implementation
challenges and demonstrating
a real-time neural video decoder
running on a smartphone
35
www.qualcomm.com/ai
@QCOMResearch
www.qualcomm.com/news/onq
https://www.youtube.com/qualcomm?
http://www.slideshare.net/qualcommwirelessevolution
Connect with Us
Questions?
Follow us on:
For more information, visit us at:
www.qualcomm.com & www.qualcomm.com/blog
Thank you
Nothing in these materials is an offer to sell any of the
components or devices referenced herein.
©2018-2021 Qualcomm Technologies, Inc. and/or its
affiliated companies. All Rights Reserved.
Qualcomm and Snapdragon are trademarks or registered
trademarks of Qualcomm Incorporated. Other products
and brand names may be trademarks or registered
trademarks of their respective owners.
References in this presentation to “Qualcomm” may mean Qualcomm
Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries
or business units within the Qualcomm corporate structure, as
applicable. Qualcomm Incorporated includes our licensing business,
QTL, and the vast majority of our patent portfolio. Qualcomm
Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates,
along with its subsidiaries, substantially all of our engineering,
research and development functions, and substantially all of our
products and services businesses, including our QCT semiconductor
business.

Contenu connexe

Tendances

Cellular V2X is Gaining Momentum
Cellular V2X is Gaining MomentumCellular V2X is Gaining Momentum
Cellular V2X is Gaining MomentumQualcomm Research
 
Ericsson 5 g platform
Ericsson 5 g platformEricsson 5 g platform
Ericsson 5 g platformEricsson
 
David Soldani, Huawei
David Soldani, HuaweiDavid Soldani, Huawei
David Soldani, HuaweiHilary Ip
 
LTE-Advanced Pro from Qualcomm
LTE-Advanced Pro from QualcommLTE-Advanced Pro from Qualcomm
LTE-Advanced Pro from QualcommLow Hong Chuan
 
F01 beam forming_srs
F01 beam forming_srsF01 beam forming_srs
F01 beam forming_srsLuciano Motta
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfQualcomm Research
 
Expanding the 5G NR (New Radio) ecosystem
Expanding the 5G NR (New Radio) ecosystemExpanding the 5G NR (New Radio) ecosystem
Expanding the 5G NR (New Radio) ecosystemQualcomm Research
 
5G Core Network - ZTE 5g Cloude ServCore
5G Core Network - ZTE 5g Cloude ServCore5G Core Network - ZTE 5g Cloude ServCore
5G Core Network - ZTE 5g Cloude ServCoreITU
 
6G Training Course Part 1: Introduction
6G Training Course Part 1: Introduction6G Training Course Part 1: Introduction
6G Training Course Part 1: Introduction3G4G
 
Ericsson 5G Radio Dot Launch
Ericsson 5G Radio Dot LaunchEricsson 5G Radio Dot Launch
Ericsson 5G Radio Dot LaunchEricsson
 
6G Training Course Part 3: 6G Use Cases & Applications
6G Training Course Part 3: 6G Use Cases & Applications6G Training Course Part 3: 6G Use Cases & Applications
6G Training Course Part 3: 6G Use Cases & Applications3G4G
 
Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Qualcomm Research
 
Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...
Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...
Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...3G4G
 
How to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANHow to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANQualcomm Research
 
Qualcomm 5g-vision-presentation
Qualcomm 5g-vision-presentationQualcomm 5g-vision-presentation
Qualcomm 5g-vision-presentationYali Wang
 
Getting to the Edge – Exploring 4G/5G Cloud-RAN Deployable Solutions
Getting to the Edge – Exploring 4G/5G Cloud-RAN Deployable SolutionsGetting to the Edge – Exploring 4G/5G Cloud-RAN Deployable Solutions
Getting to the Edge – Exploring 4G/5G Cloud-RAN Deployable SolutionsRadisys Corporation
 
5G Services Story
5G Services Story5G Services Story
5G Services StoryEricsson
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Qualcomm Research
 

Tendances (20)

Cellular V2X is Gaining Momentum
Cellular V2X is Gaining MomentumCellular V2X is Gaining Momentum
Cellular V2X is Gaining Momentum
 
Ericsson 5 g platform
Ericsson 5 g platformEricsson 5 g platform
Ericsson 5 g platform
 
David Soldani, Huawei
David Soldani, HuaweiDavid Soldani, Huawei
David Soldani, Huawei
 
LTE-Advanced Pro from Qualcomm
LTE-Advanced Pro from QualcommLTE-Advanced Pro from Qualcomm
LTE-Advanced Pro from Qualcomm
 
F01 beam forming_srs
F01 beam forming_srsF01 beam forming_srs
F01 beam forming_srs
 
How will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdfHow will sidelink bring a new level of 5G versatility.pdf
How will sidelink bring a new level of 5G versatility.pdf
 
Expanding the 5G NR (New Radio) ecosystem
Expanding the 5G NR (New Radio) ecosystemExpanding the 5G NR (New Radio) ecosystem
Expanding the 5G NR (New Radio) ecosystem
 
5G Core Network - ZTE 5g Cloude ServCore
5G Core Network - ZTE 5g Cloude ServCore5G Core Network - ZTE 5g Cloude ServCore
5G Core Network - ZTE 5g Cloude ServCore
 
Making 5G NR a reality
Making 5G NR a realityMaking 5G NR a reality
Making 5G NR a reality
 
6G Training Course Part 1: Introduction
6G Training Course Part 1: Introduction6G Training Course Part 1: Introduction
6G Training Course Part 1: Introduction
 
Ericsson 5G Radio Dot Launch
Ericsson 5G Radio Dot LaunchEricsson 5G Radio Dot Launch
Ericsson 5G Radio Dot Launch
 
6G Training Course Part 3: 6G Use Cases & Applications
6G Training Course Part 3: 6G Use Cases & Applications6G Training Course Part 3: 6G Use Cases & Applications
6G Training Course Part 3: 6G Use Cases & Applications
 
Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022Why and what you need to know about 6G in 2022
Why and what you need to know about 6G in 2022
 
Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...
Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...
Beginners: Different Types of RAN Architectures - Distributed, Centralized & ...
 
How to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RANHow to build high performance 5G networks with vRAN and O-RAN
How to build high performance 5G networks with vRAN and O-RAN
 
5G Network Slicing
5G Network Slicing5G Network Slicing
5G Network Slicing
 
Qualcomm 5g-vision-presentation
Qualcomm 5g-vision-presentationQualcomm 5g-vision-presentation
Qualcomm 5g-vision-presentation
 
Getting to the Edge – Exploring 4G/5G Cloud-RAN Deployable Solutions
Getting to the Edge – Exploring 4G/5G Cloud-RAN Deployable SolutionsGetting to the Edge – Exploring 4G/5G Cloud-RAN Deployable Solutions
Getting to the Edge – Exploring 4G/5G Cloud-RAN Deployable Solutions
 
5G Services Story
5G Services Story5G Services Story
5G Services Story
 
Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18Setting off the 5G Advanced evolution with 3GPP Release 18
Setting off the 5G Advanced evolution with 3GPP Release 18
 

Similaire à How AI research is enabling next-gen codecs

COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...ijma
 
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...ijma
 
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...ijma
 
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...ijma
 
Thesis arpan pal_gisfi
Thesis arpan pal_gisfiThesis arpan pal_gisfi
Thesis arpan pal_gisfiArpan Pal
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612NITC
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
 
Paper id 2120148
Paper id 2120148Paper id 2120148
Paper id 2120148IJRAT
 
IBM VideoCharger and Digital Library MediaBase.doc
IBM VideoCharger and Digital Library MediaBase.docIBM VideoCharger and Digital Library MediaBase.doc
IBM VideoCharger and Digital Library MediaBase.docVideoguy
 
Ch07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.pptCh07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.pptdjempol
 
New coding techniques, standardisation, and quality metrics
New coding techniques, standardisation, and quality metricsNew coding techniques, standardisation, and quality metrics
New coding techniques, standardisation, and quality metricsTouradj Ebrahimi
 
Encoding stored video for stremming applications ieee paper ppt
Encoding stored video for stremming applications ieee paper pptEncoding stored video for stremming applications ieee paper ppt
Encoding stored video for stremming applications ieee paper pptNavin Kumar
 
Video Streaming Compression for Wireless Multimedia Sensor Networks
Video Streaming Compression for Wireless Multimedia Sensor NetworksVideo Streaming Compression for Wireless Multimedia Sensor Networks
Video Streaming Compression for Wireless Multimedia Sensor NetworksIOSR Journals
 
Video Conferencing : Fundamentals and Application
Video Conferencing : Fundamentals and ApplicationVideo Conferencing : Fundamentals and Application
Video Conferencing : Fundamentals and ApplicationVideoguy
 
Video Transcoding Terms Explained
Video Transcoding Terms Explained Video Transcoding Terms Explained
Video Transcoding Terms Explained nerodude
 
Video Conferencing, The Enterprise and You
Video Conferencing, The Enterprise and YouVideo Conferencing, The Enterprise and You
Video Conferencing, The Enterprise and YouVideoguy
 
Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches ijsc
 

Similaire à How AI research is enabling next-gen codecs (20)

COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
 
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
 
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
Comparison of Cinepak, Intel, Microsoft Video and Indeo Codec for Video Compr...
 
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
COMPARISON OF CINEPAK, INTEL, MICROSOFT VIDEO AND INDEO CODEC FOR VIDEO COMPR...
 
Thesis arpan pal_gisfi
Thesis arpan pal_gisfiThesis arpan pal_gisfi
Thesis arpan pal_gisfi
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
Paper id 2120148
Paper id 2120148Paper id 2120148
Paper id 2120148
 
IBM VideoCharger and Digital Library MediaBase.doc
IBM VideoCharger and Digital Library MediaBase.docIBM VideoCharger and Digital Library MediaBase.doc
IBM VideoCharger and Digital Library MediaBase.doc
 
Ch07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.pptCh07_-_Multimedia_Element-Video_1_.ppt
Ch07_-_Multimedia_Element-Video_1_.ppt
 
Real Time Video Processing in FPGA
Real Time Video Processing in FPGA Real Time Video Processing in FPGA
Real Time Video Processing in FPGA
 
New coding techniques, standardisation, and quality metrics
New coding techniques, standardisation, and quality metricsNew coding techniques, standardisation, and quality metrics
New coding techniques, standardisation, and quality metrics
 
Encoding stored video for stremming applications ieee paper ppt
Encoding stored video for stremming applications ieee paper pptEncoding stored video for stremming applications ieee paper ppt
Encoding stored video for stremming applications ieee paper ppt
 
Video Streaming Compression for Wireless Multimedia Sensor Networks
Video Streaming Compression for Wireless Multimedia Sensor NetworksVideo Streaming Compression for Wireless Multimedia Sensor Networks
Video Streaming Compression for Wireless Multimedia Sensor Networks
 
Multimedia-Lecture-6.pptx
Multimedia-Lecture-6.pptxMultimedia-Lecture-6.pptx
Multimedia-Lecture-6.pptx
 
Video Conferencing : Fundamentals and Application
Video Conferencing : Fundamentals and ApplicationVideo Conferencing : Fundamentals and Application
Video Conferencing : Fundamentals and Application
 
Video Transcoding Terms Explained
Video Transcoding Terms Explained Video Transcoding Terms Explained
Video Transcoding Terms Explained
 
Video Conferencing, The Enterprise and You
Video Conferencing, The Enterprise and YouVideo Conferencing, The Enterprise and You
Video Conferencing, The Enterprise and You
 
Scct2013 topic4 video
Scct2013 topic4 videoScct2013 topic4 video
Scct2013 topic4 video
 
Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches Video Compression Algorithm Based on Frame Difference Approaches
Video Compression Algorithm Based on Frame Difference Approaches
 

Plus de Qualcomm Research

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdfQualcomm Research
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfQualcomm Research
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfQualcomm Research
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIQualcomm Research
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingQualcomm Research
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GQualcomm Research
 
AI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptAI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptQualcomm Research
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edgeQualcomm Research
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scaleQualcomm Research
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G futureQualcomm Research
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingQualcomm Research
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyQualcomm Research
 
What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? Qualcomm Research
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AIQualcomm Research
 
Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Qualcomm Research
 
5G spectrum innovations and global update
5G spectrum innovations and global update5G spectrum innovations and global update
5G spectrum innovations and global updateQualcomm Research
 
Transforming enterprise and industry with 5G private networks
Transforming enterprise and industry with 5G private networksTransforming enterprise and industry with 5G private networks
Transforming enterprise and industry with 5G private networksQualcomm Research
 
The essential role of technology standards
The essential role of technology standardsThe essential role of technology standards
The essential role of technology standardsQualcomm Research
 

Plus de Qualcomm Research (20)

Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
The future of AI is hybrid
The future of AI is hybridThe future of AI is hybrid
The future of AI is hybrid
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdf
 
Enabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdfEnabling the metaverse with 5G- web.pdf
Enabling the metaverse with 5G- web.pdf
 
Presentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AIPresentation - Model Efficiency for Edge AI
Presentation - Model Efficiency for Edge AI
 
Bringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensingBringing AI research to wireless communication and sensing
Bringing AI research to wireless communication and sensing
 
Realizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5GRealizing mission-critical industrial automation with 5G
Realizing mission-critical industrial automation with 5G
 
AI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-conceptAI firsts: Leading from research to proof-of-concept
AI firsts: Leading from research to proof-of-concept
 
5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge5G positioning for the connected intelligent edge
5G positioning for the connected intelligent edge
 
Enabling on-device learning at scale
Enabling on-device learning at scaleEnabling on-device learning at scale
Enabling on-device learning at scale
 
The essential role of AI in the 5G future
The essential role of AI in the 5G futureThe essential role of AI in the 5G future
The essential role of AI in the 5G future
 
Role of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous drivingRole of localization and environment perception in autonomous driving
Role of localization and environment perception in autonomous driving
 
Pioneering 5G broadcast
Pioneering 5G broadcastPioneering 5G broadcast
Pioneering 5G broadcast
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiency
 
What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave? What's in the future of 5G millimeter wave?
What's in the future of 5G millimeter wave?
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
 
Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...Enabling the rise of the smartphone: Chronicling the developmental history at...
Enabling the rise of the smartphone: Chronicling the developmental history at...
 
5G spectrum innovations and global update
5G spectrum innovations and global update5G spectrum innovations and global update
5G spectrum innovations and global update
 
Transforming enterprise and industry with 5G private networks
Transforming enterprise and industry with 5G private networksTransforming enterprise and industry with 5G private networks
Transforming enterprise and industry with 5G private networks
 
The essential role of technology standards
The essential role of technology standardsThe essential role of technology standards
The essential role of technology standards
 

Dernier

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Dernier (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

How AI research is enabling next-gen codecs

  • 1. July 14, 2021 @QCOMResearch Qualcomm Technologies, Inc. How AI research is enabling next-gen codecs
  • 2. 3 Agenda The demand for improved data compression is growing 1 AI is a compelling tool for compression 2 Our latest AI voice and video codec research 3 Our on-device neural video decoder demo 4 Future research work and challenges 5
  • 3. 4 4 The scale of video and voice being created and consumed is massive 1M Minutes of video crossing the internet per second 15B Minutes of talking per day on WhatsApp calls 82% Of all consumer internet traffic is online video 76 Minutes per day watching video on digital devices by US adults 8B Average daily video views on Facebook Cisco Visual Networking Index: Forecast and Trends, 2017–2022; WhatsApp blog 4/28/20 4
  • 4. 5 5 Video streaming Extended reality Smart cities Smart factories Increasingly, video is all around us — providing entertainment, enhancing collaboration, and transforming industries Autonomous vehicles Video conferencing Sports Smartphone
  • 5. 6 6 of internet traffic will be video by 20221 Video technology revolutionized how we create and consume media Enhanced video quality with less bits led to broad adoption across a wide range of devices and services 1. Cisco Annual Internet Report, 2018–2023
  • 6. 7 1. On a 1080p video at 30 frames per second; EVC = Essential Video Coding, VVC = Versatile Video Coding, Technical innovation drives video evolution A regular cadence of technical advancement in video codecs has led to massive reduction in file size The first evolution Introduced flexibility Focused initially on optical disc media Increased efficiency Introduced internet video Emerging video applications and technology Robust network efficiency Enabling user experiences as demand surges Continued technical advancement 2013 HEVC 2020 EVC and VVC Video Compression 2003 AVC 1995 MPEG-2 Evolution transformed Enriched experiences and proliferation Video Compression Reduction in file size with VVC versus no compression1 ~1000x DVD, digital TV Blu-ray, HDTV, internet ~50% compression ↑ Streaming, 4k, HDR, 360˚ ~50% compression ↑ 8k, immersive video ~30% ~40% compression ↑
  • 7. 8 8 Visual quality is much more than resolution and frame rate Preserving color accuracy, contrast, and brightness is also crucial Broad interoperability across all our screens Resolution Increased definition and sharpness Frame rate Reduced blurring and latency Pixel quantity Pixel fidelity Color accuracy More realistic colors through an expanded color gamut, depth, and temperature Contrast and brightness Increased detail through a larger dynamic range and lighting enhancements
  • 8. 9 Variational auto encoder (VAE)* Invertible Auto-regressive Generative adversarial network (GAN) Generative models Extract features by learning a low-dimension feature representation Sampling to generate, restore, predict, or compress data Powerful capabilities Broad applications Computational photography Text to speech Graphics rendering Speech/video compression Voice UI * VAE first introduced by D. Kingma and M. Welling in 2013 Deep generative model research for unsupervised learning Given unlabeled training data, generate new samples from the same distribution
  • 9. 10 Variational auto encoder (VAE)* Invertible Auto-regressive Generative adversarial network (GAN) Generative models Deep generative model research for unsupervised learning Given unlabeled training data, generate new samples from the same distribution * VAE first introduced by D. Kingma and M. Welling in 2013 Encoder part Decoder part Input unlabeled data, such as images Desired output as close as possible to input Extract features by learning a low-dimension feature representation Sampling to generate, restore, predict or compress data
  • 10. 11 AI-based compression has compelling benefits No special-purpose hardware required, other than an AI acceleration Easy to upgrade, standardize, and deploy new codecs Specialized to a specific data distribution Easy to develop new codecs for new modalities Improved rate-distortion trade-off Optimized for advanced perceptual quality metrics Semantics aware for human visual perception Can generate visual details not in the bitstream
  • 11. 12 12 Achieving state-of-the-art data compression with AI 1 Comparison between state-of-the-art traditional speech coding algorithm and AI speech compression encoder decoder Input speech Output speech Compressed speech Bit-rate compression at same speech quality using AI1 2.6X
  • 12. 13 13 Feedback Recurrent Autoencoder for Video Compression (Golinski, et al., arXiv 2020) Video used in images is produced by Netflix, with CC BY-NC-ND 4.0 license: https://media.xiph.org/video/derf/ElFuente/Netflix_Tango_Copyright.txt Novel machine learning-based video codec research Neural network based I-frame and P-frame compression Achieving state-of-the-art rate-distortion compared with other learned video compression solutions End-to-end deep learning Reconstruction Ground truth Motion vector Residual T=3 Motion estimation P-frame encoder Motion estimation Entropy encoding bits Entropy decoding P-frame decoder Sum Warp Deep neural net T=1 I-frame encoder Entropy encoding bits Entropy decoding I-frame decoder T=2 Motion estimation P-frame encoder Entropy encoding bits Entropy decoding P-frame decoder Warp Sum
  • 13. 14 14 Video compression utilizes temporal and spatial redundancy Group of pictures (GOP) is a sequence of video frame types used to efficiently encode data GOP structures I-frame (intra) P-frame (predicted) B-frame (bi-directional) Independently code each frame Exploit spatial redundancy Code changes based on previous frame Exploit spatial and temporal redundancy Code changes based on previous and next frame Exploit spatial and temporal redundancy | | | | | | | | | | ● ● ● P ● ● ● | ● ● ● P ● ● ● | | ● ● ● B ● ● ● P ● ● ● B ● ● ● P Increasing complexity and bitrate compression
  • 14. 15 Existing research for B-frame codecs has limitations Our solution combines the best of existing P-frame and B-frame codecs while allowing them to share weights Existing P-frame codec Existing B-frame codecs Our B-frame codec Residual compression Motion compensation Motion compression 𝑅𝑒𝑓 𝐼𝑛𝑝𝑢𝑡 𝑂𝑢𝑡𝑝𝑢𝑡 𝑃𝑟𝑒𝑑 Residual compression Bidir motion compensation Bidir motion compression 𝑃𝑟𝑒𝑑 𝑅𝑒𝑓1 𝑅𝑒𝑓0 𝐼𝑛𝑝𝑢𝑡 𝑂𝑢𝑡𝑝𝑢𝑡 High correlation between the two flows (motion constancy) Residual compression Frame interpolation 𝑃𝑟𝑒𝑑 𝑅𝑒𝑓1 𝑅𝑒𝑓0 𝐼𝑛𝑝𝑢𝑡 𝑂𝑢𝑡𝑝𝑢𝑡 Does not work well for large motion P-frame codec Frame interpolation 𝑃𝑟𝑒𝑑 𝑅𝑒𝑓1 𝑅𝑒𝑓0 𝐼𝑛𝑝𝑢𝑡 𝑂𝑢𝑡𝑝𝑢𝑡 P-frame and B-frame codecs can share weights
  • 15. 16 16 Extending Neural P-frame Codecs for B-frame Coding (Pourreza, et al., arXiv 2021) Illustration of different inter-frame coding methods B-Frame compression through Extended P-frame & Interpolation Codec (B-EPIC) Actual object location Object location Prediction Motion vector Prediction error Interpolation Interpolation guide 𝐱𝑡 ෤ 𝐱𝑡 ො 𝐱ref Input frame The corresponding prediction The reference 𝐱𝑡 ො 𝐱ref1 ො 𝐱ref0 ෤ 𝐱𝑡 B-frame prediction based on bidirectional flow/warp: two motion vectors with respect to two references are transmitted. ො 𝐱ref ෤ 𝐱𝑡 P-frame prediction: a motion vector with respect to a single reference is transmitted ො 𝐱ref0 ෤ 𝐱𝑡 ො 𝐱ref1 B-frame prediction based on frame interpolation: the interpolation result is treated as the prediction. No motion information is transmitted. ො 𝐱ref0 ෤ 𝐱𝑡 ො 𝐱ref1 Our B-frame prediction approach: the interpolation result is corrected using a unidirectional motion vector similar to P-frame
  • 16. 17 17 R. Pourezza, T. Cohen, Extending Neural P-frame Codecs for B-frame Coding, Under review at ICCV 2021 Our B-frame coding provides state-of-the-art results Improved rate-distortion tradeoff by extending neural p-frame codecs for B-frame coding 32 33 34 35 36 37 38 39 40 41 0 0.1 0.2 0.3 0.4 0.5 0.6 PSNR (dB) Rate (bits per pixel) PSNR 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 0 0.1 0.2 0.3 0.4 0.5 0.6 MS-SSIM Rate (bits per pixel) MS-SSIM Ours Google [CVPR-20] H.265 (FFmpeg)
  • 17. 18 18 shared knowledge 𝜃𝒟 𝜃𝒟 sender receiver Overfitting through instance-adaptive video compression Improved compression via continual learning Overfit a model on one video instance and encode weight-deltas in the bitstream Model delta < baseline bitstream Encoded bitstream + Send weight-deltas based on overfitting Send smaller encoded bitstream based on overfitting encoder 𝑞𝜑(𝒛|𝒙) 0 0 0 1 1 1 ෝ 𝒙 0 0 0 1 1 1 𝒙 𝒃𝒛 E9 E9 E11 D4 D4 𝒛 𝒛 𝒃𝒛 D3 latent prior latent prior decoder 𝑝𝜃(𝒙|𝒛) EE ED E11 EE ED 0 0 1 0 0 1 𝒃ഥ δ E10 D1 𝒃ഥ δ ⊖ ത δ ⊕ ത δ D2 model prior model prior decoder 𝑝𝜃(𝒙|𝒛) T. van Rozendaal*, I.A.M. Huijben*, T. Cohen, Overfitting for Fun and Profit: Instance-Adaptive Data Compression, ICLR 2021 T. Van Rozendaal, Y. Zhang, J. Brehmer, T. Cohen, Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set, Under review at ICCV 2021
  • 18. 19 Mobile friendly deployment Decoding complexity can be reduced by while still maintaining SOTA results 72% SOTA results from instance-adaptive video compression 24% BD-rate savings over leading neural codec by Google 29% BD-rate savings over FFmpeg H.265 (FFmpeg)
  • 19. 20 20 Variable bitrate image compression for simpler deployment Three levels of solutions with the goal of single model producing a single bitstream Level 0: Multi-model multi-bitstream encoder1 decoder1 encoder2 decoder2 encoder3 decoder3 Each model is optimized for a fixed 𝛽 𝑁 models are needed for 𝑁 rate-distortion tradeoffs Level 1: Single-model multi-bitstream encoder decoder encoder decoder encoder decoder Single encoder-decoder model Certain params /input are used to steer R-D tradeoff 𝛽1 𝛽1 𝛽2 𝛽3 𝛽2 𝛽3 Level 2: Single-model single-bitstream encoder decoder decoder decoder Single encoder-decoder model Encode once, embed all bitrates A.k.a. embedded/ progressive/ scalable coding Increasing order of design difficulty and ease of deployment
  • 20. 21 1/s × s 1/s 𝑦 + - EC 𝑥 𝑦(𝑠) Hyper Codec ED 𝒈𝑎 Analysis transform / encoder 𝒈s Synthesis transform / decoder 𝑥 Input image s Scaling factor that determines the bitrate µ, ℴ Prior parameters EC Entropy coding ED Input image 𝑔𝑎 𝑔𝑠 µ 𝑥 ^ ⎿ ⎿ • N(µ, ℴ) Level 1 approach that applies a scaling factor to the latent for a different trade-off between rate and distortion Latent scaling for a single-model multi-bitstream solution
  • 21. 22 Level 2 approach that uses multiple quantization levels and conditional probability to create a single bitstream with multiple quality levels Nested quantization for a single-model single-bitstream solution Fine scale s1 Coarse scale s2 Finer bins for increased quality
  • 22. 23 Y. Lu, Y. Zhu, Y. Yang, A. Said, T. Cohen, Progressive Neural Image Compression with Nested Quantization and Latent Ordering, Accepted at ICIP 2021 Variable-bitrate progressive neural image compression Achieves comparable performance to HEVC Intra but uses only a single model and a single bitstream HEVC Intra Separate bitstreams, one for each quality [N-model, N-bitstream] Level 0 [1-model, 1-bitstream] Level 2 - Ours [1-model, N-bitstream] Level 1 - HEVC Intra [1-model, 1-bitstream] Level 2 Our scheme A single bitstream where the prefixes decode to different qualities
  • 23. 24 24 Semantic-aware image compression for improved quality Allocate more bits to regions of interest Source Baseline bit allocation Semantic-aware bit allocation
  • 24. 25 25 Semantic compression focuses on regions of interest End-to-end deep learning on foreground/background weighted distortion loss Mask code Mask encoder Code Decoder Encoder Mask decode Foreground / background loss Segmentation mask Input image Gain Reconstructed image
  • 25. 26 Next step Extend to video compression Semantic-aware image compression improves quality where it matters State-of-the-art results for rate-distortion tradeoff 28 30 32 34 36 38 40 42 0.2 0.4 0.6 0.8 1 1.2 1.4 RGB PSNR (dB) Rate (bits per pixel) Rate distortion Segmentation AE (ROI) Segmentation AE (non-ROI) Hyperprior AE (ROI) Hyperprior AE (non-ROI) 5dB gain in region of interest (ROI) 2dB loss at non-ROI
  • 26. 27 27 The tradeoff between perception and distortion metrics Optimizing distortion and/or perceptual quality leads to different reconstructions Original Rate + Distortion Rate + Perception Rate + Distortion + Perception Distortion Perceptual quality Very good Very bad Very bad Very good Good Good
  • 27. 28 28 How GAN-based codecs work GAN-based codecs can produce much more visually appealing content at very low bits per pixel Encoder Generator Discriminator 1 2 1 0 Data distribution P(𝑧) Prior P(real | 𝑧) P(real)
  • 28. 29 29 GAN 14.5 kB 0.221 bpp 145x reduction Raw 2097.15 kB 8.0 bpp 512x1024 BPG QP=40 16.4 kB 0.250 bpp 128x reduction
  • 29. 30 Parallel Entropy Decoding Snapdragon is a product of Qualcomm Technologies, Inc. and/or its subsidiaries. Real-time on-device neural video decode is now possible Research demo showcases all-intra neural video decode on a smartphone. Inter-frame decoding is next 1280 × 704 (Comparable to HD 720p) Mobile device powered by Snapdragon® 888 Mobile Platform CPU Parallel Entropy Encoding 30+ Frames / second Offline processing Encoder AI accelerator Decoder Bitstream
  • 30. 31 31 8-bit model with efficient decoding PEC Decoder AI Model Efficiency Toolkit (AIMET) https://github.com/quic/aimet Step-3 AIMET quantization aware training AIMET is a product of Qualcomm Innovation Center, Inc. Efficient on-device neural decoding Limit to I-frame (intra-frame) component of neural video decoding; Inter-frame decoding is coming soon Step-1 Decoder architecture refinement Decoder Floating-point neural codec Step-2 Parallel Entropy Coding (PEC) PEC Entropy coding Decoder Encoder Parallel entropy coding + architecture refinement + 8-bit  Fast inference AIMET quantization aware training  Improve quantized model accuracy
  • 31. 32 Real-time neural video decoding on a mobile device demo video Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.
  • 32. 33 Perceptual quality Need to develop differentiable perceptual quality metrics GANs are promising, but not perfect Video GANs are challenging to train due to computational requirements Computational efficiency Real-time on-device video encoding and decoding is still a challenge Rate distortion improvement Bitrate needs to continue to get smaller for the foreseeable future as video demand increases New modalities Lots of work to do on new modalities like point clouds, omnidirectional video, multi-camera setups (e.g. in AV), etc. Challenges for mass deployment of neural video compression
  • 33. 34 34 AI-based compression has the potential to more efficiently share video and voice across devices We are conducting leading research and development in end-to-end deep learning for video and speech coding We are solving implementation challenges and demonstrating a real-time neural video decoder running on a smartphone
  • 35. Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog Thank you Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2018-2021 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm and Snapdragon are trademarks or registered trademarks of Qualcomm Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, research and development functions, and substantially all of our products and services businesses, including our QCT semiconductor business.