Evaluating Wavelet Tranforms for Video Conferencing Applications
1. ICT R&D Funded Project
Evaluating Wavelet Tranforms for Video
Conferencing Applications
Second quarter report
(Oct – Dec, 2008)
Principal Investigators:
Dr. Shahid Masud and Dr. Nadeem Khan
Dept of Computer Science
Lahore University of Management Sciences
1
2. Executive Summary:
Research project titled ‘Evaluating Wavelet Transforms for Video Conferencing
applications’ was approved by ICT R&D fund in March 2008. The project was initiated
in July 2008. Second quarter activities in this project mainly revolve around developing
in-depth knowledge of Dirac both in terms of theory and software implementation. Three
aspects of the project are being simultaneously addressed; (a) developing a detailed
understanding of DIRAC wavelet based video codec, (b) developing a software flow
diagram of DIRAC software and (c) investigating the development of SIP based
openphone type application around DIRAC. This report includes complete theoretical
details of DIRAC codec and its software flow diagram.
This report is divided into four sections as follows:
1.0 Summary of DIRAC Wavelet Based Video Codec
2.0 Software Architecture of DIRAC
3.0 SIP based video-over-IP and integration of DIRAC
4.0 Future Work
2
3. 1.0 Summary of DIRAC Wavelet Based Video Codec
Dirac is an experimental wavelet transform Open Source video codec initially developed
by BBC Research. The aim of the Dirac was to build up a high performance video
compression codec with a simpler and modular design both conceptually and in
implementation.
Following are the main elements or modules to the coder:
• Transform and scaling involves taking frame data and applying a transform (in
this case the wavelet transform) and scaling the coefficients to perform
subsequent quantization;
• Quantization
• Entropy coding is applied to quantized transform coefficients and to motion
vector (MV) data and performs lossless compression on them;
• Motion estimation (ME) involves finding matches for frame data from previously
coded frames, trading off accuracy with motion vector bit rate. Motion
compensation (MC) involves using the motion vectors to predict the current
frame, in such a way as to minimize the cost of encoding the residual data.
• Rate-distortion framework is used throughout the encoder. Unlike MPEG codecs,
there is no macroblock-by-macroblock switching.
• VLC – Variable length coding
Dirac uses macroblock structures to introduce a degree of adaption into motion
estimation by allowing the size of the blocks used to vary. The mode decision in software
is taken by trying a combination of block size and prediction mode using RDO block-
matching metric and best solution adopted macroblock by macroblock.
Dirac can use any block sizes, although blocks parameters do have to meet some
constraints, so that the overlapping process works properly, especially in conjunction
with subsampled chroma components (for which the blocks will be correspondingly
smaller). For example, the block separations and corresponding lengths must differ by a
multiple of four, so that overlap is symmetric for luma and sub-sampled chroma.
The attached document titled ‘Summary Dirac’ includes all theoretical and algorithmic
details of Dirac codec. The explanation of wavelet transform filters and quantization
operation, motion estimation, entropy coding and VLC have been included in this
document.
The ‘Summary Dirac’ document also contains a detailed bibliography of all literature
available about Dirac, its performance and comparison.
2.0 Developing Software flow diagram of DIRAC
The attached document ‘Software Architecture’ includes all implementation details of
Dirac codec. The complexity of Dirac codec is an order of magnitude more than
commonly available codecs such as H.263 / H.264. It is very important to develop an
3
4. overview of software flow, classes, objects and interfaces to be able to effectively utilize
this codec within the video conferencing environment.
Important aspects of software flow diagram include:
• High-level Input and Output of Dirac Encoder
• Input of Motion Estimation and Compensation
• Output of Motion Estimation and Compensation
• Input of Quantization
• Output of Quantization
• Input of Wavelet Transform
• Output of Wavelet Transform
• Input and Output of Entropy Coding
3.0 SIP based video-over-IP and integration of DIRAC
We are also in the process of studying and documenting the SIP based Open Phone
software which we want to use as a test-bed video conferencing framework to test our
codec. Some details of it are included as follows. Its technical details and compilation
details are provided in the attached document “Developing a Wavelet Based Video
Conferencing System – Open Phone and DIRAC Integration”.
SIP:
SIP [1, 2] is the abbreviation of session initiation protocol. It is a signaling protocol used
for establishing sessions in an IP network. A session could be a simple two-way
telephone call or it could be a collaborative multi-media conference session. SIP uses
UDP or TCP as transport protocol and it can handle unicast or multicast sessions. Using
SIP, telephony becomes another web application and integrates easily into other Internet
services.
OPAL:
OPAL [2, 3] is the abbreviation of open phone abstraction library. It uses SIP/H.323.
OPAL is the next generation of openh323 library. It is design to be infrastructure for any
call protocol such as SIP and H.323. It include the momentarily implementation of SIP
and H.323 stacks. In order to use the OPAL library Pwlib is needed as interface layer to
the operating system.
OPEN PHONE:
Open Phone [2] application is a basic Softphone realized with the OPAL library. Open
Phone uses OPAL manager. Initializing OPAL manager means among other things:
setting up the UDP or TCP port, jitter values, preferable media formats and sip-proxy
server. In main function, upon user permission, a sip call will be made or a new incoming
call will be accepted or rejected.
WORK In this quarter:
4
5. The attached document “Developing a Wavelet Based Video Conferencing System –
Open Phone and DIRAC Integration” includes details of possible ways by which DIRAC
could be integrated in OPENPHONE type application. Software classes are being
developed by which this integration may be possible.
Reference:
[1] http://www.sipcenter.com/sip.nsf/html/What+Is+SIP+Introduction
[2] www.voxgratia.org/docs/opal/Thesis_Taha.pdf
[3] http://www.opalvoip.org/
4.0 Future Work
Investigation of DIRAC and Open Phone have resulted in developing some
understanding of issues likely to creep up in the future. Main problem in these, like many
other open source projects, is the complexity and variation in depth and breadth of
software, lack of detailed documentation, no support and unclear instructions. It is hoped
that the documents prepared so far will help us in streamlining further efforts.
The work in the next quarter will revolve around fixing the issues related to integration of
DIRAC into Open Phone. We will also try to take DIRAC through VTUNE profiling tool
to get estimates of computational loads of different modules that will help in later stages
to improve the run-time speed of codec. At the moment, the DIRAC executes far too
slow (around fifty frames per minute) for CIF-size videos. It is hoped that we will have a
draft-design of our proposed solution available in the next quarter.
5