MPEG 4 VIDEO

MPEG 4 VIDEO

Presented by
Arun S Amingad.

Topics to Cover:
• Facial Animation Parameters(FAP)
• Facial Definition Parameters(FDP)
• Face Model
• Coding of FAP’s
• Integration of Face Animation and Text to
Speech(TTS) synthesis.
• (Binary Format Scene)BIFS for Facial
Animation.

• What is (Facial Animation Parameters)FAP?
It is based on the minimal perceptual
actions of human beings,such as
expressions,emotions etc..and are closely
related to the muscle actions.
• What is (Facial Definition Parameters)FDP?
It allows the user to configure the 3D
facial model to be used at the receiver.(either
sending the previously sent model or
introducing a fresh model)

A Face model with its feature points

Face Model:
• Every MPEG-4 terminal that is able to decode FAP
streams has to provide a face model for
animation.
• This model is proprietary to decoder itself.
• The encoder does not know about the look of the
face model.
• Using an FDP node MPEG-4 allows the encoder to
specify completely the face model to animate.
• The FDP node can also be used to calibrate the
proprietary model of the decoder.

• The decoder may choose to specify the location of all
or some feature points.
• After specifying the feature points, the decoder can
adapt its own proprietary face model such that the
model conforms to the feature point positions.
• Face model adaptation also allows for the
downloading of texture maps for the face.
• Each feature point has a different texture map
• In order to specify the mapping of the texture map
onto the face model,the encoder sends texture
co-ordinates for each feature point.

• Encoder specific.
• The process of adapting the feature point locations of
the face model according to encoder specifications is
referred to as Face Model Calibration.
• Sometimes also called as Face Model Adaptation.

Simplified scene graph for a head model.
Root Group

Head Transform X

Head Transform Y

Head Transform Z

Left Eye Right Eye
Face Hair Tongue Teeth
Transform X Transform X
Left Eye Right Eye
Transform Y Transform Y

Left Eye Right Eye

• A root node is a collection of objects.
• For the objects to move together in a group, they
need to be in the same transform group.
• When the transform nodes contain different
transforms, the information setting has a
cumulative effect.
• The transform node defines geometric 3D
transformations such as scaling,rotation etc.
• Indexed Face Set is used to define the geometry
and the surface attributes (color and texture) of
the object.
• The rotations for the left eye and right eye are
also embedded in this.

Coding Of (Facial Animation
Parameters)FAP’s:
• Tools used for coding:
1) Arithmetic encoder(low delay)
2)DCT coding technique (high delay)

• 1)Using Arithmetic decoder:
-Allows for low delay FAP coding
-coding efficiency is low
• 2)Using DCT:
-Introduces larger delay.
-Achieves higher coding efficiency.

• The first set of FAP values , FAP(0) is coded without
prediction.(At time instant zero)
• The value of a FAP at time instant k i.e FAP(k) is
predicted using the previous encoded value FAP(k-1)
• e` is quantized using the step size QP multiplied by a
quantization parameter FAP_QUANT.
• 0< FAP_QUANT<31
• The quantized prediction error e` is arithmetically
encoded using a separate adaptive probability model
for each FAP.
• FAP_QUANT>15,is usually not used because the quality
of the animation gets reduced.
• At the decoder,the received data is arithmetically
decoded,dequantized and added to the previously
decoded value.

DCT:
• Applied to 16 consecutive FAP values.
• Hence,it introduces a significant delay in the coding and
decoding processes.
• After computing the DCT of 16 consecutive values of one
FAP,DC and AC coefficients are coded seperately.
• DC coefficients use the prediction method
• AC coefficients are directly coded.
• Both AC and DC coefficients are quantized seperately.
• The quantized coefficients are encoded with one VLC word
defining the number of zero co-efficients,prior to next
non-zero coefficients and another VLC for the amplitude of
this non zero coefficient.

Integration of TTS synthesizer into an MPEG4 face animation system

Integration of Face Animation and Text to
Speech(TTS) synthesis
• Syncronization of a FAP stream with TTS
synthesizers using the TTSI(TTS interface) is
only possible if the encoder sends the timing
information.
• This is because,a conventional TTS is an
asynchronous source.
• Decoder:Decodes the text and passes it to the
proprietary speech synthesizer.

• SYNTHESIZER:Creates speech samples that are
handed to the compositor.
• COMPOSITOR:Provides audio or video output to
the user.
• The second output interface of the synthesizer
sends the phonemes of the synthesized speech as
well as the start time and duration information of
each phoneme to FAP converter.
• The converter translates the phonemes and
timing information into FAP’s so that the face
renderer can use in order to animate the face
model.

• Bookmarks in the text of TTS is used to
animate facial expressions.
• When the TTS finds the bookmarks in the
text,it sends it to FAP converter.
• FAP converter transforms the phonemes into
visemes and timing information into the FAP’s.
• When the TTS finds the bookmark in the
text,it sends this bookmark to the FAP
converter.
• The bookmark defines the start point and
duration of transition to FAP amplitude.

Integration with MPEG-4 Systems:(BIFS)
• To use face animation in MPEG-4 systems,a BIFS scene
graph has to be transmitted.
• Minimum scene graph should contain a face node and FAP
node.
• The nodes of FAP’s may be the high level FAP’s such as
visemes and expressions.
• The scene graph would enable the encoder to animate the
proprietary face model of the decoder.
• In order to download a face model to the decoder,it
requires a FDP node.
• A FDP node is further divided into its children,viz Face
definition table(Fdef),Face Definition Mesh(FDM),Face
Definition Transform(FDT).

Nodes of the BIFS scene that are used to describe and animate a face

MPEG 4 VIDEO

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to MPEG 4 VIDEO

Similar to MPEG 4 VIDEO (20)

Recently uploaded

Recently uploaded (20)

MPEG 4 VIDEO