Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Unit 3 Dictionary based Compression Techniques
1. Lecture Notes on Dictionary Based
Compression Techniques
for
Open Educational Resource
on
Data Compression(CA209)
by
Dr. Piyush Charan
Assistant Professor
Department of Electronics and Communication Engg.
Integral University, Lucknow
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
2. • Dictionary-based compression algorithms
usually create a dictionary (a pattern of
characters) in memory as data is scanned
looking for repeated information (some
implementations use a static dictionary so it
does have to be built dynamically).
4/22/2021 2
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
3. LZ77/LZ1/ Sliding Window
• In many applications, the output of the source consists of
recurring patterns.
• A very reasonable approach to encode such sources is to
keep a list, or dictionary, of frequently occurring patterns.
• The input is split into two classes, frequently occurring and
infrequently occurring patterns.
• There are static and adaptive dictionary techniques. Most
adaptive techniques have their roots in two papers by Ziv
and Lempel in 1977 (LZ77) and 1978 (LZ78)
4/22/2021 3
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
4. LZ77 Approach
• LZ77 is a Dynamic Adaptive Dictionary Technique that consists of a Sliding
Window.
• The widow consists of two parts:
– Search Buffer (SB)
– Look Ahead Buffer (LAB)
• The size of the sliding window is given as:
• Window Size= SB+LAB
1 2 3 4 5 6 7 8 9 10 11 12 13
Search Buffer
Look Ahead Buffer
4/22/2021 4
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
5. • Search Buffer: A Search Buffer that contains a portion of the
recently encoded sequence.
• Look Ahead Buffer: A Look - Ahead Buffer that contains the next
portion of the sequence.
• To encode the sequence in look-ahead buffer, the encoder moves a
search pointer back through the search buffer until it encounters a
match to the first symbol in the look-ahead buffer.
• Any two of the three must be given in the problem to encode given
sequence of text.
4/22/2021 5
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
6. Process of LZ77 Compression
• Lets see the process below:
• Triplets: <o, l, c>
c a b r a c a d a b r a
Window Size=13
r r a ……
Search Buffer Look-Ahead Buffer
Offset
Length of match
codeword
4/22/2021 6
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
7. • Offset (o): The distance between the search pointer and the
look-ahead buffer is called the offset.
• Length of match (l): The number of consecutive symbols in
the search buffer that match the consecutive symbols in the
look-ahead buffer, starting with the first symbol, is called the
length of match.
• Codeword (c): It is the codeword corresponding to the symbol
in the look-ahead buffer that follows the match.
4/22/2021 7
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
8. LZ77 Example
• Encode the message-
c a b r a c a d a b r a r r a r r a d
• Here Window Size =13
• And Size of Look Ahead Buffer =6
c a b r a c a d a b r a r r a r r a d
4/22/2021 8
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
9. LZ77 Example contd..
c a b r a c a
c a b r a c
c a b r a c a d
a d a……
d a b……
a b r……
<0,0,c(c)>
<0,0,c(a)>
Search buffer Look Ahead Buffer
<0,0,c(b)>
4/22/2021 9
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
10. LZ77 Example contd..
c a b r a c a d a b
c a b r a c a d a
c a b r a c a d a b r a
b r a……
r a r……
r r a……
<0,0,c(r)>
<3,1,c(c)>
Search buffer Look Ahead Buffer
<2,1,c(d)>
4/22/2021 10
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
11. LZ77 Example contd..
a d a b r a r r a r r a d
a b r a c a d a b r a r r a r r…… <7,4,c(r)>
<3,5,c(d)>
Search buffer Look Ahead Buffer
c
c a b r a c
4/22/2021 11
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
12. LZ77 Example contd..
• The encoded message in the form of triplets are as
follows:
<0, 0, c(c)>,<0, 0, c(a)>,<0, 0, c(b)>,<0, 0, c(r)>
<3, 1, c(c)>,<2, 1, c(d)>,<7, 4, c(r)>,<3, 5, c(d)>
4/22/2021 12
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow