SlideShare une entreprise Scribd logo
1  sur  60
Télécharger pour lire hors ligne
Lecture Notes on Arithmetic Coding
for
Open Educational Resource
on
Data Compression(CA209)
by
Dr. Piyush Charan
Assistant Professor
Department of Electronics and Communication Engg.
Integral University, Lucknow
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
UNIT-III Syllabus
• Arithmetic Coding: Coding a sequence,
• Generating a Binary code,
• Comparison of Arithmetic and Huffman coding.
• Dictionary Techniques: Introduction, Static Dictionary:
• Diagram Coding, Adaptive Dictionary:
• The LZ77 Approach, The LZ78 Approach.
• Applications: File Compression, Image Compression
• Lossless Image Compression: Multi-resolution Approaches.
• Context Based Compression: Dynamic Markov Compression.
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 2
 Coding rate is the average number of bits used to represent a symbol from a source.
 For a given probability model, the entropy is the lowest rate at which the source can
be coded.
 Huffman coding will generate whose rate is within p_max + 0. 086
 Therefore, in Huffman coding, when the alphabet size is large, the amount of
deviation from the entropy is quite small, and vice versa.
 One solution for this problem is blocking in Huffman coding. In which, it is more
efficient to generate codewords for groups or sequences of symbols rather than to
generate a separate codeword for each symbol in a sequence.
 In order to find the Huffman coding for a sequence of length m, we need
codewords for all possible sequences of length m.
 This causes an exponential growth in the size of the code book.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 3
4/22/2021
 We need a way of assigning codewords to particular sequences with out having to
generate a codes for all sequences of that length.
 Rather than separating the input into component symbols and replacing each with a code,
arithmetic encodes the entire message with a number (tag).
 Firstly, a unique identifier or tag is generated for a sequence. Secondly, this tag is then
given a unique binary code.
• Entropy encoding • Lossless data compression • Variable length coding
Arithmetic Coding
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 4
Arithmetic Coding
 Arithmetic coding is based on the concept of interval subdividing.
– In arithmetic coding a source ensemble is represented by an interval between 0
and 1 on the real number line.
– Each symbol of the ensemble narrows this interval.
– As the interval becomes smaller, the number of bits needed to specify it grows
– Arithmetic coding assumes an explicit probabilistic model of the source.
– It uses the probabilities of the source messages to successively narrow the
interval used to represent the ensemble.
 A high probability message narrows the interval less than a low
probability message, so that high probability messages contribute fewer
bits to the coded ensemble.
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 5
 Assume we know the probabilities of each symbol of the data source,
 we can allocate to each symbol an interval with width proportional to its
probability, and each of the intervals does not overlap with others.
 This can be done if we use the cumulative probabilities as the two ends of each
interval.
 Therefore, the two ends of each symbol x amount to Q[x-1] and Q[x].
 Symbol x is said to own the range [Q[x-1], Q[x]).
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 6
4/22/2021
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 7
 We begin with the interval [0, 1) and subdivide the interval iteratively.
 For each symbol entered, the current interval is divided according to the
probabilities of the alphabet.
 The interval corresponding to the symbol is picked as the interval to be further
proceeded with.
 The procedure continues until all symbols in the message have been processed.
 Since each symbol's interval does not overlap with others, for each possible
message there is a unique interval assigned.
 We can represent the message with the interval's two ends [L, H). In fact, taking
any single value in the interval as the encoded code is enough, and usually the
left end L is selected.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 8
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 9
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 10
4/22/2021
Once the character probabilities are
known, the individual symbols need
to be assigned a range along a
"probability line," which is nominally
0 to 1. It doesn't matter which
characters are assigned which
segment of the range, as long as it is
done in the same manner by both
the encoder and the decoder. The
nine-character symbol set used here
would look like Figure 2.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 11
4/22/2021
Each character is assigned the
portion of the 0 - 1 range that
corresponds to its probability of
appearance. Note also that the
character "owns" everything up
to, but not including the higher
number. So the letter T in fact
has the range 0.90 - 0.9999 ....
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 12
4/22/2021
After the first character is encoded, we
also know that the range for our output
number is bounded by the low and high
numbers. During the rest of the encoding
process, each new symbol to be encoded
will further restrict the possible range of
the output number. The next character to
be encoded, I, owns the range 0.50
through 0.60. If this was the first number
in our message, we would set these as our
low- and high-range values. But I is the
second character; therefore, we say that I
owns the range corresponding to 0.50 -
0.60 in the new subrange of 0.2 - 0.3. This
means that the new encoded number will
have to fall somewhere in the 50 to 60th
percentile of the currently established
range.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 13
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 14
4/22/2021
Binary Codeword
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 15
Decoding Algorithm
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 16
Decoding BILL GATES
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 17
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 18
4/22/2021
Huffman vs. Arithmetic Codes
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 19
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 20
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 21
4/22/2021
Huffman vs. Arithmetic Codes
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 22
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 23
4/22/2021
Arithmetic Coding Huffman Coding
Does not need the probability distribution Need a probability distribution
No need to keep and send codeword table Need to store the codeword table
Decompression speed is slow Decompression speed is Fast
Compression Speed is low Compression speed is Fast
Compression ratio is very good Compression ratio is poor
No compressed pattern matching Compressed pattern matching
Fractional codeword length Minimum codeword length is 1 bit
Does not produce Prefix code Produce Prefix code
Comparison of Arithmetic vs. Huffman
Coding
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 24
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 25
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 26
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 27
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 28
4/22/2021
 each symbol or group of symbols is encoded with a variable
length code, according to some probability distribution.
 based on the use of a dictionary, which can be static or dynamic, and
they code each symbol or group of symbols with an element of the
dictionary.
Huffman
Dynamic Markov Compression
Lempel-Ziv-Welch
Lossless Compression Techniques
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 29
Lempel-Ziv-Welch (LZW)
Created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in
1984as an improved implementation of the LZ78 algorithm, published by Lempel and Ziv
in 1978
universal adaptative1 lossless data compression algorithm
builds a translation table (also called dictionary) from the text being compressed
the string translation table maps the message strings to fixed-length codes
1
The coding scheme used for the kth
character of a message is based on the characteristics of the preceding k −
1 characters in the message
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 30
4/22/2021
Dictionary Based Techniques
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 31
Lempel –Ziv Coding
4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 32
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 33
4/22/2021
Lempel –Ziv Coding
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 34
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 35
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 36
4/22/2021
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 37
4/22/2021
Lempel-Ziv-Welch (LZW) Compression Algorithm
 As mentioned earlier, static coding schemes require some knowledge
about the data before encoding takes place.
 Universal coding schemes, like LZW, do not require advance
knowledge and can build such knowledge on-the-fly.
 LZW is the foremost technique for general purpose data compression
due to its simplicity and versatility.
 It is the basis of many PC utilities that claim to “double the capacity of
your hard drive”
 LZW compression uses a code table, with 4096 as a common choice for
the number of table entries.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 38
4/22/2021
LZW (cont'd)
 Codes 0-255 in the code table are always assigned to represent single bytes
from the input file.
 When encoding begins the code table contains only the first 256 entries,
with the remainder of the table being blanks.
 Compression is achieved by using codes 256 through 4095 to represent
sequences of bytes.
 As the encoding continues, LZW identifies repeated sequences in the data,
and adds them to the code table.
 Decoding is achieved by taking each code from the compressed file, and
translating it through the code table to find what character or characters it
represents.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 39
4/22/2021
LZW Encoding Algorithm
1 Initialize table with single character strings
2 P = first input character
3 WHILE not end of input stream
4 C = next input character
5 IF P + C is in the string table
6 P = P + C
7 ELSE
8 output the code for P
9 add P + C to the string table
10 P = C
11 END WHILE
12 output code for P
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 40
4/22/2021
Example 1: Compression using LZW
Example 1: Use the LZW algorithm to compress the string
BABAABAAA
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 41
4/22/2021
Example 1: LZW Compression Step 1
BABAABAAA P=A
C=empty
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 42
STRING TABLE
ENCODER OUTPUT
string
codeword
representing
output code
BA
256
B
66
4/22/2021
Example 1: LZW Compression Step 2
BABAABAAA P=B
C=empty
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 43
STRING TABLE
ENCODER OUTPUT
string
codeword
representing
output code
BA
256
B
66
AB
257
A
65
4/22/2021
Example 1: LZW Compression Step 3
BABAABAAA P=A
C=empty
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 44
STRING TABLE
ENCODER OUTPUT
string
codeword
representing
output code
BA
256
B
66
AB
257
A
65
BAA
258
BA
256
4/22/2021
Example 1: LZW Compression Step 4
BABAABAAA P=A
C=empty
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 45
STRING TABLE
ENCODER OUTPUT
string
codeword
representing
output code
BA
256
B
66
AB
257
A
65
BAA
258
BA
256
ABA
259
AB
257
4/22/2021
Example 1: LZW Compression Step 5
BABAABAAA P=A
C=A
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 46
STRING TABLE
ENCODER OUTPUT
string
codeword
representing
output code
BA
256
B
66
AB
257
A
65
BAA
258
BA
256
ABA
259
AB
257
AA
260
A
65
4/22/2021
Example 1: LZW Compression Step 6
BABAABAAA P=AA
C=empty
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 47
STRING TABLE
ENCODER OUTPUT
string
codeword
representing
output code
BA
256
B
66
AB
257
A
65
BAA
258
BA
256
ABA
259
AB
257
AA
260
A
65
AA
260
4/22/2021
LZW Decompression
 The LZW decompressor creates the same string table during
decompression.
 It starts with the first 256 table entries initialized to single characters.
 The string table is updated for each character in the input stream, except
the first one.
 Decoding achieved by reading codes and translating them through the
code table being built.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 48
4/22/2021
LZW Decompression Algorithm
1 Initialize table with single character strings
2 OLD = first input code
3 output translation of OLD
4 WHILE not end of input stream
5 NEW = next input code
6 IF NEW is not in the string table
7 S = translation of OLD
8 S = S + C
9 ELSE
10 S = translation of NEW
11 output S
12 C = first character of S
13 OLD + C to the string table
14 OLD = NEW
15 ENDWHILE
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 49
4/22/2021
Example 2: LZW Decompression 1
Example 2: Use LZW to decompress the output sequence of
Example 1:
<66><65><256><257><65><260>.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 50
4/22/2021
Example 2: LZW Decompression Step 1
<66><65><256><257><65><260> Old = 65 S = A
New = 66 C = A
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 51
STRING TABLE
ENCODER OUTPUT
string
codeword
string
B
BA
256
A
4/22/2021
Example 2: LZW Decompression Step 2
<66><65><256><257><65><260> Old = 256 S = BA
New = 256 C = B
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 52
STRING TABLE
ENCODER OUTPUT
string
codeword
string
B
BA
256
A
AB
257
BA
4/22/2021
Example 2: LZW Decompression Step 3
<66><65><256><257><65><260> Old = 257 S = AB
New = 257 C = A
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 53
STRING TABLE
ENCODER OUTPUT
string
codeword
string
B
BA
256
A
AB
257
BA
BAA
258
AB
4/22/2021
Example 2: LZW Decompression Step 4
<66><65><256><257><65><260> Old = 65 S = A
New = 65 C = A
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 54
STRING TABLE
ENCODER OUTPUT
string
codeword
string
B
BA
256
A
AB
257
BA
BAA
258
AB
ABA
259
A
4/22/2021
Example 2: LZW Decompression Step 5
<66><65><256><257><65><260> Old = 260 S = AA
New = 260 C = A
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 55
STRING TABLE
ENCODER OUTPUT
string
codeword
string
B
BA
256
A
AB
257
BA
BAA
258
AB
ABA
259
A
AA
260
AA
4/22/2021
LZW: Some Notes
 This algorithm compresses repetitive sequences of data well.
 Since the codewords are 12 bits, any single encoded character will expand
the data size rather than reduce it.
 In this example, 72 bits are represented with 72 bits of data. After a
reasonable string table is built, compression improves dramatically.
 Advantages of LZW over Huffman:
 LZW requires no prior information about the input data stream.
 LZW can compress the input stream in one single pass.
 Another advantage of LZW its simplicity, allowing fast execution.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 56
4/22/2021
LZW: Limitations
 What happens when the dictionary gets too large (i.e., when all the 4096 locations have
been used)?
 Here are some options usually implemented:
 Simply forget about adding any more entries and use the table as is.
 Throw the dictionary away when it reaches a certain size.
 Throw the dictionary away when it is no longer effective at compression.
 Clear entries 256-4095 and start building the dictionary again.
 Some clever schemes rebuild a string table from the last N input characters.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 57
4/22/2021
Lossless Image Compression: Multi-resolution
Approaches.
Image compression is a type of data compression applied to digital images, to reduce their cost
for storage or transmission.
Image compression may be lossy or lossless. Lossless compression is preferred for archival purposes and
often for medical imaging, technical drawings, clip art, or comics.
Methods for lossless compression:
Run-length encoding – used in default method in PCX and as one of possible in BMP, TGA, TIFF
Area image compression
Predictive coding – used in DPCM
Entropy encoding – the two most common entropy encoding techniques are arithmetic coding and Huffman
coding
Adaptive dictionary algorithms such as LZW – used in GIF and TIFF
DEFLATE – used in PNG, MNG, and TIFF
Chain codes
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 58
4/22/2021
Context Based Compression: Dynamic Markov
Compression.
 developed by Gordon Cormack and Nigel Horspool (1987)
 adaptative lossless data compression algorithm
 based on the modelization of the binary source to be encoded by means of a Markov chain,
which describes the transition probabilities between the symbol “0” and the symbol “1”
 the built model is used to predict the future bit of a message. The predicted bit is then coded
using arithmetic coding
Dynamic Markov compression (DMC) is a lossless data compression algorithm developed by Gordon
Cormack and Nigel Horspool.It uses predictive arithmetic coding similar to prediction by partial
matching (PPM), except that the input is predicted one bit at a time (rather than one byte at a time).
DMC has a good compression ratio and moderate speed, similar to PPM, but requires somewhat more
memory and is not widely implemented. Dynamic Markov Compression is an obscure form of
compression that uses Markov chains to model the patterns represented in a file.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 59
4/22/2021
Each circle represents a state, and each arrow
represents a transition. In this example, we have
two states, raining and sunny, a perfect
representation of true weather. Each state has
two possible transitions, it can transition to itself
again or it can transition to another state. The
likelihood of each transition is defined by a
percentage representing the probability that the
transition occurs.
Now let’s say it’s sunny and we’re following this
model. According to the model there’s a 50%
chance it’s sunny again tomorrow or a 50%
chance it’s rainy tomorrow. If it becomes rainy,
then there’s a 25% chance it’s rainy the day after
that or a 75% chance it’s sunny the day after that.
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 60
4/22/2021

Contenu connexe

Tendances

Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compressionjeevithaelangovan
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisRamakant Soni
 
Huffman Coding Algorithm Presentation
Huffman Coding Algorithm PresentationHuffman Coding Algorithm Presentation
Huffman Coding Algorithm PresentationAkm Monir
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processingAhmed Daoud
 
Video display devices
Video display devicesVideo display devices
Video display devicesMohd Arif
 
Multimedia lossy compression algorithms
Multimedia lossy compression algorithmsMultimedia lossy compression algorithms
Multimedia lossy compression algorithmsMazin Alwaaly
 
Image compression standards
Image compression standardsImage compression standards
Image compression standardskirupasuchi1996
 
Image restoration and degradation model
Image restoration and degradation modelImage restoration and degradation model
Image restoration and degradation modelAnupriyaDurai
 
Lzw compression ppt
Lzw compression pptLzw compression ppt
Lzw compression pptRabia Nazir
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic codingGidey Leul
 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image CompressionMathankumar S
 

Tendances (20)

Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compression
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysis
 
Huffman Coding Algorithm Presentation
Huffman Coding Algorithm PresentationHuffman Coding Algorithm Presentation
Huffman Coding Algorithm Presentation
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
Video display devices
Video display devicesVideo display devices
Video display devices
 
Huffman coding
Huffman coding Huffman coding
Huffman coding
 
Subband Coding
Subband CodingSubband Coding
Subband Coding
 
Huffman Coding
Huffman CodingHuffman Coding
Huffman Coding
 
Multimedia lossy compression algorithms
Multimedia lossy compression algorithmsMultimedia lossy compression algorithms
Multimedia lossy compression algorithms
 
Image compression standards
Image compression standardsImage compression standards
Image compression standards
 
Image restoration and degradation model
Image restoration and degradation modelImage restoration and degradation model
Image restoration and degradation model
 
Sleeping barber problem
Sleeping barber problemSleeping barber problem
Sleeping barber problem
 
Computer graphics realism
Computer graphics realismComputer graphics realism
Computer graphics realism
 
Image compression models
Image compression modelsImage compression models
Image compression models
 
Lzw compression ppt
Lzw compression pptLzw compression ppt
Lzw compression ppt
 
Data compression
Data compressionData compression
Data compression
 
Jpeg compression
Jpeg compressionJpeg compression
Jpeg compression
 
image compression ppt
image compression pptimage compression ppt
image compression ppt
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image Compression
 

Similaire à Unit 3 Arithmetic Coding

Unit 2 Lecture notes on Huffman coding
Unit 2 Lecture notes on Huffman codingUnit 2 Lecture notes on Huffman coding
Unit 2 Lecture notes on Huffman codingDr Piyush Charan
 
Implementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataImplementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataBRNSSPublicationHubI
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding09lavee
 
Error control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codesError control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codesIAEME Publication
 
Error control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codesError control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codesIAEME Publication
 
Unit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression TechniquesUnit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression TechniquesDr Piyush Charan
 
Speech recognition using vector quantization through modified k means lbg alg...
Speech recognition using vector quantization through modified k means lbg alg...Speech recognition using vector quantization through modified k means lbg alg...
Speech recognition using vector quantization through modified k means lbg alg...Alexander Decker
 
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATIONX-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATIONIJCI JOURNAL
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Information Theory Final.pptx
Information Theory Final.pptxInformation Theory Final.pptx
Information Theory Final.pptxSkNick1
 
Isolated word recognition using lpc &amp; vector quantization
Isolated word recognition using lpc &amp; vector quantizationIsolated word recognition using lpc &amp; vector quantization
Isolated word recognition using lpc &amp; vector quantizationeSAT Journals
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationeSAT Publishing House
 
Fine grained irony classification through transfer learning approach
Fine grained irony classification through transfer learning approachFine grained irony classification through transfer learning approach
Fine grained irony classification through transfer learning approachCSITiaesprime
 
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...ijsc
 

Similaire à Unit 3 Arithmetic Coding (20)

Unit 2 Lecture notes on Huffman coding
Unit 2 Lecture notes on Huffman codingUnit 2 Lecture notes on Huffman coding
Unit 2 Lecture notes on Huffman coding
 
Implementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataImplementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text Data
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Error control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codesError control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codes
 
Error control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codesError control coding using bose chaudhuri hocquenghem bch codes
Error control coding using bose chaudhuri hocquenghem bch codes
 
Unit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression TechniquesUnit 3 Dictionary based Compression Techniques
Unit 3 Dictionary based Compression Techniques
 
A017410108
A017410108A017410108
A017410108
 
A017410108
A017410108A017410108
A017410108
 
Speech recognition using vector quantization through modified k means lbg alg...
Speech recognition using vector quantization through modified k means lbg alg...Speech recognition using vector quantization through modified k means lbg alg...
Speech recognition using vector quantization through modified k means lbg alg...
 
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATIONX-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
X-RECOSA: MULTI-SCALE CONTEXT AGGREGATION FOR MULTI-TURN DIALOGUE GENERATION
 
Word embedding
Word embedding Word embedding
Word embedding
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
C04922125
C04922125C04922125
C04922125
 
wordembedding.pptx
wordembedding.pptxwordembedding.pptx
wordembedding.pptx
 
Information Theory Final.pptx
Information Theory Final.pptxInformation Theory Final.pptx
Information Theory Final.pptx
 
Isolated word recognition using lpc &amp; vector quantization
Isolated word recognition using lpc &amp; vector quantizationIsolated word recognition using lpc &amp; vector quantization
Isolated word recognition using lpc &amp; vector quantization
 
Isolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantizationIsolated word recognition using lpc & vector quantization
Isolated word recognition using lpc & vector quantization
 
Fine grained irony classification through transfer learning approach
Fine grained irony classification through transfer learning approachFine grained irony classification through transfer learning approach
Fine grained irony classification through transfer learning approach
 
Unit 5 Quantization
Unit 5 QuantizationUnit 5 Quantization
Unit 5 Quantization
 
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
 

Plus de Dr Piyush Charan

Unit 1- Intro to Wireless Standards.pdf
Unit 1- Intro to Wireless Standards.pdfUnit 1- Intro to Wireless Standards.pdf
Unit 1- Intro to Wireless Standards.pdfDr Piyush Charan
 
Unit 4 Lossy Coding Preliminaries
Unit 4 Lossy Coding PreliminariesUnit 4 Lossy Coding Preliminaries
Unit 4 Lossy Coding PreliminariesDr Piyush Charan
 
Unit 2: Programming Language Tools
Unit 2:  Programming Language ToolsUnit 2:  Programming Language Tools
Unit 2: Programming Language ToolsDr Piyush Charan
 
Unit 3 Lecture Notes on Programming
Unit 3 Lecture Notes on ProgrammingUnit 3 Lecture Notes on Programming
Unit 3 Lecture Notes on ProgrammingDr Piyush Charan
 
Unit 3 introduction to programming
Unit 3 introduction to programmingUnit 3 introduction to programming
Unit 3 introduction to programmingDr Piyush Charan
 
Forensics and wireless body area networks
Forensics and wireless body area networksForensics and wireless body area networks
Forensics and wireless body area networksDr Piyush Charan
 
Final PhD Defense Presentation
Final PhD Defense PresentationFinal PhD Defense Presentation
Final PhD Defense PresentationDr Piyush Charan
 
Unit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionUnit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionDr Piyush Charan
 
Unit 1 Introduction to Non-Conventional Energy Resources
Unit 1 Introduction to Non-Conventional Energy ResourcesUnit 1 Introduction to Non-Conventional Energy Resources
Unit 1 Introduction to Non-Conventional Energy ResourcesDr Piyush Charan
 
Unit 5-Operational Amplifiers and Electronic Measurement Devices
Unit 5-Operational Amplifiers and Electronic Measurement DevicesUnit 5-Operational Amplifiers and Electronic Measurement Devices
Unit 5-Operational Amplifiers and Electronic Measurement DevicesDr Piyush Charan
 
Unit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionUnit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionDr Piyush Charan
 
Unit 4 Switching Theory and Logic Gates
Unit 4 Switching Theory and Logic GatesUnit 4 Switching Theory and Logic Gates
Unit 4 Switching Theory and Logic GatesDr Piyush Charan
 
Unit 1 Numerical Problems on PN Junction Diode
Unit 1 Numerical Problems on PN Junction DiodeUnit 1 Numerical Problems on PN Junction Diode
Unit 1 Numerical Problems on PN Junction DiodeDr Piyush Charan
 
Unit 4_Part 1_Number System
Unit 4_Part 1_Number SystemUnit 4_Part 1_Number System
Unit 4_Part 1_Number SystemDr Piyush Charan
 
Unit 5 Global Issues- Early life of Prophet Muhammad
Unit 5 Global Issues- Early life of Prophet MuhammadUnit 5 Global Issues- Early life of Prophet Muhammad
Unit 5 Global Issues- Early life of Prophet MuhammadDr Piyush Charan
 
Unit 3 Professional Responsibility
Unit 3 Professional ResponsibilityUnit 3 Professional Responsibility
Unit 3 Professional ResponsibilityDr Piyush Charan
 

Plus de Dr Piyush Charan (20)

Unit 1- Intro to Wireless Standards.pdf
Unit 1- Intro to Wireless Standards.pdfUnit 1- Intro to Wireless Standards.pdf
Unit 1- Intro to Wireless Standards.pdf
 
Unit 1 Solar Collectors
Unit 1 Solar CollectorsUnit 1 Solar Collectors
Unit 1 Solar Collectors
 
Unit 4 Lossy Coding Preliminaries
Unit 4 Lossy Coding PreliminariesUnit 4 Lossy Coding Preliminaries
Unit 4 Lossy Coding Preliminaries
 
Unit 3 Geothermal Energy
Unit 3 Geothermal EnergyUnit 3 Geothermal Energy
Unit 3 Geothermal Energy
 
Unit 2: Programming Language Tools
Unit 2:  Programming Language ToolsUnit 2:  Programming Language Tools
Unit 2: Programming Language Tools
 
Unit 4 Arrays
Unit 4 ArraysUnit 4 Arrays
Unit 4 Arrays
 
Unit 3 Lecture Notes on Programming
Unit 3 Lecture Notes on ProgrammingUnit 3 Lecture Notes on Programming
Unit 3 Lecture Notes on Programming
 
Unit 3 introduction to programming
Unit 3 introduction to programmingUnit 3 introduction to programming
Unit 3 introduction to programming
 
Forensics and wireless body area networks
Forensics and wireless body area networksForensics and wireless body area networks
Forensics and wireless body area networks
 
Final PhD Defense Presentation
Final PhD Defense PresentationFinal PhD Defense Presentation
Final PhD Defense Presentation
 
Unit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionUnit 1 Introduction to Data Compression
Unit 1 Introduction to Data Compression
 
Unit 1 Introduction to Non-Conventional Energy Resources
Unit 1 Introduction to Non-Conventional Energy ResourcesUnit 1 Introduction to Non-Conventional Energy Resources
Unit 1 Introduction to Non-Conventional Energy Resources
 
Unit 5-Operational Amplifiers and Electronic Measurement Devices
Unit 5-Operational Amplifiers and Electronic Measurement DevicesUnit 5-Operational Amplifiers and Electronic Measurement Devices
Unit 5-Operational Amplifiers and Electronic Measurement Devices
 
Unit 1 Introduction to Data Compression
Unit 1 Introduction to Data CompressionUnit 1 Introduction to Data Compression
Unit 1 Introduction to Data Compression
 
Unit 4 Switching Theory and Logic Gates
Unit 4 Switching Theory and Logic GatesUnit 4 Switching Theory and Logic Gates
Unit 4 Switching Theory and Logic Gates
 
Unit 1 Numerical Problems on PN Junction Diode
Unit 1 Numerical Problems on PN Junction DiodeUnit 1 Numerical Problems on PN Junction Diode
Unit 1 Numerical Problems on PN Junction Diode
 
Unit 4_Part 1_Number System
Unit 4_Part 1_Number SystemUnit 4_Part 1_Number System
Unit 4_Part 1_Number System
 
Unit 5 Global Issues- Early life of Prophet Muhammad
Unit 5 Global Issues- Early life of Prophet MuhammadUnit 5 Global Issues- Early life of Prophet Muhammad
Unit 5 Global Issues- Early life of Prophet Muhammad
 
Unit 4 Engineering Ethics
Unit 4 Engineering EthicsUnit 4 Engineering Ethics
Unit 4 Engineering Ethics
 
Unit 3 Professional Responsibility
Unit 3 Professional ResponsibilityUnit 3 Professional Responsibility
Unit 3 Professional Responsibility
 

Dernier

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 

Dernier (20)

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 

Unit 3 Arithmetic Coding

  • 1. Lecture Notes on Arithmetic Coding for Open Educational Resource on Data Compression(CA209) by Dr. Piyush Charan Assistant Professor Department of Electronics and Communication Engg. Integral University, Lucknow This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
  • 2. UNIT-III Syllabus • Arithmetic Coding: Coding a sequence, • Generating a Binary code, • Comparison of Arithmetic and Huffman coding. • Dictionary Techniques: Introduction, Static Dictionary: • Diagram Coding, Adaptive Dictionary: • The LZ77 Approach, The LZ78 Approach. • Applications: File Compression, Image Compression • Lossless Image Compression: Multi-resolution Approaches. • Context Based Compression: Dynamic Markov Compression. 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 2
  • 3.  Coding rate is the average number of bits used to represent a symbol from a source.  For a given probability model, the entropy is the lowest rate at which the source can be coded.  Huffman coding will generate whose rate is within p_max + 0. 086  Therefore, in Huffman coding, when the alphabet size is large, the amount of deviation from the entropy is quite small, and vice versa.  One solution for this problem is blocking in Huffman coding. In which, it is more efficient to generate codewords for groups or sequences of symbols rather than to generate a separate codeword for each symbol in a sequence.  In order to find the Huffman coding for a sequence of length m, we need codewords for all possible sequences of length m.  This causes an exponential growth in the size of the code book. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 3 4/22/2021
  • 4.  We need a way of assigning codewords to particular sequences with out having to generate a codes for all sequences of that length.  Rather than separating the input into component symbols and replacing each with a code, arithmetic encodes the entire message with a number (tag).  Firstly, a unique identifier or tag is generated for a sequence. Secondly, this tag is then given a unique binary code. • Entropy encoding • Lossless data compression • Variable length coding Arithmetic Coding 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 4
  • 5. Arithmetic Coding  Arithmetic coding is based on the concept of interval subdividing. – In arithmetic coding a source ensemble is represented by an interval between 0 and 1 on the real number line. – Each symbol of the ensemble narrows this interval. – As the interval becomes smaller, the number of bits needed to specify it grows – Arithmetic coding assumes an explicit probabilistic model of the source. – It uses the probabilities of the source messages to successively narrow the interval used to represent the ensemble.  A high probability message narrows the interval less than a low probability message, so that high probability messages contribute fewer bits to the coded ensemble. 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 5
  • 6.  Assume we know the probabilities of each symbol of the data source,  we can allocate to each symbol an interval with width proportional to its probability, and each of the intervals does not overlap with others.  This can be done if we use the cumulative probabilities as the two ends of each interval.  Therefore, the two ends of each symbol x amount to Q[x-1] and Q[x].  Symbol x is said to own the range [Q[x-1], Q[x]). Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 6 4/22/2021
  • 7. 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 7  We begin with the interval [0, 1) and subdivide the interval iteratively.  For each symbol entered, the current interval is divided according to the probabilities of the alphabet.  The interval corresponding to the symbol is picked as the interval to be further proceeded with.  The procedure continues until all symbols in the message have been processed.  Since each symbol's interval does not overlap with others, for each possible message there is a unique interval assigned.  We can represent the message with the interval's two ends [L, H). In fact, taking any single value in the interval as the encoded code is enough, and usually the left end L is selected.
  • 8. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 8 4/22/2021
  • 9. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 9 4/22/2021
  • 10. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 10 4/22/2021
  • 11. Once the character probabilities are known, the individual symbols need to be assigned a range along a "probability line," which is nominally 0 to 1. It doesn't matter which characters are assigned which segment of the range, as long as it is done in the same manner by both the encoder and the decoder. The nine-character symbol set used here would look like Figure 2. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 11 4/22/2021
  • 12. Each character is assigned the portion of the 0 - 1 range that corresponds to its probability of appearance. Note also that the character "owns" everything up to, but not including the higher number. So the letter T in fact has the range 0.90 - 0.9999 .... Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 12 4/22/2021
  • 13. After the first character is encoded, we also know that the range for our output number is bounded by the low and high numbers. During the rest of the encoding process, each new symbol to be encoded will further restrict the possible range of the output number. The next character to be encoded, I, owns the range 0.50 through 0.60. If this was the first number in our message, we would set these as our low- and high-range values. But I is the second character; therefore, we say that I owns the range corresponding to 0.50 - 0.60 in the new subrange of 0.2 - 0.3. This means that the new encoded number will have to fall somewhere in the 50 to 60th percentile of the currently established range. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 13 4/22/2021
  • 14. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 14 4/22/2021
  • 15. Binary Codeword 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 15
  • 16. Decoding Algorithm 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 16
  • 17. Decoding BILL GATES 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 17
  • 18. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 18 4/22/2021
  • 19. Huffman vs. Arithmetic Codes 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 19
  • 20. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 20 4/22/2021
  • 21. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 21 4/22/2021 Huffman vs. Arithmetic Codes
  • 22. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 22 4/22/2021
  • 23. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 23 4/22/2021
  • 24. Arithmetic Coding Huffman Coding Does not need the probability distribution Need a probability distribution No need to keep and send codeword table Need to store the codeword table Decompression speed is slow Decompression speed is Fast Compression Speed is low Compression speed is Fast Compression ratio is very good Compression ratio is poor No compressed pattern matching Compressed pattern matching Fractional codeword length Minimum codeword length is 1 bit Does not produce Prefix code Produce Prefix code Comparison of Arithmetic vs. Huffman Coding 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 24
  • 25. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 25 4/22/2021
  • 26. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 26 4/22/2021
  • 27. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 27 4/22/2021
  • 28. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 28 4/22/2021
  • 29.  each symbol or group of symbols is encoded with a variable length code, according to some probability distribution.  based on the use of a dictionary, which can be static or dynamic, and they code each symbol or group of symbols with an element of the dictionary. Huffman Dynamic Markov Compression Lempel-Ziv-Welch Lossless Compression Techniques 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 29
  • 30. Lempel-Ziv-Welch (LZW) Created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984as an improved implementation of the LZ78 algorithm, published by Lempel and Ziv in 1978 universal adaptative1 lossless data compression algorithm builds a translation table (also called dictionary) from the text being compressed the string translation table maps the message strings to fixed-length codes 1 The coding scheme used for the kth character of a message is based on the characteristics of the preceding k − 1 characters in the message Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 30 4/22/2021
  • 31. Dictionary Based Techniques 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 31
  • 32. Lempel –Ziv Coding 4/22/2021 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 32
  • 33. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 33 4/22/2021 Lempel –Ziv Coding
  • 34. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 34 4/22/2021
  • 35. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 35 4/22/2021
  • 36. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 36 4/22/2021
  • 37. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 37 4/22/2021
  • 38. Lempel-Ziv-Welch (LZW) Compression Algorithm  As mentioned earlier, static coding schemes require some knowledge about the data before encoding takes place.  Universal coding schemes, like LZW, do not require advance knowledge and can build such knowledge on-the-fly.  LZW is the foremost technique for general purpose data compression due to its simplicity and versatility.  It is the basis of many PC utilities that claim to “double the capacity of your hard drive”  LZW compression uses a code table, with 4096 as a common choice for the number of table entries. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 38 4/22/2021
  • 39. LZW (cont'd)  Codes 0-255 in the code table are always assigned to represent single bytes from the input file.  When encoding begins the code table contains only the first 256 entries, with the remainder of the table being blanks.  Compression is achieved by using codes 256 through 4095 to represent sequences of bytes.  As the encoding continues, LZW identifies repeated sequences in the data, and adds them to the code table.  Decoding is achieved by taking each code from the compressed file, and translating it through the code table to find what character or characters it represents. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 39 4/22/2021
  • 40. LZW Encoding Algorithm 1 Initialize table with single character strings 2 P = first input character 3 WHILE not end of input stream 4 C = next input character 5 IF P + C is in the string table 6 P = P + C 7 ELSE 8 output the code for P 9 add P + C to the string table 10 P = C 11 END WHILE 12 output code for P Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 40 4/22/2021
  • 41. Example 1: Compression using LZW Example 1: Use the LZW algorithm to compress the string BABAABAAA Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 41 4/22/2021
  • 42. Example 1: LZW Compression Step 1 BABAABAAA P=A C=empty Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 42 STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 4/22/2021
  • 43. Example 1: LZW Compression Step 2 BABAABAAA P=B C=empty Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 43 STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 4/22/2021
  • 44. Example 1: LZW Compression Step 3 BABAABAAA P=A C=empty Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 44 STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258 BA 256 4/22/2021
  • 45. Example 1: LZW Compression Step 4 BABAABAAA P=A C=empty Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 45 STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258 BA 256 ABA 259 AB 257 4/22/2021
  • 46. Example 1: LZW Compression Step 5 BABAABAAA P=A C=A Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 46 STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258 BA 256 ABA 259 AB 257 AA 260 A 65 4/22/2021
  • 47. Example 1: LZW Compression Step 6 BABAABAAA P=AA C=empty Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 47 STRING TABLE ENCODER OUTPUT string codeword representing output code BA 256 B 66 AB 257 A 65 BAA 258 BA 256 ABA 259 AB 257 AA 260 A 65 AA 260 4/22/2021
  • 48. LZW Decompression  The LZW decompressor creates the same string table during decompression.  It starts with the first 256 table entries initialized to single characters.  The string table is updated for each character in the input stream, except the first one.  Decoding achieved by reading codes and translating them through the code table being built. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 48 4/22/2021
  • 49. LZW Decompression Algorithm 1 Initialize table with single character strings 2 OLD = first input code 3 output translation of OLD 4 WHILE not end of input stream 5 NEW = next input code 6 IF NEW is not in the string table 7 S = translation of OLD 8 S = S + C 9 ELSE 10 S = translation of NEW 11 output S 12 C = first character of S 13 OLD + C to the string table 14 OLD = NEW 15 ENDWHILE Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 49 4/22/2021
  • 50. Example 2: LZW Decompression 1 Example 2: Use LZW to decompress the output sequence of Example 1: <66><65><256><257><65><260>. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 50 4/22/2021
  • 51. Example 2: LZW Decompression Step 1 <66><65><256><257><65><260> Old = 65 S = A New = 66 C = A Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 51 STRING TABLE ENCODER OUTPUT string codeword string B BA 256 A 4/22/2021
  • 52. Example 2: LZW Decompression Step 2 <66><65><256><257><65><260> Old = 256 S = BA New = 256 C = B Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 52 STRING TABLE ENCODER OUTPUT string codeword string B BA 256 A AB 257 BA 4/22/2021
  • 53. Example 2: LZW Decompression Step 3 <66><65><256><257><65><260> Old = 257 S = AB New = 257 C = A Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 53 STRING TABLE ENCODER OUTPUT string codeword string B BA 256 A AB 257 BA BAA 258 AB 4/22/2021
  • 54. Example 2: LZW Decompression Step 4 <66><65><256><257><65><260> Old = 65 S = A New = 65 C = A Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 54 STRING TABLE ENCODER OUTPUT string codeword string B BA 256 A AB 257 BA BAA 258 AB ABA 259 A 4/22/2021
  • 55. Example 2: LZW Decompression Step 5 <66><65><256><257><65><260> Old = 260 S = AA New = 260 C = A Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 55 STRING TABLE ENCODER OUTPUT string codeword string B BA 256 A AB 257 BA BAA 258 AB ABA 259 A AA 260 AA 4/22/2021
  • 56. LZW: Some Notes  This algorithm compresses repetitive sequences of data well.  Since the codewords are 12 bits, any single encoded character will expand the data size rather than reduce it.  In this example, 72 bits are represented with 72 bits of data. After a reasonable string table is built, compression improves dramatically.  Advantages of LZW over Huffman:  LZW requires no prior information about the input data stream.  LZW can compress the input stream in one single pass.  Another advantage of LZW its simplicity, allowing fast execution. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 56 4/22/2021
  • 57. LZW: Limitations  What happens when the dictionary gets too large (i.e., when all the 4096 locations have been used)?  Here are some options usually implemented:  Simply forget about adding any more entries and use the table as is.  Throw the dictionary away when it reaches a certain size.  Throw the dictionary away when it is no longer effective at compression.  Clear entries 256-4095 and start building the dictionary again.  Some clever schemes rebuild a string table from the last N input characters. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 57 4/22/2021
  • 58. Lossless Image Compression: Multi-resolution Approaches. Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Image compression may be lossy or lossless. Lossless compression is preferred for archival purposes and often for medical imaging, technical drawings, clip art, or comics. Methods for lossless compression: Run-length encoding – used in default method in PCX and as one of possible in BMP, TGA, TIFF Area image compression Predictive coding – used in DPCM Entropy encoding – the two most common entropy encoding techniques are arithmetic coding and Huffman coding Adaptive dictionary algorithms such as LZW – used in GIF and TIFF DEFLATE – used in PNG, MNG, and TIFF Chain codes Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 58 4/22/2021
  • 59. Context Based Compression: Dynamic Markov Compression.  developed by Gordon Cormack and Nigel Horspool (1987)  adaptative lossless data compression algorithm  based on the modelization of the binary source to be encoded by means of a Markov chain, which describes the transition probabilities between the symbol “0” and the symbol “1”  the built model is used to predict the future bit of a message. The predicted bit is then coded using arithmetic coding Dynamic Markov compression (DMC) is a lossless data compression algorithm developed by Gordon Cormack and Nigel Horspool.It uses predictive arithmetic coding similar to prediction by partial matching (PPM), except that the input is predicted one bit at a time (rather than one byte at a time). DMC has a good compression ratio and moderate speed, similar to PPM, but requires somewhat more memory and is not widely implemented. Dynamic Markov Compression is an obscure form of compression that uses Markov chains to model the patterns represented in a file. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 59 4/22/2021
  • 60. Each circle represents a state, and each arrow represents a transition. In this example, we have two states, raining and sunny, a perfect representation of true weather. Each state has two possible transitions, it can transition to itself again or it can transition to another state. The likelihood of each transition is defined by a percentage representing the probability that the transition occurs. Now let’s say it’s sunny and we’re following this model. According to the model there’s a 50% chance it’s sunny again tomorrow or a 50% chance it’s rainy tomorrow. If it becomes rainy, then there’s a 25% chance it’s rainy the day after that or a 75% chance it’s sunny the day after that. Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow 60 4/22/2021