Komdat-Kompresi Data

2
Introduction
 Compression is used to :
 reduce the volume of information to be
stored into storages
 reduce the communication bandwidth
required for its transmission over the
networks

4
Compression Principles
Entropy Encoding
1. Run-length encoding
 Lossless & Independent of the type of source
information
 Used when the source information comprises
long substrings of the same character or
binary digit
(string or bit pattern, # of occurrences), as
FAX
e.g) 000000011111111110000011……
⇒ 0,7 1,10 0,5 1,2…… ⇒ 7,10,5,2……

5
Entropy Encoding
2. Statistical encoding
 Based on the probability of occurrence of a
pattern
 The more probable, the shorter codeword

6
Compression Principles
 Huffman Encoding
 Entropy, H: theoretical min. avg. # of bits that are required
to transmit a particular stream
H = -Σ i=1
n
Pi log2Pi
where n: # of symbols, Pi: probability of symbol i
 Efficiency, E = H/H’
where, H’ = avr. # of bits per codeword = Σ i=1
n
Ni Pi
Ni: # of bits of symbol i

7
 E.g) symbols M(10), F(11), Y(010), N(011), 0(000),
1(001) with probabilities 0.25, 0.25, 0.125, 0.125,
0.125, 0.125
 H’ = Σ i=1
6
Ni Pi = (2(2×0.25) + 4(3×0.125)) = 2.5
bits/codeword
 H = -Σ i=1
6
Pi log2Pi = - (2(0.25log20.25) +
4(0.125log20.125)) = 2.5
 E = H/H’ =100 %
 3-bit/codeword if we use fixed-length codewords for six
symbols

8
Huffman Algorithm (Variable-Length
Encoding)
Method Konstruksi pohon encoding
• Full Binary Tree Representation
• Each edge of the tree has a value,
(0 is the left child, 1 is the right child)
• Data is at the leaves, not internal nodes
• Result: encoding tree

9
Huffman Algorithm
• 1. Maintain a forest of trees
• 2. Weight of tree = sum frequency of
leaves
• 3. For 0 to N-1
– Select two smallest weight trees
– Form a new tree

10
• Huffman coding
• variable length code whose length is inversely
proportional to that character’s frequency
• must satisfy nonprefix property to be uniquely
decodable
• two pass algorithm
– first pass accumulates the character frequency
and generate codebook
– second pass does compression with the
codebook

11
• create codes by constructing a binary tree
1. consider all characters as free nodes
2. assign two free nodes with lowest frequency to
a parent nodes with weights equal to sum of
their frequencies
3. remove the two free nodes and add the newly
created parent node to the list of free nodes
4. repeat step2 and 3 until there is one free node
left. It becomes the root of tree
Huffman coding

12
• Right of binary tree :1
• Left of Binary tree :0
• Prefix (example)
– e:”01”, b: “010”
– “01” is prefix of “010” ==> “e0”
• same frequency : need consistency of
left or right

13
Static Huffman Coding
 Huffman (Code) Tree
 Hitung jumlah symbols atau characters dan probabillitas relatif
prior
 Must hold “prefix property” among codes
Symbol Occurrence
A 4/8
B 2/8
C 1/8
D 1/8
Symbol Code
A 1
B 01
C 001
D 000
4×1 + 2×2 + 1×3 +
1×3 = 14 bits are
required to transmit
“AAAABBCD”
0 1
D
A
B
C
0 1
0 18
4
2
Leaf node
Root node
Branch node
Prefix Property !

14
• Contoh (Data dengan 64 karakter)
• R K K K K K K K
• K K K R R K K K
• K K R R R R G G
• K K B C C C R R
• G G G M C B R R
• B B B M Y B B R
• G G G G G G G R
• G R R R R G R R

15
• Character frequency Huffman code
• =================================
• R 19 00
• K 17 01
• G 14 10
• B 7 110
• C 4 1110
• M 2 11110
• Y 1 11111

17
Tujuan kompresi data adalah
untuk merepresentasikan suatu
data digital dengan sesedikit
mungkin bit.
Soal :
Tentukanlah kode masing-masing Karakter pada Text berikut dengan
menggunakan Huffman code

Komdat-Kompresi Data

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

En vedette

En vedette (13)

Similaire à Komdat-Kompresi Data

Similaire à Komdat-Kompresi Data (20)

Dernier

Dernier (20)

Komdat-Kompresi Data