SlideShare une entreprise Scribd logo
1  sur  17
1
Data Compression
2
Introduction
 Compression is used to :
 reduce the volume of information to be
stored into storages
 reduce the communication bandwidth
required for its transmission over the
networks
3
4
Compression Principles
Entropy Encoding
1. Run-length encoding
 Lossless & Independent of the type of source
information
 Used when the source information comprises
long substrings of the same character or
binary digit
(string or bit pattern, # of occurrences), as
FAX
e.g) 000000011111111110000011……
⇒ 0,7 1,10 0,5 1,2…… ⇒ 7,10,5,2……
5
Entropy Encoding
2. Statistical encoding
 Based on the probability of occurrence of a
pattern
 The more probable, the shorter codeword
6
Compression Principles
 Huffman Encoding
 Entropy, H: theoretical min. avg. # of bits that are required
to transmit a particular stream
H = -Σ i=1
n
Pi log2Pi
where n: # of symbols, Pi: probability of symbol i
 Efficiency, E = H/H’
where, H’ = avr. # of bits per codeword = Σ i=1
n
Ni Pi
Ni: # of bits of symbol i
7
 E.g) symbols M(10), F(11), Y(010), N(011), 0(000),
1(001) with probabilities 0.25, 0.25, 0.125, 0.125,
0.125, 0.125
 H’ = Σ i=1
6
Ni Pi = (2(2×0.25) + 4(3×0.125)) = 2.5
bits/codeword
 H = -Σ i=1
6
Pi log2Pi = - (2(0.25log20.25) +
4(0.125log20.125)) = 2.5
 E = H/H’ =100 %
 3-bit/codeword if we use fixed-length codewords for six
symbols
8
Huffman Algorithm (Variable-Length
Encoding)
Method Konstruksi pohon encoding
• Full Binary Tree Representation
• Each edge of the tree has a value,
(0 is the left child, 1 is the right child)
• Data is at the leaves, not internal nodes
• Result: encoding tree
9
Huffman Algorithm
• 1. Maintain a forest of trees
• 2. Weight of tree = sum frequency of
leaves
• 3. For 0 to N-1
– Select two smallest weight trees
– Form a new tree
10
• Huffman coding
• variable length code whose length is inversely
proportional to that character’s frequency
• must satisfy nonprefix property to be uniquely
decodable
• two pass algorithm
– first pass accumulates the character frequency
and generate codebook
– second pass does compression with the
codebook
11
• create codes by constructing a binary tree
1. consider all characters as free nodes
2. assign two free nodes with lowest frequency to
a parent nodes with weights equal to sum of
their frequencies
3. remove the two free nodes and add the newly
created parent node to the list of free nodes
4. repeat step2 and 3 until there is one free node
left. It becomes the root of tree
Huffman coding
12
• Right of binary tree :1
• Left of Binary tree :0
• Prefix (example)
– e:”01”, b: “010”
– “01” is prefix of “010” ==> “e0”
• same frequency : need consistency of
left or right
13
Static Huffman Coding
 Huffman (Code) Tree
 Hitung jumlah symbols atau characters dan probabillitas relatif
prior
 Must hold “prefix property” among codes
Symbol Occurrence
A 4/8
B 2/8
C 1/8
D 1/8
Symbol Code
A 1
B 01
C 001
D 000
4×1 + 2×2 + 1×3 +
1×3 = 14 bits are
required to transmit
“AAAABBCD”
0 1
D
A
B
C
0 1
0 18
4
2
Leaf node
Root node
Branch node
Prefix Property !
14
• Contoh (Data dengan 64 karakter)
• R K K K K K K K
• K K K R R K K K
• K K R R R R G G
• K K B C C C R R
• G G G M C B R R
• B B B M Y B B R
• G G G G G G G R
• G R R R R G R R
15
• Character frequency Huffman code
• =================================
• R 19 00
• K 17 01
• G 14 10
• B 7 110
• C 4 1110
• M 2 11110
• Y 1 11111
16
17
Tujuan kompresi data adalah
untuk merepresentasikan suatu
data digital dengan sesedikit
mungkin bit.
Soal :
Tentukanlah kode masing-masing Karakter pada Text berikut dengan
menggunakan Huffman code

Contenu connexe

Tendances

Huffman's Alforithm
Huffman's AlforithmHuffman's Alforithm
Huffman's Alforithm
Roohaali
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algoritham
Rahul Khanwani
 

Tendances (19)

Huffman codes
Huffman codesHuffman codes
Huffman codes
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysis
 
Huffman Student
Huffman StudentHuffman Student
Huffman Student
 
Lossless
LosslessLossless
Lossless
 
Huffman's Alforithm
Huffman's AlforithmHuffman's Alforithm
Huffman's Alforithm
 
Shannon Fano
Shannon FanoShannon Fano
Shannon Fano
 
information theory
information theoryinformation theory
information theory
 
Huffman Coding
Huffman CodingHuffman Coding
Huffman Coding
 
Data communication & computer networking: Huffman algorithm
Data communication & computer networking:  Huffman algorithmData communication & computer networking:  Huffman algorithm
Data communication & computer networking: Huffman algorithm
 
Adaptive Huffman Coding
Adaptive Huffman CodingAdaptive Huffman Coding
Adaptive Huffman Coding
 
Lossless
LosslessLossless
Lossless
 
Huffman coding
Huffman codingHuffman coding
Huffman coding
 
Text compression
Text compressionText compression
Text compression
 
Lec5 Compression
Lec5 CompressionLec5 Compression
Lec5 Compression
 
Data Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length EncodingData Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length Encoding
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algoritham
 
Arithmetic Coding
Arithmetic CodingArithmetic Coding
Arithmetic Coding
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Multimedia lossless compression algorithms
Multimedia lossless compression algorithmsMultimedia lossless compression algorithms
Multimedia lossless compression algorithms
 

En vedette

10 pengantar jaringan komputer dan kom dat
10 pengantar jaringan komputer dan kom dat10 pengantar jaringan komputer dan kom dat
10 pengantar jaringan komputer dan kom dat
teddyhadia
 
Do romantismo à república
Do romantismo à repúblicaDo romantismo à república
Do romantismo à república
silvartes
 
Comunidad de canarias
Comunidad de canariasComunidad de canarias
Comunidad de canarias
clarapalencia
 
komdat1
komdat1komdat1
komdat1
pasca
 

En vedette (13)

Komdat pertemuan 4
Komdat pertemuan 4Komdat pertemuan 4
Komdat pertemuan 4
 
KOMUNIKASI DATA 1
KOMUNIKASI DATA 1KOMUNIKASI DATA 1
KOMUNIKASI DATA 1
 
Komunikasi data1
Komunikasi data1Komunikasi data1
Komunikasi data1
 
10 pengantar jaringan komputer dan kom dat
10 pengantar jaringan komputer dan kom dat10 pengantar jaringan komputer dan kom dat
10 pengantar jaringan komputer dan kom dat
 
Do romantismo à república
Do romantismo à repúblicaDo romantismo à república
Do romantismo à república
 
Pengantar KomDat
Pengantar KomDatPengantar KomDat
Pengantar KomDat
 
Comunidad de canarias
Comunidad de canariasComunidad de canarias
Comunidad de canarias
 
Komunikasi Data - Pengertian Data dan Media Transmisi
Komunikasi Data - Pengertian Data dan Media TransmisiKomunikasi Data - Pengertian Data dan Media Transmisi
Komunikasi Data - Pengertian Data dan Media Transmisi
 
Tik 3
Tik 3Tik 3
Tik 3
 
Standart Komunikasi Data
Standart Komunikasi DataStandart Komunikasi Data
Standart Komunikasi Data
 
komdat1
komdat1komdat1
komdat1
 
KOMUNIKASI DATA
KOMUNIKASI DATAKOMUNIKASI DATA
KOMUNIKASI DATA
 
Komunikasi Data
Komunikasi Data Komunikasi Data
Komunikasi Data
 

Similaire à Komdat-Kompresi Data

Implementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataImplementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text Data
BRNSSPublicationHubI
 

Similaire à Komdat-Kompresi Data (20)

Farhana shaikh webinar_huffman coding
Farhana shaikh webinar_huffman codingFarhana shaikh webinar_huffman coding
Farhana shaikh webinar_huffman coding
 
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
 
Module-IV 094.pdf
Module-IV 094.pdfModule-IV 094.pdf
Module-IV 094.pdf
 
add9.5.ppt
add9.5.pptadd9.5.ppt
add9.5.ppt
 
Huffman Encoding Pr
Huffman Encoding PrHuffman Encoding Pr
Huffman Encoding Pr
 
Huffmans code
Huffmans codeHuffmans code
Huffmans code
 
huffman ppt
huffman ppthuffman ppt
huffman ppt
 
Data structures' project
Data structures' projectData structures' project
Data structures' project
 
Source coding
Source coding Source coding
Source coding
 
image compresson
image compressonimage compresson
image compresson
 
Image compression
Image compressionImage compression
Image compression
 
Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3
 
Huffman Coding.ppt
Huffman Coding.pptHuffman Coding.ppt
Huffman Coding.ppt
 
Greedy Algorithms Huffman Coding.ppt
Greedy Algorithms  Huffman Coding.pptGreedy Algorithms  Huffman Coding.ppt
Greedy Algorithms Huffman Coding.ppt
 
Digital Communication Techniques
Digital Communication TechniquesDigital Communication Techniques
Digital Communication Techniques
 
Hufman coding basic
Hufman coding basicHufman coding basic
Hufman coding basic
 
Encoding in sc
Encoding in scEncoding in sc
Encoding in sc
 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.ppt
 
Implementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataImplementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text Data
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

Komdat-Kompresi Data

  • 2. 2 Introduction  Compression is used to :  reduce the volume of information to be stored into storages  reduce the communication bandwidth required for its transmission over the networks
  • 3. 3
  • 4. 4 Compression Principles Entropy Encoding 1. Run-length encoding  Lossless & Independent of the type of source information  Used when the source information comprises long substrings of the same character or binary digit (string or bit pattern, # of occurrences), as FAX e.g) 000000011111111110000011…… ⇒ 0,7 1,10 0,5 1,2…… ⇒ 7,10,5,2……
  • 5. 5 Entropy Encoding 2. Statistical encoding  Based on the probability of occurrence of a pattern  The more probable, the shorter codeword
  • 6. 6 Compression Principles  Huffman Encoding  Entropy, H: theoretical min. avg. # of bits that are required to transmit a particular stream H = -Σ i=1 n Pi log2Pi where n: # of symbols, Pi: probability of symbol i  Efficiency, E = H/H’ where, H’ = avr. # of bits per codeword = Σ i=1 n Ni Pi Ni: # of bits of symbol i
  • 7. 7  E.g) symbols M(10), F(11), Y(010), N(011), 0(000), 1(001) with probabilities 0.25, 0.25, 0.125, 0.125, 0.125, 0.125  H’ = Σ i=1 6 Ni Pi = (2(2×0.25) + 4(3×0.125)) = 2.5 bits/codeword  H = -Σ i=1 6 Pi log2Pi = - (2(0.25log20.25) + 4(0.125log20.125)) = 2.5  E = H/H’ =100 %  3-bit/codeword if we use fixed-length codewords for six symbols
  • 8. 8 Huffman Algorithm (Variable-Length Encoding) Method Konstruksi pohon encoding • Full Binary Tree Representation • Each edge of the tree has a value, (0 is the left child, 1 is the right child) • Data is at the leaves, not internal nodes • Result: encoding tree
  • 9. 9 Huffman Algorithm • 1. Maintain a forest of trees • 2. Weight of tree = sum frequency of leaves • 3. For 0 to N-1 – Select two smallest weight trees – Form a new tree
  • 10. 10 • Huffman coding • variable length code whose length is inversely proportional to that character’s frequency • must satisfy nonprefix property to be uniquely decodable • two pass algorithm – first pass accumulates the character frequency and generate codebook – second pass does compression with the codebook
  • 11. 11 • create codes by constructing a binary tree 1. consider all characters as free nodes 2. assign two free nodes with lowest frequency to a parent nodes with weights equal to sum of their frequencies 3. remove the two free nodes and add the newly created parent node to the list of free nodes 4. repeat step2 and 3 until there is one free node left. It becomes the root of tree Huffman coding
  • 12. 12 • Right of binary tree :1 • Left of Binary tree :0 • Prefix (example) – e:”01”, b: “010” – “01” is prefix of “010” ==> “e0” • same frequency : need consistency of left or right
  • 13. 13 Static Huffman Coding  Huffman (Code) Tree  Hitung jumlah symbols atau characters dan probabillitas relatif prior  Must hold “prefix property” among codes Symbol Occurrence A 4/8 B 2/8 C 1/8 D 1/8 Symbol Code A 1 B 01 C 001 D 000 4×1 + 2×2 + 1×3 + 1×3 = 14 bits are required to transmit “AAAABBCD” 0 1 D A B C 0 1 0 18 4 2 Leaf node Root node Branch node Prefix Property !
  • 14. 14 • Contoh (Data dengan 64 karakter) • R K K K K K K K • K K K R R K K K • K K R R R R G G • K K B C C C R R • G G G M C B R R • B B B M Y B B R • G G G G G G G R • G R R R R G R R
  • 15. 15 • Character frequency Huffman code • ================================= • R 19 00 • K 17 01 • G 14 10 • B 7 110 • C 4 1110 • M 2 11110 • Y 1 11111
  • 16. 16
  • 17. 17 Tujuan kompresi data adalah untuk merepresentasikan suatu data digital dengan sesedikit mungkin bit. Soal : Tentukanlah kode masing-masing Karakter pada Text berikut dengan menggunakan Huffman code