2. Information Axioms : I ( p ) = the amount of information in the occurrence of an event of probability p A. I ( p ) ≥ 0 for any event p B. I ( p 1 ∙ p 2 ) = I ( p 1 ) + I ( p 2 ) p 1 & p 2 are independent events C. I ( p ) is a continuous function of p Existence : I ( p ) = log_(1/ p ) units of information : in base 2 = a bit in base e = a nat in base 10 = a Hartley 6.2 A quantitative measure of the amount of information any probabilistic event represents. Cauchy functional equation source single symbol
3.
4.
5.
6. Gibbs Inequality Basic information about log function: Tangent line to y = ln x at x = 1 is ( y ln 1) = (ln) ′ x =1 ( x 1) y = x 1 (ln x )″ = (1/ x )′ = -(1/ x 2 ) < 0 x ln x is concave down. Therefore, ln x x 1 0 -1 1 ln x x y = x 1 6.4
7.
8.
9.
10. Shannon-Fano Coding The simplest variable length method. Less efficient than Huffman, but allows one to code symbol s i with length l i directly from p i . Given source symbols s 1 , …, s q with probabilities p 1 , …, p q pick l i = log r (1/ p i ) . Hence, Summing this inequality over i : Kraft inequality is satisfied, therefore there is an instantaneous code with these lengths. 6.6
11. Example : p ’ s : ¼, ¼, ⅛, ⅛, ⅛, ⅛ l ’ s : 2, 2, 3, 3, 3, 3 K = 1 H 2 ( S ) = 2.5 L = 5/2 0 0 0 0 0 1 1 1 1 1 6.6
12. Recall: The n th extension of a source S = { s 1 , …, s q } with probabilities p 1 , …, p q is the set of symbols T = S n = { s i 1 ∙∙∙ s i n | s i j S 1 j n } where t i = s i 1 ∙∙∙ s i n has probability p i 1 ∙∙∙ p i n = Q i assuming independent probabilities. The entropy is: [Letting i = ( i 1 , …, i n ) q , an n -digit number base q ] The Entropy of Code Extensions concatenation multiplication 6.8
13. 6.8 H ( S n ) = n ∙ H ( S ) Hence the average S-F code length L n for T satisfies: H ( T ) L n < H ( T ) + 1 n ∙ H ( S ) L n < n ∙ H ( S ) + 1 H ( S ) ( L n / n ) < H ( S ) + 1/ n
17. Example 6.11 0, 0 1, 0 0, 1 1, 1 .8 .8 .5 .5 .5 .5 .2 .2 equilibrium probabilities p (0,0) = 5/14 = p (1,1) p (0,1) = 2/14 = p (1,0) previous state next state S i 1 S i 2 S i p ( s i | s i 1 , s i 2 ) p ( s i 1 , s i 2 ) p ( s i 1 , s i 2 , s i ) 0 0 0 0.8 5/14 4/14 0 0 1 0.2 5/14 1/14 0 1 0 0.5 2/14 1/14 0 1 1 0.5 2/14 1/14 1 0 0 0.5 2/14 1/14 1 0 1 0.5 2/14 1/14 1 1 0 0.2 5/14 1/14 1 1 1 0.8 5/14 4/14
18. Base Fibonacci The golden ratio = (1+√5)/2 is a solution to x 2 − x − 1 = 0 and is equal to the limit of the ratio of adjacent Fibonacci numbers. 0 … r − 1 H 2 = log 2 r 1/ r 0 1 1/ 1/ 2 1 0 1 st order Markov process: 0 10 1/ 1/ 2 1/ 1/ 2 1 0 1/ + 1/ 2 = 1 Think of source as emitting variable length symbols: Entropy = (1/ )∙log + ½ (1/ ² )∙log ² = log which is maximal take into account variable length symbols 1/ 1/ 2 0
19. The Adjoint System For simplicity, consider a first-order Markov system, S Goal: bound the entropy by a source with zero memory, yet whose probabilities are the equilibrium probabilities. Let p ( s i ) = equilibrium prob. of s i p ( s j ) = equilibrium prob. of s j p ( s j , s i ) = equilibrium probability of getting s j s i . with = only if p ( s j , s i ) = p ( s i ) · p ( s j ) Now, p ( s j , s i ) = p ( s i | s j ) · p ( s j ). = = = 6.12 (skip)
Notes de l'éditeur
Use natural logarithms, but works for any base!
How do we know we can get arbitrarily close in all other cases?
if K = 1, then the average code length = the entropy (put on final exam)