arithmetic encoding
Arithmetic coding is a technique for achieving near-optimal entropy encoding.
An arithmetic encoder takes a string of symbols as input and produces a rational number in the interval as output. As each symbol is processed, the encoder will restrict the output to a smaller interval.
Let be the number of distinct symbols in the input; let represent the symbols, and let represent the probability of each symbol appearing. At each step in the process, the output is restricted to the current interval . Partition this interval into disjoint subintervals:
Therefore the of is . If the next symbol is , then restrict the output to the new interval .
Note that at each stage, all the possible intervals are pairwise disjoint. Therefore a specific sequence of symbols produces exactly one unique output range, and the process can be reversed.
Since arithmetic encoders are typically implemented on binary computers, the actual output of the encoder is generally the shortest sequence of bits representing the fractional part of a rational number in the final interval.
Suppose our entire input string contains symbols: then appears exactly times in the input. Therefore, the size of the final interval will be
The number of bits necessary to write a binary fraction in this range is
By Shannon’s theorem, this is the total entropy of the original message. Therefore arithmetic encoding is near-optimal entropy encoding.
Title | arithmetic encoding |
---|---|
Canonical name | ArithmeticEncoding |
Date of creation | 2013-03-22 12:32:20 |
Last modified on | 2013-03-22 12:32:20 |
Owner | vampyr (22) |
Last modified by | vampyr (22) |
Numerical id | 4 |
Author | vampyr (22) |
Entry type | Definition |
Classification | msc 68P30 |
Classification | msc 94A24 |
Synonym | arithmetic coding |
Synonym | arithmetic encoder |
Synonym | arithmetic coder |
Related topic | HuffmanCoding |