After quantization process and zigzag scanning, in result we get the sequence of 64 values for 8x8 block. The first byte in the sequence has the highest value and represents mean value of the block. This byte is called DC. The rest of the bytes has smaller values with majority of zeros. These bytes are called AC.
The DC coefficient will be decoded slightly different than the AC coefficients. Respecting the correlation to the neighboring blocks, just for the first block the whole DC coefficient is processed. Later blocks will only encode the difference to the preceding block’s DC component, this applies for each component separately. AC and DC coefficients have different Huffman tables.
The AC coefficients are decoded using run length encoding technique. In this method values are presented by pair of number. The first number is the information about number of preceding zeros, the second is the non-zero value.There are two special codes: eob (0,0), which represents tailing zeros and zrl (15,0), which represents 16 subsequent zeros (maximum allowed number of subsequent zeros).
Let's consider the example of two subsequent 8x8 blocks:
The first element is DC, so we use the difference between two blocks for encoding the value (22-29 = -7). Next elements are encoded by using run length method. After encoding, we get the following sequence of symbols:
The red marked values represent number of preceding zeros. The (15,0) is the zrl code and (0,0) is eob code.
Next stage involves values with number of bits, required for representing data.
|Number of bits||Range|
Now the sequence is:
In square brackets are the number of precedeing zeros folowed by number of bits for the value (marked as green).
Let's change the values to binary. The negative values should be presented in U1 (one's complement) system. Of course if we're using U2 (two's complement) in algorithm, we may subtract one from the value to get U1 (U1+1 => U2).
Now the pairs in square brackets (number of preceding zeros and number of bits) are encoded. There are two methods of coding: Huffman method or arithmetic method.
The Huffman method is used for encoding data with maximum entropy. This method uses variable length symbols. The length of symbol depends on probability of occuring the symbol in a data stream. The higher is the probability, the smaller length of the symbol is. The JPEG specification defines an example of Huffman encoding symbols, but there is no requirement for using them in JPEG compression. Generally, the Huffman symbols are defined in DHT marker segment in JFIF file type. There are diffrent tables for DC an AC components.
Let's consider our example using Huffman tables for chrominance from JPEG standard.
DC: 3 -> 110
02 -> 100
01 -> 01
31 -> 11011
11 -> 1011
zrl -> 1111111010
eob -> 00
The final sequence is:
It is 48 bits. The whole 8x8 block has 512 bits. It gives compression level 512/48 = 10,67.