Home

About PCS

About Us
Outsourcing
Evaluation package

IP cores

JPEG-EC
STD8085

CSC

CSC_S
CSC_P

Education

JPEG
Contact Us

JPEG

 


Entropy Encoding

After quantization process and zigzag scanning, in result we get the sequence of 64 values for 8x8 block. The first byte in the sequence has the highest value and represents mean value of the block. This byte is called DC. The rest of the bytes has smaller values with majority of zeros. These bytes are called AC.

The DC coefficient will be decoded slightly different than the AC coefficients. Respecting the correlation to the neighboring blocks, just for the first block the whole DC coefficient is processed. Later blocks will only encode the difference to the preceding block’s DC component, this applies for each component separately. AC and DC coefficients have different Huffman tables.

The AC coefficients are decoded using run length encoding technique. In this method values are presented by pair of number. The first number is the information about number of preceding zeros, the second is the non-zero value.There are two special codes: eob (0,0), which represents tailing zeros and zrl (15,0), which represents 16 subsequent zeros (maximum allowed number of subsequent zeros).

Let's consider the example of two subsequent 8x8 blocks:

29,-2,3,1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0...

followed by:

22,-2,3,1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0...

 

The first element is DC, so we use the difference between two blocks for encoding the value (22-29 = -7). Next elements are encoded by using run length method. After encoding, we get the following sequence of symbols:

-7;(0,-2);(0,3);(0,1);(0,1);(0,1);(3,1);(15,0);(1,1);(0,0)

The red marked values represent number of preceding zeros. The (15,0) is the zrl code and (0,0) is eob code.

Next stage involves values with number of bits, required for representing data.

 

Number of bits Range
0 0
1 -1,1
2 -3…-2; 2…,3
3 -7…,-4; 4…7
4 -15…-8; 8…15
5 -31…-16; 16…31
6 -63…-32; 32…63
7 -127…-64; 64…127
8 -255…-128; 128…255
9 -511…-256; 256…511
10 -1023…512; 512…1023
11 -2047…-1024; 1024…2047

 

Now the sequence is:

[3]-7;([0,2],-2);([0,2],3);([0,1],1);([0,1],1);([0,1],1);([3,1],1);zrl;([1,1],1);eob

 

In square brackets are the number of precedeing zeros folowed by number of bits for the value (marked as green).

Let's change the values to binary. The negative values should be presented in U1 (one's complement) system. Of course if we're using U2 (two's complement) in algorithm, we may subtract one from the value to get U1 (U1+1 => U2).

 

[3]000;([0,2],01);([0,2],11);([0,1],1);([0,1],1);([0,1],1);([3,1],1);zrl;([1,1],1);eob

 

Now the pairs in square brackets (number of preceding zeros and number of bits) are encoded. There are two methods of coding: Huffman method or arithmetic method.

The Huffman method is used for encoding data with maximum entropy. This method uses variable length symbols. The length of symbol depends on probability of occuring the symbol in a data stream. The higher is the probability, the smaller length of the symbol is. The JPEG specification defines an example of Huffman encoding symbols, but there is no requirement for using them in JPEG compression. Generally, the Huffman symbols are defined in DHT marker segment in JFIF file type. There are diffrent tables for DC an AC components.

Let's consider our example using Huffman tables for chrominance from JPEG standard.

DC: 3 -> 110

AC:

     02 -> 100

     01 -> 01

     31 -> 11011

     11 -> 1011

     zrl -> 1111111010

     eob -> 00

 

The final sequence is:

110000100011001101101101111011111111110101011100

 

It is 48 bits. The whole 8x8 block has 512 bits. It gives compression level 512/48 =  10,67.