Color Image Compression System by using Block Categorization Based on Spatial Details and DCT Followed by Improved Entropy Encoder

In this paper, a new high-performance lossy compression technique based on DCT is proposed. The image is partitioned into blocks of a size of NxN (where N is multiple of 2), each block is categorized whether it is high frequency (uncorrelated block) or low frequency (correlated block) according to its spatial details, this done by calculating the energy of block by taking the absolute sum of differential pulse code modulation (DPCM) differences between pixels to determine the level of correlation by using a specified threshold value. The image blocks will be scanned and converted into 1D vectors using horizontal scan order. Then, 1D-DCT is applied for each vector to produce transform coefficients. The transformed coefficients will be quantized with different quantization values according to the energy of the block. Finally, an enhanced entropy encoder technique is applied to store the quantized coefficients. To test the level of compression, the quantitative measures of the peak signal-to-noise ratio (PSNR) and compression ratio (CR) is used to ensure the effectiveness of the suggested system. The PSNR values of the reconstructed images are taken between the intermediate range from 28dB to 40dB, the best attained compression gain on standard Lena image has been increased to be around (96.60 %). Also, the results were compared to those of the standard JPEG system utilized in the “ACDSee Ultimate 2020” software to evaluate the performance of the proposed system.


1.
Introduction Data Compression means to compress data to reduce the data storage needed to store it and also reduce transmission time needed to transmit it over the Internet. Data is consisting of information and redundancy. The information must be continuously preserved in its original form to properly understand the meaning of the data while redundancy can be removed when it is not important to reduce data size or reinsert it again to data to return it to its original form when needed. The technique of removing redundancy is called a compression and the technique of reinserting redundancy is called decompression [1].
There are varieties of redundancies that can be eliminated from the color image, these redundancies are supposed to be redundant information and can be removed, these include (Spectral redundancy, Inter pixel redundancy, Psycho-visual redundancy, and Coding redundancy,). One of the main issues of RGB system is the spectral redundancy that usually exists between pixels of the three primary color channels. In order to provide a more efficient representation of color images, the YCbCr (Luminance, Blue Chrominance and Red Chrominance) color system, also known as YUV, has been proposed in ITU-R BT.601 Recommendation [2]. Inter pixel redundancy presents because of the correlation between pixels of an image, each pixel is related to its neighbor's pixels and its value can be predicted from its neighbors, this sort of redundancy is likewise called (spatial redundancy). Psycho-visual redundancy occurs in image because of the Human visual system (HVS), human eye cannot distinguish all image colors because "it is more sensitive to the lower frequencies than to the higher frequencies in the visual spectrum", removing less important frequencies (information) in the image may be acceptable to human visual system. Coding redundancy occurs in an image because of the probability of the same intensity occurrences, to reduce that type of redundancy use shorter code words for most occurring intensity values and longer code words to the least occurring ones, this is called (variable length coding) [3], [4]. In this work, high performance color images compression system was implemented, the system comprises of four stages to deal with different type of redundancies that have been mentioned earlier.
The first stage used to transform color space from RGB to YC b C r to reduce spectral redundancy. The second stage is the transform coding to deal with inter pixel redundancy. The third stage is the quantization to deal with psycho-visual redundancy and the final stage which is the most important stage in this work is the entropy coding to deal with coding redundancy.

Literature Survey
Ashwaq T. Hashim et al. proposed a compression system based on DPCM, DCT, DWT and Quadtree coding techniques. Firstly, the RGB space is converted into YUV color space to reduce spectral redundancy. Secondly, the image is partitioned into blocks and the DPCM technique is used to separate blocks into correlated and non-correlated blocks by finding the Maximum absolute value. Then DCT Is used to transform correlated blocks and DWT used to transform non-correlated blocks. Then quad-tree coding technique applied to uncorrelated blocks to produce the output sequence. The resulted sequence from quad-tree coding and the resulted coefficients from DCT is coded and stored using a shift coding technique. The performance results of proposed algorithms produce a better quality of the image in terms of PSNR with a higher CR compared to standalone DCT or DWT [5].
Yangming Zhou et al. propose a lossy compression algorithm using a new efficient lossless encoder. Firstly, the image is converted into YCbCr color space, then DCT applied to transform an image and produce the transformed coefficients. An iterative process based on the bisection method used to control the quality of compressed image, this algorithm applies the bisection method to repeat an iterative process to automatically select the threshold. After setting the goal of the peak signal-tonoise ratio. An adaptive block scanning by using four different aspects of scans (zigzag, horizontal, vertical, and Hilbert) applied to each transform coefficients block by choosing the path which has the longest run of zeros. The next step is to encode the transformed block using a modified lossless encoder, which is work by utilizing the statistical characteristics of the DCT coefficients. The proposed system has a better performance compared to JPEG and CDABS for both subjective and objective measurements [6].
Abdelhamid Messaoudi et al. present a compression algorithm based on DCT. First, the RGB color space is transformed into a YCbCr color space. Next, the image is partitioned into blocks of equal size and DCT applied on each block. Four types of scan orders applied on each block (Zig-Zag, Horizontal, Vertical, and Hilbert), then an index vector contains the runs of zero sequence is formed for each scan. The lowest length of the four index vector is determined as the best scan for a block. The non-zero values and the chosen index vector are coded and stored. The proposed system shows better performance compared to several techniques including JPEG standard [7].
Mohammed M. Siddeq et al. propose a new system for image compression based on DCT. The method works by split the image into blocks and then the DCT applied for each block to produce the transformed coefficients. Then each block is converted into a 1D vector. After that, the zero values from AC coefficients are removed and keep only non-zero values, then apply a high-frequency minimization method to the non-zero values reducing each block by 2/3 and this will produce a minimized array. A Differential operator is applied to the vector of DC-components. Finally, the resulted DC and AC coefficients are stored using arithmetic coding. The attained results show that the proposed compression system is perceptually superior to JPEG with equivalent quality to JPEG2000 [8] .
Issam Dagher et al. use a combination of DCT and Haar wavelet transform is done to make use of the advantages of these two transforms. The image is partitioned into an equal-sized block, then DCT is used to transform the upper left corner of the block and the Haar is used to transform the remaining parts of the block. The system results show an improvement in PSNR value with the same CR compared to DCT, and permitting better edge recovery than the Haar transform [9] .

The Proposed System Scheme
Firstly, the color transformation from RGB space to YC b C r is applied. Then, the input image is split into blocks of size NxN (where N is multiple of 2), each block is categorized according to its spatial details whether it is high-frequency block (its pixels are uncorrelated) or low-frequency block that contains correlated pixels by using DPCM in three different aspects (horizontally, vertically, and diagonally) then the energy of each block is calculated to determine the correlation level between its nearby pixels by using a specified threshold value. Then, image blocks are scanned and converted into 1D vectors using horizontal scan order. The next step is to apply the 1D-DCT on each vector to produce the transform coefficients. Then adaptive scalar quantization applied for both correlated and uncorrelated blocks. The level of the quantization values for each block is different according to the block feature whether it is high frequency detailed blocks or low frequency correlated blocks, by taking advantages that these high frequencies blocks can be treated separately from low frequency to produce better compression. Finally, the proposed entropy encoder applied to the quantized coefficients to store each coefficient with an optimal number of bits. Figure-1 illustrates the process.

RGB to YC b C r Conversion
Firstly, RGB color space is converted to YC b C r space. One of the main issues of the RGB color space is the spectral redundancy that usually exists between pixels of the three primary color channels. To provide a more efficient representation of color images, the YC b C r space (luminance component and chrominance component) Y represents luminance information and C b , C r is chrominance 3131 information. The logic behind this process is that most of the image energy lies within the Y component and also human eye is more sensitive towards luminance change than color changes. So is beneficial to work in the YC b C r space and to treat these three components separately. The transform from RGB color space to YC b C r space is given by the following equations [10]: The inverse transform from YC b C r color space to RGB space is given by the following equations: 3.2 Block Classification Image blocks are categorized by applying differential pulse code modulation (DPCM) on each block to determine the correlation between its pixels and hence determine the importance of block. The DPCM for each block is calculated in three different manners (Horizontally, Vertically and Diagonally) to calculate the correlation of each pixel with its neighborhood pixels. Then the level of correlation of this block is determined by taking the maximum absolute sum of these differences. If the correlation level of the block is greater than a specified threshold, then this block is treated as highfrequency uncorrelated block, otherwise the block is treated as low frequency correlated block. The correlation level of each block is calculated using the following equations: i, j: is the coordinate of each pixel in the block. Y: is the pixel value with the coordinates i,j. w, h: width and height of the block.

Horizontal Scanning Order
The horizontal scan is used to convert the 2D block into 1D vector. Horizontal scan working by scanning each row in snake form. For first row, the scan is starting from first pixel into last pixel and in the second row the scan is starting from last pixel into first pixel and so on. The horizontal scan is shown in Figure-  Discrete Cosine Transform (DCT) DCT implemented in this system is a 1D-DCT. Because processing the block in one-dimension manner is faster than when processed in two dimensions, and therefore accelerating the compression and decompression process. The resultant DCT transformed coefficients in each block is arranged in a way where the coefficient that holds the most significant information is called DC component and it resides in the upper left corner of the block, the rest of the coefficients holds the less important information in the block which is called AC [11]. 1D-DCT can be obtained by applying the equation (10) to transform the images and reverse the transformation by applying the equation (11) to obtain the reconstructed image [12]: Where, N is the length of the input block, u= 0,1,….. N-1 Where f(x) the input matrix.

Adaptive Scalar Quantization
Quantization is a simple process, each coefficient is divided by a quantization step value, and then the result will be rounded to the nearest integer value, the de-quantize process is carried out by multiplying with quantization step. In non-uniform scalar quantization, the quantization matrix of size N N is calculated, then each transformed block of size N N from the DCT step is quantized nonuniformly by dividing the coefficients of each block by its corresponding value in the quantization matrix. Quantization process is applied by the following equation [13]: where the quantization step (Q s ) is calculated by the following equation: Where: u, v: represents the coordinates of each block Q0 is calculated as follows: Q0 = Where N is the block size.

4.
Entropy Coding An entropy coding applied to store the quantized coefficients as the compressed image for less storage requirement. The entropy encoder work by using variable length coding (VLC) technique, VLC technique calculates the optimal number of bits needed for each coefficient according to their probability. Each quantized block will be passed into the encoder to produce a stream of bits. The algorithm steps of the proposed entropy encoder implemented as the following: 1-Split the image into a block of size (MxM), M is calculated as the block size (N) used by the DCT stage divided by 2. 2-Check if the block values are all zeroes or contain zero and non-Zero values, a tag with one bit (0 or 1) is stored for each block to use it later in the decompression stage to determine that the corresponding block elements are all zeroes.

3-If the block values are all zero then the block is discarded, whereas if the block contains non-Zero
values, create a vector that will contain block values sequentially. For DCT block take the DC value until the last non-zero value found and discard the rest of the block. 4-Map the negative values into positive using mapping operation. 5-Move the non-zero values in the resulted vector and put it in a new vector and replace its location in the original vector with one. 6-The run-length encoder applied for the original vector (contains only zeroes and ones) to reduce the long-run sequence. 7-Find the mean value of each resulted two vectors (Non-Zero vector and Run vector). 8-For each vector, move the values that are equal or greater than its mean value and put it in a new vector and replace its original location with zero. 9-Subtract the values in the resulted vectors from the mean value. 10-Adaptive shift coding applied for each vector to specify the optimal number of bits needed to store each value.

Mapping
The mapping process is required to convert the negative values into positive numbers to get rid of the negative sign and also avoid the coding complexity when storing these numbers. This is simply done by mapping all negative values to positive odd numbers while the positive values are mapped to be even numbers. The mapped numbers are produced by applying the following equation [14]: Where Pi is the coefficient value in the incoming sequence. The inverse mapping process is implemented using the following equation: Run Length Encoding (RLE) RLE is applied on a 1D-vectors that contains a lot of consecutive zeroes. We can reduce the size of the vector by replacing a sequence of the same values with only one value along with its count. Let us consider the following sequence with zeroes and non-zeroes values :  57, 45, 0, 0, 0, 0, 23, 0, -30, -16, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, ... , 0 The run-length encoding is applied in the following steps: 1-By replacing the non-zero values with ones and storing them in a different vector, we obtain :  1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, ... , 0  57, 45, 23, -30, -16, 1 2-A run-length encoding will be applied to the sequence of zeroes and ones, as follows: (1,2), (0,4), (1,1), (0,1), (1,2), (0,2), (0,EOB) 3-We can discard the last zero-sequence because it is already known that the values of the remaining blocks are zeroes:

Zeroes and Non-Zeroes Decomposition
( 1,2), (0,4), (1,1), (0,1), (1,2), (0,2) 4-Only the non-zero values and the length of each sequence will be coded and stored: 57, 45, 23, -30, -16, 1 (non-zero values) 2, 4, 1, 1, 2, 2 (Run length of each sequence) 4.4 Mean Value Decomposition In this stage, the resultant vectors from the previous stage (Non-Zero Vector and Run Vector) is further separated into two sub-vectors according to the mean value. The first vector will contain the values less than the mean value and the second vector will contain the values that is greater than or equal to the mean value and then subtract the second vector from the mean value to reduce its scaling range.

4.5
Adaptive Shift Coding Shift coding is one of the variable length coding (VLC) technique. Shift coding efficiently removes coding redundancy in the sequence by assigning an optimal number of bit needed per value according to their probability. Most frequent occurring coefficients will be assigned a short number of bit than less occurring coefficients. Shift coding working by determining the minimum total bits needed to store all values in the input sequence. The following equations calculate the total number of consumed bits according to the nature of the input data [15]: where ( ) is the number of bits required for short values and ( ) is the number of bits required for long values, while His() represents the histogram array for the input sequence. Equation (16) is applied to find the minimum T Bits using shift key{1,0} as keys indicator (i.e., the values 1 and 0 are stored as an additional bit for indicating that the value is stored with a short or long number of bits). While equation (17) is employed in the case of using the value ( ) as a key-value without the need for using extra key indicators. The values of n s & n l are chosen such that they lead to minimum T Bits . After determining the optimal values of n s and n l , the most frequently appearing data will be stored in a short number of bits ( ) while the less frequently appearing data will be stored in a long number of bits ( ), along with its key indicator.

5.
Evaluation Metrics The Compression Ratio (CR) is defined as the ratio of the size between the original image and the compressed image. It can be measured in bits, bytes, kilobytes, etc.... The higher CR value is the better compression technique used to compress the image [16].

(18)
Mean Square Error (MSE) is one of the" full-reference" evaluation metric and the simplest estimator to measure image quality. MSE is computed by averaging the cumulative squared intensity error differences between the original image and the reconstructed image. MSE value closest to zero is better, the reconstructed image quality is poor when MSE is large, thus MSE must be as low as possible for effective compression. MSE is calculated through the following equation [17]: where X is the original image size and Y is the compressed image size in pixel. The dimension of the images is m x n.
Peak Signal to Noise Ratio (PSNR) is the ratio between maximum signal power which is considered as the original image and the power of distorting noise obtained from MSE. PSNR measure how the human visual reaction to image quality. The higher the PSNR, the better the quality of the reconstructed image. Typical values for lossy compression of an image are between 28 dB and 40 dB. The PSNR can be given by the following equation [18]: ( ) (20) where MAX 2 is the power of maximum intensity value in the original image X. Bit Per Pixel (BPP) is defined as a number of bits required to store each pixel in an image. Pixels must be coded efficiently to reduce redundancy, hence reduce storage requirements. BPP is calculated using the following equation [19]: or (22) where B is a number of bits after compression and MxN is the total number of pixels in an image. The CG is defined as the amount of compression gained after the image is compressed. The CG of a digital image is calculated by the following equations: (23) or (24) The Structural similarity index Measure (SSIM) metric aims to measure quality by capturing the similarity of images. Three aspects of similarity: Luminance, contrast and structure is determined and their product is measured.

6.
Test Materials A set of standard true color images have been utilized to test and evaluate the system performance. These images are of type "Bitmap" format in which each pixel is stored in 24-bit true color. The image files "Lena.bmp", "Barbara.bmp", "Peppers.bmp", (smoothed images) and "Baboon.bmp" (sharp edge image). All test images are of size 256x256 for width and height. Figure-5 presents these images.

A.
Color Lena B. Color Barbara C. Color Peppers D. Color Baboon Figure 5-Test color images

7.
Experimental Results In this section, a set of results table are presented for evaluation. Each table shows the effectiveness of each control parameter used in the proposed system on the resulted compressed image. The test results are evaluated and compared based on fidelity criteria measurements (i.e. MSE, PSNR, SSIM, CR, CG and BPP). The PSNR values in each table ranged from 28 dB (low quality) to 40 dB (very high quality). The real time of encoding and decoding process also presented. The adopted system and all additional programs for the testing purpose were established using visual studio (C sharp programming language).
The effect of the following control parameters has been investigated to test the results of the proposed system: 1. BS: is the block size used in DCT 2. Thr.: is the threshold value used to determine the importance of each block. 3. Q0, Q1, and α: are the quantization control parameters used to calculate the quantization step value.
The selection of each parameter depends on the characteristics of the tested image. The range of each control parameters is shown in the table below.   Comparisons with JPEG standard Tables 6 show the compression results (in terms of PSNR, SSIM, CR and BPP) which are attained by the proposed scheme with those given in standard encoder when they applied on color Lena, color Barbara, color Peppers, and color Baboon images, respectively; taking into consideration that different