Image Compression Using Deep Learning: Methods and Techniques

In recent years images have been used widely by online social networks providers or numerous organizations such as governments, police departments, colleges, universities, and private companies. It held in vast databases. Thus, efficient storage of such images is advantageous and its compression is an appealing application. Image compression generally represents the significant image information compactly with a smaller size of bytes while insignificant image information (redundancy) already been removed for this reason image compression has an important role in data transfer and storage especially due to the data explosion that is increasing significantly. It is a challenging task since there are highly complex unknown correlations between the pixels. As a result, it is hard to find and recover a well-compressed representation for images, and it also hard to design and test networks that are able to recover images successfully in a lossless or lossy way. Several neural networks and deep learning methods have been used to compress images. This article survey most common techniques and methods of image compression focusing on auto-encoder of deep learning.


Lossy Compression
Lossy compression is a way of minimizing data size while maintaining usable or important data [5]. Redundant bits are replaced by lossy compression methods, minimizing the file size. As some of the data is lost in the compression, it is not possible to convert the compressed file back to the original file. The recovered file after de-compression is an approximation of the original file which is restored based on the compression program's comprehension. As a lot of redundant data is removed ,the compression proportion of lossy compression methods is usually more than lossless image and video compression, where a group of pixels can be approximated to a single value using transform or differential encoding. Compressing methods like Joint Photographic Expert Group (JPEG) and Moving Picture 3 (MP3) provide examples of lossy compression [6]. Applications for lossy image compression involve transmitting images through the network and compression of the image. Remote sensing images are usually compressed to maintain image quality using lossless compression [6]. Methods which fall under the technique of lossy compression are [7]:

Transformation Coding
One of the types of normal data compression such as audio signals or photographic images is transform coding. Usually transformation is used in lossy compression. This results low quality version of input image. In transform coding, the compression techniques are used to select the information to discard. This way it decreases bandwidth. Also by using several methods, the residual information can be compressed. When decoding output the result may not be the same as the original. However, it is supposed to be as close as possible for the application [8].

Vector Quantization (VQ):
VQ method expands the fundamental concepts of scalar quantification to several dimensions. This method is designed to create a dictionary of vectors of a constant size named code vectors. Then image vectors are created by partitioned the given image into non-overlapping blocks. After that, calculating closest corresponding vector in the dictionary for each image vector while its index in the dictionary is utilized to encode the original image vector. Due to its quick viewing ability on the Decoder part, VQ-based coding method is especially convenient for multimedia applications [9].

Fractal Coding:
The basic concept here is to break down the image into several parts using standard image processing methods like color separation, edge detection, spectrum and texture analysis then look at each part in a fractal library. The library includes codes named Iterated Function System (IFS) codes. These codes made of compact numbers. By using a systematic procedure, a set of codes for a given image is defined in such a way that when (IFS) codes are used to an acceptable set of blocks of image, an image is generated which is a very close approximation of the original. This approach is highly successful for compressing images with reasonable regularity and self-similarity [10].

Lossless Compression
In lossless compression, processes do not result in any loss of data or information. The compression is achieved by building a file with fewer bits without any loss of data. This is achieved with the aid of different statistical and mathematical methods like entropy coding which are often used to transform the compressed file back to the original uncompressed file. Lossless compression methods are used to compress files where all information is required as in the case of executable files [11]. Lossless compression is primarily used, for applications such as medical imaging, where image Quality is required [11]. Lossless image compression has wide-ranging applications like archiving business or medical files and digital radiography, where any loss of data, in the authentic image may result in an incorrect diagnosis. Another lossless compression application, include image storage, thermal transmission, and other applications include camera device images registered through Nano-Satellite for remote sensing applications like forest fire-monitoring, and soil moisture [12]. The following are the methods that come under lossless compression.

Run Length Encoding (RLE):
One of the easiest methods for image compressing is RLE. This method operates by exchanging the sequence (run) of the equivalent symbols with the pair holding the symbol with the length of the running [11]. RLE is used as the fundamental compression method in the One Dimension (1D) Consultative Committee for International Telephone and Telegraph (CCITT) Category three fax standard and in combination with other methods in the JPEG image compression standard [12].

Huffman Encoding:
It is one of the common entropic encoding algorithms using for lossless information compression. Huffman Coding uses a variable length code that assigns short code words to more common values or symbols in the data and assigns longer code words to less common values. The Huffman algorithm produces minimum redundancy codes relative to other algorithms been used in the video and text and image -compression and conferencing systems such as JPEG, Motion Picture Editors Guild (MPEG-2), MPEG-4 successfully [13].

Lempel, Ziv and Welch (LZW) Coding:
Global algorithm for lossless information compression is generated by Abraham Lempel, Jacob Ziv, and Terry Welch which is called LZW coding. It is a dictionary based on coding that can be either dynamic or static. In the static dictionary coding, it is set through the Decoding and Encoding process. The LZW dictionary is rapidly updated in dynamic dictionary coding. LZW utilized in the computer business line. Also used as a compress command for the Universal Network Information Exchange Technology (UNIX) platform [13].

Predictive Coding:
Inter pixel redundancies are eliminated in the lossless predictive image compression method by prophesying the current values using closely spaced pixel values to produce new coding values. The error is created by subtracting the expected value from original values [14].

Image Compression Techniques:
The most popular compression algorithms are: 4.1 JPEG Standard: JPEG initially invented in (1987) under the supervision of the International Standards Organization (ISO) to establish an appropriate image compression standard. In (1992) JPEG turn into an International Standard [14] [15]. The JPEG described in several modes include baseline modes, lossless modes, hierarchical, and progressive modes [16]. Depending on the application specifications, the baseline model is the most common, which supports only lossy coding and able to be expanded by different options. In many applications JPEG has been used extensively. For instance digital photography, medicinal imaging, wireless imaging, imaging of documents, GIS, pre-press, science and digital cinema, industrial fields archives of image and databases scanning and printing, monitoring, television, etc. [16].

JPEG-2000:
Through the ISO and International Telecommunications Union-Telecommunications Standardization Sector (ISO/ITU-T) the standard for JPEG-2000 image compression technique was developed to supplement the JPEG standard through offering improved compression efficiency and new features [15]. The core coding algorithm, JPEG-2000 Part I in December 2000 became an international standard, and provided a broad range of functionalities in a single bit-stream such as progressive resolution transmission, goodness element or position access random and lossless compress to lossy compress. The algorithm of JPEG-2000 is depended on the embedded block coding along with optimized truncation by using Discrete Wavelet-Transform (DWT) [17]. Scalar quantization simulation of contexts arithmetic coding and allocation of post-compression rate made compression performance of JPEG-2000 became better [17]. Figure 2 shows the general structure for JPEG 2000. The advantages and disadvantages of several algorithms are shown in the table below [18].

Introduction to Deep Learning:
Deep learning is widely utilized because it has the capability to learn patterns from the input image then convert them with fewer components into another pattern. By changing the weights (synapses) of interconnections through layers, the neural network learns. In general two-different types of learning mechanisms can be used to conduct the process of learning of neural network, supervised learning and unsupervised learning. One of the preferred unsupervised learning methods used in different research domains is an auto encoder-based method [19].

Auto Encoder:
Artificial Neural Network that is, used in an un-supervised way to learn the successful coding of information is known as an Auto Encoder. An Auto Encoder's primary aim is to learn an encoding or representation for a dataset. This is usually achieved by teaching the network to ignore the noises for dimensionality reduction. Along with the lowering process a restoration process is also involved. The AutoEncoder will attempt to create an-encoding in the reconstruction process that is identical to the authentic input. The auto-encoder concept is now common and also commonly used in generative data models. The AutoEncoder compresses information and generates a short-code from the input layer. Then it decompresses this code into a type comparable to the original data [20] [21]. Figure 3 demonstrates the Simple AutoEncoder (SAE) architecture with Encoder and Decoder layers parts.

AutoEncoder Application:
1-For image compression, an AutoEncoder may be used for image compression. The size of the hidden layer in the AutoEncoder in this algorithm is strictly less than the output layer size.
Training such an AutoEncoder with same input values as target values using back propagation forces the AutoEncoder to learn the low-dimensional representation of input information. The active hidden layer compresses the data [22]. In Figure 4 below a simplified form of the image compression process is shown. 2) Image de-noising: one of the applications of AutoEncoder is that it can be used for denoising of images. Treat with the AutoEncoder as a non-linear feature in the image de-noising task that can remove the effect of noise in the image. By feeding the image with random noise, (Gaussian noise) to train the network and the original image without noise is the output target [22] Figure 5 below a simplified form of the image de-noising process is shown.

AutoEncoder Architecture:
Two separate AutoEncoders are introduced in this section: Simple AutoEncoder (SAE), and Convolutional AutoEncoder (CAE) [22]. The SAE: The SAE is a three layers feed-forward network. The connections are completely connected between layers. All units in the next layer are linked to all units in the previous layers. Input layer and output layer sizes are similar to the size of the image. To have the output target like as the original image the AutoEncoder is compelled to learn, the compressed representation without losing data [22]. The CAE: By changing, the completely linked layers to convolution layers the CAE, expands the basic structure of the SAE. The input layers size is almost like as the output layers just like the SAE but the decoder network changing to convolutional layers and the decoder network is changed to transposed convolutional layers [22].

Literature Review:
This section presents a brief study of all works related to image compression. There are many works that have been accomplished in the image-compression field, and encryption using the AutoEncoder.
In 2017, H. Kubra Cilingir et. al. [23], Presented a work that are aimed to find a wellcompressed representation of images, build and test networks that can effectively recover it in a lossless or lossy way by using a compression network implemented by a group using Generative Adversarial Networks (GANs) in which at high compression rates good visually images are obtained. The proposed compressive AutoEncoders that uses an entropy coding approach involving Convolutionary AutoEncoders are used for the implementation of Structural-Similarity Index Metric (SSIM) and Peak Signal to Noise Ratio (PSNR), performance metrics. Three test images for both preparation and research were included in the data set. These images can be found in the Github repository for the lossless compression comparison metric [23], For all preparation, lossy compression step s and Canadian Institute For Advanced Research (CIFAR) dataset were used .Results of Generative Adversarial Networks-AutoEncoders (GAN-AE) showed a great success in (PSNR 32.53 and SSIM 0.987). According to SSIM Convolutional Neural Network-AutoEncoders-Left (CNN-AE-FT) seems to be very successful with an index very close JPEG-2000 the result are (PSNR 33.9 and SSIM 0.99) and beat GAN-AE network. It shows that image specific compression using fine-tuning shows a great promise. Also, note the performance increase in JPEG from (36.1 to 43.6) PSNR and 0.993 SSIM in High Resolution (HR) image [23]. The drawback of this algorithm is the use of a limited network while another option can be used to make the network bigger and deeper and compress the whole image in a lump rather than dividing it into parts and use deterministic AutoEncoder in all of the networks to encode and decode the images and that deterministic AutoEncoder is more appropriate there are some works on variation AutoEncoder due to their generative power. Lucas Theis et al. 2017 [24], proposed a new way to solve the problem of improving autoencoders for compression of lossy images. In the training of AutoEncoders for lossy compression, an easy but efficient way of dealing with non-differentiability was implemented. Use the widely used Kodak Picture Compact Disk (CD) dataset for qualified compressive AutoEncoders by using high-quality images, consist of 434 images, it is collected from flickr.com, for testing. The results of the various approaches where evaluated for PSNR, SSIM [25][26] [27]. The results illustrate that this method performs similarly to JPEG-2000 in terms of PSNR but a little worse at low and medium bitrates and a little best at elevated bitrates the system outperforms all other tested approaches in terms of SSIM. For, all techniques, except at, very low bitrates, MS-SSIM generate very close ratings [24]. Average outcomes of Mean Opinion Score (MOS) at-bit rate for each algorithm is (95%). It was found that, Convolutional AutoEncoders and JPEG2000, achieved greater MOS than JPEG [28]. CAE was also found to outperform JPEG 2000 substantially at (0.375 bpp) (less than 0.05, and at, 0.5 bpp) less than 0.001 [28]. The drawback of this work is that the compressive AutoEncoder devices worked on specific metrics, and to improve them is by making the compressive AutoEncoder devices work on different metrics, like using metrics dependent on neural networks trained for image classification. Which have achieved interesting super resolution results [28]. Prakash et al. 2017 [29], presented a model that able to learn to select various objects by any scale and create a map of various regions of outstanding images. This provides enough data in order to implement image-compression with variable quality and without providing accurate semantic segmentation. The Kodak Photo (CD) and the Saliency Benchmark datasets from the Massachusetts Institute of Technology (MIT) were checked on and the results for overall the dataset are presented in Table 2: The problem in this research is using just the jpeg method for lossy compression. Although it can use other methods like JPEG-2000 lossy compression [29]. Cheng, et al.. 2018 [30], A compression architecture for lossy images was proposed which uses the advantage of CAE to realize high coding efficiency. Experimental findings show that by realizing a 13.7 percent, BD-rate decrease at database (Kodak) images in comparison with to JPEG-2000 this approach outperforms conventional image coding algorithms. This technique also retains a mild complexity similar to JPEG2000. To train the CAE network, 5500 images were used from Image Net database [31]. In this research, the issue is that perceptual quality matrices like the quality or MS-SSIM, predicted by neural networks, need to be applied to the loss function to increase the output, of MS-SSIM. Y. Zhang 2018 [32], Compare, and implement the two AutoEncoders with various architectures presented. The SAE with one hidden layer is the first auto-encoder. Another AutoEncoder is the convolution AutoEncoder (CAE) [33] compare between these AutoEncoders in two separate applications (image denoising and image compression) the results show that the CAE works better than the SAE. The chosen dataset is Labelled Face In The Wild (LFW) [33]. It comprises more than 13000 pictures of various people faces two types of AutoEncoder are used in this work. Besides, the Variational AutoEncoder (VAE) is another AutoEncoder worth investigating in a distribution and that would be interesting for the results. Zhengxue Cheng et. al. 2019 [34], Proposed image compression architecture based on energy compaction by using CAE in order to realize high quality coding. For the CAE network training the ImageNet database consisting of 5500 images was used that were cropped to millions of samples [35]. for the MS-SSIM metric the test results display that the suggested method best than Better Portable Graphics (BPG), and High Efficiency Video Coding (HEVC-intra), Also it achieves better performance compared to the exist bit allocation techniques also provides greater coding reliability in comparison with known learning compression techniques at high bitrates [35]. All the results are shown in the Tables below:   The problem in this research is to use few of the dataset for training with lossless compression and the design of the network that is useless to use another type of data. D. Alexandre et al. 2019 [37] suggested a compression method for loss-based images using deep-learning AutoEncoder structure and provided an AutoEncoder -based learned image compressor with the notion of the value of bit allocation and rate importance maps. The dataset used from ImageNet Large Scale Visual Recognition Competition (ILSVRC) [38]. Also the result compares the compression efficiency of the system with that of JPEG and BPG at 0.15 bpp for PSNR and MS-SSIM [38]. The results of the compression performance, are show in the Table 7. The defect in this research is the performance gap relative to BPG.  [39], Proposed an enhanced hybrid layered, image compression system through integrating deep learning with conventional image codecs at the Encoder. This compression system is a simplified model from of the latest Deep Semantic Segmentation Based Layered Image (DSSLIC) schema without using a semantic segmentation layer [40].
Using two dataset forms the Kodak Picture CD dataset and the Tecnick dataset [41], Kodak dataset evaluation results and the Tecnick dataset are shown in Table 8. The issue in this study is that it deals with specific size blocks and this type of dataset image uses a wide memory space and to increase the performance of the proposed mode the CNN model used to realize color image compression such as Red Green Blue (RGB) because of the ability to manage the multipleimage components [45]. The output values for image compression using 4×4 blocks are shown in Table 9.

Conclusion:
The aim of this research is to provide a comprehensive definition of the researches carried out by researchers and results they have reached in the field of image compression by using deep learning techniques and neural networks algorithms and to identify the importance of compression, especially in the transmission of data through networks, to provide storage space, and high transmission speed. Researchers can observe the importance of using deep learning techniques at the present time in the field of compression, and one of the most popular techniques used is the deep Auto-Encoder, which is an important work in the future.