Video Colorization Methods: a Survey

Nowadays, there are a huge number of video colorization methods. This is because in the gray scale image one value (gray) must be converted into three corresponding values (RGB). In this paper, some of these methods have been presented and discussed. Then, different comparisons have been established between these methods and the results demonstrate the efficiency of each method.


Introduction
Colorization means coloring a gray-scale image or video. It involves driving three values from only one value such as drive red, green, or blue value from gray scale value. Therefore, mapping is not unique and colorization is ambiguous. It needs user interaction or external information. Thus, colorization can be performed in different ways, which results in a large number of colorization methods [1]. Many applications have used colorization in a variety of ways for example, modification and enhancement of color images and videos, restoration of old movies, etc [2]. Manual colorization is time-consuming and expensive process, especially when the video consists of very large number of images. Thus, it is better to use an automatic colorization algorithm [2].
In still image colorization, the image is first segmented into regions, and then a color is assigned to each region. Fuzzy or complex region boundaries, such as the boundary between subject's face and her hair cannot be identified using automatic segmentation. Thus, in this case only these boundaries can be identified manually. Colorization of movies includes tracking regions across the frames of a shot. Non rigid regions cannot be tracked using existing tracking algorithms, again requiring manual intervention in this process is crucial [3].

A-Monem and Hammood
Iraqi Journal of Science, 2020, Vol. 61, No. 3, pp: 675-686 676 While the quality of the colorization outputs is currently at a very high level, the complexity of the colorization methods has also increased [4]. For the semi-automatic colorization, it typically consists of the following steps [4]: 1. The user indicates scribbles. 2. The user observes the outputs. 3. The user modifies or inserts additional scribbles in order to achieve the desired final effect. The main goals of video colorization methods are: [4] 1. Simplicity of implementation. 2. Reduction of computational complexity. 3. Preserving a high output quality.
The objective of this paper is to describe briefly some methods for colorizing videos and compare between them to discover the best one (ones).

Classification of colorization methods
There are different classifications for colorization methods. Some of these classifications are described below. Sykora [5] classifies video colorization methods into the following approaches: 1. Luminance keying: It uses look-up table defined by the user to transform each gray value into a specified hue, brightness, and saturation. One problem of this method is that pixels having same intensity located in different locations in the image cannot have different colors. To solve this problem, -segmentation is used. Another problem is luminance fluctuation (assigning different colors to similar intensities) which is common in old movies. R.C. Gonzalez et al. in [6] proposed a method that handles this problem. Luminance keying is extensively used to colorize videos. 2. Color-by-example: Welsh [7] proposed a technique that uses local textural information to solve ambiguity in the color-to-intensity assignment. One problem of this method is that it is not suitable for cartoons where the image consists of a set of homogeneous regions. 3. Motion Estimation: When changes between two consecutive frames are small. Optical flow is used in these methods to estimate pixel-to-pixel correspondences. Then, chromatic information transferred directly between the corresponding pixels. When the optical flow estimation failed (motion is large between two consecutive frames), many key-frames are colorized manually to cover changes and correct colorization. The method proposed by Z. Pan et al. in [8] is an example of these methods. 4. Color Propagation: When homogeneity in color domain indicates homogeneity in gray-scale domain and vice versa. Color is propagated in these methods from several user-defined seed pixels to the rest of the image. An example of these methods is in [3] which is proposed by Levin et al. 5. Segmentation: Chen [9] divides an original gray image into a set of layers using manual image segmentation. Then, the author used Bayesian image matting to estimate the alpha channel of each layer. This allows applying colorization (or other techniques) on each layer separately and then reconstructs the final image using alpha-blending. Popowicz [4] classifies video colorization methods into the following approaches: 1. Automatic: These methods compare between two images, gray scale image and a given color reference. They try to estimate the color of each image region according to this comparison. An example of these approaches is the method proposed by Z. Zhen et al in [10]. 2. Semiautomatic: these methods require manual intervention by the user. The user inserts color scribbles which can be considered as the starting points for the color flooding. These scribbles also indicate the regions of a given color. An example of these approaches is the method proposed by T. Horiuchi et al. in [11]. Pierre [12] classifies video colorization methods into the following approaches: 1. Three Dimensional diffusion: These methods use natural temporal consistency to compute the chrominance channels of the video. They can deal with occlusion and dis-occlusion problems. In these methods the user processes and checks the whole video sequence after scribbling because of the difficult interactivity and computational burden. Concatenating colorized sequences may produce temporal inconsistencies. One example of these methods is the method proposed by B. Sheng et al. in [13]. 2. Frame-to-frame Propagation: In these methods, colors are propagated from one frame to its adjacent frames until the whole video sequence is colorized. All these methods use specific motion

A-Monem and Hammood
Iraqi Journal of Science, 2020, Vol. 61, No. 3, pp: 675-686 677 estimation. These methods are subject to inaccuracy since the result is re-used and unsuitable colors may propagate. An example is the method proposed by Welsh in [7]. Guillaume classifies video colorization methods into the following approaches: 1. Color Transferring: It uses the luminance keying approach. In these methods, color is transferred to the gray image using luminance and textures of either another image or color generator image. These methods give acceptable colorization performance but the input image must have distinct textures or luminance values across object boundaries, as an example is the method proposed by E. Reinhard et al. in [14]. 2. Propagation Based: The user assigns colors to some pixels (source pixels) and then those colors are propagated to the other remaining pixels. These methods assume that image geometry is provided by gray-scale information geometry. Generally, these methods provide more reliable performances than the previous ones. However, they may cause color blurring and their performances may be affected significantly by the location of scribbles. An example of these approaches is the method proposed by G. Sapiro in [15]. Cheng [16] classifies video colorization methods into the following approaches: 1. Scribble-based: The user provides scribbles in the target gray images. Therefore, it is timeconsuming especially for rookie user. An example of these methods is the method proposed by Q. Luan et al. in [17]. 2. Example-based: These methods use similar reference image to transfer color information to target gray scale image. One problem of these methods is the user has a problem of finding a suitable reference image. An example of these approaches is the method proposed by R. Irony et al. in [18].

Colorization Methods
Generally, video colorization methods can be classified into two categories. First, a class in which the methods depend on user's experiment (user scribbles). Second, a class in which methods depend on a set (or one) of similar images. In this section, some of these methods are described in the following subsections.

Colorization using Optimization
L evin's method [3] is based on optimization. The main steps of this method are: 1. The user inserts some scribbles inside some frames (constraints). 2. The frame is converted to YUV model. 3. The squared difference between a weighted average of colors in a small neighborhood and a pixel color (quadratic cost function) is minimized. They assume that nearby pixels in space-time and with similar gray levels and also have similar colors. Thus, pixels with similar intensities in the neighborhood will have more weight than others. 4. The result obtained from minimization is a large sparse system of linear equations which is solved using multigrid solver. Figure-1 shows an example of coloring a video consisting of 83 frames. Only 7 frames are marked. This method uses small number of color scribbles, but colors may spread wrongly over different regions because edges between regions are not considered [19].

Colorization -using Chrominance Blending
Yatziv's method [1] is based on luminance-weighted chrominance blending and fast intrinsic (geodesic) distance computations. The main steps of this method are: 1. The user inserts some scribbles to only one frame. 2. The frame is converted to YC b C r color space. It assumes that if two points are very close to each other, then they must have similar chrominance. 3. The geodesic distance is computed for each uncolored pixel to several nearest seed pixels with different colors assigned. 4. The final color is computed as a weighted average of seed colors where weigh is proportional to the corresponding geodesic distance. This method can be extended to video using 3D geodesic distance. The results of this method on images and videos are shown in Figures-(2, 3) respectively.  In this method, if scribbles are incorrectly positioned, some regions of the image regions will not be colorized or the final image will not be natural. Also, if the luminance of scribbles is too different from the luminance of the pixels in the gray image at same position, it will cause erroneous color propagation and color blurring in the results [2].

Colorization using Isolines
Popowicz's method [4] depends on isolines. The main steps of this method are: 1. The user inserts some scribbles to a number of frames in the video sequence. 2. The distance map (assignment) is created for each uncolored pixel in the gray scale image. 3. The Isoline distance is then computed for each gray image pixel using double-scan method. 4. Finally, the color of each pixel is determined using chrominance weighted averaging performed in the YC b C r . Figure-4 shows the results obtained from this method. This method main advantage is that it does not depend on the number of user scribbles and it is also simple to implement. These two advantages make the method very attractive [4].

Colorization using prioritized source propagation
Hue's method [2] is based on prioritized source propagation. Highest priority is given to the nonsource pixel, then, its color is interpolated from the neighboring pixels. This process is repeated until all non-source pixels are colorized. The main steps of this method are: 1. The user inserts some scribbles to the first frame. 2. The color accuracy is defined for each gray image pixel in the range [0, 1] indicating the reliability of that pixel. The accuracy is 1 if the pixel is a scribbled color pixel (called source pixel) and 0 if it is not (called non-source pixel). 3. The priorities of non-source pixels are defined. 4. Each non-source pixel is colorized using their neighbors. 5. The accuracy of the colored pixel is updated to 1 (become source pixel) and also the priorities of the neighboring pixels are updated. 6. The non-source pixels are colorized upon their priority order until all pixels are colorized. 7. For video colorization, frames are colorized sequentially using motion estimation from the previous frame. Figure-5 shows the results obtained from this method.

Colorization using competitive propagation paths
Kawulok's method [20] has the following steps: 1. The user inserts some scribbles to some frames. 2. The propagation paths from each scribble to every pixel in the image are determined. These paths are determined to minimize the number of desired chrominance along the path. Dijkstra's algorithm is used to perform the minimization process. 3. Propagation paths are computed using two types of distance: gradient distance and plain distance. The method uses competitive approach to select the type of distance. It means that the method selects the smallest distance. 4. Perform the chrominance blending in YC b C r color space to color each pixel. Figure-6 shows results of this method.

Colorization by example
Irony's method [18] transfers color from a segmented example image to gray scale image. The main steps of this method are: 1. Partially segmented reference color image is used. Partially means that each region must be uniform in texture and color (not necessary cover the whole image). Such segmentation can be performed automatically or manually by the user. 2. A training set which contains the reference image luminance and the partial segmentation is provided as input to a supervised learning algorithm. 3. The region used as a color reference for each gray scale image pixel is determined by voting among the pixel's nearest neighbors in the feature space. 4. A measure called confidence is used to determine the color of each pixel. This means that the color is transferred to pixels whose confidence in their label is sufficiently large. 5. The colored pixels are provided as constrains to Levin's classification [3] and the color transfer stages may be viewed as a method for color automatic generation (micro-scribbles). Figure-7 shows the results obtained from this method.

Colorization by transferring color
The main steps of Welsh's method [7] are: 1. Use a similar color image as input to the method. 2. Convert both images into Lαβ color space. 3. The scan line order is used to check each pixel in the gray scale image and select the best matching sample in the color image using neighborhood statistics (standard deviation of the luminance values of the neighbors). 4. The best matching color is selected for each gray scale pixel based on neighborhood statistics and the weighted average of luminance. 5. The α and β chromaticity are transferred to target pixel while retaining original luminance value. Figure-8 shows the results obtained from this method.

Colorization using machine learning
Z. Frenette [22] describes how to use machine learning in video colorization. The main steps are: 1. Choose a small training set of colored images that is similar to the gray scale image. 2. Each image in the training set is converted from RGB to LAB. 3. The size of color space of each image is reduced through a process called color space quantization. Quantization is performed on the (a, b) components of each pixel. 4. Classify each image into k clusters using k-means method. 5. Colorize the k clusters using either support vector machine or linear logistic regression. Figure-9 shows the results obtained from this method. Figure 9-Sample coloring of a small food tray. One problem of these methods is the inability to differentiate between regions that have similar texture but different colors. This is shown in Figure-10.

Colorization using global and local priors
Iizuka's method [23] uses Convolutional Neural Networks (CNN) with strong learning. The main steps of this method are shown in Figure-

Comparisons
Some of video colorization methods mentioned in Section 3 are compared and the best method is determined. These comparisons are described in the following subsections.

A-Monem and Hammood
From Figure-13, CUI gives better results than the other methods.

CUO
CUP CUE Figure 15-Close-ups of results shown in Figure 14 From Figures-(14, 15), CUE gives better results than the two other methods.

Comparison 4
In this subsection, comparison is made between Deep Colorization (DC) [16], and Colorization Using Global and local image priors (CUG) [24]. The results of this comparison are shown in Figure -17.

Conclusions
In this paper, multiple video colorization methods have been compared. These methods are classified into two classes: The first class involves methods that use scribbles inserted by the user, so the method depends on the user's experience. The other class involves methods that use one (or more) similar image(s) stored in a data base. Many types of classification are used to classify video colorization methods. CUI gives better results than CUO, CBC, and DCT. CUE gives better results than CUO, and CUP. CUC gives better results than CUO and CBC. CUG gives better results than DC.