Linear Feedback Shift Registers-Based Randomization for Image Steganography

: Steganography involves concealing information by embedding data within cover media and it can be categorized into two main domains: spatial and frequency. This paper presents two distinct methods. The first is operating in the spatial domain which utilizes the least significant bits (LSBs) to conceal a secret message. The second method is the functioning in the frequency domain which hides the secret message within the LSBs of the middle-frequency band of the discrete cosine transform (DCT) coefficients. These methods enhance obfuscation by utilizing two layers of randomness: random pixel embedding and random bit embedding within each pixel. Unlike other available methods that embed data in sequential order with a fixed amount. These methods embed the data in a random location with a random amount, further enhancing the level of obfuscation. A pseudo-random binary key that is generated through a nonlinear combination of eight Linear Feedback Shift Registers (LFSRs) controls this randomness. The experimentation involves various 512x512 cover images. The first method achieves an average PSNR of 43.5292 with a payload capacity of up to 16% of the cover image. In contrast, the second method yields an average PSNR of 38.4092 with a payload capacity of up to 8%. The performance analysis demonstrates that the LSB-based method can conceal more data with less visibility, however, it is vulnerable to simple image manipulation. On the other hand, the DCT-based method offers lower capacity with increased visibility, but it is more robust.


Introduction
Due to the widespread use of digital networks in exchanging and transmitting information through various communication channels, information security has become an imperative necessity to preserve data from manipulation or theft by an intruder [1]. Among the most popular used methods in information security are encryption and Steganography [2,3]. These are two methods to secure data either by encrypting it with a key or hiding it in a secret way [4]. steganography is the science and art of covert communications and involves two procedures. First, the required message is concealed in a particular carrier, e.g., image, audio, text, etc., that is called a steganographic cover. The second procedure concerns transmitting the cover to the message recipient without drawing suspicion. Fundamentally, the steganographic goal is not to hinder the adversary from decoding a hidden message, but to prevent the adversary from suspecting the existence of covert communications [5]. Some examples of steganography that have been used in the past include invisible inks and writing messages on the envelopes of letters in the area that is covered by postage stamps. Benedict Arnold used codes and steganography to communicate with the British during the American Revolutionary War. His coded messages were written in invisible ink (though now visible) and interspersed between the lines of a normal letter written by his wife, Peggy Shippen Arnold [6].
There are five domains that are used with digital steganography each domain has some techniques that help to improve the hiding processing. These domains are the spatial domain, transform domain, spread spectrum domain, statistical domain, and distortion domain [7]. This paper will focus on spatial and transform domains. The least significant bit (LSB) method will be the technique used in the spatial domain, where the message is directly embedded in the cover media [8]. On the other hand, the discrete cosine transform (DCT) method is the technique that will be used in a transform domain where the cover media is transformed and then the message is embedded in the transformed representation [9].

Related work
In both the spatial and transform domains, various ideas and approaches have been contributed by numerous researchers. By reviewing studies conducted in recent years, the advancements and achievements that are made by these researchers in the field are to be understood. Ahd Aljarf and John Filippas introduced an algorithm to embed data within meticulously chosen clean images, aiming to generate STEGO images containing one or more embedded data files. This procedure incorporated diverse statistical tools to conceal individual and multiple data files using masking techniques. Subsequently, a comprehensive analysis and testing were conducted to assess the differences between the initial clean images and their corresponding Stego versions [10]. Huda Najeeb and Israa Ali proposed a steganography method utilizing the Least Significant Bit (LSB) to embed text files in conjunction with the associated image within a gray-scale image. They also explored the concept of the bit plane which consists of eight separate segments that, when merged, create the actual image. [11]. Beenish Siddiqui and Sudhir Goswami described the various techniques using the LSB substitution method to hide the data in images and proposed a new approach based on transform domain using NSGA (Non-Dominated Sorting Algorithm) for a better quality of stego image [12]. Mohammed Mahdi et al, summarized the current image steganography techniques in the spatial domain and also analyzed different problems and the drawbacks of each method that have been innovated in the last few years. Few of their works on better image quality, while others on the data hiding capacity or security [13]. Sonali K. Powar et al concluded that the spatial domain technique provides a good capacity but it does not robust against different attacks. While the frequency domain technique provides good robustness with less capacity [14].
Many researchers have been observed to employ sequential embedding methods, hiding the same number of bits in each pixel designated for embedding. This approach makes these methods more susceptible to detection due to their routine behavior. To address this issue, this paper introduces a two-layer randomness strategy: one layer for selecting the pixel to store the secret message and another for determining the number of bits to be hidden in that pixel. This randomness is regulated by a pseudo-random key, as it is depicted in Figure 1, which is generated by a random key generator. The details of the generating process are discussed in Section 1.2.
Moreover, two embedding methods are proposed: the first is based on LSBs, the details are provided in Section 2.2. The second is based on DCT with specifics outlined in Section 2.3.
The experimental part of this study and the analysis of results are covered in Section 3 and its sub-sections.

The Proposed Methods
Ensuring the secure concealment of information within images is crucial for data security, as previously mentioned. This paper presents two methods for hiding information randomly both of which rely on generating a random binary sequence through the use of the Linear Feedback Shift Registers (LFSR). This sequence is used as a key for hiding the information. The first proposed method operates in the spatial domain and hides a varying number of bits within the Least Significant Bits (LSBs) of a chosen cover pixel. While, the second proposed method operates in the transform domain, hiding a varying number of bits within the LSBs of a chosen coefficient from the middle-frequency range of the Discrete Cosine Transform (DCT) coefficients.

Random Key Generator
The proposed methods use a random binary sequence as a key that is generated by connecting 8 LFSRs as shown in Figure 2, The connection is done according to the following equation: = 1 ⊕ ( 2 ⊕ ( 3 ⊕ ( 4 ⊕ ( 5 ⊕ ( 6 ⊕ ( 7 ⊕ 8 )))))) Where is the final output of the random key generator and is the output of the ℎ LFSR ( = 1,2, … 8) and the symbol ⊕ represents the XOR operation The lengths of these LFSRs are selected to be distinct and satisfy the condition ( , +1 ) = 1 , where is the length of the ℎ LFSR ( = 1,2, … 7). To achieve a maximal sequence length [15], the feedback polynomial of each LFSR is chosen to be a primitive polynomial of order , which guarantees a period of 2 − 1 for that LFSR [16]. The length of the key sequence should be sufficiently greater than 8 times the length of the message to accommodate the entire plaintext without repeating the key stream. Each LFSR needs a primitive feedback polynomial and an initial state to operate. The selection of these two factors can be made by the sender. The sender chooses a text key , which is converted into a binary sequence . The first bits of are used as initial states for the LFSRs and are distributed according to the needs of each of them, where = ∑ 8

=1
. This process is shown in Figure 3. . This process is shown in Figure 4.

LSB-Based Steganography using Variable-Length Embedding
To begin the process, the cover image is converted into a binary sequence by converting each pixel's integer value into an 8-bit binary number. Simultaneously, the text message to be hidden is transformed into a binary sequence of length 8 1 , where 1 is the number of characters in the original text message. This is done by converting the ASCII code of each character into an 8-bit binary number.
Next, the key generated in section 3.1 is divided into blocks , each block consists of 2 bits. These blocks are converted into decimal numbers where 0 ≤ ≤ 3. Each is then used to determine how many bits of the sequence will be hidden in the LSBs of each pixel of the cover image. It is important to note that the number of bits that will be changed from the original pixel value will differ from one pixel to another including when = 0, which means that the pixel will be overridden and does not hide any bit in it. This process produces a binary sequence.
Finally, the sequence is converted to decimal numbers and then reshaped to match the dimensions of the original image. The steps of this method are summarized in Algorithm 1, while Figure 5 provides a block diagram that illustrates the process. Step 1: Convert the cover image into a binary sequence by converting each pixel's integer value into an 8-bit binary number.
Step 2: Convert the message into a binary sequence of length , where is the number of characters in the message. This is done by converting the ASCII code of each character into an 8-bit binary number.
Step 3: Divide the key into blocks , each consisting of 2 bits.
Step 5: For each pixel in : a. If = , skip to the next pixel.
b. Otherwise, replace the least significant bits of the pixel's binary value with the corresponding bits from the sequence, up to a maximum of bits. (The resulting sequence is the binary ) Step 6: Convert the into decimal numbers and reshape the resulting sequence to produce the stego-image which is the same size as the original image.

DCT-Based Steganography using Variable-Length Embedding
In the second proposed method, the LSBs of the Discrete Cosine Transform DCT coefficients of the cover image will be utilized for hiding the message. It is widely known that if an image is converted into DCT, then the frequencies are redistributed as low, medium, and high, respectively, from the left-top corner to the right-bottom corner . Additionally, it is common knowledge that all the details of the image are represented by the low frequencies.
Thus, changing the low-frequency range can have a significant impact on the final stego-image, and therefore, the hiding information should be avoided in this part. Meanwhile, the highfrequency range is susceptible to loss of its values when the image is compressed (e.g., JPEG compression). Thus, information hiding in this part can result in information loss. Based on these two points, the middle-frequency area is chosen for hiding the message. As illustrated in Figure 6, the process of generating and is identical to that of the first proposed method. However, the second method distinguishes itself by concealing the message in the LSBs of the middle frequencies of the DCT coefficients instead of the LSBs of the spatial domain pixels. The cover image is first split into multiple sub-images of size 8x8. Next, the DCT is computed for each sub-image. Following this, the quantization operation is performed to produce integer values. Then, the message is concealed in the LSBs of DCT coefficients in the middle frequencies part. The parameter determines the number of bits to be hidden in each selected coefficient. This process results in a set of sub-images labeled as . Finally, the inverse DCT is computed to each sub-image to generate the stego image. The steps of this method are summarized in Algorithm 2.
Step 1: Convert the cover image into a binary sequence by converting each pixel's integer value into an 8-bit binary number.
Step 2: Convert the message into a binary sequence of length , where is the number of characters in the message. This is done by converting the ASCII code of each character into an 8-bit binary number.
Step 3: Divide the key into blocks , each consisting of 2 bits. Step 4: Convert into a decimal number , where ≤ ≤ .
Step 5: Split the cover image into multiple sub-images of size 8x8 Step 6: For each sub-image: a. Compute the DCT coefficients. b. Quantize the coefficients to produce integer values. c. Determine the middle frequency coefficients and select a set of coefficients to hide the message.
Step 7: For each chosen coefficient: a. If = , skip to the next coefficient. b. Otherwise, replace the least significant bits of the coefficient with the corresponding bits from the sequence, up to a maximum of bits.
Step 8: Compute the inverse DCT for each sub-image to generate the stego-image .

Experimental Rresults and Analysis
The tests and performance evaluation are presented in two parts, the first is a Randomness sequence test, and the second is a performance test of the proposed inclusion method.

Evaluating Key Randomness
In order to verify the statistical characteristics of the key, the SP800-22 test package, developed by the National Institute of Standards and Technology (NIST), is utilized for random performance detection [17]. The selection of SP800-22 as a tool for evaluating randomness is based on its use in assessing the AES cipher and its frequent application in formal certification or approvals. In the tests, a keystream sequence of length 1,000,000 bits that are generated by the proposed keystream generator is examined. Table 1 illustrates the results of the tests. Each row of the table presents the name of the test, the P-value, and the test result. No deviation from a truly random sequence is shown in the results mentioned in the table, as all P-values are greater than the significant value = 1%.

Performance Metric on Spatial and Transform Domain
Several commonly used metrics for evaluating performance and ensuring image quality. Among the most important assessments are the signal-to-noise ratio (PSNR), mean squared error (MSE), and normalized cross-correlation (NCC). [18] [19].
• The Peak Signal to Noise Ratio (PSNR) evaluates the resemblance between two images (original and stego images) and is directly related to the Mean Squared Error [20,21]. The equation for PSNR is as follows: ].
In this equation, represents the dynamic range of pixel values or the maximum possible value for a pixel. For 8-bit images, is equal to 255. MSE refers to the mean square error. • The Mean Squared Error (MSE) quantifies the difference between two images; a lower MSE value indicates higher image quality [22,23]. The equation for MSE is as follows: In this equation, ( , ) represents the original image, while ( , ) denotes the stego image. The variables and correspond to the dimensions of the image.
• Normalized Cross Correlation (NCC) assesses the level of similarity (or difference) between two images being compared. Its primary advantage is its reduced sensitivity to linear changes in illumination amplitude within the compared images [24,25]. The equation for NCC is as follows: Figure 7 illustrates a set of standard images, both before and after incorporating a hidden binary message, along with their respective histograms. The final column displays the randomization map. In this map, the red points signify the embedding of 1 bit in a pixel, the green points represent the embedding of 2 bits, and the blue points indicate the embedding of 3 bits. The black points, on the other hand, denote skipped pixels. Simultaneously, Table 2 provides a comprehensive display of the numerical values corresponding to the three-performance metrics discussed earlier: PSNR, MSE, and NCC. On the other hand, the embedding in the frequency domain using the proposed method is illustrated in Figure 8, and the numerical values for the three metrics are provided in Table 3.
It should be noted that the modified quantization table proposed by Li and Wang was utilized for the quantization step. [26]. Furthermore, the results of the proposed method for frequency domain embedding have been compared with both the widely recognized Jsteg method and the method proposed by Senthooran and Ranathunga [27]. For comparison purposes, the same images and payloads used in [27] were utilized. The images involved in this comparison can be seen in Figure 9.  The average results that are presented in Table 4 indicate that the proposed method outperforms the other two methods, as it exhibits the lowest error and highest PSNR. Additionally, a graphical representation of the comparison based on MSE and PSNR is illustrated in Figures 10 and 11, respectively. Figure 12 illustrates the relationship between payload and PSNR, MSE, and NCC, respectively. It is apparent that the first and third relationships are inverse, while the second relationship is proportional

Conclusions
The objective of steganography is to have secret messages concealed within cover images. In most existing techniques, information is embedded sequentially in the image, and a fixed number of bits is utilized for each pixel, making the hidden message more susceptible to attacks. In this study, it is proposed that obfuscation can be enhanced by having the pixel for message embedding randomly selected, and by having the number of bits hidden within the chosen pixel randomly determined. For this purpose, a random binary key that is generated from a non-linear combination of eight LFSRs is employed. Due to its speed, simplicity, determinism, and affordability, the method is selected.
Two approaches are proposed, namely having the data hidden in the spatial domain and having it hidden in the frequency domain. In the first approach, while a large volume of data can be concealed, the preservation of the data becomes challenging if the cover image is subject to external influences. Consequently, the second approach involves having the data hidden in the frequency domain, specifically in the middle range of frequencies, offering greater resilience against influences on the cover image. However, the amount of hidden data is smaller than in the first approach. Experimental results indicate that high image quality and substantial message capacity are provided by both proposed methods, in addition to the obfuscation achieved through the two layers of randomness mentioned.