Human Skin Detection and Segmentation Based on Convolutional Neural Networks
DOI:
https://doi.org/10.24996/ijs.2024.65.2.40Keywords:
classification, convolution neural network, detection, image processing, human skinAbstract
Human skin detection is the process of classifying the image pixels or regions as skin or non-skin. Skin detection has many applications, such as face tracking, skin diseases, nudity recognition, hand gestures, video surveillance, web content filtering, and pornographic content filtering. Skin detection is a challenging problem due to skin color variation, the human race, aging, gender, makeup, complex backgrounds, etc. This paper suggests detecting the skin region in the image and finding the location of the skin based on a convolutional neural network. In this proposal, the proposed CNN will be modified by adding two layers before the first layer of CNN and after the last layer of CNN. The main purpose of these layers is to prepare the input image by using a sliding window, which inputs an indexed small part of the image into the CNN. The network classifies each part as skin or non-skin and then sends the result into the second suggested layer. After processing all the image pixels, the non-skin blocks are mapped to the original image as black regions. The final image contains the skin regions with black in the background. The contribution of the proposed method is the ability to detect the skin from any part of the human body, unlike previous works, which focused on one part of the body. The input image is processed as blocks instead of the entire image as in the previous works, and then in the output, the original image is reconstructed. This method works well with most of the challenges that face the detection of skin, and finally, the designed network facilitates the localization and segmentation of the skin region almost accurately, while the previous networks focused on the classification of the image as skin or non-skin. The accuracy of the detection of skin when testing with images different from the training images was 95.4%.