A Survey on Image Caption Generation in Various Languages

Authors

  • Haneen serag Ibrahim Department of Computer Science, College of Sciences, Mustansiriyah University, Baghdad, Iraq https://orcid.org/0000-0001-7452-1691
  • Narjis Mezaal Shati Department of Computer Science, College of Sciences, Mustansiriyah University, Baghdad, Iraq

DOI:

https://doi.org/10.24996/ijs.2024.65.7.38

Keywords:

CNN, Computer Vision, Image Captioning, LSTM, Natural Language Processing

Abstract

      The image caption is the process of adding an explicit, coherent description to the contents of the image. This is done by using the latest deep learning techniques, which include computer vision and natural language processing, to understand the contents of the image and give it an appropriate caption. Multiple datasets suitable for many applications have been proposed. The biggest challenge for researchers with natural language processing is that the datasets are incompatible with all languages. The researchers worked on translating the most famous English data sets with Google Translate to understand the content of the images in their mother tongue. In this paper, the proposed review aims to enhance the understanding of image captioning strategies and to survey previous research related to image captioning while examining the most popular databases in different languages, mostly English, translating into other languages using the latest models for describing images, summarizing evaluation measures, and comparing them.

Downloads

Published

2024-07-30

Issue

Section

Computer Science

How to Cite

A Survey on Image Caption Generation in Various Languages. (2024). Iraqi Journal of Science, 65(7), 4030-4046. https://doi.org/10.24996/ijs.2024.65.7.38

Similar Articles

31-40 of 1025

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)