Word Embedding Methods for Word Representation in Deep Learning for Natural Language Processing

Authors

DOI:

https://doi.org/10.24996/ijs.2022.63.3.37

Keywords:

Word embedding, NLP, FastText, Deep Learning, local and pretrained word vector

Abstract

    Natural Language Processing (NLP) deals with analysing, understanding and generating languages likes human. One of the challenges of NLP is training computers to understand the way of learning and using a language as human.  Every training session consists of several types of sentences with different context and linguistic structures. Meaning of a sentence depends on actual meaning of main words with their correct positions. Same word can be used as a noun or adjective or others based on their position. In NLP, Word Embedding is a powerful method which is trained on large collection of texts and encoded general semantic and syntactic information of words. Choosing a right word embedding generates more efficient result than others. Most of the papers used pretrained word embedding vector in deep learning for NLP processing. But, the major issue of pretrained word embedding vector is that it can’t use for all types of NLP processing. In this paper, a local word embedding vector formation process have been proposed and shown a comparison between pretrained and local word embedding vectors for Bengali language. The Keras framework is used in Python for local word embedding implementation and analysis section of this paper shows proposed model produced 87.84% accuracy result which is better than fastText pretrained word embedding vectors accuracy 86.75%. Using this proposed method NLP researchers of Bengali language can easily build the specific word embedding vectors for word representation in Natural Language Processing.

Downloads

Download data is not yet available.

Downloads

Published

2022-03-30

How to Cite

Wadud, M. A. H., Mridha, M. F. ., & Rahman, M. M. . (2022). Word Embedding Methods for Word Representation in Deep Learning for Natural Language Processing. Iraqi Journal of Science, 63(3), 1349–1361. https://doi.org/10.24996/ijs.2022.63.3.37

Issue

Section

Computer Science