The Effect of False Predictions of Machine Learning on the Security of the Big Data Environment

Authors

  • Ammar Hatem Farhan Computer Center, University of Fallujah, Anbar, Iraq
  • Omar Salah F. Shareef Computer Center, University of Fallujah, Anbar, Iraq https://orcid.org/0000-0002-2171-5203
  • Rehab Flaih Hasan Computer Sciences Department, University of Technology, Baghdad, Iraq

DOI:

https://doi.org/10.24996/ijs.2025.66.1.29

Keywords:

SQLI, Logistic Regression, Big Data, Confidentiality, Integrity, Availability

Abstract

The exchange of data between customers and organizations has become a major target for hackers who seek to illegally access this data, compromising the three main components of security information: confidentiality, integrity, and availability (CIA). Structured query language injection (SQLI) is one of the most common forms of cyberattack. However, most of the previous research has only looked at SQLI attacks that target web-based applications. There hasn't been much time to sort the kind of SQLI payload that the client sent into the vast amounts of data needed to create machine learning models. Additionally, there hasn't been a study that looks at the risks of machine learning models making mistakes and how they affect the three information security principles.

To address this gap, this research aims to create a model that serves as an intermediate protective interface that is a link between the customer's layers and the database server to improve security during communication from SQLI attacks. Additionally, it shortens the time required to identify the client's request type. Finally, study the impact of false predictions of machine learning algorithms on CIA. The proposed method is to train a model using a logistics regression technique (LR) with the Spark ML library that works to process big data containing SQL payloads (harmful and benign).

Comparing our proposed model with previous studies, the results obtained show that the proposed model achieved outstanding results, with an accuracy ratio of 98.10%, a precision ratio of 98.13%, a call ratio of 98.10%, and an F1 index of 98.10%. The results also showed that the time needed to detect and prevent such attacks was only 00.09 seconds.

 

Downloads

Published

2025-01-30

Issue

Section

Computer Science

How to Cite

The Effect of False Predictions of Machine Learning on the Security of the Big Data Environment. (2025). Iraqi Journal of Science, 66(1), 361-374. https://doi.org/10.24996/ijs.2025.66.1.29

Similar Articles

11-20 of 1060

You may also start an advanced similarity search for this article.