An Improved Outlier Detection Model for Detecting Intrinsic Plagiarism

Authors

  • Nasreen J. Kadhim Department of Computer Science, College of Science, University of Baghdad, Baghdad Iraq
  • Maysaa I Abdulhussain Almulla khalaf Department of Computer Science, College of Science, University of Baghdad, Baghdad Iraq https://orcid.org/0000-0002-0996-4952

DOI:

https://doi.org/10.24996/ijs.2022.63.12.42

Keywords:

Intrinsic plagiarism detection, document representation, weight vectors, main content vectors

Abstract

     In the task of detecting intrinsic plagiarism, the cases where reference corpus is absent are to be dealt with. This task is entirely based on inconsistencies within a given document. Detection of internal plagiarism has been considered as a classification problem. It can be estimated through taking into consideration self-based information from a given document.

The core contribution of the work proposed in this paper is associated with the document representation. Wherein, the document, also, the disjoint segments generated from it, have been represented as weight vectors demonstrating their main content. Where, for each element in these vectors, its average weight has been considered instead of its frequency.

The proposed work has been evaluated in terms of Precision, Recall, F-measure, Granularity, and Plagdet. It is shown that the attained results are comparable to the ones attained by the best state-of-the-art methods. Where, through applying the proposed method to PAN-PC-09 and PAN-PC-11 for the detection of intrinsic plagiarism, a Recall scores of 0.4503 and 0.4303 have been recorded, even though further improvement for Precision (0.3308 and 0.2806) and Granularity (1.1765 and 1.1111) needs to be made. Concerning f-measure, the proposed approach has recorded 0.3814 and 0.3397. In terms of the total performance of a plagiarism detection approach, Plagdet, the proposed method has recorded 0.3399 and 0.3151.

Downloads

Download data is not yet available.

Downloads

Published

2022-12-30

Issue

Section

Computer Science

How to Cite

An Improved Outlier Detection Model for Detecting Intrinsic Plagiarism. (2022). Iraqi Journal of Science, 63(12), 5581-5588. https://doi.org/10.24996/ijs.2022.63.12.42

Similar Articles

1-10 of 1119

You may also start an advanced similarity search for this article.