Fake News Detection in low-resourced languages “Kurdish language” using Machine learning algorithms

Rania Azad , Bilal Mohammed, Rawaz Mahmud, Lanya Zrar, Shajwan Sdiqa

doi:10.17762/turcomat.v12i6.8393

pdf

Published: 2021-06-05

DOI: https://doi.org/10.17762/turcomat.v12i6.8393

Keywords:

Fake News Detection, Kurdish Language, Machine learning, Classifiers, SVM, TF-IDF.

Rania Azad , Bilal Mohammed, Rawaz Mahmud, Lanya Zrar, Shajwan Sdiqa

Abstract

With the growth of using the internet and the large amount of real-time information created and shared over social media platforms, the risk of disseminating malicious activities, perform illegal movements, abuse other people, and publicize fake news increased dramatically. Fake news detection is a well-studied research issue to understand the nature of fake news, detection or prevention for the highly resourced languages like Arabic, English, and other European languages where less-resourced languages remain out of the focus because of the absence of labeled fake corpus, absence of fact-checker websites or unavailability of NPL tools, until today, non-research has been conducted in Fake news detection in the Kurdish language. This paper showcase creating a novel Kurdish Fake news corpus that made publicly available[1],it contains two sets of news, the first one contains crawled fake news, the second set contains manipulated text from real news, then several classifiers applied on the corpus after using TF-IDF as a feature of selection. The outcome of the proposed paper showed that Support Vector Machine (SVM) scored the highest accuracy 88.71% among the other classifiers on set 1 and LR outperforms the other algorithms on set 2. This work can be considered as a baseline for future studies.

Issue

Vol. 12 No. 6 (2021)

Section

Research Articles

You are free to:

Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.

Under the following terms:

Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Notices:

You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .

No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

How to Cite

Fake News Detection in low-resourced languages “Kurdish language” using Machine learning algorithms. (2021). Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(6), 4219-4225. https://doi.org/10.17762/turcomat.v12i6.8393

Article Sidebar

Main Article Content