Misinformation detection in Luganda-English code-mixed social media text
Loading...
Date
2021
Journal Title
Journal ISSN
Volume Title
Publisher
. arXiv preprint arXiv
Abstract
The increasing occurrence, forms, and negative
effects of misinformation on social media
platforms has necessitated more misinformation
detection tools. Currently, work
is being done addressing COVID-19 misinformation
however, there are no misinformation
detection tools for any of the 40 distinct
indigenous Ugandan languages. This paper
addresses this gap by presenting basic language
resources and a misinformation detection
data set based on code-mixed Luganda-
English messages sourced from the Facebook
and Twitter social media platforms. Several
machine learning methods are applied on the
misinformation detection data set to develop
classification models for detecting whether
a code-mixed Luganda-English message contains
misinformation or not. A 10-fold cross
validation evaluation of the classification methods
in an experimental misinformation detection
task shows that a Discriminative Multinomial
Na¨ıve Bayes (DMNB) method achieves
the highest accuracy and F-measure of 78.19%
and 77.90% respectively. Also, Support Vector
Machine and Bagging ensemble classification
models achieve comparable results. These results
are promising since the machine learning
models are based on n-gram features from only
the misinformation detection data set.
Description
Keywords
Citation
Nabende, P., Kabiito, D., Babirye, C., Tusiime, H., & Nakatumba-Nabende, J. (2021). Misinformation detection in Luganda-English code-mixed social media text. arXiv preprint arXiv:2104.00124.