A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks
Loading...
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Egyptian informatics journal
Abstract
This work is inspired by Kaggle competition which was part of the Fine-Grained Visual Categorization
workshop at CVPR 2019 (Conference on Computer Vision and Pattern Recognition) we participated in.
It aimed at detecting cassava diseases using 5 fine-grained cassava leaf disease categories with 10,000,
labeled images collected during a regular survey in Uganda. Traditionally, this detection is done mostly
through physical inspection and supervision of cassava plants in the garden by farmers or agricultural
extension workers from NAADS (National Agricultural Advisory Services) and then reported to NARO
(National Agricultural Advisory Services) for further analysis. However, this can be tiresome, capital
intensive, and lacks the ability to detect cassava infection timely to help farmers apply preventive techniques to the non-infected cassava plants in order to improve on yields which subsequently increases
African food basket leading to food security which fights famine. Using the dataset provided to train
CNNs (Convolutional Neural Networks) to achieve high accuracy was very challenging due to two reasons: the dataset was small in size and has high-class imbalance being heavily biased towards CMD
(Cassava Mosaic Disease) and CBB (Cassava Brown Streak Virus Disease) classes. Class imbalance is problematic in machine learning and exists in many domains. Note that, not all world data is balanced, in fact,
most of the time you will not be extremely lucky to get a perfectly balanced real-world dataset, in recent
years, a lot of research has been done for two-class problems such as fraudulent credit card and tumor
detection among others. Interestingly, class imbalance in multi-class image datasets has received little
attention. This paper, therefore, focused on techniques to achieve an accuracy score of over 93% with class
weight, SMOTE (Synthetic Minority Over-sampling Technique) and focal loss with deep convolutional
neural networks from scratch. The goal was to counter high-class imbalance so that the model can accurately predict underrepresented classes.
Description
Keywords
Agriculture, Cassava mosaic detection, Rectifier Linear Unit, Synthetic minority over-sampling technique, Stochastic gradient descent
Citation
Sambasivam, G. A. O. G. D., & Opiyo, G. D. (2021). A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks. Egyptian informatics journal, 22(1), 27-34. https://doi.org/10.1016/j.eij.2020.02.007