Browsing by Author "Bbosa, Francis Fuller"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item On the Goodness of Fit of Parametric and Non‑Parametric Data Mining Techniques: The Case of Malaria Incidence Thresholds in Uganda(Health and Technology, 2021) Bbosa, Francis Fuller; Nabukenya, Josephine; Nabende, Peter; Wesonga, RonaldTo identify which data mining technique (parametric or non-parametric) best fits the predictions on imbalanced malaria incidence dataset. The researchers compared parametric techniques in form of naïve Bayes and logistic regression against non-parametric techniques in form of support vector machines and artificial neural networks and their goodness of fit and prediction was assessed using 10-fold and 5-fold cross-validation on an independent validation dataset set to determine which model best fits the predictions on imbalanced malaria incidence dataset. The 10-fold cross-validation outperformed the 5-fold cross-validation in all performance metrics with the naïve Bayes classifier attaining accuracy of 69% with a sensitivity of 90.9%, a specificity of 55.6%, a precision of 55.6% and F-measure score of 69.0%, the logistic regression achieved an accuracy of 65.5% with a sensitivity of 83.3%, a specificity of 52.9%, a precision of 55.6% and F-measure score of 66.7%, the support vector machines achieved an accuracy of 82.8% with a sensitivity of 88.2%, a specificity of 75.0%, a precision of 83.3%, and F-measure score of 85.7% whereas the artificial neural networks registered an accuracy of 89.7% with a sensitivity of 94.1%, a specificity of 83.3%, a precision of 88.9%, and F-measure score of 91.4%. Non-parametric data mining techniques in form of artificial neural networks and support vector machines outperformed the parametric data mining technique in form of naïve Bayes in making predictions emanating from imbalanced malaria incidence dataset on account of registering higher F-measure values of 91.4% and 85.7% respectively.Item Reliability of Predictions Using Hybrid Models: The Case of Malaria Incidence Rates in Uganda(Journal of Health Informatics in Africa, 2020) Nabende, Peter; Bbosa, Francis Fuller; Wensonga, Ronald; Nabukenya, JosephineBackground and purpose: Reliability of estimates emanating from predictive independent data mining techniques is a complex problem. This could be attributed to cross-cutting weaknesses of individual techniques such as collinearity due to high dimensionality of attributes in a dataset, biasedness due to under fitting and over fitting of data as well as noise accumulation due to outliers and thus affecting the reliability of predictions emanating from these models. This study thus aimed at developing a hybrid data mining technique for predicting reliable malaria incidence rate thresholds. Methods: The decision tree and naïve Bayes classifiers were used to build a hybrid prediction model. Results of the developed hybrid model were compared with independent data mining models using 10- fold cross-validation on a previously unlearned data set. Accuracy, F-measure and the area under the receiver operating characteristics curve (AUC) were the key performance metrics used to evaluate the generalizability of the hybrid model in comparison to the independent models. Results: Findings revealed that the hybrid classifier attained an accuracy of 79.3% and an F-measure score of 84.2%, the naïve Bayes classifier achieved accuracy and F-measure value of 69% while the decision tree classifier registered an accuracy of 72.4% and an F-measure score of 80%. Conclusions: The developed hybrid model outperformed both independent decision tree and naïve Bayes models. Hence merging several independent homogeneous predictive data mining techniques enhances the accuracy of the estimates leading to reliable estimates.