Clustering and Classification of Cotton Lint Using Principle Component Analysis, Agglomerative Hierarchical Clustering, and K-Means Clustering

Abstract
Cotton from the three cotton growing regions of Uganda was characterized for 13 quality parameters using the High Volume Instrument (HVI). Principal Component Analysis (PCA), Agglomerative Hierarchical Clustering (AHC) and k-means clustering were used to model cotton quality parameters. Using factor analysis, cotton yellowness and short fiber index were found to account for the highest variability. At 5% significance level, the highest correlation (0.73) was found between short fiber index and yellowness. Based on Cotton Outlook’s world classification and USDA Standards, the cotton under test was deemed of high and uniform quality, falling between Middling and Good Middling grades. Our suggested classification integrates all lint quality parameters, unlike the traditional methods that consider selected parameters.
Description
Keywords
Agglomerative hierarchical clustering (AHC), Classification, Cotton quality, High volume instrument (HVI), k-means clustering, Principal component analysis (PCA)
Citation
Edwin Kamalha , Jovan Kiberu, Ildephonse Nibikora, Josphat Igadwa Mwasiagi & Edison Omollo (2017): Clustering and Classification of Cotton Lint Using Principle Component Analysis, Agglomerative Hierarchical Clustering, and K-Means Clustering, Journal of Natural Fibers, DOI: 10.1080/15440478.2017.1340220