| Resumen |
This study evaluates how five indicators of dataset complexity affect the performance of 24 machine learning (ML) and deep learning (DL) classifiers across eight publicly available agriculture-related datasets. The indicators were cardinality (320-13,611 instances), dimensionality (7-35 features), class imbalance (Imbalance Ratio [IR] = 1-109.9), class number (2-40 classes), and feature types (numeric and ordinal). Performance measures, including sensitivity, specificity, balanced accuracy (BA), precision, F1-score, and Matthews Correlation Coefficient (MCC), were derived from confusion matrices generated via 10-fold cross-validation procedure. Macro and weighted-average were included as overall measures. Nonparametric tests (Friedman-Nemenyi; p < 0.05 and Cliff's [delta]) were performed for weighted-average sensitivity and BA. Across 192 analyses, ensembles (GBM, XGBoost, RF) and C5.0 significantly outperformed other classifiers on 5 out of 8 datasets, achieving values greater than 0.91. Artificial Neural Networks (ANN) showed ineffectiveness for tabular data (BA < 0.50). Extreme imbalance (White Wine: IR = 109.9) affected the classifiers performance, mainly for distance-based and probabilistic (MCC < 0.34), even the ensembles partially mitigated the bias (BA < 0.65). High dimensionality (Date Fruits: 34 features) favored LDA and RF (BA >= 0.93). Conversely, large multiclass (Soybean Cultivars: 40 classes) demonstrated higher performance of IBk (BA = 0.87). Sixty paired comparisons confirmed significant differences (p < 0.00001) and strong effects (delta = -0.57 to 0.18) between ensembles and underperforming classifiers, confirming that dimensionality, IR, and multiclass directly determine the performance. To the best of our knowledge, this is the first large-scale comparison of 24 ML/DL classifiers on eight agricultural datasets. |