Resumen |
This study introduces a feature selection technique known as dMeans, focusing on 10 datasets with only two classes, with moderate dimensions ranging from 168 to 10,000 attributes. The technique calculates the relevance of each feature by considering the mean values between classes for individual features. The result of dMeans is a relevance permutation of the features, providing a nuanced understanding of feature importance based on their mean values across different classes. A significant positive impact on the performance of classifiers, including 1-NN, 3-NN, 5-NN, Naïve Bayes, Support Vector Machine, Random Forest, and Adaboost, has been observed. The application of the dMeans impacts in notable improvements in predictive efficiency for the classifiers, as indicated by optimized metrics such as F1-Score, Recall and Precision. Importantly, dMeans significantly reduces the number of features required to achieve classifier performance, often surpassing it. This study offers an effective strategy which provides valuable insights into the importance of feature selection in datasets with moderate dimensionality. dMeans enhance the performance of classification models while optimizing computational resources by reducing the number of features without compromising predictive quality. © 2024 Copyright held by the owner/author(s). |