Resumen |
Adversarial machine learning has emerged as a critical research topic aimed at securing machine learning models against security threats. In this work, we propose a evaluation method for evaluating the adversarial robustness of classification models, focusing specifically on convolutional neural networks (CNN) trained using feedback alignment algorithms and direct random target output propagation. Through experimentation and comparative analysis, we evaluate the robustness of these models against adversarial attacks, contrasting them with models trained using standard backpropagation. Our findings reveal that models trained with feedback alignment and direct random target projection exhibit superior resilience to gradient-based adversarial attacks, necessitating significant perturbations across multiple pixels for successful compromise. Conversely, models trained with backpropagation demonstrate vulnerability to attacks with smaller perturbations. Additionally, we design a evaluation method (REM) tailored to assess the robustness of these models under various attack scenarios. The present work contributes to the academic discourse by offering insights into the factors influencing model vulnerability and advancing our understanding of adversarial machine learning, thereby fostering the development of more secure machine learning systems. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. |