SABER

Autores
Abiola Olalekan Totulope
Ojo Olumide Ebenezer
Bizuneh Tewodros Achamaleh
Adebanji Olaronke Oluwayemisi
Sidorov Grigori
Kolesnikova Olga

Título	CIC-NLP at GenAI Detection Task 1: Leveraging DistilBERT for Detecting Machine-Generated Text in English
Tipo	Congreso
Sub-tipo	Memoria
Descripción	1st Workshop on GenAI Content Detection, GenAIDetect 2025
Resumen	As machine-generated texts (MGT) become increasingly similar to human writing, these distinctions are harder to identify. In this paper, we as the CIC-NLP team present our submission to the Gen-AI Content Detection Workshop at COLING 2025 for Task 1 Subtask A, which involves distinguishing between text generated by LLMs and text authored by humans, with an emphasis on detecting English-only MGT. We applied the DistilBERT model to this binary classification task using the dataset provided by the organizers. Fine-tuning the model effectively differentiated between the classes, resulting in a micro-average F1-score of 0.70 on the evaluation test set. We provide a detailed explanation of the fine-tuning parameters and steps involved in our analysis. © 2025 International Conference on Computational Linguistics.
Observaciones
Lugar	Abu Dhabi
País	Emiratos Arabes Unidos
No. de páginas	271-277
Vol. / Cap.
Inicio	2025-01-19
Fin
ISBN/ISSN	9798891762053