Autores
Kolesnikova Olga
Shahiki Tash Moein
Ahani Zahra
Sidorov Grigori
Título Advanced machine learning techniques for social support detection on social media
Tipo Revista
Sub-tipo JCR
Descripción Heliyon
Resumen The widespread use of social media highlights the need to understand its impact, particularly the role of online social support. In this study, we present a dataset of YouTube comments, initially comprising 66,272 entries, which was refined to 42,695, with a subset of 10,000 comments selected for detailed analysis without additional filtering. The dataset is annotated for three classification tasks: (1) distinguishing supportive from non-supportive comments, (2) determining whether the support is directed at an individual or a group, and (3) further categorizing group support into six subtypes (Nation, LGBTQ, Black Community, Women, Religion, and Other). To address data imbalances in these tasks, we employed K-means clustering to balance the dataset and compared the results with the original unbalanced data. We use state-of-the-art transformer models and zero-shot learning techniques—including GPT-3, GPT-4, and GPT-4o. Our approach, evaluated using macro F1-scores, demonstrates strong performance on the imbalanced dataset, with our transformer-based model (roberta-base) achieving scores of 0.78, 0.84, and 0.80, respectively, across the three classification tasks. These results also demonstrate a macro F1-score improvement of 0.2% for Task 2 and 0.7% for Task 3 compared to previous work that used CNN with GloVe embeddings and traditional machine learning baselines based on TF-IDF and LIWC features. © 2025 The Author(s)
Observaciones DOI 10.1016/j.heliyon.2025.e43437
Lugar Cambridge
País Estados Unidos
No. de páginas Article number e43437
Vol. / Cap. v. 11 no. 10
Inicio 2025-05-01
Fin
ISBN/ISSN