Autores
Tamayo Herrera Antonio Jesús
Gelbukh Alexander
Título Augmenting a Spanish clinical dataset for transformer-based linking of negations and their out-of-scope references
Tipo Revista
Sub-tipo JCR
Descripción Natural Language Processing
Resumen A negated statement consists of three main components: the negation cue, the negation scope, and the negation reference. The negation cue is the indicator of negation, while the negation scope defines the extent of the negation. The negation reference, which may or may not be within the negation scope, is the part of the statement being negated. Although there has been considerable research on the negation cue and scope, little attention has been given to identifying negation references outside the scope, even though they make up almost half of all negations. In this study, our goal is to identify out-of-scope references (OSRs) to restore the meaning of truncated negated statements identified by negation detection systems. To achieve this, we augment the largest available Spanish clinical dataset by adding annotations for OSRs. Additionally, we fine-tune five robust BERT-based models using transfer learning to address negation detection, uncertainty detection, and OSR identification and linking with their respective negation scopes. Our best model achieves state-of-the-art performance in negation detection while also establishing a competitive baseline for OSR identification (Macro F1 = 0.56) and linking (Macro F1 = 0.86). We support these findings with relevant statistics from the newly annotated dataset and an extensive review of existing literature.
Observaciones DOI 10.1017/nlp.2024.10
Lugar Cambridge
País Reino Unido
No. de páginas 56-89
Vol. / Cap. v. 31 no. 1
Inicio 2025-01-01
Fin
ISBN/ISSN