Resumen |
This research explores the sense disambiguation of polysemous verbs in the definitions of a digital dictionary (WordNet) through two similarity metrics: Cosine Similarity and the Modified Simplified Lesk Algorithm. These methods were applied with the aim of identifying the three senses of greatest correspondence for each verb, thus facilitating the selection of the most appropriate sense according to the context in which it appears. Verb sense disambiguation is a fundamental task in the field of Natural Language Processing (NLP), with broad applications including automatic translation, information retrieval and semantic analysis, and it has been evaluated in competitions such as SensEval and SemEval. In this research, WordNet was employed to build a dataset that includes verbs with multiple definitions, representing a common and challenging scenario in lexical disambiguation. To validate the results, a manual evaluation by experts was used, allowing to establish a reliable reference on the correct meaning of each verb. Accuracy results were 82.78% for the Cosine Similarity method and 73.12% for the Modified Simplified Lesk Algorithm, evidencing the relative effectiveness of each method and providing a starting point for improving disambiguation models in advanced PLN tasks. © 2024 Instituto Politecnico Nacional. All rights reserved. |