| Resumen |
Sarcasm detection is important in sentiment analysis and social media analysis, as the literal meaning does not agree with the filtered sentiment. This paper is specifically about sarcasm detection in the Roman Urdu language. The dataset was originally introduced for Urdu sarcasm detection and was alternatively transliterated into Roman Urdu using systematic phonetic mapping. We evaluated twelve popular machine-learning models with GloVe and Word2Vec embeddings and fine-tuned several large language models (LLMs) such as LLaMA 2 (7B), LLaMA 3 (8B), and Mistral. Experimental results reveal that Gradient Boosting and Support Vector Machines showed the best performance with F1-scores of 96.62% and 95.32% respectively, using the GloVe-based embeddings. For the LLM model that reached the best result among all classifiers, it was LLaMA 3, which was also the best among all nine evaluated models LLaMA 3 with 97.15% F1-Score, followed by Mistral with 96.32%, then LLaMA 2 with 95.43%. This work highlights the potential of Roman Urdu for advanced sarcasm detection and compares the performance of traditional machine learning techniques with state-of-the-art large language models (LLMs). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026. |