EN
TR
EXPLORING THE EFFECTIVENESS OF PRE-TRAINED TRANSFORMER MODELS FOR TURKISH QUESTION ANSWERING
Abstract
Recent advancements in Natural Language Processing (NLP) and Artificial Intelligence (AI) have been propelled by the emergence of Transformer-based Large Language Models (LLMs), which have demonstrated outstanding performance across various tasks, including Question Answering (QA). However, the adoption and performance of these models in low-resource and morphologically rich languages like Turkish remain underexplored. This study addresses this gap by systematically evaluating several state-of-the-art Transformer-based LLMs on a curated, gold-standard Turkish QA dataset. The models evaluated include BERTurk, XLM-RoBERTa, ELECTRA-Turkish, DistilBERT, and T5-Small, with a focus on their ability to handle the unique linguistic challenges posed by Turkish. The experimental results indicate that the BERTurk model outperforms other models, achieving an F1-score of 0.8144, an Exact Match of 0.6351, and a BLEU score of 0.4035. The study highlights the importance of language-specific pre-training and the need for further research to improve the performance of LLMs in low-resource languages. The findings provide valuable insights for future efforts in enhancing Turkish NLP resources and advancing QA systems in underrepresented linguistic contexts.
Keywords
References
- Bilgin, M., Bozdemir, M., and Demir, E. (2024). Performance Analysis of Large Language Models on Turkish Question-Answer Texts. Proceedings of the 2024 Electrical-Electronics and Biomedical Engineering Conference (ELECO 2024), 1–5. Bursa, Türkiye: IEEE. https://doi.org/10.1109/ELECO64362.2024.10847201
- Bonov, P. (2025). DeepSeek climbs to top spot of the App Store, beats ChatGPT in the process. Retrieved February6, 2025, from GSMArena website: https://www.gsmarena.com/deepseek_climbs_to_top_spot_of_the_app_store_beats_chatgpt_in_the_process-news-66286.php
- Celebi, E., Gunel, B., and Sen, B. (2011). Automatic Question Answering for Turkish with Pattern Parsing. Proceedings of the 2011 International Symposium on INnovations in Intelligent SysTems and Applications, 389–393. Istanbul, Türkiye: IEEE. https://doi.org/10.1109/INISTA.2011.5946098
- Clark, K., Luong, M. T., Le, Q. V., and Manning, C. D. (2020). ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. Proceedings of the 8th International Conference on Learning Representations (ICLR 2020), 1–18. Online.
- Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., … Stoyanov, V. (2020). Unsupervised Cross-lingual Representation Learning at Scale. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8440–8451. Online: ACL. https://doi.org/10.18653/v1/2020.acl-main.747
- Datasets. (2025). Retrieved April 7, 2025, from Hugging Face website: https://huggingface.co/docs/datasets/en/index
- Derici, C., Çelik, K., Kutbay, E., Aydın, Y., Güngör, T., Özgür, A., and Kartal, G. (2015). Question Analysis for a Closed Domain Question Answering System. Proceedings of the 16th International Conference Computational Linguistics and Intelligent Text Processing (CICLing 2015), 9042. Cairo, Egypt: Springer. https://doi.org/10.1007/978-3-319-18117-2_35
- Devlin, J., Chang, M. W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2019), 1. Minneapolis, Minnesota, USA: ACL.
Details
Primary Language
English
Subjects
Deep Learning
Journal Section
Research Article
Authors
Publication Date
June 3, 2025
Submission Date
March 2, 2025
Acceptance Date
April 12, 2025
Published in Issue
Year 2025 Volume: 28 Number: 2