İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ

Fatma Gülşah Tan; Asım Sinan Yüksel; Muhammed Abdulhamid Karabıyık

doi:10.17780/ksujes.1480838

Araştırma Makalesi

İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ

Yıl 2024, , 1401 - 1420, 03.12.2024

Fatma Gülşah Tan , Asım Sinan Yüksel , Muhammed Abdulhamid Karabıyık

https://doi.org/10.17780/ksujes.1480838

Öz

İstem mühendisliği, büyük dil modellerinin yeteneklerini artırmak için kritik bir teknik olarak ortaya çıkmıştır. İstem adı verilen talimatlar ile model parametrelerini değiştirmeden ince ayar yapma imkânı sunarak, bu modellerin çeşitli görevlerde üstün performans göstermesini sağlar. Bu çalışmanın temel amacı, büyük dil modeli performansını iyileştirmek, hesaplama maliyetlerini azaltmak ve kullanıcı deneyimini geliştirmek için istem mühendisliğinin nasıl etkili kullanılacağını göstermektir. Çalışmada, farklı uygulama alanlarına göre kategorize edilen ve son teknoloji 15 istem mühendisliği tekniği analiz edilmiştir. Bu teknikler, sıfır atış ve birkaç atış istemlerinden düşünce zinciri ve otomatik düşünce zinciri istemlerine kadar çeşitlilik göstermektedir. Her bir tekniğin avantajları ve dezavantajları detaylı olarak değerlendirilmiş ve performans artışının nasıl sağlandığı örnek senaryolarla gösterilmiştir. Araştırmanın sonuçları, istem mühendisliğinin büyük dil modellerinin çeşitli görev ve uygulamalarda performansını artırmada önemli bir rol oynadığını göstermektedir. Özellikle az verili öğrenme senaryolarında verimliliği artırmak ve önyargı, tutarsızlık gibi zorlukları azaltmak için yenilikçi istem mühendisliği tekniklerinin başarılı performans sergiledikleri görülmüştür. Bu bulgular, araştırmacılar ve uygulayıcılar için yol gösterici bir kaynak olarak hizmet edecek ve büyük dil modellerinin geniş çapta uygulanabilirliğini artıracaktır. Çalışmamız, istem mühendisliğinin daha iyi anlaşılmasına katkı sağlayacak ve gelecekteki araştırmalara ışık tutacaktır.

Anahtar Kelimeler

Büyük dil modelleri, İstem Mühendisliği, Doğal Dil İşleme, Yapay Zeka

Kaynakça

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language Models are Few-Shot Learners. Içinde H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Ed.), Advances in Neural Information Processing Systems (C. 33, ss. 1877-1901). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
Chen, S., Wang, W., Chen, X., Lu, P., Yang, Z., & Du, Y. (2024). LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains. Data Intelligence, 1-53. https://doi.org/10.1162/dint_a_00251
Kojima, T., Gu, S. (Shane), Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 22199-22213). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/8bb0d291acd4acf06ef112099c16f326-Paper-Conference.pdf
Lester, B., Al-Rfou, R., & Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045-3059. Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.243
Liu, J., Liu, A., Lu, X., Welleck, S., West, P., Bras, R. Le, … Hajishirzi, H. (2021). Generated Knowledge Prompting for Commonsense Reasoning. arXiv . https://doi.org/10.48550/arXiv.2110.08387
Long, J. (2023). Large Language Model Guided Tree-of-Thought. arXiv . https://doi.org/10.48550/arXiv.2305.08291
Ma, R., Zhou, X., Gui, T., Tan, Y., Li, L., Zhang, Q., & Huang, X. (2021). Template-free Prompt Tuning for Few-shot NER. arXiv . https://doi.org/10.48550/arXiv.2109.13532
Paranjape, B., Lundberg, S., Singh, S., Hajishirzi, H., Zettlemoyer, L., & Ribeiro, M. T. (2023). ART: Automatic multi-step reasoning and tool-use for large language models. arXiv . https://doi.org/10.48550/arXiv.2303.09014
Polverini, G., & Gregorcic, B. (2024a). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
Polverini, G., & Gregorcic, B. (2024b). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., & others. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv . https://doi.org/10.48550/arXiv.2402.07927
Seo, J., Moon, H., Lee, C., Eo, S., Park, C., Kim, J., … Lim, H. (2022). Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners. IEEE Access, 10, 107587-107597. https://doi.org/10.1109/ACCESS.2022.3213027
Shaier, S., Bennett, K., Hunter, L. E., & von der Wense, K. (2024). Comparing Template-based and Template-free Language Model Probing. arXiv . https://doi.org/10.48550/arXiv.2402.00123
Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: language agents with verbal reinforcement learning. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 8634-8652). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/1b44b878bb782e6954cd888628510e90-Paper-Conference.pdf
Shu, M., Nie, W., Huang, D.-A., Yu, Z., Goldstein, T., Anandkumar, A., & Xiao, C. (2022). Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 14274-14289). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/5bf2b802e24106064dc547ae9283bb0c-Paper-Conference.pdf
Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical Networks for Few-shot Learning. Içinde I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 30). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
Taveekitworachai, P., Abdullah, F., Dewantoro, M. F., Thawonmas, R., Togelius, J., & Renz, J. (2023). ChatGPT4PCG Competition: Character-like Level Generation for Science Birds. 2023 IEEE Conference on Games (CoG), 1-8. IEEE. https://doi.org/10.1109/CoG57401.2023.10333206
Vinyals, O., Blundell, C., Lillicrap, T., kavukcuoglu, koray, & Wierstra, D. (2016). Matching Networks for One Shot Learning. Içinde D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 29). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., … Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv . https://doi.org/10.48550/arXiv.2203.11171
Wu, P. (2022). A Survey of Few-Shot Learning Research Based on Deep Neural Network. Frontiers in Computing and Intelligent Systems, 2(1), 110-115. https://doi.org/10.54097/fcis.v2i1.3177
Yang, J., Guo, X., Li, Y., Marinello, F., Ercisli, S., & Zhang, Z. (2022). A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods, 18(1), 28. https://doi.org/10.1186/s13007-022-00866-2
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., & Narasimhan, K. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 11809-11822). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/271db9922b8d1f4dd7aaef84ed5ac703-Paper-Conference.pdf
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03629
Zhang, Z., Zhang, A., Li, M., & Smola, A. (2022). Automatic Chain of Thought Prompting in Large Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03493
Zheng, H. S., Mishra, S., Chen, X., Cheng, H.-T., Chi, E. H., Le, Q. V, & Zhou, D. (2024). Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models. The Twelfth International Conference on Learning Representations. Geliş tarihi gönderen https://openreview.net/forum?id=3bq3jsvcQ1
Zhou, K., Yang, J., Loy, C. C., & Liu, Z. (2022). Learning to Prompt for Vision-Language Models. International Journal of Computer Vision, 130(9), 2337-2348. https://doi.org/10.1007/s11263-022-01653-1
Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. arXiv . https://doi.org/10.48550/arXiv.2211.01910

THE ROLE OF PROMPT ENGINEERING IN THE PERFORMANCE OF LARGE LANGUAGE MODELS: ANALYSIS AND APPLICATION EXAMPLES

Yıl 2024, , 1401 - 1420, 03.12.2024

Fatma Gülşah Tan , Asım Sinan Yüksel , Muhammed Abdulhamid Karabıyık

https://doi.org/10.17780/ksujes.1480838

Öz

Prompt engineering has emerged as a critical technique for increasing the capabilities of large language models. It enables these models to perform superiorly in various tasks by providing the opportunity to fine-tune the model without changing the parameters through instructions called prompts. The main goal of this work is to show how to effectively use prompt engineering to improve large language model performance, reduce computational costs, and improve user experience. In the study, 15 state-of-the-art prompt engineering techniques, categorized according to different application areas, were analyzed. These techniques range from zero-shot and few-shot prompts to chain-of-thought and automatic-chain-of-thought prompts. The advantages and disadvantages of each technique are evaluated in detail and how the performance increase is achieved is shown with example scenarios. The results of the research show that prompt engineering plays an important role in improving the performance of large language models in a variety of tasks and applications. Innovative prompt engineering techniques have been shown to perform successfully to increase efficiency and reduce difficulties such as bias and inconsistency, especially in low-data learning scenarios. These findings will serve as a guiding resource for researchers and practitioners and will increase the broad applicability of large language models. Our study will contribute to a better understanding of prompt engineering and shed light on future research.

Anahtar Kelimeler

Large Language Models, Prompt Engineering, Natural Language Processing, Artificial Intelligence

Kaynakça

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language Models are Few-Shot Learners. Içinde H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Ed.), Advances in Neural Information Processing Systems (C. 33, ss. 1877-1901). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
Chen, S., Wang, W., Chen, X., Lu, P., Yang, Z., & Du, Y. (2024). LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains. Data Intelligence, 1-53. https://doi.org/10.1162/dint_a_00251
Kojima, T., Gu, S. (Shane), Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 22199-22213). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/8bb0d291acd4acf06ef112099c16f326-Paper-Conference.pdf
Lester, B., Al-Rfou, R., & Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045-3059. Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.243
Liu, J., Liu, A., Lu, X., Welleck, S., West, P., Bras, R. Le, … Hajishirzi, H. (2021). Generated Knowledge Prompting for Commonsense Reasoning. arXiv . https://doi.org/10.48550/arXiv.2110.08387
Long, J. (2023). Large Language Model Guided Tree-of-Thought. arXiv . https://doi.org/10.48550/arXiv.2305.08291
Ma, R., Zhou, X., Gui, T., Tan, Y., Li, L., Zhang, Q., & Huang, X. (2021). Template-free Prompt Tuning for Few-shot NER. arXiv . https://doi.org/10.48550/arXiv.2109.13532
Paranjape, B., Lundberg, S., Singh, S., Hajishirzi, H., Zettlemoyer, L., & Ribeiro, M. T. (2023). ART: Automatic multi-step reasoning and tool-use for large language models. arXiv . https://doi.org/10.48550/arXiv.2303.09014
Polverini, G., & Gregorcic, B. (2024a). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
Polverini, G., & Gregorcic, B. (2024b). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., & others. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv . https://doi.org/10.48550/arXiv.2402.07927
Seo, J., Moon, H., Lee, C., Eo, S., Park, C., Kim, J., … Lim, H. (2022). Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners. IEEE Access, 10, 107587-107597. https://doi.org/10.1109/ACCESS.2022.3213027
Shaier, S., Bennett, K., Hunter, L. E., & von der Wense, K. (2024). Comparing Template-based and Template-free Language Model Probing. arXiv . https://doi.org/10.48550/arXiv.2402.00123
Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: language agents with verbal reinforcement learning. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 8634-8652). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/1b44b878bb782e6954cd888628510e90-Paper-Conference.pdf
Shu, M., Nie, W., Huang, D.-A., Yu, Z., Goldstein, T., Anandkumar, A., & Xiao, C. (2022). Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 14274-14289). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/5bf2b802e24106064dc547ae9283bb0c-Paper-Conference.pdf
Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical Networks for Few-shot Learning. Içinde I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 30). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
Taveekitworachai, P., Abdullah, F., Dewantoro, M. F., Thawonmas, R., Togelius, J., & Renz, J. (2023). ChatGPT4PCG Competition: Character-like Level Generation for Science Birds. 2023 IEEE Conference on Games (CoG), 1-8. IEEE. https://doi.org/10.1109/CoG57401.2023.10333206
Vinyals, O., Blundell, C., Lillicrap, T., kavukcuoglu, koray, & Wierstra, D. (2016). Matching Networks for One Shot Learning. Içinde D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 29). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., … Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv . https://doi.org/10.48550/arXiv.2203.11171
Wu, P. (2022). A Survey of Few-Shot Learning Research Based on Deep Neural Network. Frontiers in Computing and Intelligent Systems, 2(1), 110-115. https://doi.org/10.54097/fcis.v2i1.3177
Yang, J., Guo, X., Li, Y., Marinello, F., Ercisli, S., & Zhang, Z. (2022). A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods, 18(1), 28. https://doi.org/10.1186/s13007-022-00866-2
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., & Narasimhan, K. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 11809-11822). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/271db9922b8d1f4dd7aaef84ed5ac703-Paper-Conference.pdf
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03629
Zhang, Z., Zhang, A., Li, M., & Smola, A. (2022). Automatic Chain of Thought Prompting in Large Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03493
Zheng, H. S., Mishra, S., Chen, X., Cheng, H.-T., Chi, E. H., Le, Q. V, & Zhou, D. (2024). Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models. The Twelfth International Conference on Learning Representations. Geliş tarihi gönderen https://openreview.net/forum?id=3bq3jsvcQ1
Zhou, K., Yang, J., Loy, C. C., & Liu, Z. (2022). Learning to Prompt for Vision-Language Models. International Journal of Computer Vision, 130(9), 2337-2348. https://doi.org/10.1007/s11263-022-01653-1
Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. arXiv . https://doi.org/10.48550/arXiv.2211.01910

Toplam 28 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Yapay Zeka (Diğer)
Bölüm	Bilgisayar Mühendisliği
Yazarlar	Fatma Gülşah Tan 0000-0002-2748-0396 Asım Sinan Yüksel 0000-0003-1986-5269 Muhammed Abdulhamid Karabıyık 0000-0001-7927-8790
Yayımlanma Tarihi	3 Aralık 2024
Gönderilme Tarihi	8 Mayıs 2024
Kabul Tarihi	25 Haziran 2024
Yayımlandığı Sayı	Yıl 2024

Kaynak Göster

APA	Tan, F. G., Yüksel, A. S., & Karabıyık, M. A. (2024). İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 27(4), 1401-1420. https://doi.org/10.17780/ksujes.1480838

Makale Dosyaları

Tam Metin