İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ

Fatma Gülşah Tan; Asım Sinan Yüksel; Muhammed Abdulhamid Karabıyık

doi:10.17780/ksujes.1480838

TR EN

İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ

Abstract

İstem mühendisliği, büyük dil modellerinin yeteneklerini artırmak için kritik bir teknik olarak ortaya çıkmıştır. İstem adı verilen talimatlar ile model parametrelerini değiştirmeden ince ayar yapma imkânı sunarak, bu modellerin çeşitli görevlerde üstün performans göstermesini sağlar. Bu çalışmanın temel amacı, büyük dil modeli performansını iyileştirmek, hesaplama maliyetlerini azaltmak ve kullanıcı deneyimini geliştirmek için istem mühendisliğinin nasıl etkili kullanılacağını göstermektir. Çalışmada, farklı uygulama alanlarına göre kategorize edilen ve son teknoloji 15 istem mühendisliği tekniği analiz edilmiştir. Bu teknikler, sıfır atış ve birkaç atış istemlerinden düşünce zinciri ve otomatik düşünce zinciri istemlerine kadar çeşitlilik göstermektedir. Her bir tekniğin avantajları ve dezavantajları detaylı olarak değerlendirilmiş ve performans artışının nasıl sağlandığı örnek senaryolarla gösterilmiştir. Araştırmanın sonuçları, istem mühendisliğinin büyük dil modellerinin çeşitli görev ve uygulamalarda performansını artırmada önemli bir rol oynadığını göstermektedir. Özellikle az verili öğrenme senaryolarında verimliliği artırmak ve önyargı, tutarsızlık gibi zorlukları azaltmak için yenilikçi istem mühendisliği tekniklerinin başarılı performans sergiledikleri görülmüştür. Bu bulgular, araştırmacılar ve uygulayıcılar için yol gösterici bir kaynak olarak hizmet edecek ve büyük dil modellerinin geniş çapta uygulanabilirliğini artıracaktır. Çalışmamız, istem mühendisliğinin daha iyi anlaşılmasına katkı sağlayacak ve gelecekteki araştırmalara ışık tutacaktır.

Keywords

THE ROLE OF PROMPT ENGINEERING IN THE PERFORMANCE OF LARGE LANGUAGE MODELS: ANALYSIS AND APPLICATION EXAMPLES

Abstract

Prompt engineering has emerged as a critical technique for increasing the capabilities of large language models. It enables these models to perform superiorly in various tasks by providing the opportunity to fine-tune the model without changing the parameters through instructions called prompts. The main goal of this work is to show how to effectively use prompt engineering to improve large language model performance, reduce computational costs, and improve user experience. In the study, 15 state-of-the-art prompt engineering techniques, categorized according to different application areas, were analyzed. These techniques range from zero-shot and few-shot prompts to chain-of-thought and automatic-chain-of-thought prompts. The advantages and disadvantages of each technique are evaluated in detail and how the performance increase is achieved is shown with example scenarios. The results of the research show that prompt engineering plays an important role in improving the performance of large language models in a variety of tasks and applications. Innovative prompt engineering techniques have been shown to perform successfully to increase efficiency and reduce difficulties such as bias and inconsistency, especially in low-data learning scenarios. These findings will serve as a guiding resource for researchers and practitioners and will increase the broad applicability of large language models. Our study will contribute to a better understanding of prompt engineering and shed light on future research.

Keywords

References

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language Models are Few-Shot Learners. Içinde H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Ed.), Advances in Neural Information Processing Systems (C. 33, ss. 1877-1901). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
Chen, S., Wang, W., Chen, X., Lu, P., Yang, Z., & Du, Y. (2024). LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains. Data Intelligence, 1-53. https://doi.org/10.1162/dint_a_00251
Kojima, T., Gu, S. (Shane), Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 22199-22213). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/8bb0d291acd4acf06ef112099c16f326-Paper-Conference.pdf
Lester, B., Al-Rfou, R., & Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045-3059. Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.243
Liu, J., Liu, A., Lu, X., Welleck, S., West, P., Bras, R. Le, … Hajishirzi, H. (2021). Generated Knowledge Prompting for Commonsense Reasoning. arXiv . https://doi.org/10.48550/arXiv.2110.08387
Long, J. (2023). Large Language Model Guided Tree-of-Thought. arXiv . https://doi.org/10.48550/arXiv.2305.08291
Ma, R., Zhou, X., Gui, T., Tan, Y., Li, L., Zhang, Q., & Huang, X. (2021). Template-free Prompt Tuning for Few-shot NER. arXiv . https://doi.org/10.48550/arXiv.2109.13532
Paranjape, B., Lundberg, S., Singh, S., Hajishirzi, H., Zettlemoyer, L., & Ribeiro, M. T. (2023). ART: Automatic multi-step reasoning and tool-use for large language models. arXiv . https://doi.org/10.48550/arXiv.2303.09014

Polverini, G., & Gregorcic, B. (2024a). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
Polverini, G., & Gregorcic, B. (2024b). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., & others. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv . https://doi.org/10.48550/arXiv.2402.07927
Seo, J., Moon, H., Lee, C., Eo, S., Park, C., Kim, J., … Lim, H. (2022). Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners. IEEE Access, 10, 107587-107597. https://doi.org/10.1109/ACCESS.2022.3213027
Shaier, S., Bennett, K., Hunter, L. E., & von der Wense, K. (2024). Comparing Template-based and Template-free Language Model Probing. arXiv . https://doi.org/10.48550/arXiv.2402.00123
Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: language agents with verbal reinforcement learning. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 8634-8652). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/1b44b878bb782e6954cd888628510e90-Paper-Conference.pdf
Shu, M., Nie, W., Huang, D.-A., Yu, Z., Goldstein, T., Anandkumar, A., & Xiao, C. (2022). Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 14274-14289). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/5bf2b802e24106064dc547ae9283bb0c-Paper-Conference.pdf
Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical Networks for Few-shot Learning. Içinde I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 30). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
Taveekitworachai, P., Abdullah, F., Dewantoro, M. F., Thawonmas, R., Togelius, J., & Renz, J. (2023). ChatGPT4PCG Competition: Character-like Level Generation for Science Birds. 2023 IEEE Conference on Games (CoG), 1-8. IEEE. https://doi.org/10.1109/CoG57401.2023.10333206
Vinyals, O., Blundell, C., Lillicrap, T., kavukcuoglu, koray, & Wierstra, D. (2016). Matching Networks for One Shot Learning. Içinde D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 29). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., … Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv . https://doi.org/10.48550/arXiv.2203.11171
Wu, P. (2022). A Survey of Few-Shot Learning Research Based on Deep Neural Network. Frontiers in Computing and Intelligent Systems, 2(1), 110-115. https://doi.org/10.54097/fcis.v2i1.3177
Yang, J., Guo, X., Li, Y., Marinello, F., Ercisli, S., & Zhang, Z. (2022). A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods, 18(1), 28. https://doi.org/10.1186/s13007-022-00866-2
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., & Narasimhan, K. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 11809-11822). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/271db9922b8d1f4dd7aaef84ed5ac703-Paper-Conference.pdf
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03629
Zhang, Z., Zhang, A., Li, M., & Smola, A. (2022). Automatic Chain of Thought Prompting in Large Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03493
Zheng, H. S., Mishra, S., Chen, X., Cheng, H.-T., Chi, E. H., Le, Q. V, & Zhou, D. (2024). Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models. The Twelfth International Conference on Learning Representations. Geliş tarihi gönderen https://openreview.net/forum?id=3bq3jsvcQ1
Zhou, K., Yang, J., Loy, C. C., & Liu, Z. (2022). Learning to Prompt for Vision-Language Models. International Journal of Computer Vision, 130(9), 2337-2348. https://doi.org/10.1007/s11263-022-01653-1
Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. arXiv . https://doi.org/10.48550/arXiv.2211.01910

Details

Primary Language

Turkish

Subjects

Artificial Intelligence (Other)

Journal Section

Research Article

Authors

Fatma Gülşah Tan ^*
0000-0002-2748-0396
Türkiye

Asım Sinan Yüksel
0000-0003-1986-5269
Türkiye

Muhammed Abdulhamid Karabıyık
0000-0001-7927-8790
Türkiye

Publication Date

December 3, 2024

Submission Date

May 8, 2024

Acceptance Date

June 25, 2024

Published in Issue

Year 2024 Volume: 27 Number: 4

DOI

https://doi.org/10.17780/ksujes.1480838

IZ

https://izlik.org/JA86DH24EZ

Cite

RIS / Bibtex

APA

Tan, F. G., Yüksel, A. S., & Karabıyık, M. A. (2024). İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 27(4), 1401-1420. https://doi.org/10.17780/ksujes.1480838

AMA

1.Tan FG, Yüksel AS, Karabıyık MA. İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ. KSU J. Eng. Sci. 2024;27(4):1401-1420. doi:10.17780/ksujes.1480838

Chicago

Tan, Fatma Gülşah, Asım Sinan Yüksel, and Muhammed Abdulhamid Karabıyık. 2024. “İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 27 (4): 1401-20. https://doi.org/10.17780/ksujes.1480838.

EndNote

Tan FG, Yüksel AS, Karabıyık MA (December 1, 2024) İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 27 4 1401–1420.

IEEE

[1]F. G. Tan, A. S. Yüksel, and M. A. Karabıyık, “İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ”, KSU J. Eng. Sci., vol. 27, no. 4, pp. 1401–1420, Dec. 2024, doi: 10.17780/ksujes.1480838.

ISNAD

Tan, Fatma Gülşah - Yüksel, Asım Sinan - Karabıyık, Muhammed Abdulhamid. “İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi 27/4 (December 1, 2024): 1401-1420. https://doi.org/10.17780/ksujes.1480838.

JAMA

1.Tan FG, Yüksel AS, Karabıyık MA. İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ. KSU J. Eng. Sci. 2024;27:1401–1420.

MLA

Tan, Fatma Gülşah, et al. “İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ”. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, vol. 27, no. 4, Dec. 2024, pp. 1401-20, doi:10.17780/ksujes.1480838.

Vancouver

1.Fatma Gülşah Tan, Asım Sinan Yüksel, Muhammed Abdulhamid Karabıyık. İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ. KSU J. Eng. Sci. 2024 Dec. 1;27(4):1401-20. doi:10.17780/ksujes.1480838

Cited By

The development and evaluation of agricultural question-answering systems based on large language models

Scientific Reports

https://doi.org/10.1038/s41598-026-35003-9