Research Article
BibTex RIS Cite

İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ

Year 2024, Volume: 27 Issue: 4, 1401 - 1420, 03.12.2024
https://doi.org/10.17780/ksujes.1480838

Abstract

İstem mühendisliği, büyük dil modellerinin yeteneklerini artırmak için kritik bir teknik olarak ortaya çıkmıştır. İstem adı verilen talimatlar ile model parametrelerini değiştirmeden ince ayar yapma imkânı sunarak, bu modellerin çeşitli görevlerde üstün performans göstermesini sağlar. Bu çalışmanın temel amacı, büyük dil modeli performansını iyileştirmek, hesaplama maliyetlerini azaltmak ve kullanıcı deneyimini geliştirmek için istem mühendisliğinin nasıl etkili kullanılacağını göstermektir. Çalışmada, farklı uygulama alanlarına göre kategorize edilen ve son teknoloji 15 istem mühendisliği tekniği analiz edilmiştir. Bu teknikler, sıfır atış ve birkaç atış istemlerinden düşünce zinciri ve otomatik düşünce zinciri istemlerine kadar çeşitlilik göstermektedir. Her bir tekniğin avantajları ve dezavantajları detaylı olarak değerlendirilmiş ve performans artışının nasıl sağlandığı örnek senaryolarla gösterilmiştir. Araştırmanın sonuçları, istem mühendisliğinin büyük dil modellerinin çeşitli görev ve uygulamalarda performansını artırmada önemli bir rol oynadığını göstermektedir. Özellikle az verili öğrenme senaryolarında verimliliği artırmak ve önyargı, tutarsızlık gibi zorlukları azaltmak için yenilikçi istem mühendisliği tekniklerinin başarılı performans sergiledikleri görülmüştür. Bu bulgular, araştırmacılar ve uygulayıcılar için yol gösterici bir kaynak olarak hizmet edecek ve büyük dil modellerinin geniş çapta uygulanabilirliğini artıracaktır. Çalışmamız, istem mühendisliğinin daha iyi anlaşılmasına katkı sağlayacak ve gelecekteki araştırmalara ışık tutacaktır.

References

  • Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language Models are Few-Shot Learners. Içinde H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Ed.), Advances in Neural Information Processing Systems (C. 33, ss. 1877-1901). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  • Chen, S., Wang, W., Chen, X., Lu, P., Yang, Z., & Du, Y. (2024). LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains. Data Intelligence, 1-53. https://doi.org/10.1162/dint_a_00251
  • Kojima, T., Gu, S. (Shane), Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 22199-22213). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/8bb0d291acd4acf06ef112099c16f326-Paper-Conference.pdf
  • Lester, B., Al-Rfou, R., & Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045-3059. Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.243
  • Liu, J., Liu, A., Lu, X., Welleck, S., West, P., Bras, R. Le, … Hajishirzi, H. (2021). Generated Knowledge Prompting for Commonsense Reasoning. arXiv . https://doi.org/10.48550/arXiv.2110.08387
  • Long, J. (2023). Large Language Model Guided Tree-of-Thought. arXiv . https://doi.org/10.48550/arXiv.2305.08291
  • Ma, R., Zhou, X., Gui, T., Tan, Y., Li, L., Zhang, Q., & Huang, X. (2021). Template-free Prompt Tuning for Few-shot NER. arXiv . https://doi.org/10.48550/arXiv.2109.13532
  • Paranjape, B., Lundberg, S., Singh, S., Hajishirzi, H., Zettlemoyer, L., & Ribeiro, M. T. (2023). ART: Automatic multi-step reasoning and tool-use for large language models. arXiv . https://doi.org/10.48550/arXiv.2303.09014
  • Polverini, G., & Gregorcic, B. (2024a). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
  • Polverini, G., & Gregorcic, B. (2024b). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
  • Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., & others. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
  • Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv . https://doi.org/10.48550/arXiv.2402.07927
  • Seo, J., Moon, H., Lee, C., Eo, S., Park, C., Kim, J., … Lim, H. (2022). Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners. IEEE Access, 10, 107587-107597. https://doi.org/10.1109/ACCESS.2022.3213027
  • Shaier, S., Bennett, K., Hunter, L. E., & von der Wense, K. (2024). Comparing Template-based and Template-free Language Model Probing. arXiv . https://doi.org/10.48550/arXiv.2402.00123
  • Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: language agents with verbal reinforcement learning. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 8634-8652). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/1b44b878bb782e6954cd888628510e90-Paper-Conference.pdf
  • Shu, M., Nie, W., Huang, D.-A., Yu, Z., Goldstein, T., Anandkumar, A., & Xiao, C. (2022). Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 14274-14289). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/5bf2b802e24106064dc547ae9283bb0c-Paper-Conference.pdf
  • Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical Networks for Few-shot Learning. Içinde I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 30). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
  • Taveekitworachai, P., Abdullah, F., Dewantoro, M. F., Thawonmas, R., Togelius, J., & Renz, J. (2023). ChatGPT4PCG Competition: Character-like Level Generation for Science Birds. 2023 IEEE Conference on Games (CoG), 1-8. IEEE. https://doi.org/10.1109/CoG57401.2023.10333206
  • Vinyals, O., Blundell, C., Lillicrap, T., kavukcuoglu, koray, & Wierstra, D. (2016). Matching Networks for One Shot Learning. Içinde D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 29). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
  • Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., … Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv . https://doi.org/10.48550/arXiv.2203.11171
  • Wu, P. (2022). A Survey of Few-Shot Learning Research Based on Deep Neural Network. Frontiers in Computing and Intelligent Systems, 2(1), 110-115. https://doi.org/10.54097/fcis.v2i1.3177
  • Yang, J., Guo, X., Li, Y., Marinello, F., Ercisli, S., & Zhang, Z. (2022). A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods, 18(1), 28. https://doi.org/10.1186/s13007-022-00866-2
  • Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., & Narasimhan, K. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 11809-11822). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/271db9922b8d1f4dd7aaef84ed5ac703-Paper-Conference.pdf
  • Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03629
  • Zhang, Z., Zhang, A., Li, M., & Smola, A. (2022). Automatic Chain of Thought Prompting in Large Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03493
  • Zheng, H. S., Mishra, S., Chen, X., Cheng, H.-T., Chi, E. H., Le, Q. V, & Zhou, D. (2024). Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models. The Twelfth International Conference on Learning Representations. Geliş tarihi gönderen https://openreview.net/forum?id=3bq3jsvcQ1
  • Zhou, K., Yang, J., Loy, C. C., & Liu, Z. (2022). Learning to Prompt for Vision-Language Models. International Journal of Computer Vision, 130(9), 2337-2348. https://doi.org/10.1007/s11263-022-01653-1
  • Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. arXiv . https://doi.org/10.48550/arXiv.2211.01910

THE ROLE OF PROMPT ENGINEERING IN THE PERFORMANCE OF LARGE LANGUAGE MODELS: ANALYSIS AND APPLICATION EXAMPLES

Year 2024, Volume: 27 Issue: 4, 1401 - 1420, 03.12.2024
https://doi.org/10.17780/ksujes.1480838

Abstract

Prompt engineering has emerged as a critical technique for increasing the capabilities of large language models. It enables these models to perform superiorly in various tasks by providing the opportunity to fine-tune the model without changing the parameters through instructions called prompts. The main goal of this work is to show how to effectively use prompt engineering to improve large language model performance, reduce computational costs, and improve user experience. In the study, 15 state-of-the-art prompt engineering techniques, categorized according to different application areas, were analyzed. These techniques range from zero-shot and few-shot prompts to chain-of-thought and automatic-chain-of-thought prompts. The advantages and disadvantages of each technique are evaluated in detail and how the performance increase is achieved is shown with example scenarios. The results of the research show that prompt engineering plays an important role in improving the performance of large language models in a variety of tasks and applications. Innovative prompt engineering techniques have been shown to perform successfully to increase efficiency and reduce difficulties such as bias and inconsistency, especially in low-data learning scenarios. These findings will serve as a guiding resource for researchers and practitioners and will increase the broad applicability of large language models. Our study will contribute to a better understanding of prompt engineering and shed light on future research.

References

  • Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language Models are Few-Shot Learners. Içinde H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Ed.), Advances in Neural Information Processing Systems (C. 33, ss. 1877-1901). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  • Chen, S., Wang, W., Chen, X., Lu, P., Yang, Z., & Du, Y. (2024). LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains. Data Intelligence, 1-53. https://doi.org/10.1162/dint_a_00251
  • Kojima, T., Gu, S. (Shane), Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 22199-22213). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/8bb0d291acd4acf06ef112099c16f326-Paper-Conference.pdf
  • Lester, B., Al-Rfou, R., & Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045-3059. Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.243
  • Liu, J., Liu, A., Lu, X., Welleck, S., West, P., Bras, R. Le, … Hajishirzi, H. (2021). Generated Knowledge Prompting for Commonsense Reasoning. arXiv . https://doi.org/10.48550/arXiv.2110.08387
  • Long, J. (2023). Large Language Model Guided Tree-of-Thought. arXiv . https://doi.org/10.48550/arXiv.2305.08291
  • Ma, R., Zhou, X., Gui, T., Tan, Y., Li, L., Zhang, Q., & Huang, X. (2021). Template-free Prompt Tuning for Few-shot NER. arXiv . https://doi.org/10.48550/arXiv.2109.13532
  • Paranjape, B., Lundberg, S., Singh, S., Hajishirzi, H., Zettlemoyer, L., & Ribeiro, M. T. (2023). ART: Automatic multi-step reasoning and tool-use for large language models. arXiv . https://doi.org/10.48550/arXiv.2303.09014
  • Polverini, G., & Gregorcic, B. (2024a). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
  • Polverini, G., & Gregorcic, B. (2024b). How understanding large language models can inform the use of ChatGPT in physics education. European Journal of Physics, 45(2), 025701. https://doi.org/10.1088/1361-6404/ad1420
  • Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., & others. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
  • Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv . https://doi.org/10.48550/arXiv.2402.07927
  • Seo, J., Moon, H., Lee, C., Eo, S., Park, C., Kim, J., … Lim, H. (2022). Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners. IEEE Access, 10, 107587-107597. https://doi.org/10.1109/ACCESS.2022.3213027
  • Shaier, S., Bennett, K., Hunter, L. E., & von der Wense, K. (2024). Comparing Template-based and Template-free Language Model Probing. arXiv . https://doi.org/10.48550/arXiv.2402.00123
  • Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: language agents with verbal reinforcement learning. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 8634-8652). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/1b44b878bb782e6954cd888628510e90-Paper-Conference.pdf
  • Shu, M., Nie, W., Huang, D.-A., Yu, Z., Goldstein, T., Anandkumar, A., & Xiao, C. (2022). Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models. Içinde S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh (Ed.), Advances in Neural Information Processing Systems (C. 35, ss. 14274-14289). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2022/file/5bf2b802e24106064dc547ae9283bb0c-Paper-Conference.pdf
  • Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical Networks for Few-shot Learning. Içinde I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 30). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
  • Taveekitworachai, P., Abdullah, F., Dewantoro, M. F., Thawonmas, R., Togelius, J., & Renz, J. (2023). ChatGPT4PCG Competition: Character-like Level Generation for Science Birds. 2023 IEEE Conference on Games (CoG), 1-8. IEEE. https://doi.org/10.1109/CoG57401.2023.10333206
  • Vinyals, O., Blundell, C., Lillicrap, T., kavukcuoglu, koray, & Wierstra, D. (2016). Matching Networks for One Shot Learning. Içinde D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, & R. Garnett (Ed.), Advances in Neural Information Processing Systems (C. 29). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
  • Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., … Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv . https://doi.org/10.48550/arXiv.2203.11171
  • Wu, P. (2022). A Survey of Few-Shot Learning Research Based on Deep Neural Network. Frontiers in Computing and Intelligent Systems, 2(1), 110-115. https://doi.org/10.54097/fcis.v2i1.3177
  • Yang, J., Guo, X., Li, Y., Marinello, F., Ercisli, S., & Zhang, Z. (2022). A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods, 18(1), 28. https://doi.org/10.1186/s13007-022-00866-2
  • Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., & Narasimhan, K. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Içinde A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Ed.), Advances in Neural Information Processing Systems (C. 36, ss. 11809-11822). Curran Associates, Inc. Geliş tarihi gönderen https://proceedings.neurips.cc/paper_files/paper/2023/file/271db9922b8d1f4dd7aaef84ed5ac703-Paper-Conference.pdf
  • Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03629
  • Zhang, Z., Zhang, A., Li, M., & Smola, A. (2022). Automatic Chain of Thought Prompting in Large Language Models. arXiv . https://doi.org/10.48550/arXiv.2210.03493
  • Zheng, H. S., Mishra, S., Chen, X., Cheng, H.-T., Chi, E. H., Le, Q. V, & Zhou, D. (2024). Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models. The Twelfth International Conference on Learning Representations. Geliş tarihi gönderen https://openreview.net/forum?id=3bq3jsvcQ1
  • Zhou, K., Yang, J., Loy, C. C., & Liu, Z. (2022). Learning to Prompt for Vision-Language Models. International Journal of Computer Vision, 130(9), 2337-2348. https://doi.org/10.1007/s11263-022-01653-1
  • Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. arXiv . https://doi.org/10.48550/arXiv.2211.01910
There are 28 citations in total.

Details

Primary Language Turkish
Subjects Artificial Intelligence (Other)
Journal Section Computer Engineering
Authors

Fatma Gülşah Tan 0000-0002-2748-0396

Asım Sinan Yüksel 0000-0003-1986-5269

Muhammed Abdulhamid Karabıyık 0000-0001-7927-8790

Publication Date December 3, 2024
Submission Date May 8, 2024
Acceptance Date June 25, 2024
Published in Issue Year 2024Volume: 27 Issue: 4

Cite

APA Tan, F. G., Yüksel, A. S., & Karabıyık, M. A. (2024). İSTEM MÜHENDİSLİĞİNİN BÜYÜK DİL MODELLERİNİN PERFORMANSINDAKİ ROLÜ: ANALİZ VE UYGULAMA ÖRNEKLERİ. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 27(4), 1401-1420. https://doi.org/10.17780/ksujes.1480838