政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/158579

政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/158579

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 117629/148660 (79%)
Visitors : 71723987 Online Users : 526

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大典藏 > College of Commerce > Department of MIS > Theses > Item 140.119/158579

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/158579

Title:	應用自動化提示工程與RAG機制於問答系統之優化 Application of automated prompt engineering and the RAG mechanism to the optimization of question answering systems
Authors:	葉柏皓 Yeh, Bo-Hao
Contributors:	陳恭 Chen, Kung 葉柏皓 Yeh, Bo-Hao
Keywords:	生成式AI RAG 自動化提示工程 PE2 BERT Score Generative AI RAG Automated Prompt Engineering PE2 BERT Score
Date:	2025
Issue Date:	2025-08-04 14:27:45 (UTC+8)
Abstract:	生程式AI近年快速崛起，其中結合RAG（Retrieval-Augmented Generation）問答系統的應用更受到許多不同產業界廣泛關注。然而，作為問答系統核心的大型語言模型（LLM），其回答品質直接影響系統效能。傳統上透過微調LLM的方法，往往需投入大量硬體資源與專業技術，導致推廣困難。因此，本研究以自動化提示工程方法 PE2（Prompt Engineering a Prompt Engineer）為基礎框架，並根據實際應用情境進行調整與設計，將其有效融合至生成式 AI 的 RAG問答系統中。透過自動化調整與優化查詢（Query）的方式，在無需對模型進行額外微調的前提下，有效提升LLM的回答品質，並降低系統建置所需的資源成本與技術門檻。實驗結果顯示，本研究所提出之方法能有效提高LLM的回答品質，並改善語意相關度評估指標（BERT Score）。此外，本研究亦自行設計了一套客觀的 Query 評估標準，取代以往缺乏統一客觀指標，僅能依靠人工主觀判斷 Query 品質之不足，進一步提升了提示詞評估的一致性與可靠性。本研究最後亦提出未來的研究方向，聚焦於進一步強化生成式 AI 問答系統的穩定性與準確性，期望透過持續優化與擴展，使其更能因應多元且複雜的應用情境，提升實務運用價值。 Generative AI has rapidly emerged in recent years, with RAG (Retrieval-Augmented Generation) QA systems receiving growing attention across industries. As the core of these systems, the response quality of large language models (LLMs) greatly affects system performance. However, improving LLMs through fine-tuning requires substantial resources and expertise, limiting its scalability. This study adopts the automated prompt engineering method PE2 (Prompt Engineer a Prompt Engineer) as a framework, tailoring it to real-world scenarios and integrating it into a generative AI-based RAG QA system. By automatically adjusting and optimizing prompt queries, our method improves response quality without additional fine-tuning, reducing technical and resource costs. Experiments show that the proposed approach effectively increases answer quality and improves semantic relevance (BERT Score). Additionally, we design an objective query evaluation standard to replace subjective judgment and enhance prompt consistency. Finally, this study proposes future directions for improving the robustness and precision of generative AI QA systems, aiming to enhance their adaptability to diverse and complex application scenarios.
Reference:	Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901. Chase, H. (2022). LangChain [Software]. https://github.com/langchain-ai/langchain Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., ... & Wang, H. (2023). Retrieval- augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2. Gupta, S., Ranjan, R., & Singh, S. N. (2024). A Comprehensive Survey of Retrieval- Augmented Generation (RAG): Evolution, Current Landscape and Future Directions. arXiv preprint arXiv:2410.12837. Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020, November). Retrieval augmented language model pre-training. In International conference on machine learning (pp. 3929-3938). PMLR. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., ... & Fung, P. (2023). Survey of hallucination in natural language generation. ACM computing surveys, 55(12), 1-38. Karpukhin, V., Oguz, B., Min, S., Lewis, P. S., Wu, L., Edunov, S., ... & Yih, W. T. (2020, November). Dense Passage Retrieval for Open-Domain Question Answering. In EMNLP (1) (pp. 6769-6781). Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33, 9459-9474. PromptEngineering.org. (2024). What is prompt engineering? Retrieved May 10, 2025, from https://promptengineering.org/what-is-prompt-engineering/ Pryzant, R., Iter, D., Li, J., Lee, Y. T., Zhu, C., & Zeng, M. (2023). Automatic prompt optimization with" gradient descent" and beam search. arXiv preprint arXiv:2305.03495. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140), 1-67. Sahoo, P., Singh, A. K., Saha, S., Jain, V., Mondal, S., & Chadha, A. (2024). A systematic survey of prompt engineering in large language models: Techniques and applications. arXiv preprint arXiv:2402.07927. Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., ... & Resnik, P. (2024). The prompt report: A systematic survey of prompting techniques. arXiv preprint arXiv:2406.06608. Suzgun, M., Scales, N., Schärli, N., Gehrmann, S., Tay, Y., Chung, H. W., ... & Wei, J. (2022). Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261. Vatsal, S., & Dubey, H. (2024). A survey of prompt engineering methods in large language models for different nlp tasks. arXiv preprint arXiv:2407.12994. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., ... & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35, 24824-24837. Ye, Q., Axmed, M., Pryzant, R., & Khani, F. (2023). Prompt engineering a prompt engineer. arXiv preprint arXiv:2311.05661. Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., ... & Wen, J. R. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223, 1(2). Zheng, L., Chiang, W. L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., ... & Stoica, I. (2023). Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, 36, 46595-46623.
Description:	碩士國立政治大學資訊管理學系 112356038
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0112356038
Data Type:	thesis
Appears in Collections:	[Department of MIS] Theses

Files in This Item:

File	Description	Size	Format
603801.pdf		3585Kb	Adobe PDF	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback