Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/155989
|
Title: | 基於小型特定領域語言模型建構數位導遊學習問答系統 —以台灣達悟族觀光文化為例 Development of a Digital Tour Guide Learning and Q&A System Based on a Small-Scale, Domain-Specific Language Model: A Case Study of the Tourism Culture of Taiwan's Tao Tribe |
Authors: | 孫詩傑 Sun, Shih-Jie |
Contributors: | 蔡子傑 孫詩傑 Sun, Shih-Jie |
Keywords: | 大型語言模型 小型特定領域 量化微調 檢索增強生成 數位導遊 問答系統 Large Language models Small-scale and Domain-specific Fine-tuning and Quantization RAG Digital Tour Guide Q&A System |
Date: | 2025 |
Issue Date: | 2025-03-03 14:28:40 (UTC+8) |
Abstract: | 隨著科技的迅速發展,大型語言模型在自然語言處理領域中展現了顯著的進步,特別是在語言生成方面的廣泛與卓越能力,使其在各類型的大眾知識問答上有很好的回饋。不過雖然通用模型可以在廣泛的知識背景下表現得不錯,但實務上對於大多數產業的特定任務答覆能力並不好。如果能讓語言模型對於小型特定領域,如觀光文化的知識領域,有高水準與高正確性的回覆能力,將有相當大的旅遊觀光與學習效益。 現行觀光文化資訊之取得可能僅為實體看板、實體文本等,即使有數位學習資源,但通常依然為大量條列式的文本內容,就算已整理架構,但難以即時滿足個人對於特定問題的資訊需求,如能建構一個數位導遊的問答系統,針對提問者的問題與好奇的部分進行客製化的答覆,將大幅提升學習效率與效果,觀光文化知識更便利獲取,以及旅遊感受更為滿意。 本文提出基於小型特定領域語言模型建構的數位導遊學習問答系統。研究以非高額成本之運算資源,使用一般消費級資訊產品進行微調訓練、量化與檢索增強生成等技術,探索獨特的架構整合,並使用自行特殊調整的資料前處理方式進行訓練資料製作,評估正確性、可讀性與豐富度三項指標。實驗結果顯示,本文建構之達悟族文化數位導遊學習問答系統,其回覆品質有相當水準,達成觀光或文化使用者的需求。 With the rapid advancement of technology, large language models have made significant strides in the field of natural language processing, particularly in their extensive and exceptional capabilities in language generation. This has resulted in notable improvements in a wide range of general knowledge question-answering tasks. However, although foundational models perform adequately within broad knowledge domains, their ability to address specific task requirements in most industries remains insufficient. If language models could offer high-level and accurate responses within smaller, specialized domains—such as the realm of tourism and cultural knowledge—there would be considerable benefits for both tourism and educational purposes. Currently, acquiring information on tourism and cultural subjects is primarily limited to physical signage, printed materials, and other similar forms. Even when digital learning resources are available, they are often presented in the form of large, itemized textual content. While some of this content may be organized, it is still difficult to satisfy individual, real-time information needs. A digital tour guide question-answering system capable of providing personalized responses based on specific user inquiries and curiosities would greatly enhance learning efficiency and effectiveness, making cultural knowledge more readily accessible and improving the overall travel experience. This study proposes the development of a digital tour guide learning question-answering system based on a small-scale, domain-specific language model. The research utilizes computational resources that are both cost-effective and accessible, employing consumer-grade information products for tasks such as training, fine-tuning, quantization, and retrieval-augmented generation. The study explores the integration of specialized frameworks and utilizes custom data preprocessing methods for the preparation of training datasets. The system's performance is evaluated based on three key metrics: accuracy, readability, and richness. The experimental results demonstrate that the proposed digital tour guide learning question-answering system for the Tao culture delivers responses of a satisfactory quality, effectively meeting the needs of both tourism and cultural users. |
Reference: | [1] Weaver, W. (1955). Translation. In W. N. Locke & A. D. Booth (Eds.), Machine Translation of Languages: Fourteen Essays (pp. 15–23). MIT Press. [2] Chomsky, N. (1957). Syntactic structures. Mouton. [3] Khurana, D., Koli, A., Khatter, K., & Singh, S. (2017). Natural language processing: State of the art, current trends and challenges. arXiv preprint arXiv:1708.05148. [4] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. [5] Mitchell, T. M. (1997). Machine learning. McGraw-Hill. [6] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS 2017), 1–11. [7] Winograd, T. (1972). Understanding natural language. Academic Press. [8] Baker, J. K. (1995). Real-time American Sign Language visual recognition from video using hidden Markov models (Master's thesis). Massachusetts Institute of Technology. [9] Berger, A. L., Della Pietra, S. A., & Della Pietra, V. J. (1996). A maximum entropy approach to natural language processing. Computational Linguistics, 22(1), 39–71. [10] Jones, E. K., & Miller, L. M. (2023). L* parsing: A general framework for syntactic analysis of natural language. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 8741–8749. [11] Devopedia Contributors. (2023). N-gram model. Devopedia. https://devopedia.org/n-gram-model [12] Goyal, C. (2021). Syntactic analysis | Guide to master natural language processing (Part 11). Analytics Vidhya. [13] Carroll, J. (2012). Parsing. In R. Mitkov (Ed.), The Oxford handbook of computational linguistics (pp. 233–248). Oxford University Press. [14] Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. Proceedings of the 34th International Conference on Machine Learning, 70, 1243–1252. [15] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog. [16] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. [17] Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165. [18] Wei, Q., Yang, M., Wang, J., Mao, W., Xu, J., & Ning, H. (2024). TourLLM: Enhancing LLMs with tourism knowledge. arXiv preprint arXiv:2407.12791. [19] Karataş, F., Abedi, F. Y., & Yılmaz, M. (2024). Incorporating AI in foreign language education: An investigation into ChatGPT’s effect on foreign language learners. Education and Information Technologies, 29(15), 19321–19341. [20] Chaudhry, M. A., & Kazim, E. (2021). Artificial Intelligence in Education (AIEd): A high-level academic and industry note 2021. AI and Ethics, 1(4), 531–538. [21] Selwood, I., & Pilkington, R. (2005). Teacher workload: Using ICT to release time to teach. Educational Review, 57(2), 163–174. [22] Zimmerman, B. J., & Schunk, D. H. (Eds.). (2011). Handbook of self-regulation of learning and performance. Routledge. [23] Luckin, R. (2017). Towards artificial intelligence-based assessment systems. Nature Human Behaviour, 1(3), 0028. [24] Jiménez, S., Juárez-Ramírez, R., Castillo, V. H., Licea, G., Ramírez-Noriega, A., & Inzunza, S. (2018). A feedback system to provide affective support to students. Computer Applications in Engineering Education, 26(3), 473–483. [25] Heffernan, N. T., & Heffernan, C. L. (2014). The ASSISTments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. International Journal of Artificial Intelligence in Education, 24(4), 470–497. [26] Baylari, A., & Montazer, G. A. (2009). Design a personalized e-learning system based on item response theory and artificial neural network approach. Expert Systems with Applications, 36(4), 8013–8021. [27] Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations (ICLR). [28] Zheng, Y., Li, X., Wang, G., Chen, H., & Liu, Z. (2024). Llamafactory: Unified efficient fine-tuning of 100+ language models. arXiv Preprint arXiv:2403.13372. [29] Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023). QLoRA: Efficient finetuning of quantized LLMs. arXiv Preprint arXiv:2305.14314. [30] Dettmers, T., Lewis, M., Belkada, Y., & Zettlemoyer, L. (2022). LLM.int8(): 8-bit matrix multiplication for transformers at scale. arXiv preprint arXiv:2208.07339. [31] Liu, H., Tam, D., Muqeeth, M., Mohta, J., Huang, T., Bansal, M., & Raffel, C. (2022). Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. arXiv preprint arXiv:2205.05638. [32] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. arXiv preprint arXiv:2005.11401. [33] taide. (2023). Llama3-TAIDE-LX-8B-Chat-Alpha1. Hugging Face. https://huggingface.co/taide/Llama3-TAIDE-LX-8B-Chat-Alpha1 [34] shenzhi-wang. (2023). Llama3-8B-Chinese-Chat. Hugging Face. https://doi.org/10.57967/hf/2316 [35] 原住民族委員會 (n.d.).臺灣原住民族資訊資源網-達悟族。取自 https://www.tipp.org.tw/aborigines_info.asp?A_ID=11 [36] Zhao, Y., Du, L., Ding, X., Xiong, K., Liu, T., & Qin, B. (2024). Supervised fine-tuning: An activation pattern optimization process for attention heads. arXiv preprint arXiv:2409.15820. [37] Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., de Laroussilhe, Q., Gesmundo, A., Attariyan, M., & Gelly, S. (2019). Parameter-efficient transfer learning for NLP. arXiv preprint arXiv:1902.00751. [38] LLaMA Factory. (2023). DPO-En-Zh-20k [Dataset]. Hugging Face. https://huggingface.co/datasets/llamafactory/DPO-En-Zh-20k [39] Qiao, S., Ou, Y., Zhang, N., Chen, X., Yao, Y., Deng, S., Tan, C., Huang, F., & Chen, H. (2023). Reasoning with language model prompting: A survey. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 5368–5393). [40] Inan, H., Upasani, K., Chi, J., Rungta, R., Iyer, K., Mao, Y., Tontchev, M., Hu, Q., Fuller, B., Testuggine, D., & Khabsa, M. (2023). Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations. arXiv preprint arXiv:2312.06674. |
Description: | 碩士 國立政治大學 資訊科學系碩士在職專班 110971006 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0110971006 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系碩士在職專班] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
100601.pdf | | 1532Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|