English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 119181/150261 (79%)
Visitors : 86705607      Online Users : 333
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 理學院 > 應用數學系 > 學位論文 >  Item 140.119/159317
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/159317


    Title: 大型語言模型的非檢索式上下文延展機制研究:從鍵值緩存到微積分AI教師
    RAG-Free Contextual Extension for LLMs: A Study on KV-Cache and Calculus AI Tutoring
    Authors: 孫翊珈
    Sun, Yi-Jia
    Contributors: 蔡炎龍
    Tsai, Yen-Lung
    孫翊珈
    Sun, Yi-Jia
    Keywords: 大型語言模型
    鍵值快取
    上下文延展
    AI 教學系統
    非檢索生成
    Large Language Models
    Key-Value Cache
    Context Extension
    AI Tutoring System
    Retrieval-Free Generation
    Date: 2025
    Issue Date: 2025-09-01 16:30:03 (UTC+8)
    Abstract: 隨著大型語言模型(Large Language Models, LLMs)在自然語言處理領域的快速發展,其應用已逐漸擴展至教育場域。然而,現有 LLM 面臨上下文長度(context window)受限的挑戰,使其在處理長篇教材與多輪教學問答時難以維持語境連貫性與邏輯一致性。傳統解法如檢索增強生成(Retrieval-Augmented Generation, RAG)雖能引入外部知識,但也易產生檢索偏誤及語境斷裂,影響教學應用的效能。

    本研究提出一種基於鍵值快取(Key-Value Cache, KV-Cache)的非檢索式上下文延展策略,並設計實作了一套以微積分教材為基礎的 AI 教師系統。系統透過分段預填充(chunked prefill)將教材內容逐步輸入模型,並快取中間計算結果,讓模型在後續教學問答中能延續語境、節省運算資源並提升語義一致性。實驗比較了 KV-Cache 系統、RAG 系統與無快取系統,評估其記憶體使用與回應延遲。

    實驗結果顯示,所提出的 KV-Cache 機制在長文本教學場景下能有效提升語境連貫性,並顯著降低回應延遲,展現其於 AI 教學應用中的潛力。
    With the advancement of Large Language Models (LLMs), their integration into educational applications has attracted increasing attention. However, LLMs are constrained by their fixed context window size, making it difficult to handle long instructional materials and maintain coherent multi-turn teaching dialogues. While Retrieval-Augmented Generation (RAG) alleviates some knowledge limitations by incorporating external retrieval, it often introduces retrieval bias and context fragmentation, reducing its effectiveness in educational scenarios.

    This study proposes a retrieval-free context extension approach based on Key-Value Cache (KV-Cache) and implements a calculus-focused AI tutoring system. The system incrementally feeds LaTeX-based calculus textbooks into the model using a chunked prefill strategy, caching intermediate computations to enable consistent context retention and improved semantic coherence in subsequent teaching interactions. The experiments compare the proposed system with RAG-based and non-caching baselines, focusing on response latency and teaching continuity.

    Experimental results demonstrate that the KV-Cache mechanism effectively enhances contextual coherence and significantly reduces response latency in long-text teaching scenarios, showing great potential for future AI-driven educational systems.
    Reference: [1] Jie Hu, Shengnan Wang, Yutong He, Ping Gong, Jiawei Yi, Juncheng Zhang, Youhui Bai, Renhai Chen, Gong Zhang, Cheng Li, et al. Efficient long-context llm inference via kv cache clustering. arXiv preprint arXiv:2506.11418, 2025.
    [2] Neusha Javidnia, Bita Darvish Rouhani, and Farinaz Koushanfar. Key, value, compress: A systematic exploration of kv cache compression techniques. In 2025 IEEE Custom Integrated Circuits Conference (CICC), pages 1–3. IEEE, 2025.
    [3] Jushi Kai, Boyi Zeng, Yixuan Wang, Haoli Bai, Ziwei He, Bo Jiang, and Zhouhan Lin. Freqkv: Frequencydomainkey-valuecompressionforefficientcontextwindowextension. arXiv preprint arXiv:2505.00570, 2025.
    [4] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33:9459–9474, 2020.
    [5] Guangda Liu, Chengwei Li, Jieru Zhao, Chenqi Zhang, and Minyi Guo. Clusterkv: Manipulating llm kv cache in semantic space for recallable compression. arXiv preprint arXiv:2412.03213, 2024.
    [6] A. Palu and B. Smith. Kv-cache compression with low-rank projection. In International Conference on Learning Representations (ICLR), 2024.
    [7] Aurick Qiao, Zhewei Yao, Samyam Rajbhandari, and Yuxiong He. Swiftkv: Fast prefill optimized inference with knowledge-preserving model transformation. arXiv preprint arXiv:2410.03960, 2024.
    [8] Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, and Beidi Chen. Shadowkv: Kv cache in shadows for high-throughput long-context llm inference. arXiv preprint arXiv:2410.21465, 2024.
    [9] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
    [10] Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multilingual e5 text embeddings: A technical report. arXiv preprint arXiv:2402.05672, 2024.
    [11] Jialong Wu, Zhenglin Wang, Linhai Zhang, Yilong Lai, Yulan He, and Deyu Zhou. Scope: Optimizing key-value cache compression in long-context generation. arXiv preprint arXiv:2412.13649, 2024.
    [12] Jingbo Yang, Bairu Hou, Wei Wei, Yujia Bao, and Shiyu Chang. Kvlink: Accelerating large language models via efficient kv cache reuse. arXiv preprint arXiv:2502.16002, 2025.
    Description: 碩士
    國立政治大學
    應用數學系
    111751001
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0111751001
    Data Type: thesis
    Appears in Collections:[應用數學系] 學位論文

    Files in This Item:

    File SizeFormat
    100101.pdf1103KbAdobe PDF0View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback