English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 117578/148609 (79%)
Visitors : 71296045      Online Users : 475
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 >  Item 140.119/158575
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/158575


    Title: 針對KGW風格水印技術在大型語言模型中的輕量化增強方法
    A Lightweight Enhancement for KGW-Style Watermarking in Large Language Models
    Authors: 陳彥邦
    Chen, Yen-Pang
    Contributors: 郁方
    洪智鐸

    Yu, Fang
    Hong, Chih-Duo

    陳彥邦
    Chen, Yen-Pang
    Keywords: 政治大學
    LLM水印技術
    生成式人工智慧
    機器生成文本偵測
    NCCU
    LLM Watermarking
    Generative AI
    Machine-Generated Text Detection
    Date: 2025
    Issue Date: 2025-08-04 14:26:58 (UTC+8)
    Abstract: 隨著大型語言模型生成流暢且自然文字的能力持續提升,外界對其在假資訊、身分冒用及學術不誠實等方面的濫用問題日益關注。為了標記這類由人工智慧生成的內容,軟性水印技術應運而生,透過在文字生成過程中微幅偏向特定詞彙,提高後續辨識機器生成文本的可能性。然而,現有水印方法在處理低變化性的內容(如程式碼、格式化寫作、重複語句)時效果不佳,主因是可用的詞彙有限,導致偵測訊號薄弱。此外,為保留語句自然度,水印強度通常被設為較低,進一步降低偵測效能。本研究提出一種簡單有效的改進方法,透過收集生成過程中頻繁出現的紅色詞彙或 n-gram,並於偵測時將其排除,以去除對偵測貢獻不大、統計證據不足的高機率片段,強化水印訊號。此方法運算量低,且可應用於任一類型的 KGW 式水印技術。多項實驗顯示,即使在低水印強度下,本方法仍可維持高偵測率,並有效抑制誤判。
    The growing capability of large language models (LLMs) has raised concerns over misuse in misinformation, impersonation, and academic dishonesty. Soft watermarking marks AI-generated content by subtly biasing token selection, enabling downstream detection. However, existing methods struggle on low-variation text or under low watermark strength, where detection signals are weak. We propose a lightweight enhancement that filters frequently sampled red tokens or n-grams during detection to amplify the watermark signal. Our method significantly improves detection accuracy under low watermark strength, while maintaining a low false positive rate and remaining compatible with any Kgw-style watermarking scheme.
    Reference: Aaronson, S. and Kirchner, H. (2022). Watermarking GPT outputs. https://www. scottaaronson.com/talks/watermark.ppt. Presentation slides.
    Christ, M., Gunn, S., and Zamir, O. (2023). Undetectable watermarks for language models.
    Dathathri, S., See, A., Ghaisas, S., Huang, P.-S., McAdam, R., Welbl, J., Bachani, V., Kaskasoli, A., Stanforth, R., Matejovicova, T., et al. (2024). Scalable watermarking for identifying large language model outputs. Nature, 634(8035):818–823.
    Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Burstein, J., Doran, C., and Solorio, T., editors, Proceedings of the 2019 Conference of the North American Chap- ter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Asso- ciation for Computational Linguistics.
    He, Z., Zhou, B., Hao, H., Liu, A., Wang, X., Tu, Z., Zhang, Z., and Wang, R. (2024). Can watermarks survive translation? on the cross-lingual consistency of text watermark for large language models.
    Hou, A. B., Zhang, J., He, T., Wang, Y., Chuang, Y.-S., Wang, H., Shen, L., Durme,
    B. V., Khashabi, D., and Tsvetkov, Y. (2024). Semstamp: A semantic watermark with paraphrastic robustness for text generation.
    Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I., and Goldstein, T. (2024a). A watermark for large language models.
    Kirchenbauer, J., Geiping, J., Wen, Y., Shu, M., Saifullah, K., Kong, K., Fernando, K., Saha, A., Goldblum, M., and Goldstein, T. (2024b). On the reliability of watermarks for large language models.
    Kuditipudi, R., Thickstun, J., Hashimoto, T., and Liang, P. (2024). Robust distortion-free watermarks for language models.
    Lee, T., Hong, S., Ahn, J., Hong, I., Lee, H., Yun, S., Shin, J., and Kim, G. (2024). Who wrote this code? watermarking for code generation.
    Li, Z. (2025). Bimarker: Enhancing text watermark detection for large language models with bipolar watermarks.
    Liu, A., Pan, L., Hu, X., Meng, S., and Wen, L. (2024a). A semantic invariant robust watermark for large language models.
    Liu, A., Pan, L., Lu, Y., Li, J., Hu, X., Zhang, X., Wen, L., King, I., Xiong, H., and Yu,
    P. (2024b). A survey of text watermarking in the era of large language models. ACM Computing Surveys, 57(2):1–36.
    Lu, Y., Liu, A., Yu, D., Li, J., and King, I. (2024). An entropy-based text watermarking detection method.
    Miller, G. A. (1994). WordNet: A lexical database for English. In Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994.
    Pan, L., Liu, A., He, Z., Gao, Z., Zhao, X., Lu, Y., Zhou, B., Liu, S., Hu, X., Wen, L.,
    King, I., and Yu, P. S. (2024). Markllm: An open-source toolkit for llm watermarking.
    Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
    Sun, Z., Du, X., Song, F., and Li, L. (2023). Codemark: Imperceptible watermarking for code datasets against neural code completion models. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE ’23, page 1561–1572. ACM.
    Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Scao, T. L., Gugger, S., Drame, M., Lhoest, Q., and Rush, A. M. (2020). Huggingface’s transformers: State-of-the-art natural language processing.
    Xu, H., Xiang, L., Yang, B., Ma, X., Chen, S., and Li, B. (2025). Tokenmark: A modality- agnostic watermark for pre-trained transformers.
    Zhao, X., Ananth, P., Li, L., and Wang, Y.-X. (2023). Provable robust watermarking for ai-generated text.
    Description: 碩士
    國立政治大學
    資訊管理學系
    112356028
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0112356028
    Data Type: thesis
    Appears in Collections:[資訊管理學系] 學位論文

    Files in This Item:

    File Description SizeFormat
    602801.pdf1647KbAdobe PDF0View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback