Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/148475
|
Title: | 新聞事件觸發之全文知識更新 Knowledge Update in Full Text Triggered by a News Event |
Authors: | 李昱廷 Lee, Yu-Ting |
Contributors: | 李蔡彥 黃瀚萱 Li, Tsai-Yen Huang, Hen-Hsen 李昱廷 Lee, Yu-Ting |
Keywords: | 文本生成 時序知識建模 知識更新 自然語言生成 大型語言模型 內容改寫 新聞事件 Text Generation Temporal Knowledge Modeling Update Summarization Natural Language Generation Knowledge Update Large Language Model Text Revision News Event |
Date: | 2023 |
Issue Date: | 2023-12-01 10:34:04 (UTC+8) |
Abstract: | 在網路資訊的快速發展下,每日的新聞事件更迭與知識獲取已成為人們主要獲取資訊的管道,新的知識內容每分每秒都不斷地在發生,而保持資訊的更新也需要極大的人力和時間成本。在本研究中,我們規劃了一項新的自然語言生成任務,即新聞事件觸發之知識更新。研究目標以現有關於某主題的文章或舊版本的內容和一個關於該主題的新聞事件,根據該新聞事件的資訊生成一篇更新後的文章。在資料蒐集的過程,我們建立一個多粒度的新聞資料集以適用於研究目標。蒐集主要源自於維基百科的文章,經由爬取並與多種語言的新聞事件對齊,而資料集包含有引文、文章之首段和文章的全文。我們提出改良後的模型設計用於自動化更新全文知識,並以多個大型語言模型驗證模型架構之有效性。 With the rapid development of internet information, daily news events and knowledge acquisition have become the primary channels for people to access information. New knowledge content is constantly emerging every second, requiring significant human and time resources to ensure knowledge updates. In this research, we propose a new natural language generation task, namely ”Knowledge Update in Full Text Triggered by a News Event”. Our objective is to generate an updated article based on a given news event and existing articles or old versions of content on a specific topic. To facilitate our research objective, we construct a multi-granularity news dataset suitable for our task. The dataset is primarily sourced from Wikipedia articles, crawled and aligned with news events in multiple language units. Dataset includes citations, first paragraphs, and full-text articles. We present an improved model architecture tailored specifically for the task of updating knowledge in full-content articles and validate the effectiveness of our framework with multiple large language models. |
Reference: | [1] OpenAI. OpenAI: Introducing ChatGPT. https://openai.com/blog/chatgpt, 2022. [2] OpenAI. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774, 2023. [3] Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B Hashimoto. Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm.stanford.edu/2023/03/13/alpaca.html, 3(6):7, 2023. [4] Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E Gonzalez, et al. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. https://vicuna.lmsys.org (accessed 14 April 2023), 2023. [5] Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, and Georgiana Ifrim. A large-scale multi-document summarization dataset from the wikipedia current events portal. arXiv preprint arXiv:2005.10070, 2020. [6] Yu-Ting Lee, Ying-Jhe Tang, Yu-Chung Cheng, Pai-Lin Chen, Tsai-YenLi, and Hen-Hsen Huang. A multi-grained dataset for news event triggered knowledge update. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 4158–4162, 2022. [7] Yixin Liu, Ansong Ni, Linyong Nan, Budhaditya Deb, Chenguang Zhu, Ahmed H Awadallah, and Dragomir Radev. Leveraging locality in abstractive text summarization. arXiv preprint arXiv:2205.12476, 2022. [8] Shengqiang Zhang, Xingxing Zhang, Hangbo Bao, and Furu Wei. Attention temperature matters in abstractive summarization distillation. arXiv preprint arXiv:2106.03441, 2021. [9] Ashish Vaswani, Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones, AidanN Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [10] Sansiri Tarnpradab, Fereshteh Jafariakinabad, and Kien A Hua. Improving online forums summarization via hierarchical unified deep neural network. arXiv preprint arXiv:2103.13587, 2021. [11] Wei Li, Xinyan Xiao, Jiachen Liu, Hua Wu, Haifeng Wang, and Junping Du. Leveraging graph to improve abstractive multi-document summarization. arXiv preprint arXiv:2005.10043, 2020. [12] Hussam Alkaissi and Samy I McFarlane. Artificial hallucinations in chatgpt: implications in scientific writing. Cureus, 15(2), 2023. [13] Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461, 2019. [14] Wen Xiao, Iz Beltagy, Giuseppe Carenini, and Arman Cohan. Primera: Pyramid-based masked sentence pre-training for multi-document summarization. arXiv preprint arXiv:2110.08499, 2021. [15] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arXiv e-prints, page arXiv:1910.10683, October 2019. [16] Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin A Raffel. Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning. Advances in Neural Information Processing Systems, 35:1950–1965, 2022. [17] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. [18] Sheena Panthaplackel, Adrian Benton, and Mark Dredze. Updated headline generation: Creating updated summaries for evolving news stories. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6438–6461, 2022. [19] Hoa Trang Dang, Karolina Owczarzak, et al. Overview of the tac 2008 update summarization task. In TAC, 2008. [20] Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020. [21] Sho Takase, JunSuzuki, Naoaki Okazaki, Tsutomu Hirao,and Masaaki Nagata. Neural headline generation on abstract meaning representation. In Proceedings of the 2016 conference on empirical methods in natural language processing, pages 1054–1059, 2016. [22] David Zajic,Bonnie Dorr,and Richard Schwartz. Automatic headline generation for newspaper stories. In Workshop on Text Summarization (ACL 2002 and DUC 2002 meeting on Text Summarization). Philadelphia, page 65, 2002. [23] Alexander Spangher, Xiang Ren, Jonathan May, and Nanyun Peng. Newsedits: A news article revision dataset and a novel document-level reasoning challenge. In Proceedings of the 2022 Conference of the North American Chapter of the Associa- tion for Computational Linguistics: Human Language Technologies, pages 127–157, 2022. [24] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67, 2020. [25] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019. [26] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [27] Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004. [28] Alexander R Fabbri, Irene Li, Tianwei She, Suyi Li, and Dragomir R Radev. Multi-news: A large-scale multi-document summarization dataset and abstractive hierarchical model. arXiv preprint arXiv:1906.01749, 2019. [29] Nachshon Cohen, Oren Kalinsky, Yftah Ziser, and Alessandro Moschitti. Wikisum: Coherent summarization dataset for efficient human evaluation. 2021. [30] Paul Over and James Yen. An introduction to duc-2004. National Institute of Standards and Technology, 2004. [31] Kristina Toutanova, Chris Brockett, Michael Gamon, Jagadeesh Jagarlamudi, Hisami Suzuki, and Lucy Vanderwende. The pythy summarization system: Microsoft research at duc 2007. In Proc. of DUC, volume 2007. Citeseer, 2007. [32] Colin B Clement, Matthew Bierbaum, Kevin P O’Keeffe, and Alexander A Alemi. On the use of arxiv as a dataset. arXiv preprint arXiv:1905.00075, 2019. [33] Peter J. Denning. The locality principle. Commun. ACM, 48(7):19–24, jul 2005. [34] Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1480–1489, San Diego, California, June 2016. Association for Computational Linguistics. [35] Alexander Fabbri, Irene Li, Tianwei She, Suyi Li, and Dragomir Radev. Multi-news: A large-scale multi-document summarization dataset and abstractive hierar- chical model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1074–1084, Florence, Italy, July 2019. Association for Computational Linguistics. [36] Yang Liu and Mirella Lapata. Hierarchical transformers for multi-document summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5070–5081, Florence, Italy, July 2019. Association for Computational Linguistics. [37] Jianmin Zhang, Jiwei Tan,and Xiaojun Wan.Adapting neural single-document summarization model for abstractive multi-document summarization: A pilot study. In Proceedings of the 11th International Conference on Natural Language Generation, pages 381–390, Tilburg University, The Netherlands, November 2018. Association for Computational Linguistics. [38] Ziming Mao, Chen Henry Wu, Ansong Ni, Yusen Zhang, Rui Zhang, Tao Yu, Budhaditya Deb, Chenguang Zhu, Ahmed H Awadallah, and Dragomir Radev. Dyle: Dynamic latent extraction for abstractive long-input summarization. arXiv preprint arXiv:2110.08168, 2021. [39] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27, 2014. [40] Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023. [41] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002. [42] Satanjeev Banerjee and Alon Lavie. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pages 65–72, 2005. [43] Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019. [44] Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021. |
Description: | 碩士 國立政治大學 資訊科學系 110753204 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0110753204 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
320401.pdf | | 6819Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|