Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/141816
|
Title: | 基於事件常識的合理故事改寫文字生成模型 A Text Generation Model based on Event Commonsense Knowledge for Reasonable Story Revision |
Authors: | 莊景雲 Chuang, Ching-Yun |
Contributors: | 黃瀚萱 陳宜秀 Huang, Hen-Hsen Chen, Yihsiu 莊景雲 Chuang, Ching-Yun |
Keywords: | 自然語言處理 故事改寫 常識知識抽取 故事生成 Natural language processing Story rewriting Common sense knowledge extraction Story generation |
Date: | 2022 |
Issue Date: | 2022-09-02 15:42:52 (UTC+8) |
Abstract: | 隨著自然語言處理技術的發展,除了訓練模型能夠生成流暢的文字以外,我們也更注重機器對於人類知識的學習、常識的理解。近年來,也有越來越多自然語言處理任務加入包含人類常識的知識庫(knowledge base) 以賦予機器人類的背景常識(common sense),使機器可以根據文字中的背景常識進行任務。過往常識知識的應用常可見於故事的撰寫、故事結尾預測、閱讀文章後回答問題等。本研究將結合外部知識庫的背景常識資料,進行故事的改寫任務。在僅給予故事頭尾的情況下,本研究目的為將常識知識資料集納入模型,完成故事改寫任務,並且藉由理解句子背後隱含的常識知識,在串連故事的邏輯性、順暢性能優於過去的模型表現。 經實驗結果發現,本研究結合外部嘗試資料集的故事生成結果優於過往模型的故事改寫成果。在人為評估方面,也可見加入常識知識後對於生成語句之間因果關係、頭尾串連上合理性的提升。我們也從評估結果發現,語言學上用詞、語句撰寫的方式也可能會對人為評估有所影響。 With the development of natural language processing technology, in addition to generate fluent sentences, we also pay more attention to the learning of human knowledge and casual relationship. In recent years, more and more natural language processing tasks have added knowledge bases containing human common sense to endow robots with common sense, so that machine interpretation of events can incorporate background knowledge. The application of common sense knowledge is often seen in story writing, predicting the ending of a story, and answering questions after reading articles. This research will combine the background common sense dataset of the external knowledge base to carry out the task of rewriting the story. The purpose of this study is to optimize the performance in story rewriting and coherence between the beginning and end of the story. The logic and smoothness of the concatenation are better than the previous model, and it can understand the implicit common sense behind the sentence. |
Reference: | [1] Prithviraj Ammanabrolu et al. “Automated storytelling via causal, commonsense plot ordering”. In: arXiv preprint arXiv:2009.00829 (2020). [2] Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D Manning. “Lever- aging linguistic structure for open domain information extraction”. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015, pp. 344–354. [3] Ashutosh Baheti et al. “Generating more interesting responses in neural conversa- tion models with distributional constraints”. In: arXiv preprint arXiv:1809.01215 (2018). [4] Antoine Bosselut et al. “Comet: Commonsense transformers for knowledge graph construction”. In: Association for Computational Linguistics (ACL). 2019. [5] Kevin Clark and Christopher D Manning. “Deep reinforcement learning for mention- ranking coreference models”. In: arXiv preprint arXiv:1609.08667 (2016). [6] Chris Donahue, Mina Lee, and Percy Liang. “Enabling language models to fill in the blanks”. In: arXiv preprint arXiv:2005.05339 (2020). [7] Nick C Ellis. “Frequency effects in language processing: A review with implica- tions for theories of implicit and explicit language acquisition”. In: Studies in sec- ond language acquisition 24.2 (2002), pp. 143–188. [8] Jessica Ficler and Yoav Goldberg. “Controlling linguistic style aspects in neural language generation”. In: arXiv preprint arXiv:1707.02633 (2017). [9] Marjan Ghazvininejad et al. “Hafez: an interactive poetry generation system”. In: Proceedings of ACL 2017, System Demonstrations. 2017, pp. 43–48. [10] Jian Guan et al. “A knowledge-enhanced pretraining model for commonsense story generation”. In: Transactions of the Association for Computational Linguistics 8 (2020), pp. 93–108. [11] Lynn Hasher and Rose T Zacks. “Automatic processing of fundamental informa- tion: the case of frequency of occurrence.” In: American psychologist 39.12 (1984), p. 1372. [12] Ari Holtzman et al. “Learning to write with cooperative discriminators”. In: arXiv preprint arXiv:1805.06087 (2018). [13] Jena D Hwang et al. “Comet-atomic 2020: On symbolic and neural commonsense knowledge graphs”. In: arXiv preprint arXiv:2010.05953 (2020). [14] Daphne Ippolito et al. “Unsupervised hierarchical story infilling”. In: Proceedings of the First Workshop on Narrative Understanding. 2019, pp. 37–43. [15] Yuta Kikuchi et al. “Controlling output length in neural encoder-decoders”. In: arXiv preprint arXiv:1609.09552 (2016). [16] Boyang Li et al. “Story generation with crowdsourced plot graphs”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 27. 1. 2013. [17] Hugo Liu and Push Singh. “ConceptNet—a practical commonsense reasoning tool- kit”. In: BT technology journal 22.4 (2004), pp. 211–226. [18] Nicholas Metropolis et al. “Equation of state calculations by fast computing ma- chines”. In: The journal of chemical physics 21.6 (1953), pp. 1087–1092. [19] Nasrin Mostafazadeh et al. “A corpus and cloze evaluation for deeper understand- ing of commonsense stories”. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lan- guage Technologies. 2016, pp. 839–849. [20] Srini Narayanan and Daniel Jurafsky. “A Bayesian model predicts human parse preference and reading times in sentence processing”. In: Advances in neural in- formation processing systems 14 (2001). [21] Nathan Ng et al. “Facebook FAIR’s WMT19 News Translation Task Submission”. In: arXiv preprint arXiv:1907.06616 (2019). [22] Kishore Papineni et al. “Bleu: a method for automatic evaluation of machine trans- lation”. In: Proceedings of the 40th annual meeting of the Association for Compu- tational Linguistics. 2002, pp. 311–318. [23] Lianhui Qin et al. “Counterfactual Story Reasoning and Generation”. In: Proceed- ings of the 2019 Conference on Empirical Methods in Natural Language Process- ing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, pp. 5043–5053. [24] Alec Radford et al. “Improving language understanding by generative pre-training”. In: (). [25] Alec Radford et al. “Language models are unsupervised multitask learners”. In: OpenAI blog 1.8 (2019), p. 9. [26] Colin Raffel et al. “Exploring the limits of transfer learning with a unified text-to- text transformer”. In: arXiv preprint arXiv:1910.10683 (2019). [27] Colin Raffel et al. “Exploring the limits of transfer learning with a unified text-to- text transformer.” In: J. Mach. Learn. Res. 21.140 (2020), pp. 1–67. [28] Maarten Sap et al. “Atomic: An atlas of machine commonsense for if-then reason- ing”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 01. 2019, pp. 3027–3035. [29] Claude Elwood Shannon. “A mathematical theory of communication”. In: The Bell system technical journal 27.3 (1948), pp. 379–423. [30] Zhixing Tian et al. “Scene Restoring for Narrative Machine Reading Comprehen- sion”. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020, pp. 3063–3073. [31] Ashish Vaswani et al. “Attention is all you need”. In: Advances in neural informa- tion processing systems. 2017, pp. 5998–6008. [32] Jingjing Xu et al. “A skeleton-based model for promoting coherence among sen- tences in narrative story generation”. In: arXiv preprint arXiv:1808.06945 (2018). [33] Peng Xu et al. “MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models”. In: arXiv preprint arXiv:2010.00840 (2020). [34] Kyra Yee et al. “Simple and effective noisy channel modeling for neural machine translation”. In: arXiv preprint arXiv:1908.05731 (2019). [35] Lantao Yu et al. “Seqgan: Sequence generative adversarial nets with policy gradi- ent”. In: Proceedings of the AAAI conference on artificial intelligence. Vol. 31. 1. 2017. [36] Lei Yu et al. “Putting Machine Translation in Context with the Noisy Channel Model”. In: (2019). [37] Lei Yu et al. “The neural noisy channel”. In: arXiv preprint arXiv:1611.02554 (2016). [38] Daniel M Ziegler et al. “Fine-tuning language models from human preferences”. In: arXiv preprint arXiv:1909.08593 (2019). |
Description: | 碩士 國立政治大學 數位內容碩士學位學程 109462007 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0109462007 |
Data Type: | thesis |
DOI: | 10.6814/NCCU202201499 |
Appears in Collections: | [數位內容碩士學位學程] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
200701.pdf | | 7259Kb | Adobe PDF2 | 192 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|