Reference: | [1] Y. T. Lee, Y. J. Tang, Y. C. Cheng, P. L. Chen, T. Y. Li, and H. H. Huang, "A Multi-grained Dataset for News Event Triggered Knowledge Update." pp. 4158-4162. [2] S. F. Chen, and J. Goodman, "An empirical study of smoothing techniques for language modeling." [3] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989. [4] R. J. Williams, and D. Zipser, “A Learning Algorithm for Continually Running Fully Recurrent Neural Networks,” Neural Computation, vol. 1, no. 2, pp. 270-280, 1989. [5] S. Hochreiter, and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997. [6] D. Bahdanau, K. Cho, and Y. Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” ArXiv, vol. 1409, 09/01, 2014. [7] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, California, USA, 2017, pp. 6000–6010. [8] X. Liu, H.-F. Yu, I. S. Dhillon, and C.-J. Hsieh, “Learning to encode position for transformer with continuous dynamical model,” in Proceedings of the 37th International Conference on Machine Learning, 2020, pp. Article 587. [9] X. Tannier, and V. Moriceau, "Building event threads out of multiple news articles." pp. 958-967. [10] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, “Support vector machines,” IEEE Intelligent Systems and their Applications, vol. 13, no. 4, pp. 18-28, 1998. [11] J. Platt, Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines, MSR-TR-98-14 ,, Microsoft, 1998. [12] S. Tarnpradab, F. Jafariakinabad, and K. A. Hua, “Improving Online Forums Summarization via Hierarchical Unified Deep Neural Network,” 2021. [13] H. T. Dang, and K. Owczarzak, "Overview of the TAC 2008 update summarization task." [14] J. Aslam, F. Diaz, M. Ekstrand-Abueg, R. McCreadie, V. Pavlu, and T. Sakai, TREC 2014 Temporal Summarization Track Overview, 2015. [15] J. A. Aslam, M. Ekstrand-Abueg, V. Pavlu, F. Diaz, and T. Sakai, "TREC 2013 Temporal Summarization." [16] S. Panthaplackel, A. Benton, and M. Dredze, "Updated Headline Generation: Creating Updated Summaries for Evolving News Stories." pp. 6438-6461. [17] F. Dernoncourt, M. M. Ghassemi, and W. Chang, "A Repository of Corpora for Summarization." [18] M. Banko, V. O. Mittal, and M. J. Witbrock, “Headline generation based on statistical translation,” in Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, Hong Kong, 2000, pp. 318–325. [19] B. Dorr, D. Zajic, and R. Schwartz, Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation, 2003. [20] K. Matsumaru, S. Takase, and N. Okazaki, “Improving Truthfulness of Headline Generation,” 2020. [21] S. Takase, J. Suzuki, N. Okazaki, T. Hirao, and M. Nagata, "Neural headline generation on abstract meaning representation." pp. 1054-1059. [22] D. Z. R. Schwartz, B. E. Door, and R. M. Schwartz, "Automatic Headline Generation for Newspaper Stories." [23] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” ArXiv, vol. abs/1907.11692, 2019. [24] W. Xiao, I. Beltagy, G. Carenini, and A. Cohan, "PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization," Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 5245-5263. [25] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” ArXiv, vol. abs/1810.04805, 2019. [26] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, Lake Tahoe, Nevada, 2013, pp. 3111–3119. [27] A. Radford, and K. Narasimhan, "Improving Language Understanding by Generative Pre-Training." [28] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “Albert: A lite bert for self-supervised learning of language representations,” arXiv preprint arXiv:1909.11942, 2019. [29] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv preprint arXiv:1910.13461, 2019. [30] J. Zhang, Y. Zhao, M. Saleh, and P. J. Liu, “PEGASUS: pre-training with extracted gap-sentences for abstractive summarization,” in Proceedings of the 37th International Conference on Machine Learning, 2020, pp. Article 1051. [31] B. Guo, Y. Gong, Y. Shen, S. Han, H. Huang, N. Duan, and W. Chen, “GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation,” arXiv preprint arXiv:2211.10330, 2022. [32] R. Campos, V. Mangaravite, A. Pasquali, A. Jorge, C. Nunes, and A. Jatowt, “YAKE! Keyword extraction from single documents using multiple local features,” Information sciences, vol. 509, pp. 257-289, 2020. [33] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” arXiv pre-print server, 2019-10-29, 2019. [34] D. G. Ghalandari, C. Hokamp, N. T. Pham, J. Glover, and G. Ifrim, “A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal,” 2020. [35] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. e. Lacroix, B. Rozi"ere, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “LLaMA: Open and Efficient Foundation Language Models,” arXiv pre-print server, 2023-02-27, 2023. [36] W.-L. Chiang, Z. Li, Z. Lin, Y. Sheng, Z. Wu, H. Zhang, L. Zheng, S. Zhuang, Y. Zhuang, J. E. Gonzalez, I. Stoica, and E. P. Xing, “Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality,” 2023. [37] G. Penedo, Q. Malartic, D. Hesslow, R. Cojocaru, A. Cappelli, H. Alobeidli, B. Pannier, E. Almazrouei, and J. Launay, “The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only,” arXiv pre-print server, 2023-06-01, 2023. [38] J. Wei, M. Bosma, Vincent, K. Guu, Adams, B. Lester, N. Du, Andrew, and Quoc, “Finetuned Language Models Are Zero-Shot Learners,” arXiv pre-print server, 2021-09-03, 2021. [39] Edward, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-Rank Adaptation of Large Language Models,” arXiv pre-print server, 2021-10-16, 2021. [40] C.-Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries." [41] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, "Bleu: a method for automatic evaluation of machine translation." pp. 311-318. [42] S. Banerjee, and A. Lavie, "METEOR: An automatic metric for MT evaluation with improved correlation with human judgments." pp. 65-72. [43] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” arXiv preprint arXiv:1904.09675, 2019. [44] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal of machine learning research, vol. 21, 2020. [45] Y. Li, “Deep Reinforcement Learning,” arXiv pre-print server, 2018-10-15, 2018. |