Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/99806
|
Title: | 《全唐詩》的初步分析: 版本比對、詩歌對應與社群網絡 Some Studies of the Complete Tang Poems: Version Comparison, Word Alignment, and Social Network Analysis |
Authors: | 羅國峯 Luo, Kuo Feng |
Contributors: | 劉昭麟 Liu, Chao Lin 羅國峯 Luo, Kuo Feng |
Keywords: | 唐詩 數位人文 文字探勘 社群網絡 Tang poem Digital Humanities text mining social network |
Date: | 2016 |
Issue Date: | 2016-08-09 11:24:53 (UTC+8) |
Abstract: | 現今資訊數位化快速發展的時代,逐漸有許多過去歷史中紙本的資料被轉成數位的方式保存。數位人文是近年來數位科技應用的重要趨勢之一,可透過資訊科學的技術並結合人文語料。 本論文選用清朝時期所編撰的《全唐詩》作為研究的語料,並透過此語料進行版本比對、詩歌對應、資訊檢索與社群網絡分析的工作。《全唐詩》存在多個版本的問題,我們藉由收集多個版本的《全唐詩》進行版本比對,且建立一個檢索系統讓使用者能檢索自己欲查詢的《全唐詩》內容,檢索的結果可以看到不同版本的內容差異。但為了進行後續的研究分析,因此我們會產生一個整合版本。我們利用此整合版本《全唐詩》,找出詩人在作品中提到其他詩人的交往詩以及詩人與詩人間的作品共用詞彙關係,建立《全唐詩》的詩人社群網絡,並進行社群網絡中的詩人詞彙分析。 在結果的部分,我們將唐朝詩人劃分成四個唐朝時期,統計四個唐朝時期詩人的詞彙使用以及建立單一時期詩人的社群網絡,並計算各個節點的分支度。而為了讓我們能了解社群關係中詩人與詩人的詞彙使用以及詞彙在《全唐詩》裡被使用的狀況,因此我們提供一個工具幫助研究者能找出社群網絡中詩人與詩人的共同常用詞彙。最後我們基於詩人的詞彙對應關係,提供一個研究詞彙史的工具,我們能將欲查詢的詞彙標記上詩歌作者的生卒年時間,且透過標記後的詞彙建立其時間序列圖觀察詞彙的演變。 In the information digitized era of rapid development, many of the past history of the text material were converted into a digital way. Hence, the digital humanities had become a popular research, and we can through the computer science with the humanities to development of digital humanities. In this paper, we chose the Qing Dynasty compiled complete Tang poems as our research topic, and through this corpus for version comparisons, word alignment, and social network analysis. We collected multiple versions of complete Tang poems for comparisons to get integrated version of complete tang poems, but we cannot say that it is best version. We built a retrieval service for users to search complete Tang poems` content. The search results allowed users to see the contents of version differences. We used an integrated version of complete Tang poems to find the exchange poetry that poets mentioned other poets in the poems and the relations between the poets were shared vocabulary in the works, and we build Tang poets’ social network and conduct word analysis. In results, we divided the Tang poets into four Tang Dynasty periods to do word analysis of each period and built of a single period of the poet`s social network, which calculated the degree of branching of each node. In order for us to understand that the words relationship between the poet and the poet and words situation how were used in the complete Tang poems, we provided a tool to help researchers to find poets’ common word. Finally, we based on the correspondence between the poet`s poetry, and construct a tool of words history. We can label words with authors` birth and death years, and built time series plot to watch words evolution. |
Reference: | [1] 吳汝煜,唐五代人交往詩索引,上海古籍出版社,1988。 [2] 俞士汶、胡俊峰,唐宋詩之詞匯自動分析及應用,語言暨語言學,631-647,2003 [3] 高文,全唐詩簡編,上海古籍出版社,1993。 [4] 高棅,唐詩品匯,中華書局,2015。 [5] 陳友冰、田素謙,唐詩清賞,正中書局,2001。 [6] 張尚斌,詞夾子演算法在專有名詞辨識上的應用-以歷史文件為例,國立臺灣大學,資訊工程學系,碩士論文,指導教授:項潔,2006。 [7] 李鴻泰,李白交往詩研究,國立臺灣師範大學,國文學系,碩士論文,指導教授:徐國能,2013。 [8] 陳鍾琇,唐代和詩研究,秀威資訊出版社,2008。 [9] 趙薇,社群網絡分析(SNA)在現代漢語歷史小說研究中的應用初探─以李劼人的《大波》三部曲為例,第六屆數位典藏與數位人文國際研討會論文集,459-480,2015。 [10] 劉昭麟、張淳甯、許筑婷、鄭文惠、王宏甦及邱偉雲,《全唐詩》的分析、探勘與應用-風格、對仗、社群網絡與對聯,第二十七屆計算語言學研討會論文集,43-57,2015。 [11] 鄭文惠、劉昭麟、邱偉雲及許筑婷,情感現象學與色彩政治學:中唐詩歌白色抒情系譜的數位人文研究,第六屆數位典藏與數位人文國際研討會論文集,481-522,2015。 [12] A. Agarwal, A. Corvalan, J. Jensen, and O. Rambow, Social Network Analysis of Alice in Wonderland, Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature, 88–96, 2012. [13] R. Disetel, Graph Theory (3rd ed.), Springer, 110-117, 2006. [14] S. V. Dongen, Graph Clustering by Flow Simulation, PhD Thesis, University of Utrecht, 2000. [15] C. D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing, MIT Press, 192-195, 1999. [16] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval, Cambridge University Press, 107-110, 2008. [17] D. Schmidt and R. Colomb, A Data Structure for Representing Multi-Version Texts Online, Journal of Human–Computer Studies, 67(6), 497–514, 2009. [18] J. Scott, Social Network Analysis (3rd ed.), SAGE, 2012. [19] 中國歷代人物傳記資料庫,http://projects.iq.harvard.edu/chinesecbdb/home [last visited 2016/7/26]。 [20] 中國哲學書電子化計劃,http://ctext.org/zh [last visited 2016/7/26]。 [21] 文學100,http://www.wenxue100.com/ [last visited 2016/7/26]。 [22] 全唐詩,http://ctext.org/quantangshi/zh [last visited 2016/7/26]。 [23] 宋朝介紹,http://www.qulishi.com/songchao/ [last visited 2016/7/26]。 [24] 故宮【寒泉】古典文獻全文檢索資料庫, http://210.69.170.100/S25/ [last visited 2016/7/26]。 [25] 詩式,http://www.cll.ncnu.edu.tw/hpoet/ha28.html [last visited 2016/7/26]。 [26] 逢甲大學-唐代研究中心,http://tang.cl.fcu.edu.tw/wSite/ct?xItem=58221&ctNode=19741&mp=535601&idPath=19710_19711_19722_19741 [last visited 2016/7/26]。 [27] Digital Humanities 2016 conference,http://dh2016.adho.org/about/ [last visited 2016/7/26]。 [28] Elasticsearch,https://www.elastic.co/ [last visited 2016/7/26]。 [29] Gephi,https://gephi.org/ [last visited 2016/7/26]。 [30] Google Chart,https://developers.google.com/chart/?hl=zh-TW [last visited 2016/7/26]。 [31] HTML Living Standard,https://html.spec.whatwg.org/multipage/ [last visited 2016/7/26]。 [32] ICTPOS3.0詞性標記集,https://gist.github.com/luw2007/6016931[last visited 2016/7/26]。 [33] International Conference of Digital Archives and Digital Humanities,http://www.dadh.digital.ntu.edu.tw/ch [last visited 2016/7/26]。 [34] Jieba斷詞,https://github.com/fxsjy/jieba [last visited 2016/7/26]。 [35] Json format,http://www.json.org/ [last visited 2016/7/26]。 |
Description: | 碩士 國立政治大學 資訊科學學系 103753026 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0103753026 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
302601.pdf | 3161Kb | Adobe PDF2 | 531 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|