English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113318/144297 (79%)
Visitors : 51103826      Online Users : 944
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 商學院 > 統計學系 > 學位論文 >  Item 140.119/146904
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/146904


    Title: 《聯合報》及《人民日報》報導風格比較
    Comparing Writing Styles of United Daily News and People’s Daily
    Authors: 廖靖芸
    Liao, Ching-Yun
    Contributors: 陳怡如
    余清祥

    Chen, Yi-Ju
    Yue, Ching-Syang

    廖靖芸
    Liao, Ching-Yun
    Keywords: 文字探勘
    風格變化
    生物多樣性
    關鍵詞
    關聯分析
    Text Mining
    Writing Style
    Species Diversity
    Keywords
    Association
    Date: 2023
    Issue Date: 2023-09-01 14:57:14 (UTC+8)
    Abstract: 俗話說:「一方水土、養一方人」,由於環境制度、生活方式、觀念思想等之差異,即便同文同種的兩地,其居民的人文素質及文化特徵可能截然不同。中國與臺灣同屬於華夏民族,擁有相似的語言文化及家庭制度,但1950年代至今兩岸採用不同政治體制,加上外來文化及民族融合等因素,臺灣及中國的風俗民情之歧異性隨時間而愈發明顯。本文研究中國與臺灣兩地報紙的文字報導,比較兩者差異的依據,透過文字採礦等方法分析寫作風格,找出兩岸用字遣詞及思想觀念有哪些明顯不同。其中,中國部分選擇1946年~2021年《人民日報》頭版報導,《人民日報》屬於中國共產黨機關報,紀錄中華人民共和國建國至今發生的重要新聞;臺灣部分選擇1960年~2021年《聯合報》社論,《聯合報》屬於臺灣三大報之一,其歷史最為悠久。
    此次研究採用探索性資料分析(Exploratory Data Analysis),引進生物多樣性及棲息地等概念,將單字及雙字詞視為生物物種,探索用字風格及關鍵字詞的關聯及聚落。首先,我們考量兩岸報紙的內文和架構,包括標點符號、虛詞(Function Words)、句子結構等因素,藉由Entropy、TTR(相異字比例;Type-Token Ratio)等豐富度及不均度指標,萃取兩種報紙的重要特徵。發現兩報報導文字及架構有明顯的不同,包括多樣性、不均度和句長等都可看出兩報在不同的歷史事件下呈現的特色。接著使用關鍵詞偵測方法TF-IDF、TextRank和詞頻篩選出先行詞,並加入在內文分析和文章架構得到的特徵找尋關鍵詞叢,以卡方獨立性檢定、關聯指標找出與先行詞最高度相關的附屬詞。例如:先行詞「臺灣」在《人民日報》中常與「西藏、問題」提及,且第一個年代(1946~1945)也出現「殘匪、消滅」等與國共內戰相關的雙字詞。而「臺灣」在《聯合報》從第二個年代(1979~1987)始常與「獨立」共同出現,且在第四個年代(2002~2021)出現「領土」更強調主權的雙字詞出現。因此透過解構報紙內文和架構可以發現詞組的變化貼合兩岸歷史事件及當代重要議題,再加上附屬詞的詞性變化,更可以發掘兩岸報紙用字遣詞的差異。
    Taiwan and Chinese residents have the same language and same race but their cultural and social characteristics are very different. These differences can be caused by education, economy, and life style. Since the 1950s, Taiwan and China have adopted different political systems, coupled with factors such as foreign culture and ethnic integration, the differences in customs Taiwan and China have become more obvious. This study examines reports from newspapers in China and Taiwan, analyzing their writing style and identifying distinct disparities in word usage across the Taiwan Strait. The Chinese part selects front-page reports from People’s Daily (1946-2021), and Taiwan’s part selects editorials from United Daily News (1960-2021), one of the three major newspapers in Taiwan.
    This research adopts Exploratory Data Analysis and introduces the concept species diversity for text mining, by treating single word and two-character words as biological species. We also consider punctuation marks, function words, and sentence structure in data analysis. We use indicators like Entropy, TTR (Type-Token Ratio), and other measures of richness and unevenness to extract the important features of the two newspapers. We found that the two newspapers have their own characteristics under different historical events. In addition, we use the keyword detection methods TF-IDF, TextRank, and word frequency to filter out the antecedent words, via the chi-square independence test and correlation index. Some noticeable results include “Taiwan” is often mentioned in People’s Daily with “Tibet” and “Taiwan” is often followed by “Independence” in 19791987.
    Reference: 一、 中文文獻
    1. 下元宏展(2013)。「來自日語的同形詞對日本漢語學習者的影響之研究」,國立臺灣師範大學華語文教學系學位論文。
    2. 王麗杰、車萬翔與劉挺(2009)。「基於SVMTool的中文詞性標註」,中文信息學報,23(4),頁16-21。
    3. 余清祥(1998)。「統計在紅樓夢的應用」,國立政治大學學報,76,303-327。
    4. 余清祥、葉昱廷(2020)。「以文字探勘技術分析臺灣四大報文字風格」,數位典藏與數位人文,(6),頁69-96。
    5. 何立行、余清祥與鄭文惠(2014)。「從文言到白話:《新青年》雜誌語言變化統計研究」,東亞觀念史集刊,(7),頁427-454。
    6. 李知沅(2004)。「現代漢語外來詞研究」,國立政治大學中國文學系學位論文。
    7. 吳蒨芸(2022)。「從文字探勘比較臺灣與中國之寫作風格——以《聯合報》與《人民日報》為例」,國立政治大學統計學系學位論文。
    8. 范賢娟、楊文金(2011)。「科學論述中文言到白話的過渡--以牛頓第一運動定律為例」,科學教育月刊,(344)。
    9. 陳肇雄、張孝飛、黃河燕與蔡智(2003)。「詞性標註中生詞處理算法研究」,中文信息學報,17(5),頁1-5。
    10. 陳庭偉(2021)。「運用文字探勘分析人民日報的風格變遷」,國立政治大學統計學系學位論文。
    11. 梁家安(2017)。「從國共內戰到改革開放:人民日報風格變遷之量化研究」,國立政治大學統計學系學位論文。
    12. 鄒曉玲(2017)。「新時期《 人民日報》新聞標題與《頻率詞典》高頻語文詞語比較」,重慶交通大學學報: 社會科學版,17(5),頁131-135。
    13. 彭明輝(2001)「《 聯合報》社論對臺灣重大政治事件的立場與觀點 (1950-1995)」,國立政治大學歷史學報,(18),頁277-308。
    14. 葉妍伶(譯)(2016)。暢銷書密碼:人工智慧帶我們重新理解小說創作。新北市:雲夢千里。(Jockers, M. and Archer, J., 2016)
    15. 楊錫彭(2007)。基於語言文字本體的漢語外來詞研究,上海人民出版社。

    二、 英文文獻
    1. Beliga, S., Meštrović, A., and Martinčić-Ipšić, S. (2015). “An overview of graph-based keyword extraction methods and approaches”, Journal of information and organizational sciences, 39(1), pp. 1-20.
    2. Devlin, J., Chang, M. W., Lee, K. and Toutanova, K. (2018). “Bert: Pre-training of deep bidirectional transformers for language understanding”, arXiv preprint arXiv:1810.04805.
    3. Freeman, L. (2004). “The development of social network analysis”, A Study in the Sociology of Science, 1(687), pp. 159-167.
    4. Jing, L. P., Huang, H. K. and Shi, H. B. (2002). “Improved feature selection approach TFIDF in text mining”, in Proceedings. International Conference on Machine Learning and Cybernetics, 2, pp. 944-946.
    5. Jockers, M. and Archer, J. (2016). The Bestseller Code, Penguin UK.
    6. Mihalcea, R. and Tarau, P. (2004, July). “Textrank: Bringing order into text”, in Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404-411).
    7. Radford, A., Narasimhan, K., Salimans, T. and Sutskever, I. (2018). “Improving language understanding by generative pre-training”, in OpenAI Preprint.
    8. Rose, S., Engel, D., Cramer, N., and Cowley, W. (2010). “Automatic keyword extraction from individual documents”, Text mining: applications and theory, pp.1-20.
    9. Tan, A. H. (1999). “Text mining: The state of the art and the challenges”, in Proceedings of the pakdd 1999 workshop on knowledge disocovery from advanced databases, 8, pp. 65-70.
    10. Terrell, G. R., and Scott, D. W. (1992). “Variable kernel density estimation”, The Annals of Statistics, 1236-1265.
    11. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. and Polosukhin, I. (2017). “Attention is all you need”, Advances in neural information processing systems, 30, pp. 5998-6008.
    Description: 碩士
    國立政治大學
    統計學系
    110354019
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0110354019
    Data Type: thesis
    Appears in Collections:[統計學系] 學位論文

    Files in This Item:

    File Description SizeFormat
    401901.pdf15852KbAdobe PDF20View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback