政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/73570
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 114205/145239 (79%)
Visitors : 52921953      Online Users : 985
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/73570


    Title: 運用社會網絡技術由文集中探勘觀念:以新青年為例
    Concept Discovery from Essays based on Social Network Mining: Using New Youth as an Example
    Authors: 陳柏聿
    Chen, Po Yu
    Contributors: 沈錳坤
    Shan, Man Kwan
    陳柏聿
    Chen, Po Yu
    Keywords: 社會網絡分析
    觀念探勘
    文字探勘
    Social Network Analysis
    Concept Mining
    Text Mining
    Date: 2014
    Issue Date: 2015-03-02 10:13:20 (UTC+8)
    Abstract: 以往人文歷史領域的學者們,以土法煉鋼的人工方式進行資料的研究與分析,這樣的方法在資料量不大的時候還可行,但隨著數位典藏的進行以及巨量資料的興起,傳統的書本、古籍和文獻大量的數位化,若繼續使用傳統逐條分析的方式便會花費很多的時間與人力,但也因為資料數位化的關係,資訊領域的人便能利用資訊技術從旁進行協助。
    而其中在觀念史研究領域裡,關鍵詞叢的研究是其中的重點之一,因為觀念可以用關鍵詞或含關鍵詞的句子來表達,所以研究關鍵詞就能幫助人文學者,了解史料文獻背後的意義與掌握當時的脈絡。因此本篇論文研究之目的在於針對收錄多篇文章的文集,探討詞彙與詞彙之間出現在文章中的情形,並利用五種的共現關係,將社群網絡的概念引入到文本分析之中,將每個詞彙當作節點,詞彙之間的關聯性當作邊建立詞彙網絡,從中找出詞彙所形成的觀念,最後實作一個由文集中探勘觀念的系統,此系統主要提供三種分析功能,分別是多詞彙觀念查詢、單詞彙觀念查詢與潛在觀念探勘。
    本研究主要以《新青年》雜誌作為主要的觀察文集與實驗案例分析,《新青年》中觀念由自由主義轉向馬克思列寧主義,而我們利用本系統的確能夠找出變化的軌跡,以及探勘兩個觀念下的關鍵詞彙。
    With development of the digital archives, essays have been digitized. While it takes much time to analyze the contents of essays by human, it is beneficial to analyze by computer. This thesis aims to investigate the approach to discover concepts of essays based on social network mining techniques. While a concept can be represented as a set of keywords, the proposed approach measure the co-occurrence relationships between two keywords and represent the relationships among keywords by networks of keywords. Social network mining techniques are employed to discover the concepts of essays. We also develop the concept discovery system which provides discovery by multiple keywords, discovery by single keyword, and latent concept mining. The New Youth is taken as an example to demonstrate the capability of the developed system.
    Reference: [1] R. Agrawal, T. Imielinski, and A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington, D.C., May 1993.
    [2] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proceedings of the 20th International Conference on Very Large Data Bases, 1994.
    [3] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, "Fast Unfolding of Communities in Large Network," Journal of Statistical Mechanics: Theory and Experiment, P10008, 2008.
    [4] P. Bonacich, “Factoring and Weighting Approaches to Status Scores and Clique Identification,” Journal of Mathematical Sociology, Vol. 2, No.1 , pp. 113-120, 1972.
    [5] R. L. Breiger, “The Analysis of Social Network,” Handbook of data analysis, London: Sage Publication, pp. 505-526, 2004.
    [6] H. Cramer, “Mathematical Methods of Statistics,” Princeton University Press, Princeton, p282, 1946.
    [7] L. C. Freeman, “Centrality in Social Network: Conceptual Clarification,” Social Networks, Vol. 1, No.3, pp. 215-239, 1979.
    [8] M. Girvan and M. E. J. Newman, “Community Structure in Social and Biological Networks,” In Proceeding of National Academic of Sciences(PNAS’02), 7821-7826, 2002.
    [9] J.-W. Huang, B.-R. Dai, and M.-S. Chen, “Twain: Two-End Association Miner with Precise Frequent Exhibition Periods,” ACM Transactions on Knowledge Discovery from Data, Vol. 1, No. 2, 2007.
    [10] K. S. Jones, “A Statistical Interpretation of Term Specificity and Its Application in Retrieval,” Journal of Documentation, Vol. 28, pp. 11-24, 1972.
    [11] C. D. Manning, P. Raghavan, and H. Schutze, “Introduction to Information Retrieval,” Cambridge University Press, Cambridge, 2008.
    [12] M. E. J. Newman, “Fast Algorithm for Detecting Community Structure in Networks,” Physical Review E, Vol. 69, No. 6, 066133, 2004.
    [13] M. E. J. Newman, “The Structure and Function of Complex Networks,” SIAM Review, Vol. 45, No. 2, 2003.
    [14] K. Pearson, “Note on Regression and Inheritance in the Case of Two Parents,” Proceedings of the Royal Society of London, Vol. 58, pp. 240-242, 1895.
    [15] K. Pearson, “On the Criterion that a given System of Deviations from the Probable in the Case of a Correlated System of variables is such that it can be reasonably supposed to have arisen from Random Sampling,” Philosophical Magazine, Series 5, Vol. 50, No.302, pp. 157–175, 1900.
    [16] C. Spearman, “The Proof and Measurement of Association Between Two Things,” American Journal of Psychology, 15, pp. 72-101, 1904.
    [17] G. Salton , A. Wong , C. S. Yang, “A Vector Space Model for Automatic Indexing,” Communications of the ACM, Vol. 18, No. 11, pp. 613-620, 1975.
    [18] S. Wasserman and K. Faust, “Social Network Analysis: Methods and Applications,” Cambridge University Press, Cambridge, 1994.
    [19] 項潔、涂豐恩,〈導論——什麼是數位人文〉,《從保存到創造:開啟數位人文研究》,頁9-28,臺北:國立臺灣大學出版中心,2011年。
    [20] 金觀濤和劉青峰。〈觀念史研究:中國現代重要政治術語的形成〉,中文大學出版社,2008。
    [21] 金觀濤、梁穎誼、姚育松和劉昭麟,〈統計偏離值分析於人文研究上的應用:以《新青年》為例〉,第四屆數位典藏與數位人文國際研討會,2012。
    [22] 金觀濤、邱偉雲和劉昭麟,〈「共現」詞頻分析及其運用-以「華人」觀念起源為例〉,第四屆數位典藏與數位人文國際研討會,2012。
    [23] 余清祥,〈統計在紅樓夢的應用〉,《政人學報》76期,頁303-327,1998年。
    [24] 中國近現代思想及文學史專業數據(1830-1930), http://digibase.ssic.nccu.edu.tw/?m=2302&wsn=0101
    [25] 《新青年》文獻簡介,http://digibase.ssic.nccu.edu.tw/?m=2302&wsn=0304
    [26] 結巴中文斷詞( Jieba)套件,https://github.com/fxsjy/jieba
    [27] community套件,https://bitbucket.org/taynaud/python-louvain
    [28] Networkx套件,https://networkx.github.io/
    Description: 碩士
    國立政治大學
    資訊科學學系
    100753013
    103
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0100753013
    Data Type: thesis
    Appears in Collections:[Department of Computer Science ] Theses

    Files in This Item:

    File SizeFormat
    index.html0KbHTML2400View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback