政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/112196
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113392/144379 (79%)
Visitors : 51232113      Online Users : 918
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/112196


    Title: 支援數位人文研究之文本自動標註系統發展與使用評估研究
    Development and evaluation of an automatic text annotation system for supporting digital humanities research
    Authors: 劉鎮宇
    Liu, Chen Yu
    Contributors: 陳志銘
    Chen, Chih Ming
    劉鎮宇
    Liu, Chen Yu
    Keywords: 數位人文
    自動標註系統
    中文自動斷詞
    鏈結資料
    Digital humanities
    Automatic annotation system
    Automatic Chinese word segmentation
    Linked data
    Date: 2017
    Issue Date: 2017-08-28 11:37:33 (UTC+8)
    Abstract: 在傳統的人文研究中,人文學者大多以如古籍珍善本、歷史文獻等紙本出版形式之文本為主要研究文本型式,但是隨著資訊社會的來臨,許多研究機構陸續將這些紙本資料進行數位化並建置數位典藏資料庫,對人文研究環境與知識取得管道帶來巨大的改變,基於數位閱讀之文本研究型式也成為必然的發展趨勢。
    因此,本研究發展支援數位人文研究之「文本自動標註系統」,藉由Linked Data的概念匯集來自不同資料庫的資源,並加以整合後,替文本進行自動註解,讓使用者在解讀文本時能夠即時參照其他資料庫的資源,並提供友善的具文本標註之閱讀介面,以利於人文學者透過閱讀進行資料的解讀。本研究以實驗研究法比較本研究所發展之「文本自動標註系統」與「MARKUS文本半自動標註系統」在支援人文學者進行文本資料解讀之閱讀成效與科技接受度是否具有顯著差異,並輔以半結構式深度訪談了解人文學者對於本研究發展之「文本自動標註系統」的看法及感受,也進一步分析「文本自動標註系統」閱讀成效、科技接受度及使用者行為歷程之間是否具有關聯性。
    實驗結果發現,採用本研究發展之文本自動標註系統的閱讀成效高於MARKUS文本半自動標註系統,但未達顯著差異;而科技接受度分析結果則顯示文本自動標註系統之科技接受度顯著優於MARKUS文本半自動標註系統。另外,從訪談結果歸納得知,文本自動標註系統閱讀介面簡潔明瞭,比MARKUS文本半自動標註系統更適合閱讀,而閱讀介面是否易於使用與是否有用,是影響人文學者能否接受採用系統輔助數位人文研究的重要因素。此外,在兩個系統類似功能比較分析後也發現,文本自動標註系統在查詢詞彙功能、連結到來源網站功能及新增標註功能都比MARKUS文本半自動標註系統更為直覺易用。另外人文學者普遍認為斷句功能比自動斷詞功能更重要,鏈結來源資料庫則以萌典最有幫助。最後,採用文本自動標註系統之閱讀成效與使用者行為歷程之間無顯著關聯性。
    In traditional humanities research, most humanities scholars studied text-type paper-based publishing texts, such as rare ancient books and historical literature. However, many research institutes, in the information society, gradually digitalized such paper-based data and established digital archives database to result in great changes in humanities research environment and knowledge acquisition channels. The research pattern with digital reading based texts became the essential development trend.
    For this reason, an “automatic text annotation system” for supporting digital humanities research is developed in this study. Resources from distinct database are gathered through Linked Data and integrated for the automatic annotation of texts. It allows users immediately referring to resources from other database when interpreting texts and provides friendly reading interface with text annotation for humanities scholars interpreting data through reading. With experimental research, the “automatic text annotation system” developed in this study is compared with “MARKUS semi-automatic text annotation system” for supporting humanities scholars interpreting text data to discussed the difference in reading achievement and technology acceptance. Semi-structured in-depth interviews are also proceeded to understand humanities scholars’ opinions and perception about the “automatic text annotation system” developed in this study as well as to analyze the correlations among reading achievement, technology acceptance, and user behavior course of the “automatic text annotation system”.
    The experimental findings show that the reading achievement with the automatic text annotation system developed in this study is higher than that with MARKUS semi-automatic text annotation system, but not achieving the significance. The technology acceptance analysis reveals remarkably better technology acceptance of the automatic text annotation system than MARKUS semi-automatic text annotation system. According to the interviews, the reading interface of the automatic text annotation system is simple and clear that it is more suitable for reading than MARKUS semi-automatic text annotation system. The ease of use and usefulness of reading interface is a key factor in humanities scholars accepting the system for the digital humanities research. In regard to the comparison of similar functions between two systems, the functions of vocabulary enquiry, linking to source web sites, and annotation appending of the automatic text annotation system are more intuitive and easy to use than those of MARKUS semi-automatic text annotation system. What is more, humanities scholars emphasize more on the sentence segmentation function than the automatic word segmentation function, and the linked source database, Moedict, appears the best assistance. Finally, there is no significant correlation between reading achievement and user behavior course with the automatic text annotation system.
    Reference: 中央研究院(民93年)。中央研究院漢籍 電子文獻瀚典全文檢索系統。上網日期:105年12月20日,檢自:http://hanchi.ihp.sinica.edu.tw/ihp/hanji.html
    中國歷代人物傳記資料庫。CBDB querying and reporting system - online。上網日期:民105年12月20日,檢自:http://db1.ihp.sinica.edu.tw/cbdb/help/systemintro.html
    杜正民(2008)。藏經與佛教工具書的數位化編纂–以 CBETA 電子佛典與數位經錄計畫為例。佛教圖書館館刊,22–35。
    吳明德、黃文琪、陳世娟(2006)。人文學者使用中文古籍全文資料庫之研究。圖書資訊學刊,4(1/2),1–15。
    吳泰廷、楊文新、崔文(2012)。語意網、鏈結資料與開放資料之實務技術與應用。電腦與通訊, (145),102-109。
    項潔、涂豐恩(2011)。導論—什麼是數位人文。從保存到創造: 開啟數位人文研究,9–28。
    項潔、陳詩沛、杜協昌(2009)。台灣古契約文書全文資料庫的建置。第三屆台灣古文書與歷史研究學術研討會 (台中市: 逢甲大學, 民國 98 年 3 月)。
    錢逢祥、蔡政崇、林政毅(2015)。不一樣的Node.js—用JavaScript打造高效能的前後台網頁程式。臺北市:松崗資產管理。
    羅鳳珠(2001)。臺灣地區中國古籍數位化的現況與展望。書目季刊,35(1),23–34。
    Adams, D. A., Nelson, R. R., & Todd, P. A. (1992). Perceived usefulness, ease of use, and usage of information technology: A replication. MIS Quarterly, 227-247.
    Ajzen, I., & Fishbein, M. (1975). Belief, attitude, intention and behavior: An introduction to theory and research. Reading, MA: Addison-Wesley.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z. (2007). Dbpedia: A nucleus for a web of open data. In The semantic web (pp. 722–735). Springer.
    Bennett, R., Hengel-Dittrich, C., O’Neill, E. T., & Tillett, B. B. (2006). VIAF(virtual international authority file): Linking die deutsche bibliothek and library of congress name authority files. In World library and information congress: 72nd IFLA general conference and council. Citeseer.
    Berners-Lee, T. (2006). Linked data, 2006.
    Berners-Lee, T., Chen, Y., Chilton, L., Connolly, D., Dhanaraj, R., Hollenbach, J., … Sheets, D. (2006). Tabulator: Exploring and analyzing linked data on the semantic web. In Proceedings of the 3rd international semantic web user interaction workshop (Vol. 2006, p. 159). Athens, Georgia.
    Berners-Lee, T., Fischetti, M., & Foreword By-Dertouzos, M. L. (2000). Weaving the Web: The original design and ultimate destiny of the World Wide Web by its inventor. HarperInformation.
    Berry, D. (2012). Understanding digital humanities. Springer.
    Bizer, C., Heath, T., & Berners-Lee, T. (2009). Linked data-the story so far. Semantic Services, Interoperability and Web Applications: Emerging Concepts, 205-227.
    Busa, R. (1980). The annals of humanities computing: The index thomisticus. Computers and the Humanities, 14(2), 83-90.
    Chen, K.-J., & Liu, S.-H. (1992). Word identification for Mandarin Chinese sentences. In Proceedings of the 14th conference on Computational linguistics-Volume 1 (pp. 101-107). Association for Computational Linguistics.
    Chen, S.-P., Hsiang, J., Tu, H.-C., & Wu, M. (2007). On building a full-text digital library of historical documents. In International Conference on Asian Digital Libraries (pp. 49-60). Springer.
    Chen, Gillenson, M. L., & Sherrell, D. L. (2002). Enticing online consumers: an extended technology acceptance perspective. Information & Management, 39(8), 705-719.
    Cole, T. W., Han, M.-J., Weathers, W. F., & Joyner, E. (2013). Library marc records into linked open data: Challenges and opportunities. Journal of Library Metadata, 13(2-3), 163-196.
    Craig, H. (2004). Stylistic analysis and authorship studies. Blackwell Publishing.
    Davis, F. D., Bagozzi, R. P., & Warshaw, P. R. (1989). User acceptance of computer technology: a comparison of two theoretical models. Management Science, 35(8), 982–1003.
    Davis Jr, F. D. (1986). A technology acceptance model for empirically testing new end-user information systems: Theory and results. Massachusetts Institute of Technology.
    Diaz-Aviles, E., Fisichella, M., Kawase, R., Nejdl, W., & Stewart, A. (2011). Unsupervised auto-tagging for learning object enrichment. In European Conference on Technology Enhanced Learning (pp. 83-96). Springer.
    Eaves, M., Essick, R., & Viscomi, J. (1996). The William Blake Archive. Caroline Digital Library and Archives, University of North Carolina.
    Fan, B. (1999). Women writers project. Crossroads, 6(2), 19-23.
    Feeney, M., & Ross, S. (1994). Information technology in humanities scholarship British achievements, prospects, and barriers. Historical Social Research/Historische Sozialforschung, 3-59.
    Ha, S., & Stoel, L. (2009). Consumer e-shopping acceptance: Antecedents in a technology acceptance model. Journal of Business Research, 62(5), 565-571.
    Haizhou, L., & Baosheng, Y. (1998). Chinese word segmentation. Language, 212, 217.
    Ho, Hou Ieong Brent, and Hilde De Weerdt.(2014) MARKUS. Text Analysis and Reading Platform.. Retrieved December 26, 2016, from http://dh.chinese-empires.eu/beta/
    Hockey, S. (2004). The history of humanities computing. A Companion to Digital Humanities, 3-19.
    Hu, P. J., Chau, P. Y., Sheng, O. R. L., & Tam, K. Y. (1999). Examining the technology acceptance model using physician acceptance of telemedicine technology. Journal of Management Information Systems, 16(2), 91-112.
    Hwang, G.-J., Yang, L.-H., & Wang, S.-Y. (2013). A concept map-embedded educational computer game for improving students’ learning performance in natural science courses. Computers & Education, 69, 121-130.
    Kirschenbaum, M. (2012). What is digital humanities and what’s it doing in English departments? Debates in the Digital Humanities, 3.
    Kobilarov, G., Scott, T., Raimond, Y., Oliver, S., Sizemore, C., Smethurst, M., … Lee, R. (2009). Media meets semantic web–how the bbc uses dbpedia and linked data to make connections. In European Semantic Web Conference (pp. 723-737). Springer.
    Liao, S., Shao, Y. P., Wang, H., & Chen, A. (1999). The adoption of virtual banking: an empirical study. International Journal of Information Management, 19(1), 63-74.
    Liu, A. (2004). Transcendental data: Toward a cultural history and aesthetics of the new encoded discourse. Critical Inquiry, 31(1), 49-84.
    Mathieson, K. (1991). Predicting user intentions: comparing the technology acceptance model with the theory of planned behavior. Information Systems Research, 2(3), 173-191.
    McCarty, W. (2005). Humanities computing. Palgrave Macmillan.
    Peng, F., Feng, F., & McCallum, A. (2004). Chinese segmentation and new word detection using conditional random fields. In Proceedings of the 20th international conference on Computational Linguistics (p. 562). Association for Computational Linguistics.
    Proud, J. K. (1989). The Oxford text archive (Vol. 5985). British Library, Research and Development Department.
    Rosenzweig, R. (2003). Scarcity or abundance? Preserving the past in a digital era. The American Historical Review, 108(3), 735–762.
    Scheinfeldt, T. (2010). Why digital humanities is “nice”. Found History, 26.
    Schreibman, S., Siemens, R., & Unsworth, J. (2008). A companion to digital humanities. John Wiley & Sons.
    Tan, M., & Teo, T. S. (2000). Factors influencing the adoption of Internet banking. Journal of the AIS, 1(1es), 5.
    Tu, A., Hung, J.-J., & Lin, Y.-H. (2012). Building a text analysis platform for chinese buddhist text -An example based on CBETA and Tripitaka Catalog Projects. Presented at the PNC 2012 Annual Conference and Joint Meetings, UC Berkeley School of Information, USA. Retrieved from http://joeyhung.info/publications/
    Vijayasarathy, L. R. (2004). Predicting consumer intentions to use on-line shopping: the case for an augmented technology acceptance model. Information & Management, 41(6), 747-762.
    W3C. (2004). RDF Primer. Retrieved December 26, 2016, from https://www.w3.org/TR/2004/REC-rdf-primer-20040210/
    W3C. (2008). SPARQL Query Language for RDF. Retrieved December 26, 2016, from https://www.w3.org/TR/rdf-sparql-query/
    Zhang, N., Guo, X., & Chen, G. (2008). IDT-TAM integrated model for IT adoption. Tsinghua Science & Technology, 13(3), 306-311.
    Description: 碩士
    國立政治大學
    圖書資訊與檔案學研究所
    104155014
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0104155014
    Data Type: thesis
    Appears in Collections:[Graduate Institute of Library, Information and Archival Studies] Theses

    Files in This Item:

    There are no files associated with this item.



    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback