政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/132125
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文筆數/總筆數 : 113324/144300 (79%)
造訪人次 : 51112763      線上人數 : 890
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/132125


    題名: 基於主動式學習之古漢語斷句系統發展與應用研究
    Development and Application of an Ancient Chinese Sentence Segmentation System Based on Active Learning
    作者: 徐志帆
    Hsu, Chih-Fan
    張鐘
    Chang, Chung
    貢獻者: 圖資與檔案學刊
    關鍵詞: 數位人文 ; 主動學習 ; 機器學習 ; 自動化古漢語斷句 ; 人機互動 
    Digital humanities ; Active learning ; Machine learning ; Automatic ancient Chinese sentence segmentation ; Human-computer interaction
    日期: 2019-12
    上傳時間: 2020-10-07 11:54:23 (UTC+8)
    摘要: 本研究旨在開發支援數位人文研究之「基於主動式學習的古漢語文本斷句系統」,結合主動學習與機器學習演算法,透過人機合作模式降低建立自動化古漢語斷句建立模型時所需的訓練語料,並協助人文學者面對未解讀過的文獻能更有效率的進行斷句判讀作業。為了找出最合適建立「基於主動式學習的古漢語文本斷句系統」的的演算法與特徵模板,本研究設計第一個實驗採用了不同的演算法與特徵模板配合依序文本和主動學習兩種選擇文本方法所建立的斷句模型進行比較。實驗結果發現,條件隨機場(conditional random fields)與三字詞特徵模板在主動學習方法中能有效地進行學習,適合發展「主動學習斷句模式」。第二個實驗邀請人文專長領域的學者使用「基於主動式學習的古漢語文本斷句系統」進行古漢語文本的斷句判讀,以人文學者各自標註資料建立的斷句模型進行比較分析,並輔以半結構式訪談深度了解人文學者對於本研究發展之系統輔以斷句的使用感受與建議。實驗結果發現「基於主動式學習的古漢語文本斷句系統」確實能有效學習人文學者的斷句標註資料,並且模型預測能力能基於人機合作而不斷提升。最後,透過訪談結果歸納得知人文學者對於系統操作流程與介面具有正面評價,多數受訪者認為本系統的斷句預測功能在古漢語斷句上能提供有效之輔助功能。未來可考量增加命名實體模型或其他古漢語規則的特徵模板設計,以進一步提升斷句預測能力,也希冀能將發展的系統運用在人文領域教育上,發展為訓練古漢語斷句之數位人文教育平台。
    This study aims to develop a sentence segmentation system of ancient Chinese texts based on active learning. It is expected that through the human-machine cooperation mode, the training corpus needed to establish a model for automated ancient Chinese sentence segmentation could be reduced and humanities researchers may work more efficiently on sentence identification of uninterpreted text. Two experiments were conducted in this study for the system development and evaluation. In the first experiment, the automatic sentence segmentation models established by applying different algorithms and feature templates to sequential text selection and active learning text selection were compared to select the most suitable algorithm and feature template to employ in establishing this system. The results show that conditional random fields combined with three-word feature template adopted in active learning could perform effective learning outcomes that would be appropriate to apply to build the active learning sentence segmentation model for ancient Chinese texts. In the second experiment, six humanities researchers were invited to use the system to conduct sentence segmentation tasks of the assigned ancient Chinese texts to evaluate the performance of the system. Sentence segmentation results produced by individual humanistic researchers using the system were compared and analyzed. Semi-structured interviews were also conducted to gather an in-depth understanding of their experience and suggestions of using the system The experimental results show that the developed ancient Chinese sentence segmentation system based on active learning could effectively learn humanities researchers sentence segmentation data and constantly improve the model prediction through human-machine cooperation. Moreover, according to the interviews, most of the humanities researchers participated in this study reported a positive experience of using the system and indicated that the sentence segmentation prediction function provided in the system could effectively assist their sentence segmentation work. The prediction of the active learning sentence segmentation model could be further improved by embedding the name entity model or applying other phonological features or POS tagging of ancient Chinese in the future study. It is also expected to develop this system into a digital humanities learning platform for ancient Chinese sentence segmentation training in the future.
    關聯: 圖資與檔案學刊, 95, 117-145
    資料類型: article
    DOI 連結: https://doi.org/10.6575/JILA.201912_(95).0004
    DOI: 10.6575/JILA.201912_(95).0004
    顯示於類別:[圖資與檔案學刊] 期刊論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    142.pdf1053KbAdobe PDF2221檢視/開啟


    在政大典藏中所有的資料項目都受到原著作權保護.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回饋