政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/147028
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 113318/144297 (79%)
造访人次 : 51058262      在线人数 : 914
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/147028


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/147028


    题名: 以機器學習與規則方法辨識中文民事裁判書結構與爭點 : 以給付扶養費為例
    Using Machine Learning and Pattern-Based Methods for Identifying Elements and Issues in Chinese Judgment Documents of Civil Cases : A Case Study with Alimony Cases
    作者: 林泓任
    Lin, Hong-Ren
    贡献者: 劉昭麟
    Liu, Chao-Lin
    林泓任
    Lin, Hong-Ren
    关键词: 自然語言處理
    機器學習
    深度學習
    法學資訊
    Natural language pocessing
    Machine learning
    Deep learning
    Legal information
    日期: 2023
    上传时间: 2023-09-01 15:23:33 (UTC+8)
    摘要: 因應近年來給付扶養費案件在家事事件案件中比例上升,為應用於後續法
    學資訊、社會科學等領域研究及方便一般民眾閱讀並擷取裁判書中重點,在此我們選擇案由為給付扶養費案之裁判書作為語料並進行文字探勘與分析。

    裁判書中除了案件基本資訊也就是給付扶養費案件中的聲請人所提出的主
    張及相對人所提出的反駁或主張外,還包含有法院對此次案件的判決與法官如何做出判決的說明。因此本研究透過機器學習技術從裁判書的內文中找出為「聲請人主張」、「相對人主張」以及法院對該次案件的見解,也就是「案件涵攝」,還有該次案件所引用的法條及說明,也就是「法律見解」。

    為了找出前述四種裁判書的主要內容分類,本實驗將其作為一種分類問
    題,分別以段落及句子為單位進行分類。而在分類器上會分別嘗試使用傳統機器學習模型,此外也使用近年深度學習熱門的 BERT 模型或是將其作為一嵌入層(embedding layer)並接上其他深度學習模型的架構來對裁判書內容進行分類,而這四分類最終在使用深度學習模型後達到 0.816 的 F1 score。

    除了「聲請人主張」、「相對人主張」、「法律見解」、「案件涵攝」四大類別的內容外,裁判書中還紀錄另一重要資訊,也就是案件中的「爭點」,其為該次案件中訴訟雙方所提出之主張或法律上的爭執點。初步建構成前述四大類別的分類器後,本研究會進一步找出爭點。此類問題在本研究中會分別轉化為以句為單位的分類問題,或視為一種文句生成問題。

    在最後將前述四分類及爭點句的分類模型應用在裁判書判決准駁預測上,
    透過比較及分析是否增加前述分類器提供的資訊對於裁判書判決預測的準確
    度,而提供爭點及相比起沒有提供爭點的裁判預測模型,準確率自等同亂猜的0.623 提升到 0.877。
    To response to the ratio of alimony in foundation’s civil cases is increasing in recent year. It need to apply to legal, social science and other fields and make people easily to read judgement documents. We choice the case of alimony in civil cases as our corpus and retrieve the information and analysis it.

    In judgment documents, it contains not only foundation information of case but also contains the opinions of the courts, and uses of laws to reach the final decisions.
    In this research, we use machine learning technique to find the pleadings of the applicants, the responses of the opposite parties, opinions of the courts and uses of laws to reach the final decisions.

    In order to find four categories, it be seemed as a classification problem. We use counts and short sentences as instances, and try to classify it by traditional machine learning model and the popular deep learning model – BERT or use BERT as embedding layers and connect other deep learning model. At last, the classification model for four categories can reach 0.816 of F1 score.

    Except the four categories claimed in previous count, It also has an important information in judgment document – issues. Issues means a point disputed by parties to a lawsuit. This research will also try to find issues, and takes it as five categories classification problem and sentences generation problem.

    Finally, the classification model of four categories and issues will be used in judgment prediction. We compare the judgment prediction model with only information mentioned by parties and privies and model with information mentioned by parties and privies and issues. In previous one, it only can 0.623 accuracy and the later can reach 0.877 accuracy.
    參考文獻: [1] Ronen Feldman and James Sanger, The Text Mining Handbook, Cambridge University Press.
    [2] Mu Yang, Yin-Hsiang Liao, Wei-Yun Ma, “ckip-classic,” [線上]. Available: https://github.com/ckiplab/ckip-classic.
    [3] Christopher D. Manning, Hinrich Schütze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
    [4] Gerard Salton, Chris Buckley, Term Weighting Approaches in Automatic Text Retrieval, New York: Cornell University, 1987.
    [5] Jenish Dhanani, Rupa Mehta & Dipti Rana, Effective and scalable legal judgment recommendation using pre-learned word embedding, Complex Intell. Syst. 8, 3199–3213, 2022.
    [6] 藍家樑, 中文訴訟文書檢索系統雛形實作, 台北市: 國立政治大學 資訊科學學系, 2009.
    [7] 蔡惠娟, 營造工程類法學判決書搜尋系統之研發, 台中市: 國立中興大學土木工程學系所, 2016.
    [8] Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Few-Shot Charge Prediction with Discriminative Legal Attributes, Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 487--498.
    [9] 何君豪, 階層式分群法在民事裁判要旨分群上之應用, 台北市: 國立政治大學 資訊科學學系, 2007.
    [10] 鍾文傑, 陳哲文, 王駿發, 曾世邦, 王宗松, 基於多 BERT 模型之 NLLP 應用於建築工程訴訟之理解與預測, The Association for Computational Linguistics and Chinese Language Processing, 2020, pp. 72--84.
    [11] 紀幸辰, 一個基於機器學習的醫療法判決書預測系統使用具文本相似性的法條分類決策樹, 台北市: 國立臺北大學 資訊工程學系, 2018.
    [12] 林岡毅, 以資訊技術分析我國離婚贍養費相關裁判, 台北市: 國立臺灣大學科際整合法律學研究所, 2018.
    [13] 陳聰富, 民法概要.
    [14] “司法院資料開放平臺,” [線上]. Available: https://opendata.judicial.gov.tw/.
    [15] Yi-Fan Liu, Chao-Lin Liu, Chieh Yang, 以民事訴訟之爭點分群為基礎的類似案件搜尋系統, The 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022).
    [16] “electronjs,” OpenJS Foundation, 2021. [線上]. Available: https://www.electronjs.org/.
    [17] J. Sun, “结巴中文分词,” [線上]. Available: https://github.com/fxsjy/jieba.
    [18] Pedregosa et al., Scikit-learn: Machine Learning in Python,, JMLR 12, 2011.
    [19] Subhajit Dey Sarkar, Saptarsi Goswami, Aman Agarwal, and Javed Aktar, A Novel Feature Selection Technique for Text Classification Using Naive Bayes, 2014.
    [20] S. Fachrurrozi, Muljono, G. F. Shidik, A. Z. Fanani, Purwanto and F. A. Zami, Increasing Accuracy of Support Vector Machine (SVM) By Applying N-Gram and Chi-Square Feature Selection for Text Classification, International Seminar on Application for Technology of Information and Communication (iSemantic), 2021.
    [21] Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding, 2018.
    [22] Ashish Vaswani, Attention Is All You Need, 2017.
    [23] Guozheng Rao, Weihang Huang, Zhiyong Feng, Qiong Cong, LSTM with sentence representations for document-level sentiment classification, Neurocomputing, 2018, pp. 49 -- 57.
    [24] Colin Raffel, Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, The Journal of Machine Learning Research, 2020, p. pp 5485–5551.
    [25] J. Su, T5 PEGASUS, 2021.
    [26] Jingqing Zhang, PEGASUS: pre-training with extracted gapsentences for abstractive summarization, JMLR.org, 2020.
    [27] Wei-Zhi Liu, Po-Hsien Wu, Hong-Ren Lin, Chao-Lin Liu, Predicting Judgments and Grants for Civil Cases of Alimony for the Elderly, Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), 2022, pp. 121--128.
    [28] Zhongguo Wang, Bao Zhang, Chinese Text Similarity Calculation Model Based on Multi-Attention Siamese Bi-LSTM, CSSE 2021: 2021 4th International Conference on Computer Science and Software Engineering, 2021, pp. 93--98.
    [29] Shang Li, Hongli Zhang, Lin Ye, Xiaoding Guo, Binxing Fang, MANN: A Multichannel Attentive Neural Network for Legal Judgment Prediction, IEEE Access, 2018, pp. 151144--151155.
    [30] TensorFlow code and pre-trained models for BERT, “TensorFlow code and pretrained models for BERT,” [線上]. Available: https://github.com/googleresearch/bert.
    Ilya Loshchilov, Frank Hutter, “Decoupled Weight Decay Regularization,” 14 Nov 2017.
    [31] “TensorFlow,” 2015. [線上]. Available: https://github.com/tensorflow/tensorflow.
    [32] Wei-Zhi Liu, Po-Hsien Wu, Hong-Ren Lin, Chao-Lin Liu, 老年扶養費請求案件之准駁及扶養金額預測, Taiwan: Rocling, 2022.
    [33] Salton, G. and McGill, M. J, Introduction to modern information retrieval, McGraw-Hill, 1983.
    描述: 碩士
    國立政治大學
    資訊科學系
    109753156
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0109753156
    数据类型: thesis
    显示于类别:[資訊科學系] 學位論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    315601.pdf4265KbAdobe PDF20检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈