Please use this identifier to cite or link to this item:
Title: | 中文動詞自動分類研究 Automatic Classification of Chinese Unknown Verbs |
Authors: | 曾慧馨 Tseng, Hui-Hsin |
Contributors: | 高照明 劉昭麟 Gao, Zhao-Ming Liu, Chao-Lin 曾慧馨 Tseng, Hui-Hsin |
Keywords: | 未知詞 詞彙相似度測量 動詞 unknown words lexical similarity verbs |
Date: | 2001 |
Issue Date: | 2016-04-15 15:58:06 (UTC+8) |
Abstract: | 本文提出以規則法與相似法將未知動詞自動分類至中研院詞庫小組(1993)的動詞分類標記上。規則法中的規則從訓練語料中訓練出,並加上未知動詞重疊的規律,包含率約二成五,正確率約86.86%∼91.32%。規則法的優點在於正確率高,但缺點在於可以處理的未知動詞數量太少。相似法利用與未知動詞的相似例子猜測未知動詞的可能分類,利用詞彙內部的訊息---詞基的詞類、語意類與詞彙結構來計算相似度。相似法的可以全面性的處理未知動詞,缺點容易受到訓練語料中標記錯誤的例子誤導與訓練語料的大小所影響。我們結合規則法與相似法預測未知動詞分類的正確率為72%。 We present two methods to classify the Chinese unknown verbs. First, we summarize some linguistic rules and morphological patterns from corpus. The accuracy of the rule-based method is 86.86%~91.32%. Second, we use the instance-based categorization to classify the Chinese unknown words. The accuracy of the instance-based method is 67.86%~70.92% and the accuracy of the integrated classifier is about 72%. |
Reference: | 中文
陳鳳儀、蔡碧芳、陳克健、黃居仁。1999。《中文句結構樹的構建》。<中文計算語言期刊 vol 4. no.2>,頁87-104。
顏國偉、譚慧敏。1999。《基於知網的常識知識標注》。<中文計算語言期刊vol 4. no.2>,頁39-86。
Abney, Steven. 1996. Statistical Methods and Linguistics. The Balancing: Combing Symbolic and Statistical Approaches to Language, ed. by Judith L. Klavans and Philip Resnik, 1-26. Cambridge: MIT.
Allen, James. 1995. Natural Language Understanding. 2nd editon. Redwood City: Benjamin/Cummings.
Biber, Douglas, Susan Conrad, and Randi Reppen. 1998. Corpus Linguistics: Inverstigating Language Structure and Use. Cambridge: Cambridge University Press.
Bresnan, Joan. 2001. Lexical Functional Syntax. Oxford:Blackwell.
Chang, Jing-Shin and Keh-Yih Su. 1997. A Multivariate Gaussian Mixture Model for Automatic Compound Word Extraction. Proceedings of Research of Computaionl Linguistics X, 123-142.
Charniak, Eugene. 1993. Statistical Language Learning. Cambridge:MIT.
Chen, Chao Jan , Ming-Hung Bai and Keh-Jiann Chen. 1997. Category Guessing for Chinese Unknown Words. Proceedings of the Natural Language Processing Pacific Rim Symposium 1997, 35-40.
Chen, Keh-Jiann and Ming-Hong Bai. 1998. Unknown Word Detection for Chinese by a Corpus-based Learning Method. Computational Lingiustics and Chinese Language Processing vol3 no. 1, 27-44.
Chen, Keh-Jiann and Chao-Jan Chen. 2000. Automatic Semantic Classification for Chinese Unknown Compound Nouns. Proceedings of the 18th International Conference on Computaitonal Linguistics.
---. 2000. Knowledge Extraction for Identification of Chinese Organization Names. Proceedings of the second Chinese Language Processing Workshop, 15-21.
Chen, Hsin-Hsi and Chi-Ching Lin. 2000. Sense-Tagging Chinese Corpus. Proceedings of the second Chinese Language Processing Workshop, 7-14.
Church, Kenneth W. .and Hanks P. 1990. Word Association Norm, Mutual Information, and Lexicography. Computational Linguitsics 16, 22-29.
Franz, Alexander. 1996. Automatic Ambiguity Resolution in Natural Language Processing. Berlin:Springer.
Her, One-Soon. 1997. Interaction and Varivation in the Chinese VO Construction. Tapiei:Crane.
Huang, Chu-Ren, Wei-Mei Hong and Keh-Jiann Chen. 1994. An Introduction Based Lexical of Abbreviation. Proceedings of the 2th Pacific Asia Conference on Formal and Computational Linguistics, 49-52.
Huang, Chu-Ren, Keh-Jiann Chen, Lili Chang and Fung-Yi Chen. 1997. Segmentation Standrad for Chinese Natural Lnaguage Processing vol 2. no.2.:47-62.
Huang, Chu-Ren, Zhao-Ming Gao, Claude C. C. Shen, Keh-Jiann Chen. 1998. Quantitative Criteria for Computational Chinese Lexicagraphy. Proceedings of Research on Compuational Linguistics Conference XI, 87-108.
Jurafsky, Daniel and James H. Martin. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. New Jersey: Pearson Higher Education.
Katamba, Francis. 1993. Morphology. New York: St. Martin’s Press.
Luo, Lyih-Peir. 1999. Studien zu seriellen Verbkonstruktionen. Tuebingen:Niemeyer.
Manning Christopher D. and Hinrich Schuetze. 1999. Foundations of Statistical Natural Language Processing. Cambridge:MIT
Mitchel, Tom M. 1997. Machine Learning. Boston:McGrqw-Hill.
Oakes, Michael P. 1998. Statistics for Corpus Linguistics. Edinburgh: Edinburgh University Press.
Peter, Sells. 1985. Lectures on Contemporary Syntactic Theories. Stanford: Stanford University.
Resnik, Philip. 1995. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), 448-453.
---. 1998. Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research XI, 95-130.
Resnik, Philip and Mona Diab. 2000. Measuring Verbal Similarity. Technical Report:LAMP-TR-047//UMIACS-TR-2000-40/CS-TR-4149/MDA-9049-6C-1250. University of Maryland, College Park.
Pinker, Steven. 1989. Learnability and Cognition. Cambridge: MIT Press.
---. 1995. The Language Instinct. New York: HarperPerennial.
Ross, Sheldon. 1988. A First Course in Probability. Third Edition. New York:Macmillan.
Weischedel, Ralph, Marie Meteer, Richard Schwartz, Lance Ramshaw and Jeff Palmucci. 1993. Coping with Ambiguity and Unknown Words through Probalistic Model. Computaional Lingistics 19,359-382.
Witten Ian H. and Eibe Frank. 1999. Data Mining : Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco:Morgan Kaufmann.
Yang, Erhong, Guoqing Zhang, and Yongkui Zhang. 2000. The Research of Word Sense Disambiguation Method Based on Co-occurrence Frequency of Hownet. Proceedings of the second Chinese Language Processing Workshop, 60-65. |
Description: | 碩士 國立政治大學 語言學研究所 88555011 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#A2002001006 |
Data Type: | thesis |
Appears in Collections: | [語言學研究所] 學位論文
Files in This Item:
File |
Size | Format | |
index.html | 0Kb | HTML2 | 875 | View/Open |
All items in 政大典藏 are protected by copyright, with all rights reserved.