政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/71721
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文筆數/總筆數 : 113318/144297 (79%)
造訪人次 : 50976502      線上人數 : 879
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/71721
    請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/71721


    題名: 多語言的場景文字偵測
    Multilingual Scene Text Detection
    作者: 梁苡萱
    Liang, Yi Hsuan
    貢獻者: 廖文宏
    Liao, Wen Hung
    梁苡萱
    Liang, Yi Hsuan
    關鍵詞: 場景文字偵測
    雙邊濾波器
    最大穩定極值區域
    Scene text detection
    Bilateral filter
    Maximally Stable Extremal Region(MSER)
    日期: 2014
    上傳時間: 2014-12-01 14:19:48 (UTC+8)
    摘要: 影像中的文字訊息,通常包含著與場景內容相關的重要資訊,如地點、名稱、指示、警告等,因此如何有效地在影像中擷取文字區塊,進而解讀其意義,成為近來電腦視覺領域中相當受矚目的議題。然而在眾多的場景文字偵測方法裡,絕大多數是以英文為偵測目標語言,中文方面的研究相當稀少,而且辨識率遠不及英文。因此,本論文提出以中文和英文為偵測目標語言的方法,分成以下四個主要程序:一、前處理,利用雙邊濾波器(Bilateral filter)使文字區域更加穩定;二、候選文字資訊擷取,考慮文字特徵,選用Canny 邊緣偵測和最大穩定極值區域(Maximally Stable Extremal Region),分別提取文字邊緣和區域特徵,並結合兩者來優化擷取的資訊;三、文字連結,依中文字結構和直式、橫式兩種書寫方向,設置幾何條件連結候選文字字串;四、候選字串分類,以SVM加入影像中文字的特徵,分類文字字串和非文字字串。使得此方法可以偵測中文和英文兩種語言,並且達到不錯的辨識效果。
    Text messages in an image usually contain useful information related to the scene, such as location, name, direction and warning. As such, robust and efficient scene text detection has gained increasing attention in the area of computer vision recently. However, most existing scene text detection methods are devised to process Latin-based languages. For the few researches that reported the investigation of Chinese text, the detection rate was inferior to the result for English.
    In this thesis, we propose a multilingual scene text detection algorithm for both Chinese and English. The method comprises of four stages: 1. Preprocessing by bilateral filter to make the text region more stable. 2. Extracting candidate text edge and region using Canny edge detector and Maximally Stable Extremal Region (MSER) respectively. Then combine these two features to achieve more robust results. 3. Linking candidate characters: considering both horizontal and vertical direction, character candidates are clustered into text candidates by using geometrical constraints. 4. Classifying candidate texts using support vector machine (SVM), the text and non-text areas are separated. Experimental results show that the proposed method detects both Chinese and English texts, and achieve satisfactory performance compared to those approaches designed only for English detection.
    參考文獻: [1] 王冠智. 雲端筆記之混合式文字切割與辨識.國立政治大學資訊科學研究所碩士論文,2012.
    [2] Trung Quy Phan, Palaiahnakote Shivakumara, Chew Lim Tan.“Text detection in natural scenes using Gradient Vector Flow-Guided symmetry. ”ICPR 2012.
    [3] Lukáš Neumann.“ Scene text recognition in images and video.”PhD Proposal, 2012.
    [4] Teofilo E. de Campos, Bodla Rakesh Babu, and Manik Varma.“ Character recognition in natural images.” In Proceedings of the International Conference on Computer Vision Theory and Applications, 2009.
    [5] J.J. Lee, P.H. Lee, S.W. Lee, A. Yuille, and C. Koch. “Adaboost for text detection in natural scene. ”In Document Analysis and Recognition (ICDAR), 2011.
    [6] Xu-Cheng Yin, Xuwang Yin, Kaizhu Huang, and Hong-Wei Hao.“Robust Text Detection in Natural Scene Images.” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2013.
    [7] 陳學志,張瓅勻,邱郁秀,宋曜廷,張國恩. 中文部件組字與形構資料庫之建立及其在識字教學的應用. 教育心理學報2011 43卷
    [8] Gang Zhou, Yuehu Liu, Quan Meng, and Yuanlin Zhang.Detection Multilingual text in Natural Scene.IEEE-ISAS 2011.
    [9] X. Chen and A. L. Yuille.″Detecting and reading text in natural scenes.″CVPR, 2004.
    [10] Boris Epshtein, Eyal Ofek, and Yonatan Wexler.“Detecting text in natural scenes with stroke width transform.” CVPR, page 2963-2970. IEEE, 2010.

    [11] C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Yu. Detecting texts of arbitrary orientations in natural images. CVPR, IEEE, 2012.
    [12] C. Yi and Y. Tian.“Text string detection from natural scenes by structure-based partition and grouping.”IEEE Trans. on Image Processing, 2011.
    [13] Huizhong Chen, Sam S. Tsai, Georg Schroth, David M. Chen, Radek Grzeszczuk, and Bernd Girod. ”Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions.” IEEE Trans. on Image Processing, 2011.
    [14] J. Matas, O. Chum, M. Urban, T. Pajdla. “Robust Wide Baseline Stereo From Maximally Stable Etremal Region.” Proc. Of British Machine Version Conference, 2002.
    [15] Yi-Feng Pan, Xinwen Hou, and Cheng-Lin Liu. ”A hybrid approach to detect and localize texts in natural secne images.” IEEE Trans. Image Processing, 2011.
    [16] Wayne Niblack. ”An Introduction to Digital Image Processing.” Prentice-Hall, 1986.
    [17] John Canny. “A computational approach to edge detection.” Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-8(6):679–698, 1986.
    [18] Carlo Tomasi and Roberto Manduchi, “Bilateral filtering for gray and color images,” in Computer Vision, 1998. Sixth International Conference on . IEEE, 1998, pp. 839– 846.
    [19] Boser, B. E,. Guyon, I. M, Vapnik, V. N, "A training algorithm for optimal margin classifiers", "Proceedings of the fifth annual workshop on Computational learning theory - COLT `92". p. 144.
    [20] L. Neumann and J. Matas, “On combining multiple segmentations in scene text recognition,” in Proc. Int. Conf. on Document Analysis and Recognition, 2013.
    [21] C. Shi, C. Wang, B. Xiao, Y. Zhang, and S. Gao, “Scene text detection using graph model built upon maximally stable extremal regions,” Pattern Recognition Letters, vol. 34, no. 2, pp. 107–116, 2013.

    [22] A. Shahab, F. Shafait, and A. Dengel, “ICDAR 2011 robust reading competition challenge 2: Reading text in scene images,” in ICDAR 2011, 2011, pp. 1491–1496.
    [23] A. Shahab, F. Shafait, and A. Dengel, “ICDAR 2011 robust reading competition challenge 2: Reading text in scene images,” in ICDAR 2011, 2011, pp. 1491–1496.
    描述: 碩士
    國立政治大學
    資訊科學學系
    101753021
    103
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0101753021
    資料類型: thesis
    顯示於類別:[資訊科學系] 學位論文

    文件中的檔案:

    檔案 大小格式瀏覽次數
    302101.pdf4005KbAdobe PDF21593檢視/開啟


    在政大典藏中所有的資料項目都受到原著作權保護.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回饋