政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/32707
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113318/144297 (79%)
Visitors : 51040740      Online Users : 971
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/32707


    Title: 多人虛擬環境中互動式語音界面的實現
    Realizing the Interactive Speech Interface in a Multi-user Virtual Environment
    Authors: 廖峻鋒
    Liao , Chun-Feng
    Contributors: 李蔡彥
    Li , Tsai-Yen
    廖峻鋒
    Liao , Chun-Feng
    Keywords: 虛擬環境
    對話管理
    語音
    Virtual Environment
    VoiceXML
    Dialog Management
    Speech
    Date: 2003
    Issue Date: 2009-09-17 14:06:12 (UTC+8)
    Abstract: 近年來3D虛擬環境與語音界面(Voice User Interface)在個人電腦上的應用逐漸受到重視。說話是人類最自然的溝通方式,若能在虛擬環境中加入語音界面,將使人物間的互動更為流暢。近年來雖有許多研究致力於3D虛擬環境與語音界面的整合,但在多人環境中對話管理(Dialog Management)等相關問題上,一直缺乏有效的解決方案。本研究的主要目的,即在解決語音界面整合及對話管理等問題,並實現多人虛擬環境的語音互動機制。我們針對虛擬環境中語音與動畫同步、對話管理機制與多人環境中之語音處理機制等問題,設計一個以VoiceXML為基礎的XAML-V (eXtensible Animation Markup Language – Voice extension ) 語言,並將其實作結果於一個多人虛擬環境系統中驗証其可行性及有效性。
    The applications of 3D virtual environments and voice user interface (VUI) on personal computers have received significant attentions in recent years. Since speech is the most natural way of communication, incorporating VUI into virtual environments can enhance user interaction and immersiveness. Although there have been many researches addressing the issue of integrating VUI and 3D virtual environment, most of the proposed solutions do not provide an effective mechanism for multi-user dialog management. The objective of this research is on providing a solution for VUI integration and dialog management and realizing such a mechanism in a multi-user virtual environment. We have designed a dialog scripting language called XAML-V (eXtensible Animation Markup Language – Voice Extension), based on the VoiceXML standard, to address the issues of synchronization between VUI and animation and dialog management for multi-user interaction. We have also implemented such a language and realized it on a multi-user virtual environment to evaluate the effectiveness of this design.
    Reference: [1] ActiveWorlds, URL:<http://www.activeworlds.com>.
    [2] C.Alexander, A Pattern Language: Towns, Buildings, Construction, Oxford University Press, 1977.
    [3] D.Alur, J.Crupi and D.Malks, Core J2EE Patterns,2nd edtion, Prentice Hall, 2003.
    [4] S.W.Ambler, Process Patterns: Building Large-Scale Systems Using Object Technology, Cambridge University Press, 1998.
    [5] O.Apaydin. “Networked Humanoid Animation Driven by Human Voice using Extensible 3D(X3D),H-Anim and Java Speech Open Standards,” Master Thesis, Naval Postgraduate School, March 2002.
    [6] S.P. Berczuk, B.Appleton, Software Configuration Management Patterns: Effective Teamwork, Practical Integration, Addison-Wesley, 2002.
    [7] Blaxxun, URL:<http://www.blaxxun.com>
    [8] J.Bloch, Effective Java: Programming Language Guide, Addison Wesley, 2001.
    [9] F.Buschmann, R.Meunier, H.Rohnert, P.Sommerlad, and M.Stal, Pattern-Oriented Software Architecture, Volume 1:A System of Patterns, John Wiley & Son, 1996.
    [10] J. Carey and B. Carlson, Framework Process Patterns: Lessons Learned Developing Application Frameworks, Addison-Wesley, 2002
    [11] B. Carpenter, S. Caskey, K. Dayanidhi, C. Drouin, and R. Pieraccini, “A portable, server-side dialog framework for VoiceXML, ” Proceedings of 2002 International Conference on Spoken Language Processing, Denver, Colorado, 2002.
    [12] M. Cernak and A. Sannier, “Command Speech Interface to Virtual Reality Applications,” Virtual Reality Applications Center at Iowa State University of Science and Technology, June 2002.
    [13] Cloud Garden API for Java Speech API, URL:< http://www.cloudgarden.com/JSAPI/index.html >
    [14] S. Descamps, H. Prendinger, and M. Ishizuka, “A multimodal presentation mark-up language for enhanced affective presentation,” Proceedings of the International Conference on Intelligent Multimedia and Distant Education (ICIMADE-01), Advances in Educational Technologies: Multimedia, WWW and Distance Education, pp. 9–16, 2001.
    [15] Distributed Interactive Virtual Environment, DIVE, URL:< http://www.sics.se/dive/ >
    [16] M. E. Fayad, D. C. Schmidt, and R. E. Johnson, “Application Frameworks,” Building Application Frameworks - Object-Oriented Foundations of Framework Design, Wiley Computer Publishing, Chap. 1, pp. 1-28, 1999.
    [17] M.Fowler, Analysis Patterns, Addison-Wesley, 1996.
    [18] M. Fowler, D. Rice, M. Foemmel, E. Hieatt, R. Mee and R. Stafford, Patterns of Enterprise Application Architecture, Addison-Wesley, 2003.
    [19] E. Frecon and M. Stenius, “DIVE: A Scalable network architecture for distributed virtual environments,” Distributed Systems Engineering Journal (Special issue on Distributed Virtual Environments), Vol. 5, No. 3, p.91-100, September 1998.
    [20] FreeTTS, URL:<http://freetts.sourceforge.net/>
    [21] E. Gamma, R. Helm, R. Johnson and J. Vlissides, Design Patterns: Elements of Reusable Object-oriented Software, Addison-Wesley,1995.
    [22] C. Greenhalgh and S. Benford, “MASSIVE: a collaborative virtual environment for teleconferencing,” ACM Transaction CHI, Volume 2, p.239-261, 1995.
    [23] P. Haggar, Practical Java – Programming Language Guide, Addison-Wesley, 2000.
    [24] H-Anim, URL:<http://www.h-anim.org>
    [25] Z. Huang, A. Eliens, and C. Visser, “STEP: A Scripting Language for Embodied Agents,” Proceedings of the Workshop on Lifelike Animated Agents, 2002.
    [26] Intelligent Media Net, URL: <http://imlab.cs.nccu.edu.tw/bResearch.jsp#9>
    [27] Introducing Computer Speech Technology, MSDN, URL:<http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sasdk_getstarted/html/intro_speechtech_intro.asp>
    [28] Java Speech API, URL:< http://java.sun.com/products/java-media/speech/>
    [29] S. Kawamoto, H. Shimodaira, T. Nitta, T. Nishimoto, S. Nakamura, K. Itou, S. Morishima, T. Yotsukura, A. Kai, A. Lee, Y. Yamashita, T. Kobayashi, K. Tokuda, K, Hirose, N. Minematsu, A. Yamada, Y. Den, T. Utsuro, and S. Sagayama, “Open-source software for developing anthropomorphic spoken dialog agent,” Proceedings of 2002 International Workshop on Lifelike Animated Agents, pp.64-69, Aug 2002.
    [30] S. Kshirsagar, A. Guye-Vuilleme, and K. Kamyab, “Avatar Markup Language,” Proceedings of 8th Eurographics Workshop on Virtual Environments, pp. 169-177, May, 2002.
    [31] D. Lea, Concurrent Programming in Java:Design Principles and Patterns,2nd edtion, Addison-Wesley, 1999.
    [32] T.Y Li, M.Y Liao, and C.F Liao. "An Extensible Scripting Language for Interactive Animation in a Speech-Enabled Virtual Environment," Proceedings of IEEE International Conference on Multimedia and Expo (ICME2004), Taipei, Taiwan.
    [33] M.Y Liao and T.Y Li, ”A Scripting Language for Extensible Animation,” Proceedings of 2003 Computer Graphics Workshop, Taiwan, 2003.
    [34] Maja Matijasevic, “A Review of Networked Multi-User Virtual Environment,” URL: <http://citeseer.nj.nec.com/matijasevic97review.html>, 1997
    [35] S. McGlashan, “Speech Interfaces to Virtual Reality,” Proceedings of 2nd International Workshop on Military Applications of Synthetic Environments and Virtual Reality, 1995.
    [36] J. Nielsen, Object-oriented reuse: experience in developing a framework for speech recognition applications, 1999.
    [37] T. Nishimoto and S. Sagayama, “The VoiceXML Interperter for the Anthropomorphic Agent Software Galatea,” Proceedings of 17th Annual Conference of the Japanese Society for Artificial Intelligence, 2003.
    [38] E. Nyberg, T. Mitamura, P. Placeway, M. Duggan, and N. Hataoka, “DialogXML: Extending VoiceXML for Dynamic Dialog Management,” Proceedings of the Human Language Technology Conference, 2002.
    [39] K. Perlin, and A. Goldberg, “Improv: A System for Scripting Interactive Characters in Virtual Worlds,” Proceedings of SIGGRAPH 96, ACM Press, pp. 205-216, 1996.
    [40] N. Ramakrishnan, R. Capra, and M.A. Perez-quinones, “Mixed-Initiative Interaction = Mixed Computation,” Proceedings of ACM SIGPLAN Workshop PEPM’02, January 2002.
    [41] S. Sagayama, S. Kawamoto, H. Shimodaira, T. Nitta, T. Nishimoto, S. Nakamura, K. Itou, S. Morishima, T. Yotsukura, A.Kai, A.Lee, Y. Yamashita, T. Kobayashi, K. Tokuda, K. Hirose, N. Moinematsu, A. Yamada, Y. Den, and T. Utsuro, “Galatea:An Anthropomorphic Spoken Dialogue Agent Toolkit,” IPSJ SIG-SLP, Feburary 2003
    [42] Sphinx, URL:<http://www.speech.cs.cmu.edu/sphinx/>
    [43] S. Srinivasan and J. Vergo, ”Object oriented reuse: experience in developing a framework for speech recognition applications,” Proceedings of the 20th international conference on Software engineering, pp. 322 – 330, Kyoto, Japan, 1998.
    [44] R. Stuart, The Design of Virtual Environments, McGraw-Hill, New York, 1996.
    [45] A. Ujwal and N. Mehrotra, NAVxI: A VoiceXML interpreter, Technical Report, Indian Institute of Technology Kanpur, December 2002.
    [46] VNet, URL:<http://www.csclub.uwaterloo.ca/u/sfwhite/vnet/ >
    [47] VoiceXML, URL:< http://www.w3.org/Voice/ >
    [48] VRML, URL:<http://www.web3d.org/ >
    [49] R.C.Waters and J.W.Barrus, "The rise of shared virtual environments," IEEE Spectrum, Volume 34, Issue 3, pp. 20-25, 1997.
    [50] Wauchope, K., S. Everett and D. Tate, T. Maney, “Speech-Interactive Virtual Environments for Ship Familiarization,” Proceedings of 2nd International EuroConference on Computer and IT Applications in the Maritime Industries (COMPIT `03), pp. 70-83, Hamburg, Germany, May 2003.
    [51] 岡崎直觀,Santi Saeyor,土肥浩,石塚滿,“記述言語MPML的3次元VRML空間的擴張”,電子情報通信學會論文誌採錄決定.
    Description: 碩士
    國立政治大學
    資訊科學學系
    91753004
    92
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0917530041
    Data Type: thesis
    Appears in Collections:[Department of Computer Science ] Theses

    Files in This Item:

    File Description SizeFormat
    53004101.pdf79KbAdobe PDF2905View/Open
    53004102.pdf105KbAdobe PDF2931View/Open
    53004103.pdf107KbAdobe PDF2838View/Open
    53004104.pdf153KbAdobe PDF2860View/Open
    53004105.pdf174KbAdobe PDF21026View/Open
    53004106.pdf815KbAdobe PDF21380View/Open
    53004107.pdf525KbAdobe PDF21100View/Open
    53004108.pdf652KbAdobe PDF21246View/Open
    53004109.pdf1940KbAdobe PDF21220View/Open
    53004110.pdf133KbAdobe PDF2900View/Open
    53004111.pdf144KbAdobe PDF21017View/Open
    53004112.pdf93KbAdobe PDF2877View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback