政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/136959
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113648/144635 (79%)
Visitors : 51609685      Online Users : 910
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/136959


    Title: 無人機環境感知與階層式手勢控制於人機協作任務應用
    UAV Environment Perception and Hierarchical Gesture Control in Human-Robot Collaboration Applications
    Authors: 邱庭毅
    Chiu, Ting-Yi
    Contributors: 劉吉軒
    Liu, Jyi-Shane
    邱庭毅
    Chiu, Ting-Yi
    Keywords: 無人機
    手勢辨識
    人機協作
    基於視覺的即時定位與地圖構建
    實例分割
    UAV
    Gesture Recognition
    Human-Robot Collaboration
    v-SLAM
    Instance Segmentation
    Date: 2021
    Issue Date: 2021-09-02 16:52:03 (UTC+8)
    Abstract: 無人機應用由早期軍事任務的範疇,逐漸拓展到現今民生服務的領域,鑑於無人機具有易於專案部署、低保養成本、高機動性的特點,因此廣受各領域人士的歡迎,但是實體搖桿控制對操作人員並不友善,需要專業培訓以熟練所有操作技能,是屬於高學習門檻的人機互動方式。除此之外,無人機的自動化控制是難以有效地被應用於現實任務,主要原因是現實環境往往是非結構化的,對於自動化控制而言,可能存在未被定義或者未能被準確定義的例外狀況。
    為了建立自然直觀的無人機操控方式,本研究提出無人機環境感知與階層式手勢控制的人機協作方法,採用階層式框架以手勢進行半自動化飛行控制的調控,是基於Mediapipe的手部追蹤與定位技術,提出由幾何向量計算手指開合狀態與指向方位作為手勢辨識的方法;此外也基於ORB-SLAM2的即時定位與地圖構建與Detectron2的實例分割技術,提出可以根據自訂義資料集進行特定目標的感知,透過圖片的實例分割進行3D物體的體積與座標估計。最後,經由數名受試者的實驗資料結果分析,得以證實本研究提出的控制方法更優於實體的搖桿控制,可以更快更高效率地完成任務,而且在環繞飛行時目標的檢視畫面也更為平穩。
    The application of UAVs has gradually shifted from military missions to civilian services. UAVs are popular in various fields due to their convenient deployment, low maintenance cost, and high maneuverability. However, the joystick control is not friendly to the operator, because the joystick is a human-computer interaction with the high learning threshold, and requires professional training to be proficient in skill. In addition, since real-world conditions are usually unstructured, and there may be undefined or inaccurately defined exceptions, it is difficult for the automated control of UAVs to be applied to real-world tasks.
    In order to create an intuitive drone control method, we propose a human-robot collaboration method of UAV environment perception and hierarchical gesture control, using a hierarchical framework to adjust flight procedures through gestures to achieve semi-automatic control. In hierarchical gesture control, we adopt flexion state of fingers and pointing direction of hand as the features of gesture recognition, based on the hand tracking technology of Mediapipe. Furthermore, we provide customizable target perception based on combining ORB-SLAM2 and Detectron2, which can estimate the volume and coordinates of 3D objects by instance segmentation. Finally, through the analysis of the experimental results of the participants, given that our proposed control method can complete the task more efficiently and provide a more stable image during surround inspection, we can confirm that our proposed control method is better than physical joystick control.
    Reference: [1] R. Austin, Unmanned Aircraft Systems: UAVS Design, Development and Deployment, John Wiley & Sons, 2011.
    [2] S. G. Gupta, M. M. Ghonge and P. M. Jawandhiya, "Review of Unmanned Aircraft System (UAS)," International Journal of Advanced Research in Computer Engineering & Technology, vol. 2, no. 4, pp. 1646-1658, 2013.
    [3] A. P. Cracknell, "UAVs: regulations and law enforcement," International Journal of Remote Sensing, vol. 38, no. 8-10, pp. 3054-3067, 2017.
    [4] PwC, "Global market for commercial applications of drone technology valued at over $127bn," 2016. [Online]. Available: https://pwc.blogs.com/press_room/2016/05/global- market-for-commercial-applications-of-drone-technology-valued-at-over-127bn.html. [Accessed Nov 2020].
    [5] N. Smolyanskiy and M. Gonzalez-Franco, "Stereoscopic First Person View System for Drone Navigation," Frontiers in Robotics and AI, vol. 4, no. 11, 2017.
    [6] D. A. Schoenwald, "AUVs: In space, air, water, and on the ground," IEEE Control Systems Magazine, vol. 20, no. 6, pp. 15-18, 2000.
    [7] K. W. Williams, "A summary of unmanned aircraft accident/incident data: Human factors implications," 2004.
    [8] E. Peshkova, M. Hitz and B. Kaufmann, "Natural interaction techniques for an unmanned aerial vehicle system," IEEE Pervasive Computing, vol. 16, no. 1, pp. 34-42, 2017.
    [9] F. F. Mueller and M. Muirhead, "Jogging with a Quadcopter," in In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 2015.
    [10] M. Stokkeland, K. Klausen and T. A. Johansen, "Autonomous visual navigation of unmanned aerial vehicle for wind turbine inspection," in 2015 International Conference on Unmanned Aircraft Systems (ICUAS), 2015.
    [11] A. Loquercio, A. I. Maqueda, C. R. del-Blanco and D. Scaramuzza, "DroNet: Learning to Fly by Driving," IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 1088-1095, 2018.
    [12] A. Giusti, J. Guzzi, D. C. Cireşan, F.-L. He, J. P. Rodriguez, F. Fontana, M. Fässler, C. Forster, J. Schmidhuber, G. Di Caro, D. Scaramuzza and L. M. Gambardella, "A machine learning approach to visual perception of forest trails for mobile robots," IEEE Robotics and Automation Letters, vol. 1, no. 2, pp. 661-667, 2015.
    [13] P. Tsarouchi, S. Makris and G. Chryssolouris, "Human–robot interaction review and challenges on task planning and programming," International Journal of Computer Integrated Manufacturing, vol. 29, no. 8, pp. 916-931, 2016.
    [14] H. Liu and L. Wang, "Gesture recognition for human-robot collaboration: A review," International Journal of Industrial Ergonomics, vol. 69, pp. 355-367, 2018.
    [15] Google, "Pixel Phone," 2019. [Online]. Available: https://support.google.com/pixelphone/answer/9517454?hl=zh-Hant. [Accessed 22 Nov 2020].
    [16] BMW, "BMW ConnectedDrive 智慧互聯駕駛," 2017. [Online]. Available: https://www.bmw.com.tw/zh/all-models/x-series/X3/2017/connectivity-driver- assistance.html. [Accessed 22 Nov 2020].
    [17] M. Karam, "A framework for research and design of gesture-based human-computer interactions," PhD Thesis, University of Southampton, 2006.
    [18] P. K. Pisharady and M. Saerbeck, "Recent methods and databases in vision-based hand gesture recognition: A review," Computer Vision and Image Understanding, vol. 141, pp. 152-165, 2015.
    [19] W. Zeng, "Microsoft kinect sensor and its effect," IEEE multimedia, vol. 19, no. 2, pp. 4-10, 2012.
    [20] F. Weichert, D. Bachmann, B. Rudak and D. Fisseler, "Analysis of the accuracy and robustness of the leap motion controller," Sensors, vol. 13, no. 5, pp. 6380-6393, 2013.
    [21] P. Hong, M. Turk and T. S. Huang, "Gesture modeling and recognition using finite state machines," in Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition, 2000.
    [22] L. Bretzner, I. Laptev and T. Lindeberg, "Hand Gesture Recognition using Multi-Scale Colour Features, Hierarchical Models and Particle Filtering," in Proceedings of fifth IEEE international conference on automatic face gesture recognition, 2002.
    [23] I. Oikonomidis, N. Kyriazis and A. A. Argyros, "Efficient Model-based 3D Tracking of Hand Articulations using Kinect," in BmVC, 2011.
    [24] Z. Ren, J. Yuan, J. Meng and Z. Zhang, "Robust Part-Based Hand Gesture Recognition Using Kinect Sensor," IEEE transactions on multimedia, vol. 15, no. 5, pp. 1110-1120, 2013.
    [25] O. Kopuklu, N. Kose and G. Rigoll, "Motion fused frames: Data level fusion strategy for hand gesture recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018.
    [26] O. Köpüklü, A. Gunduz, N. Kose and G. Rigoll, "Online Dynamic Hand Gesture Recognition Including Efficiency Analysis," IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 2, no. 2, pp. 85-97, 2020.
    [27] G. Michalos, S. Makris, P. Tsarouchi, T. Guasch, D. Kontovrakis and G. Chryssolouris, "Design considerations for safe human-robot collaborative workplaces," Procedia CIrP, vol. 37, pp. 248-253, 2015.
    [28] A. Bauer, D. Wollherr and M. Buss, "Human-Robot Collaboration: A Survey," International Journal of Humanoid Robotics, vol. 5, no. 1, pp. 47-66, 2008.
    [29] V. Villani, F. Pini, F. Leali and C. Secchi, "Survey on human–robot collaboration in industrial settings: Safety, intuitive interfaces and applications," Mechatronics, vol. 55, pp. 248-266, 2018.
    [30] I. Surgical, "Da Vinci Surgical Systems," 2017, [Online]. Available: https://www.intuitive.com/en-us/products-and-services/da-vinci/systems. [Accessed Nov 2020].
    [31] T. Fong, A. Abercromby, M. G. Bualat, M. C. Deans, K. V. Hodges, J. M. Hurtado Jr, R. Landis, P. Lee and D. Schreckenghost, "Assessment of robotic recon for human exploration of the Moon," Acta Astronautica, vol. 67, no. 9-10, pp. 1176-1188, 2010.
    [32] M. Ester, H.-P. Kriegel, J. Sander and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Kdd, 1996.
    [33] Q.-Y. Zhou, J. Park and V. Koltun, "Open3D: A Modern Library for 3D Data Processing," arXiv preprint arXiv:1801.09847, 2018.
    [34] I. Bogoslavskyi and C. Stachniss, "Fast range image-based segmentation of sparse 3D laser scans for online operation," in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016.
    [35] S. Song and J. Xiao, "Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
    [36] C. R. Qi, O. Litany, K. He and L. J. Guibas, "Deep Hough Voting for 3D Object Detection in Point Clouds," in Proceedings of the IEEE International Conference on Computer Vision, 2019.
    [37] S. Pillai and J. J. Leonard, "Monocular slam supported object recognition," arXiv preprint arXiv:1506.01732, 2015.
    [38] L. Zhang, L. Wei, P. Shen, W. Wei, G. Zhu and J. Song, "Semantic SLAM based on object detection and improved octomap," IEEE Access, vol. 6, pp. 75545-75559, 2018.
    [39] T. Taketomi, H. Uchiyama and S. Ikeda, "Visual SLAM algorithms: a survey from 2010 to 2016," IPSJ Transactions on Computer Vision and Applications, vol. 9, no. 1, 2017.
    [40] A. J. Davison, I. D. Reid, N. D. Molton and O. Stasse, "MonoSLAM: Real-Time Single Camera SLAM," IEEE transactions on pattern analysis and machine intelligence, vol. 29, no. 6, pp. 1052-1067, 2007.
    [41] G. Klein and D. Murray, "Parallel tracking and mapping for small AR workspaces," in 2007 6th IEEE and ACM international symposium on mixed and augmented reality, 2007.
    [42] R. Mur-Artal and J. D. Tardós, "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras," IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, 2017.
    [43] R. Sun and B. A. Giuseppe, "3D reconstruction of real environment from images taken from UAV (SLAM approach)," PhD Thesis, Politecnico di Torino, 2018.
    [44] J. Engel, T. Schops and D. Cremers, "LSD-SLAM: Large-Scale Direct Monocular SLAM," in European conference on computer vision, 2014.
    [45] J. Engel, V. Koltun and D. Cremers, "Direct Sparse Odometry," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611-625, 2017.
    [46] C. Forster, Z. Zhang, M. Gassner, M. Werlberger and D. Scaramuzza, "SVO: Semidirect Visual Odometry for Monocular and Multicamera Systems," IEEE Transactions on Robotics, vol. 33, no. 2, pp. 249-265, 2016.
    [47] N. Yang, R. Wang, X. Gao and D. Cremers, "Challenges in Monocular Visual Odometry: Photometric Calibration, Motion Bias, and Rolling Shutter Effect," IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 2878-2885, 2018.
    [48] R. Mur-Artal, J. M. M. Montiel and J. D. Tardos, "ORB-SLAM: a versatile and accurate monocular SLAM system," IEEE transactions on robotics, vol. 31, no. 5, pp. 1147- 1163, 2015.
    [49] M. Everingham, L. V. Gool, C. K. I. Williams, J. Winn and A. Zisserman, "The pascal visual object classes (voc) challenge," International journal of computer vision, vol. 88, no. 2, pp. 303-338, 2010.
    [50] A. Geiger, P. Lenz and R. Urtasun, "Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite," in 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
    [51] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fei, "ImageNet Large Scale Visual Recognition Challenge," International journal of computer vision, vol. 115, no. 3, pp. 211-252, 2015.
    [52] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
    [53] A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, A. Kolesnikov, T. Duerig and V. Ferrari, "The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale," 2018.
    [54] V. Badrinarayanan, A. Kendall and R. Cipolla, "Segnet: A deep convolutional encoder- decoder architecture for image segmentation," IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 12, pp. 2481-2495, 2017.
    [55] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, "Semantic image segmentation with deep convolutional nets and fully connected crfs," arXiv preprint arXiv:1412.7062, 2014.
    [56] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017.
    [57] L.-C. Chen, G. Papandreou, F. Schroff and H. Adam, "Rethinking atrous convolution for semantic image segmentation," arXiv preprint arXiv:1706.05587, 2017.
    [58] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schrof and H. Adam, "Encoder-decoder with atrous separable convolution for semantic image segmentation," in Proceedings of the European conference on computer vision (ECCV), 2018.
    [59] K. He, G. Gkioxari, P. Dollar and R. Girshick, "Mask R-CNN," in Proceedings of the IEEE international conference on computer vision, 2017.
    [60] Y. Wu, A. Kirillov, F. Massa, W.-Y. Lo and R. Girshick, "Detectron2," Facebook, 2019. [Online]. Available: https://github.com/facebookresearch/detectron2. [Accessed 22 Nov 2020].
    [61] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler and A. Ng, "ROS: an open-source Robot Operating System," in ICRA workshop on open source software, 2009.
    [62] DJI, "DJI Mavic2," 2018. [Online]. Available: https://www.dji.com/tw/mavic-2. [Accessed Nov 2020].
    [63] F. Zhang, V. Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C.-L. Chang and M. Grundmann, "MediaPipe Hands: On-device Real-time Hand Tracking," arXiv preprint arXiv:2006.10214, 2020.
    Description: 碩士
    國立政治大學
    資訊科學系
    106753010
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0106753010
    Data Type: thesis
    DOI: 10.6814/NCCU202101454
    Appears in Collections:[Department of Computer Science ] Theses

    Files in This Item:

    File Description SizeFormat
    301001.pdf28334KbAdobe PDF2564View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback