Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/146900
|
Title: | 影像分析與深偽影片的偵測 A Statistical Approach of Deepfake Video Detection |
Authors: | 陳慧霜 Chen, Huei-Shuang |
Contributors: | 陳怡如 余清祥 Chen, Yi-Ju Yue, Jack C. 陳慧霜 Chen, Huei-Shuang |
Keywords: | 深偽影片 影像辨識 面部紋理 一階差分 背景分析 Deepfake Videos Image Recognition Facial Texture First-order Differentiation Background Analysis |
Date: | 2023 |
Issue Date: | 2023-09-01 14:56:22 (UTC+8) |
Abstract: | 大數據發展為人們帶來方便,但其影響也面臨一刀兩刃的窘境,在提升民眾的生活便利及健康之餘,同時會帶來負面的影響,深偽影片 (Deepfake) 就是知名範例。藉由深度學習等技術偽造照片及影像,顛覆以往「眼見為憑」、「有圖有真相」的認知,眼睛容易受到偽造影片的愚弄,讓我們愈來愈難判斷訊息的真偽。目前多數分辨深偽影像的研究大多聚焦於深度學習模型,這些研究通常藉由模型來判別真偽影片的準確性,對於使用變數解釋性、方法論代表意涵較少著墨,異於一般統計分析的思維,有鑑於此,本文提出統計觀點的影片偵測方法。 我們認為偽造影片經過人工修改,這些影像通常會有過度平滑的傾向,在此參考Xia等人 (2022) 的做法,套用一階差分之類的分析方式,作為判斷影像真偽的依據。本文使用基本的光三原色 (紅綠藍:Red、Green、Blue) ,再加入兩種常見色彩空間為變數:HSV (色相、飽和度、明度:Hue、Saturation、Value) 、YCrCb (流明、紅色色度、藍色色度:Luminance、Red difference、Blue difference) 。除了一階差分,為了更有效判別色彩空間變數的平滑程度,我們也考慮圖像切割 (Segmentation) ,搭配Kolmogorov-Smirnov檢定來設定較為合宜平滑程度之門檻值,結合統計、機器學習模型判斷影片真偽,以深偽影片常見的Celeb-DF資料集為實驗對象,透過交叉驗證評斷方法優劣。分析發現加入HSV、YCrCb 可以增加二元分類問題之準確度,且在圖像分割為16×16時 (過多分割未必較佳) ,偵測準確度超過90%。另外,我們發現背景值對於分類影響很大,準確度受到背景值的影響而下降,但若使用Moran’s I評估真偽影像中背景與人臉的空間異質性,就能使得準確度提高至先前只使用人像的水準。 The development of big data brings convenience to people, but its impact is also facing a double-edged dilemma. While improving the convenience and health of people’s lives, it also poses a danger to the public. Deepfake videos serve as a prominent example, as they exploit deep learning techniques to fabricate images and videos, undermining the long-standing belief in “seeing is believing” and “a picture speaks a thousand words.” This has made it increasingly challenging to discern the authenticity of information. Currently, most deepfake detection approaches heavily rely on deep learning models, prioritizing accuracy while neglecting variable interpretability and methodological implications, deviating from conventional statistical analysis thinking. To address this, this study proposes a statistical perspective for video authentication. In our study, we investigate the characteristics of manipulated deepfake videos and propose a statistical approach for video authentication. Inspired by Xia et al. (2022), we utilize and apply first-order differentiation to varies color features (RGB, HSV, and YCrCb) for important variables, and then plug them into statistical and machine learning models. We use cross-validation to evaluate the proposed approach, comparing it with deep learning models, with the Celeb-DF dataset. We found that adding the variables HSV and YCrCb can improve classification accuracy, surpassing 90% when employing 16x16 image segmentation. Furthermore, we found that including the background images have negative impacts on classification accuracy. Nonetheless, we can still obtain accuracy close to that of using facial data, if adding the spatial heterogeneity of background data via Moran’s I. |
Reference: | [1]Afchar, D., Nozick, V., Yamagishi, J., & Echizen, I. (2018). “MesoNet: A Compact Facial Video Forgery Detection Network”, IEEE International Workshop on Information Forensics and Security (WIFS), Hong Kong, China, 1–7. [2]Arini, A., Bahaweres, R. B., & Haq, J. A. (2022). “Quick Classification of Xception and Resnet-50 Models on Deepfake Video Using Local Binary Pattern”, International Seminar on Machine Learning, Optimization, and Data Science (ISMODE), Jakarta, Indonesia, 254–259. [3]Ben-Hur, A., Horn, D., Siegelmann, H. T., & Vapnik, V. (2002). “Support Vector Clustering”, Journal of Machine Learning Research, 2(2), 125–137. [4]Breiman, L. (2001). “Random Forests”, Machine Learning, 45(1), 5–32. [5]Chen, T., Chuang, K., Wu, J., Chen, S. C., Hwang, I., & Jan, M. (2003). “A novel Image Quality Index using Moran I Statistics”, Physics in Medicine and Biology, 48(8), 131–137. [6]Choi, Y., Choi, M., Kim, M., Ha, J., Kim, S., & Choo, J. (2018). “StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation”, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 8789–8797. [7]Fernández, A., Del Jesus, M. J., & Herrera, F. (2009). “On the Influence of an Adaptive Inference System in Fuzzy Rule Based Classification Systems for Imbalanced Data-Sets”, Expert Systems with Applications, 36(6), 9805–9812. [8]Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation”, 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH,USA, 580–587. [9]Gonzalez, R. C. & Woods, R. E. (2018). Digital Image Processing, (4th ed.),New York, NY: Pearson. [10]Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York, NY: Springer. [11]He, K., Zhang, X., Ren, S., & Sun, J. (2016). “Deep Residual Learning for Image Recognition”, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 770-778. [12]Hosang, J., Benenson, R., & Schiele, B. (2017). “Learning Non-Maximum Suppression”, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 6469–6477. [13]Karras, T., Laine, S., & Aila, T. (2019). “A Style-Based Generator Architecture for Generative Adversarial Networks”, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 4396–4405. [14]Kolkur, S., Kalbande, D., Shimpi, P., Bapat, C. N., & Jatakia, J. (2017). “Human Skin Detection Using RGB, HSV and YCbCr Color Models”, Proceedings of the International Conference on Communication and Signal Processing 2016 (ICCASP 2016). [15]Li, M., Yukang, D., Xia, M., Liu, X., Ding, E., Zuo, W., & Wen, S. (2019). “STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing”, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 3668–3677. [16]Li, Y., Chang, M., & Lyu, S. (2018). “In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking”, IEEE International Workshop on Information Forensics and Security (WIFS), Hong Kong, China, 1–7. [17]Li, Y., & Lyu, S. (2019). “Exposing DeepFake Videos by Detecting Face Warping Artifacts”, CVPR Workshops, 46–52. [18]Li, Y., Yang, X., Sun, P., Qi, H., & Lyu, S. (2020). “Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics”, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 3204–3213. [19]Lugstein, F., Baier, S., Bachinger, G., & Uhl, A. (2021). “PRNU-Based Deepfake Detection”, In Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security (IH&MMSec `21). Association for Computing Machinery, New York, NY, USA, 7–12. [20]Matern, F., Riess, C., & Stamminger, M. (2019). “Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations”, 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), Waikoloa, HI, USA, 83–92. [21]McCloskey, S. & Albright, M. (2018). “Detecting GAN-Generated Imagery using Color Cues”,ArXiv, abs/1812.08247. [22]Moran, P. (1950). “A Test for the Serial Independence of Residuals”, Biometrika, 37, 178-181. [23]OValery. (2017). Swap-Face. https://github.com/OValery16/swap-face [24]Qi, H., Guo, Q., Xu, F. J., Xie, X., Ma, L., Feng, W., Liu, Y., & Zhao, J. (2020). “DeepRhythm. Exposing Deepfakes with Attentional Visual Heartbeat Rhythms”, In Proceedings of the 28th ACM International Conference on Multimedia (MM `20). Association for Computing Machinery, New York, NY, USA, 4318–4327. [25]Robertson, S. P. & Jones, K. S. (1976). “Relevance Weighting of Search Terms”, Journal of the American Society for Information Science, 27(3), 129–146. [26]Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Niessner, M. (2019). “FaceForensics++: Learning to Detect Manipulated Facial Images”, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea(South), 1–11. [27]Tolosana, R., Romero-Tapiador, S., Fierrez, J., & Vera-Rodriguez, R. (2020). “DeepFakes Evolution: Analysis of Facial Regions and Fake Detection Performance”, ICPR Workshops, 442–456. [28]Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., & Ortega-Garcia, J. (2020). “Deepfakes and Beyond: A Survey of Face Manipulation and Fake Detection”, Information Fusion, 64, 131–148. [29]Wang, T., Liu, M., Cao, W., & Chow, K. (2022). “Deepfake Noise Investigation and Detection”, Forensic Science International: Digital Investigation, 42, 301395. [30]Xia, Z., Qiao, T., Xu, M., Zheng, N., & Xie, S. (2022). “Towards DeepFake Video Forensics Based on Facial Textural Disparities in Multi-Color Channels”, Information Sciences, 607, 654–669. [31]Yang, X., Li, Y., & Lyu, S. (2019). “Exposing Deep Fakes Using Inconsistent Head Poses”, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 8261–8265. [32]Yap, B. W., & Sim, C. H. (2011). “Comparisons of Various Types of Normality Tests”, Journal of Statistical Computation and Simulation, 81(12), 2141–2155. [33]Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). “Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. IEEE Signal Processing Letters, 23(10), 1499–1503. [34]Zhang, K., Zuo, W., Chen, Y., Meng, D., & Zhang, L. (2017). “Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising”, IEEE Transactions on Image Processing, 26(7), 3142–3155. [35]Zhu, J., Park, T., Isola, P., & Efros, A. A. (2017). “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks”, 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2242–2251. |
Description: | 碩士 國立政治大學 統計學系 110354002 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0110354002 |
Data Type: | thesis |
Appears in Collections: | [統計學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
400201.pdf | | 6892Kb | Adobe PDF2 | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|