A Research on Deepfake Face Detection Techniques Based on Multimodal Biometric Cross – Verification

Open Access

Issue		ITM Web Conf. Volume 78, 2025 International Conference on Computer Science and Electronic Information Technology (CSEIT 2025)


Article Number		02017
Number of page(s)		12
Section		Machine Learning Applications in Vision, Security, and Healthcare
DOI		https://doi.org/10.1051/itmconf/20257802017
Published online		08 September 2025

Kaur, A., Noori Hoshyar, A., Saikrishna, V., Firmin, S., Xia, F.: ‘Deepfake video detection: challenges and opportunities.’ Artificial Intelligence Review, 2024, 57(6) [Google Scholar]
Li, Q., Gao, M., Zhang, G., Zhai, W., Chen, J., Jeon, G.: ‘Towards Multimodal Disinformation Detection by Vision-language Knowledge Interaction.’ Information Fusion, 2024, 102, 102037 [Google Scholar]
Thasiyabi, V.A., Koshy, R., Satheesh, S.: ‘Biometric fusion: Combining multimodal and multi algorithmic approach.’ In: Proc. of the International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES). 2016, 618–620 [Google Scholar]
Yang, W., Zhou, X., Chen, Z., Guo, B., Ba, Z., Xia, Z., Cao, X., Ren, K.: ‘AVoiD-DF: Audio-Visual Joint Learning for Detecting Deepfake.’ IEEE Transactions on Information Forensics and Security, 2023, 18, 2015–2029 [Google Scholar]
Kumari, C.H.L., Prasad, K.V.: ‘Video forgery detection and localization using optimized attention squeezenet adversarial network.’ Multimedia Tools and Applications, 2024, 83(40), 87697–87725 [Google Scholar]
Salah, Z., Abu Elsoud, E.: ‘Enhancing Network Security: A Machine Learning-Based Approach for Detecting and Mitigating Krack and Kr00k Attacks in IEEE 802.11.’ Future Internet, 2023, 15(8), 269 [Google Scholar]
Zhang, Y., Xiao, G., Bai, B., Wang, Z., Sun, C., Tu, Y.: ‘An Optimized Transfer Attack Framework Towards Multi-Modal Machine Learning.’ In: Proc. of the International Conference on Data-Driven Optimization of Complex Systems (DOCS). 2022, 1–6 [Google Scholar]
Thuseethan, S., Rajasegarar, S., Yearwood, J.: ‘Deep3DCANN: A Deep 3DCNN-ANN framework for spontaneous micro-expression recognition.’ Information Sciences, 2023, 630, 341–355 [Google Scholar]
Khadar, K.V.A., Sunil Kumar, R.K., Sameer, V.V.: ‘Speaker diarization based on X vector extracted from time-delay neural networks (TDNN) using agglomerative hierarchical clustering in noisy environment.’ International Journal of Speech Technology, 2024, 28(1), 13–26 [Google Scholar]
Zheng, K., Shen, J., Sun, G., Li, H., Li, Y.: ‘Shielding facial physiological information in video.’ Mathematical Biosciences and Engineering, 2022, 19(5), 5153–5168 [Google Scholar]
Ayetiran, E.F., Özgöbek, Ö.: ‘An inter-modal attention-based deep learning framework using unified modality for multimodal fake news, hate speech and offensive language detection.’ Information Systems, 2024, 123, 102378 [Google Scholar]
Xie, D., Deng, C., Li, C., Liu, X., Tao, D.: ‘Multi-Task Consistency-Preserving Adversarial Hashing for Cross-Modal Retrieval.’ IEEE Transactions on Image Processing, 2020, 29, 3626–3637 [Google Scholar]
Wu, L., Long, Y., Gao, C., Wang, Z., Zhang, Y.: ‘MFIR: Multimodal fusion and inconsistency reasoning for explainable fake news detection.’ Information Fusion, 2023, 100, 101944 [Google Scholar]
Hu, Z., Hu, L.: ‘Visual Loop Closure Detection Based on SqueezeNet Multi-layer Feature Fusion and Adaptive Range Matching Algorithm.’ Journal of Intelligent & Robotic Systems, 2023, 108(3) [Google Scholar]
Xue, J., Zhou, H.: ‘Physiological-physical feature fusion for automatic voice spoofing detection.’ Frontiers of Computer Science, 2022, 17(2) [Google Scholar]
Abd El-Rahiem, B., Hammad, M.: ‘A Multi-fusion IoT Authentication System Based on Internal Deep Fusion of ECG Signals.’ In: Studies in Big Data. Springer International Publishing, 2021, 53–79 [Google Scholar]
Gadzicki, K., Khamsehashari, R., Zetzsche, C.: ‘Early vs Late Fusion in Multimodal Convolutional Neural Networks.’ In: Proc. of the IEEE International Conference on Information Fusion (FUSION). 2020, 1–6 [Google Scholar]
Torkamani, M., Shankar, S., Rooshenas, A., Wallis, P.: ‘Differential Equation Units: Learning Functional Forms of Activation Functions from Data.’ In: Proc. of the AAAI Conference on Artificial Intelligence. 2020, 34(04), 6030–6037 [Google Scholar]
Tian, M., Khayatkhoei, M., Mathai, J., AbdAlmageed, W.: ‘Unsupervised Multimodal Deepfake Detection Using Intra- and Cross-Modal Inconsistencies.’ arXiv, 2023 [Google Scholar]
Song, K., Zhu, Y., Liu, B., Yan, Q., Elgammal, A., Yang, X.: ‘MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation.’ arXiv, 2024 [Google Scholar]
Wen, J., Jiang, D., Tu, G., Liu, C., Cambria, E.: ‘Dynamic interactive multiview memory network for emotion recognition in conversation.’ Information Fusion, 2023, 91, 123–133 [Google Scholar]
Jing, X., Wang, Y., Li, D., Pan, W.: ‘Melon ripeness detection by an improved object detection algorithm for resource constrained environments.’ Plant Methods, 2024, 20(1) [Google Scholar]
Modak, S.K.S., Jha, V.K.: ‘Multibiometric fusion strategy and its applications: A review.’ Information Fusion, 2019, 49, 174–204 [Google Scholar]
Rashid, M.B., Rivas, P.: ‘AI Safety in Practice: Enhancing Adversarial Robustness in Multimodal Image Captioning.’ arXiv, 2024 [Google Scholar]
Hu, R., Zhou, S., Tang, Z.R., Chang, S., Huang, Q., Liu, Y., Han, W., Wu, E.Q.: ‘DMMAN: A two-stage audio-visual fusion framework for sound separation and event localization.’ Neural Networks, 2021, 133, 229–239 [Google Scholar]
Yu, P., Xia, Z., Fei, J., Lu, Y.: ‘A Survey on Deepfake Video Detection.’ IET Biometrics, 2021, 10(6), 607–624 [Google Scholar]
Sundararajan, K., Woodard, D.L.: ‘Deep Learning for Biometrics.’ ACM Computing Surveys, 2018, 51(3), 1–34 [Google Scholar]
Hu, W., Dong, X., Liu, N., Chen, Y.: ‘LUMDE: Light-Weight Unsupervised Monocular Depth Estimation via Knowledge Distillation.’ Applied Sciences, 2022, 12(24), 12593 [Google Scholar]
Dong, J., Wang, Y., Lai, J., Xie, X.: ‘Restricted Black-Box Adversarial Attack Against DeepFake Face Swapping.’ IEEE Transactions on Information Forensics and Security, 2023, 18, 2596–2608 [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.