[1]张富强,白筠妍,穆 慧.基于改进GAN的人机交互手势行为识别方法[J].郑州大学学报(工学版),2025,46(02):43-50.[doi:10.13705/j.issn.1671-6833.2025.02.012]
 ZHANG Fuqiang,BAI Junyan,MU Hui.Human-machine Interaction Oriented Gesture Recognition Method Based on Improved GAN[J].Journal of Zhengzhou University (Engineering Science),2025,46(02):43-50.[doi:10.13705/j.issn.1671-6833.2025.02.012]
点击复制

基于改进GAN的人机交互手势行为识别方法()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
46
期数:
2025年02期
页码:
43-50
栏目:
出版日期:
2025-03-10

文章信息/Info

Title:
Human-machine Interaction Oriented Gesture Recognition Method Based on Improved GAN
文章编号:
1671-6833(2025)02-0043-08
作者:
张富强12 白筠妍12 穆 慧3
1.长安大学 道路施工技术与装备教育部重点实验室,陕西 西安 710064;2.长安大学 智能制造系统研究所,陕西 西安 710064;3.济南职业学院 机械制造学院,山东 济南 250002
Author(s):
ZHANG Fuqiang12 BAI Junyan12 MU Hui3
1.Key Laboratory of Road Construction Technology and Equipment of Ministry of Education,Chang’an University, Xi’an 710064, China; 2. Institute of Smart Manufacturing Systems Engineering,Chang’an University, Xi’an 710064,China; 3.School of Mechanical Manufacturing,Jinan Vocational College, Jinan 250002,China
关键词:
人机交互 生成对抗网络 变分自编码器 手势识别 条件批量归一化
Keywords:
human-machine interaction generative adversarial networks variational autoencoder gesture recognition conditional batch normalization
分类号:
TP391.41
DOI:
10.13705/j.issn.1671-6833.2025.02.012
文献标志码:
A
摘要:
为改善现有手势识别算法需要大量训练数据的现状,针对识别准确率不高、识别过程复杂等问题,基于生成式对抗网络(GAN)和变分自编码器,引入标签信息,提出一种基于改进GAN模型的人机交互手势行为识别方法。首先,在编码器和解码器中分别添加改进InceptionV2和InceptionV2-trans结构增强模型的特征还原能力;其次,在各组成网络中进行条件批量归一化(CBN)处理改善过拟合,以Mish激活函数代替ReLU函数提升网络性能;最后,通过实验证明该方法能够以较少的样本获得100%的分类准确率,且收敛时间短,验证了该方法的可靠性。
Abstract:
In order to improve the current situation that the existing gesture recognition algorithms required a large amount of training data, aiming at the drawbacks of low accuracy and complex recognition process, a gesture recognition method for human-machine interaction based on improved GAN model was proposed through taking the generative confrontation networks combined with the variational self-encoder and the label information. Firstly, the improved InceptionV2 and InceptionV2-trans structures were added to the encoder and decoder respectively to enhance the feature recovery ability of the model. Secondly, conditional batch normalization (CBN) was carried out in each component network to improve overfitting, and Mish activation function was used to replace ReLU to improve the network performance. Finally, the experimental results indicated that the proposed method could obtain 100% classification accuracy with fewer samples and short convergence time, which verified the reliability of the method.

参考文献/References:

[1]GADEKALLU T R, ALAZAB M, KALURI R, et al. Hand gesture classification using a novel CNN-crow search algorithm[J]. Complex & Intelligent Systems, 2021, 7(4): 1855-1868.

[2]张富强, 曾夏, 白筠妍, 等. 多模态数据融合的加工作业动态手势识别方法[J]. 郑州大学学报(工学版), 2024, 45(5): 30-36. 
ZHANG F Q, ZENG X, BAI J Y, et al. Dynamic gesture recognition method for machining operations based on multi-modal data fusion[J]. Journal of Zhengzhou University (Engineering Science), 2024, 45(5): 30-36. 
[3]ALAWWAD R A, BCHIR O, MAHER M. Arabic sign language recognition using faster R-CNN[J]. International Journal of Advanced Computer Science and Applications, 2021, 12(3): 692-700. 
[4]ZHOU W N, LI X L. PEA-YOLO: a lightweight network for static gesture recognition combining multiscale and attention mechanisms[J]. Signal, Image and Video Processing, 2024, 18(1): 597-605. 
[5]范晶晶, 薛皓玮, 吴欣鸿, 等. 引入重影特征映射和通道注意力机制的手势识别算法[J]. 计算机辅助设计与图形学学报, 2022, 34(3): 403-414. 
FAN J J, XUE H W, WU X H, et al. Gesture recognition algorithm introducing ghost feature mapping and channel attention mechanism[J]. Journal of Computer-Aided Design & Computer Graphics, 2022, 34(3): 403-414. 
[6]CHEN R X, TIAN X. Gesture detection and recognition based on object detection in complex background[J]. Applied Sciences, 2023, 13(7): 4480. 
[7]GOODFELLOW I J, POUGHT-ABADIE J, MIRZA M, et al. Generative adversarial networks [EB/OL]. (2014-06-10)[2024-09-15]. https:∥doi. org/10.48550/ arXiv.1406.2661. 
[8]彭冲, 张金艺, 楼亮亮. 基于条件生成对抗网络的手语样本骨架缺失关节点修复[J]. 计算机辅助设计与图形学学报, 2023, 35(3): 423-433. 
PENG C, ZHANG J Y, LOU L L. Missing joint point repair of sign language sample skeleton based on conditional generation adversarial networks[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(3): 423-433. 
[9]钱园园, 刘进锋, 朱东辉. 一种生成对抗网络半监督遥感图像分类方法[J]. 遥感信息, 2022, 37(4): 36-42. 
QIAN Y Y, LIU J F, ZHU D H. A semi-supervised remote sensing image classification method on generative adversarial network[J]. Remote Sensing Information, 2022, 37(4): 36-42.
[10] JIANG D H, LI M Q, XU C L. WiGAN: a WiFi based gesture recognition system with GANs[J]. Sensors, 2020, 20(17): 4757. 
[11] MENG H, GUO F R. Image classification and generation based on GAN model[C]∥2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI). Piscataway:IEEE, 2021: 180-183. 
[12]郝博, 尹兴超, 闫俊伟, 等. 基于Gan-St-YOLOv5的复杂环境下的手势识别[J]. 东北大学学报(自然科学版), 2023, 44(7): 953-963. 
HAO B, YIN X C, YAN J W, et al. Gesture recognition in the complex environment based on Gan-St-YOLOv5 [J]. Journal of Northeastern University (Natural Science), 2023, 44(7): 953-963. 
[13] MIRZA M, OSINDERO S. Conditional generative adversarial nets[EB/OL]. (2014-11-06) [2024-09-15]. ttps:∥doi.org/10.48550/arXiv.1411.1784. 
[14] KINGMA D P, WELLING M. Auto-encoding variational bayes[EB/OL]. (2022-12-10)[2024-09-15]. https: ∥doi.org/10.48550/arXiv.1312.6114. 
[15] BAO J M, CHEN D, WEN F, et al. CVAE-GAN: finegrained image generation through asymmetric training[C]∥ 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2017: 2764-2773. 
[16]王爱丽, 薛冬, 吴海滨, 等. 基于条件生成对抗网络的手写数字识别[J]. 液晶与显示, 2020, 35(12): 1284-1290. 
WANG A L, XUE D, WU H B, et al. Handwritten digit recognition based on conditional generative adversarial network[J]. Chinese Journal of Liquid Crystals and Displays, 2020, 35(12): 1284-1290. 
[17] MIYATO T, KOYAMA M. cGANs with projection discriminator[EB/OL]. (2018-02-15) [2024-09-15]. https:∥doi.org/10.48550/arXiv.1802.05637. 
[18] ALAFTEKIN M, PACAL I, CICEK K. Real-time sign language recognition based on YOLO algorithm[J]. Neural Computing and Applications, 2024, 36(14): 7609-7624. 
[19] BOSE S R, KUMAR V S. Efficient inceptionV2 based deep convolutional neural network for real-time hand action recognition[J]. IET Image Processing, 2020, 14 (4): 688-696. 
[20] MAVI A. A new dataset and proposed convolutional neural network architecture for classification of American sign language digits[EB/OL]. (2020-11-16) [2024-0915]. https:∥doi.org/10.48550/arXiv.2011.08927. 
[21] ALEXANDER K, KARINA K, ALEXANDER N, et al. HaGRID-HAnd gesture recognition image dataset[C]∥ 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Piscataway:IEEE, 2024: 4560-4569.

相似文献/References:

[1]何高奇,龚博杰,陈诚,等.VR+药效团:一种交互可视的虚拟筛选系统[J].郑州大学学报(工学版),2018,39(05):28.[doi:10.13705/j.issn.1671-6833.2018.05.001]
 He Gaoqi,Gong Bojie,Chen Cheng,et al.VR + Pharmacophore: an Interactive and Visual Virtual Screening System[J].Journal of Zhengzhou University (Engineering Science),2018,39(02):28.[doi:10.13705/j.issn.1671-6833.2018.05.001]
[2]屈 丹,杨绪魁,闫红刚,等.低资源少样本连续语音识别最新进展[J].郑州大学学报(工学版),2023,44(04):1.[doi:10.13705/j.issn.1671-6833.2023.04.014]
 QU Dan,YANG Xukui,YAN Honggang,et al.Overview of Recent Progress in Low-resource Few-shot Continuous Speech Recognition[J].Journal of Zhengzhou University (Engineering Science),2023,44(02):1.[doi:10.13705/j.issn.1671-6833.2023.04.014]
[3]魏明军,李 凤,刘亚志,等.基于改进WGAN-GP和ResNet的车联网入侵检测方法[J].郑州大学学报(工学版),2024,45(04):30.[doi:10.13705/ j.issn.1671-6833.2024.04.008]
 WEI Mingjun,LI Feng,LIU Yazhi,et al.An Intrusion Detection Method for Internet of Vehicles Based on Improved WGAN-GP and ResNet[J].Journal of Zhengzhou University (Engineering Science),2024,45(02):30.[doi:10.13705/ j.issn.1671-6833.2024.04.008]
[4]张 震,周一成,田鸿朋.基于空间特征和生成对抗网络的网络入侵检测[J].郑州大学学报(工学版),2024,45(06):40.[doi:10.13705/j.issn.1671-6833.2024.06.001]
 ZHANG Zhen,ZHOU Yicheng,TIAN Hongpeng.Network Intrusion Detection Based on Spatial Features and GenerativeAdversarial Networks[J].Journal of Zhengzhou University (Engineering Science),2024,45(02):40.[doi:10.13705/j.issn.1671-6833.2024.06.001]
[5]林予松,李孟娅,李英豪,等.基于GAN和多尺度空间注意力的多模态医学图像融合[J].郑州大学学报(工学版),2025,46(01):1.[doi:10.13705/j.issn.1671-6833.2025.01.001]
 LIN Yusong,,et al.Multimodal Medical Image Fusion Based on GAN and Multiscale Spatial Attention[J].Journal of Zhengzhou University (Engineering Science),2025,46(02):1.[doi:10.13705/j.issn.1671-6833.2025.01.001]

更新日期/Last Update: 2025-03-13