[1]陈晓鹏,徐 鹏,王占涛,等.基于单目视觉前馈的机械臂目标快速三维瞄准方法[J].郑州大学学报(工学版),2025,46(02):11-18.[doi:10.13705/j.issn.1671-6833.2025.02.001]
 CHEN Xiaopeng,XU Peng,WANG Zhantao,et al.Fast 3D Aiming Method of Manipulators Based on Monocular Visual Feedforward[J].Journal of Zhengzhou University (Engineering Science),2025,46(02):11-18.[doi:10.13705/j.issn.1671-6833.2025.02.001]
点击复制

基于单目视觉前馈的机械臂目标快速三维瞄准方法()
分享到:

《郑州大学学报(工学版)》[ISSN:1671-6833/CN:41-1339/T]

卷:
46
期数:
2025年02期
页码:
11-18
栏目:
出版日期:
2025-03-10

文章信息/Info

Title:
Fast 3D Aiming Method of Manipulators Based on Monocular Visual Feedforward
文章编号:
1671-6833(2025)02-0011-08
作者:
陈晓鹏1 徐 鹏1 王占涛2 邢 程1 邱钰涵1
1.北京理工大学 机电学院,北京 100081;2.北京航天计量测试技术研究所,北京 100076
Author(s):
CHEN Xiaopeng1 XU Peng1 WANG Zhantao2 XING Cheng1 QIU Yuhan1
1.School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China; 2.Beijing Aerospace Institute for Metrology and Measurement Technology, Beijing 100076, China
关键词:
单目视觉 视觉前馈 自主瞄准 深度学习 机械臂控制
Keywords:
monocular vision visual feedforward autonomous aiming deep learning manipulator operation
分类号:
TP242.6TP249
DOI:
10.13705/j.issn.1671-6833.2025.02.001
文献标志码:
A
摘要:
针对基于单目视觉的机械臂目标瞄准缺少三维深度信息和收敛速度慢等问题,提出了一种单目视觉前馈的六自由度机械臂快速瞄准方法。首先,对YOLOv4的CSPDarknet网络进行裁剪,减少Conv2d_BN_Mish单元数量,简化特征提取网络复杂度,得到Lite YOLOv4以此提高目标检测速度;其次,根据从单目图像中检测得到的目标像素坐标,求取目标像素坐标的反投影归一化坐标,得到激光瞄准器期望的相对三维瞄准射线方程;再次,提出多点标定法实现单目相机、激光瞄准器以及机械臂坐标系的外参数标定,并根据外参数标定结果与激光瞄准器的期望三维射线方程计算机械臂各个关节的期望姿态;最后,基于机械臂的直接位置控制代替图像视觉伺服,加快收敛速度,实现基于单目视觉前馈的机械臂目标快速三维瞄准。经实验验证,所提方法单目平均瞄准响应时间仅为0.611 s,同时目标瞄准成功率达95.238%,与传统基于图像的视觉伺服的方法相比提高了4.762百分点。
Abstract:
To address the issues of the lack of three-dimensional depth information and slow convergence speed in monocular vision-based manipulator aiming, a fast aiming algorithm based on monocular visual feedforward approach for a 6D of robotic arm was proposed. First of all, the CSPDarknet of YOLOv4 was cropped through reducing the number of the Cov2D_BN_Mish units and simplifying the complexity of the backbone network so as to form a Lite YOLOv4 algorithm to accelerate object detection speed. Then, the pixel coordinates of the target object detected from monocular images were inversely projected to a 3D ray emitted from the camera center. The 3D ray was exactly the target ray for the aiming laser. What was more, a multiple common points based calibration approach was proposed to obtain the extrinsic parameters of the monocular camera and the aiming laser, and the expected joint poses of the robotic arm were then deducted based on the expected 3D ray coordinates. Finally, direct position control was used to replace visual servoing control based on the expected joint poses to accelerate convergence speed, and realize the monocular visual feedforward based fast 3D aiming for robotic arms. Experiments verified that the average monocular aiming time was 0.611 second. The aiming success rate was 95.238%, which was 4.762 percentage points higher than traditional image based visual servoing.

参考文献/References:

[1]MATHESON E, MINTO R, ZAMPIERI E G G, et al. Human-robot collaboration in manufacturing applications: a review[J]. Robotics, 2019, 8(4): 100. 

[2]王大浩, 平雪良. 基于规定性能的六自由度机械臂视觉伺服控制[J]. 传感器与微系统, 2022, 41(3): 104-108. 
WANG D H, PING X L. Six-degree-of-freedom manipulator visual servo control based on prescribed performance [J]. Transducer and Microsystem Technologies, 2022, 41(3): 104-108. 
[3]LING X, ZHAO Y S, GONG L, et al. Dual-arm cooperation and implementing for robotic harvesting tomato using binocular vision[J]. Robotics and Autonomous Systems, 2019, 114: 134-143. 
[4]TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 10778-10787. 
[5]ZHU L, WANG X Q, LI P, et al. S3 net: self-supervised self-ensembling network for semi-supervised RGB-D salient object detection[J]. IEEE Transactions on Multimedia, 2023, 25: 676-689. 
[6]张震, 王晓杰, 晋志华, 等. 基于轻量化YOLOv5的交通标志检测[J]. 郑州大学学报(工学版), 2024, 45(2): 12-19. 
ZHANG Z, WANG X J, JIN Z H, et al. Traffic sign detection based on lightweight YOLOv5[J]. Journal of Zhengzhou University (Engineering Science), 2024, 45 (2): 12-19. 
[7]张震, 陈可鑫, 陈云飞. 优化聚类和引入CBAM的YOLOv5管制刀具检测[J]. 郑州大学学报(工学版), 2023, 44(5): 40-45, 61. 
ZHANG Z, CHEN K X, CHEN Y F. YOLOv5 with optimized clustering and CBAM for controlled knife detection [J]. Journal of Zhengzhou University (Engineering Science), 2023, 44(5): 40-45, 61. 
[8]LI J J, JI W, BI Q, et al. Joint semantic mining for weakly supervised RGB-D salient object detection[C]∥ Proceedings of the 35th International Conference on Neural Information Processing Systems. New York: ACM, 2021: 11945-11959. 
[9]CARION N, MASSA F, SYNNAEVE G, et al. End-toend object detection with transformers[C]∥European Conference on Computer Vision. Cham: Springer, 2020: 213-229.
[10] MENG D P, CHEN X K, FAN Z J, et al. Conditional DETR for fast training convergence[C]∥2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2021: 3631-3640. 
[11] DONG J X, ZHANG J. A new image-based visual servoing method with velocity direction control[J]. Journal of the Franklin Institute, 2020, 357(7): 3993-4007. 
[12] RANGANATHAN G. An economical robotic arm for playing chess using visual servoing[J]. Journal of Innovative Image Processing, 2020, 2(3): 141-146. 
[13] SHIRLEY D R A, RANJANI K, ARUNACHALAM G, et al. Automatic distributed gardening system using object recognition and visual servoing[C]∥Inventive Communication and Computational Technologies. Berlin: Springer, 2021: 359-369. 
[14] PARADIS S, HWANG M, THANANJEYAN B, et al. Intermittent visual servoing: efficiently learning policies robust to instrument changes for high-precision surgical manipulation[C]∥2021 IEEE International Conference on Robotics and Automation (ICRA). Piscataway: IEEE, 2021: 7166-7173. 
[15] AL-SHANOON A, LANG H X. Robotic manipulation based on 3-D visual servoing and deep neural networks [J]. Robotics and Autonomous Systems, 2022, 152: 104041. 
[16] AL-SHANOON A, WANG Y J, LANG H X. DeepNetbased 3D visual servoing robotic manipulation[J]. Journal of Sensors, 2022, 2022: 3511265. 
[17] LEE Y S, VUONG N, ADRIAN N, et al. Integrating force-based manipulation primitives with deep learningbased visual servoing for robotic assembly[C] ∥ICRA 2022 Workshop: Reinforcement Learning for Contact-Rich Manipulation. Piscataway:IEEE, 2022. 
[18] ZHONG X G, SHI C Q, LIN J, et al. Self-learning visual servoing for robot manipulation in unstructured environments[C]∥International Conference on Intelligent Robotics and Applications. Cham: Springer, 2021: 48-57. 
[19] PUANG E Y, PENG TEE K, JING W. KOVIS: keypoint-based visual servoing with zero-shot sim-to-real transfer for robotics manipulation[C]∥2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE, 2020: 7527-7533. 
[20] RIBEIRO E G, DE QUEIROZ MENDES R, GRASSI V J. Real-time deep learning approach to visual servo control and grasp detection for autonomous robotic manipulation[J]. Robotics and Autonomous Systems, 2021, 139: 103757. 
[21] DANIILIDIS K. Hand-eye calibration using dual quaternions[J]. The International Journal of Robotics Research, 1999, 18(3): 286-298. 
[22] ZHANG Z. A flexible new technique for camera calibration[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(11): 1330-1334. 
[23] SUTHERLAND I E. Three-dimensional data input by tablet[J]. Proceedings of the IEEE, 1974, 62(4): 453-461. 
[24] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection [EB/OL]. (2020-05-23)[2024-08-15].https:∥doi. org/10.48550/arXiv.2004.10934. 
[25] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. 
[26] LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 
[27] YUEN H, PRINCEN J, ILLINGWORTH J, et al. Comparative study of Hough Transform methods for circle finding[J]. Image and Vision Computing, 1990, 8(1): 7177.

更新日期/Last Update: 2025-03-13