# 融合图像深度的抗遮挡目标跟踪算法

(武警工程大学 信息工程学院，陕西 西安 710086)

## 1 融合图像深度的抗遮挡目标跟踪算法

### 1.1 基于孪生网络的目标跟踪算法

f(x,z)=φ(x)⊗φ(z);

(1)

p=argmax[φ(x)⊗φ(z)]。

(2)

SiamRPN将原来目标跟踪任务中的相似度计算转化为回归和分类问题，其中的RPN模块可以理解为一种全卷积网络，该网络最终目的是为了推荐候选区域。RPN中最重要的是anchor机制，通过预先定义anchor的尺寸和长宽比引入多尺度方法。SiamRPN中anchor有5种长宽比：0.33、0.5、1、2、3。通过平移和缩放对原始的anchor进行修正，使anchor更接近真实的目标窗口。

### 1.2 单目图像深度估计算法

(3)

Figure 1 Monodepth2 algorithm effect

### 1.3 融合图像深度的抗遮挡目标跟踪算法

Figure 2 Framework of anti-occlusion target
tracking algorithm based on image depth

Di=mean(∑x,yDt(x,y))；

(4)

(5)

(6)

S=S1·λ+S2·(1-λ)。

(7)

## 2 实验结果与分析

Table 1 The tracking results with different λ values

λ跟踪成功率跟踪精确度0.700.5900.8050.750.6100.8300.800.6120.8310.850.6230.8530.900.6200.8470.950.6200.8451.000.6120.845

Table 2 Accuracy on 11 video sequences using different algorithms

Figure 3 Comparison of accuracy rates of video sequences with different attributes

Figure 4 Tracker performance comparison on 11 video sequences

Figure 5 Comparison of actual effects of algorithms in video sequences

## 3 结论

[1] HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]// European Conference on Computer Vision. Berlin：Springer， 2012: 702-715.

[2] HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters [J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(3): 583-596.

[3] TAO R, GAVVES E, SMEULDERS A W. Siamese instance search for tracking[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1420-1429.

[4] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking[C]// European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.

[5] LI B, YAN J, WU W, et al. High performance visual tracking with siamese region proposal network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-80.

[6] ZHU Z, WU W, ZOU W, et al. End-to-end flow correlation tracking with spatial-temporal attention[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 548-557.

[7] WU C L, ZHANG Y, ZHANG Y, et al. Motion guided siamese trackers for visual tracking [J]. IEEE access, 2020, 8:7473-7489.

[8] MUNARO M, BASSO F, MENEGATTI E. OpenPTrack: open source multi-camera calibration and people tracking for RGB-D camera networks [J]. Robotics & autonomous systems, 2016, 75:525-538.

[9] GODARD C, MAC AODHA O, FIRMAN M, et al. Digging into self-supervised monocular depth estimation[C]// Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE 2019:3828-3838.

[10] WU Y, LIM J, YANG M H. Object tracking benchmark [J]. IEEE analysis and machine intelligence, 2015, 37(9): 1834-1848.

[11] 毛晓波, 周晓东, 刘艳红. 基于FAST特征点改进的TLD目标跟踪算法 [J]. 郑州大学学报(工学版), 2018, 39(2): 1-5,17.

[12] 刘明华, 汪传生, 胡强, 等. 多模型协作的分块目标跟踪 [J]. 软件学报, 2020, 31(2): 511-530.

[13] DANELLJAN M, HAGER G, SHAHBAZ KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]// Proceedings of the IEEE International Conference on Computer Vision. Pisca-taway: IEEE,2015: 4310-4318.

[14] BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: complementary learners for real-time tracking[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1401-1409.

[15] VALMADRE J, BERTINETTO L, HENRIQUES J F, et al. End-to-end representation learning for correlation filter based tracking[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE, 2017: 2805-2813.

[16] DANELLJAN M, HAGER G, KHAN F S, et al. Discriminative scale space tracking [J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(8): 1561-1575.

# Anti-occlusion Target Tracking Algorithm Based on Image Depth

WANG Xipeng, LI Yong, LI Zhi, ZHANG Yan

(School of Information Engineering, Engineering University of People′s Armed Police, Xi′an 710086, China)

Abstract: Due to the limitation of video information, target tracking in the case of occlusion is still a difficult problem to solve. Aiming at the problem of occlusion in the target tracking process, it is proposed to introduce image depth into single target tracking algorithm. Firstly, the monocular image depth estimation algorithm is used to estimate the depth of the image to obtain the depth information of the image. Secondly, the target tracking algorithm based on the siamese region proposal network is combined with the image depth to construct an occlusion discriminating module, which uses the change of the target depth information to determine the occlusion. Finally, the occlusion discrimination score and the anchor response score are weighted integrated. According to the final response score, the anchor of the target tracker is reordered to avoid interference by obstructions. Experimental results on the OTB-2015 dataset show that the algorithm can effectively deal with the influence of occlusion on tracking performance, with an average success rate of 0.623 and an average tracking accuracy of 0.853, which is 1.7% and 0.9% higher than the benchmark algorithm, repectively.

Key words: siamese network; deep learning; target tracking; monocular depth estimation; anti-occlusion

doi:10.13705/j.issn.1671-6833.2021.05.011