Rui-Ze Han Qing Guo Wei Feng*
School of Computer Science and Technology, Tianjin University  
Figure.1 Pipeline of CRSR based CF tracking. Static CRSR introduces the object saliency map of the first frame into the spatially variant weight map to highlight the target region. On this basis, temporal CRSR updates the weight map temporally through the saliency map and online learned filters to make the filters adapt the object variation better in filters learning.
Spatial regularization (SR), being an effective tool to alleviate the boundary effects, can significantly improve the accuracy and robustness of correlation filters (CF) based visual object tracking. The core of SR is a spatially variant weight map that is used to regularize the online learned correlation filters by selecting more meaningful samples. However, most existing trackers apply a data-independent SR weight map. In this paper, we show that a content-related spatial regularization (CRSR) can help to further boost both the tracking accuracy and robustness. Specifically, we present to consider both frame saliency and spatial preference to online generate the CRSR weight map and propose a simple yet effective saliency-embedded CF objective function to simultaneously optimize the filters and CRSR weight map in spatialtemporal domain. Extensive experiments validate that our content-related SR outperforms the classical SR, with higher tracking accuracy and almost two times faster speed.
Recently, correlation filters (CF) tracking, being one of the best tracking frameworks, has shown continuous performance improvement in terms of accuracy and robustness on various benchmarks. However, there is an inherent drawback of CF, i.e. boundary effects introduced by circular shifting a region centered at the target to generate training samples, thus learn less discriminative filters. SRDCF is designed to address this problem by using a spatially variant variant weight map to regularize correlation filters and has achieved the best performance on popular benchmarks, e.g. OTB and VOT. However, the weight map for spatial regularization (SR) is generated according to the bounding box of the target, given at the first frame and fixed during the whole tracking process, which loses sight of the object content information. Such designing is clearly not suitable for object tracking that usually address irregular, nonrigid and temporally changing objects. To alleviate such problem, we propose content-related spatial regularization for correlation filters (CRSRCF), which introduces the saliency information and online learned filters into the SR weight map with the consideration of target content information, i.e. the shape and variation. As a result, CRSRCF can track the irregular, nonrigid and temporally changing targets accurately. Specifically, we first propose static content-related SR by introducing target saliency map into the SR weight map to highlight the target while suppressing the surrounding at the first frame. We then propose a simple yet effective saliency-embedded CF objective function to simultaneously optimize filters and SR weight map. Experiments results show that our approach helps SRDCF track irregular, nonrigid and variational target accurately and gets much better performance than several state-of-the-art trackers on OTB 2015.
Precision plots (left) and success plots (right) showing a comparison with state-of-the-art methods on OTB-2015. The legend contains the average distance precision score at 20 pixels and the AUC score for each tracker.
This video shows the temporally updating process of the spatial weight map via the saliency information and online learned filters in CRSRCF as well as more qualitative results of our CRSRCF and other four CF-based trackers KCF, Staple, CSR-DCF and SRDCF on several scenarios of OTB-2015.
Content-Related Spatial Regularization for Visual Object Tracking
Rui-Ze Han, Qing Guo, Wei Feng. Content-Related Spatial Regularization for Visual Object Tracking.
In ICME 2018.(CCF-B).
[ PDF ]
[ BibTeX ]