Back to Search Start Over

Triple critical feature capture network: A triple critical feature capture network for weakly supervised object detection

Authors :
Zhoufeng Liu
Kaihua Wang
Chunlei Li
Shunmin Ding
Jiangtao Xi
Source :
IET Computer Vision, Vol 17, Iss 8, Pp 895-912 (2023)
Publication Year :
2023
Publisher :
Wiley, 2023.

Abstract

Abstract Weakly supervised object detection (WSOD) is becoming increasingly important for computer vision tasks, as it alleviates the burden of manual annotation. Most WSOD techniques rely on multiple instance learning (MIL), which tends to localise the discriminative parts of salient objects instead of the whole object. In addition, network training is often supervised using simple image‐level annotations, without including object quantities or location information. However, this can lead to ambiguous differentiation of object instances, both in terms of location and semantics. To address these issues, propose an end‐to‐end triple critical feature capture network (TCFCNet) for WSOD is proposed. Specifically, a multi‐task branch, which can perform fully supervised classification and regression task, was integrated with a PCL in an end‐to‐end network for refining object locations in an online method. A cyclic parametric dropblock module (CPDM) was then designed to help the detector focus on the contextual information by using cyclic masking techniques to maximise the removal of the discriminative components of an object instance to alleviate the part domination problem. Finally, a feature decoupling module (FDM) is proposed to further reduce the ambiguous distinction of object instances by adaptively constructing robust critical features that adapt to multi‐task branch for classification and regression tasks, which contains a feature enhancement module and task‐specific polarisation functions. Comprehensive experiments are carried out on the challenging Pascal VOC 2007 and VOC 2012 datasets. The proposed method achieves a 54.6% mAP and a 44.3% mAP on the Pascal VOC 2007 and VOC 2012 datasets respectively, showed that our method outperformed existing mainstream techniques by a considerable margin.

Details

Language :
English
ISSN :
17519640 and 17519632
Volume :
17
Issue :
8
Database :
Directory of Open Access Journals
Journal :
IET Computer Vision
Publication Type :
Academic Journal
Accession number :
edsdoj.1b408c7217c84ab5879af00220b243a3
Document Type :
article
Full Text :
https://doi.org/10.1049/cvi2.12203