Back to Search
Start Over
Triple critical feature capture network: A triple critical feature capture network for weakly supervised object detection
- Source :
- IET Computer Vision, Vol 17, Iss 8, Pp 895-912 (2023)
- Publication Year :
- 2023
- Publisher :
- Wiley, 2023.
-
Abstract
- Abstract Weakly supervised object detection (WSOD) is becoming increasingly important for computer vision tasks, as it alleviates the burden of manual annotation. Most WSOD techniques rely on multiple instance learning (MIL), which tends to localise the discriminative parts of salient objects instead of the whole object. In addition, network training is often supervised using simple image‐level annotations, without including object quantities or location information. However, this can lead to ambiguous differentiation of object instances, both in terms of location and semantics. To address these issues, propose an end‐to‐end triple critical feature capture network (TCFCNet) for WSOD is proposed. Specifically, a multi‐task branch, which can perform fully supervised classification and regression task, was integrated with a PCL in an end‐to‐end network for refining object locations in an online method. A cyclic parametric dropblock module (CPDM) was then designed to help the detector focus on the contextual information by using cyclic masking techniques to maximise the removal of the discriminative components of an object instance to alleviate the part domination problem. Finally, a feature decoupling module (FDM) is proposed to further reduce the ambiguous distinction of object instances by adaptively constructing robust critical features that adapt to multi‐task branch for classification and regression tasks, which contains a feature enhancement module and task‐specific polarisation functions. Comprehensive experiments are carried out on the challenging Pascal VOC 2007 and VOC 2012 datasets. The proposed method achieves a 54.6% mAP and a 44.3% mAP on the Pascal VOC 2007 and VOC 2012 datasets respectively, showed that our method outperformed existing mainstream techniques by a considerable margin.
Details
- Language :
- English
- ISSN :
- 17519640 and 17519632
- Volume :
- 17
- Issue :
- 8
- Database :
- Directory of Open Access Journals
- Journal :
- IET Computer Vision
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.1b408c7217c84ab5879af00220b243a3
- Document Type :
- article
- Full Text :
- https://doi.org/10.1049/cvi2.12203