Choe, Junsuk, Han, Dongyoon, Yun, Sangdoo, Ha, Jung-Woo, Oh, Seong Joon, and Shim, Hyunjung
• We propose a new WSOL method, RDAP, that induces a model to learn the less discriminative parts of an object. • The proposed method is more robust to the selection of hyperparameters, compared to previous WSOL methods. • We achieved state-of-the-art performances on two popular datasets with four architectures. Weakly supervised object localization (WSOL) methods utilize the internal feature responses of a classifier trained only on image-level labels. Classifiers tend to focus on the most discriminative part of the target object, instead of considering its full extent. Adversarial erasing (AE) techniques have been proposed to ameliorate this problem. These techniques erase the most discriminative part during training, thereby encouraging the classifiers to learn the less discriminative parts of the object. Despite the success of AE-based methods, we have observed that the hyperparameters fail to generalize across model architectures and datasets. Therefore, new sets of hyperparameters must be determined for each architecture and dataset. The selection of hyperparameters frequently requires strong supervision (e.g., pixel-level annotations or human inspection). Because WSOL is premised on the assumption that such strong supervision is absent, the applicability of AE-based methods is limited. In this paper, we propose the region-based dropout with attention prior (RDAP) algorithm, which features hyperparameter transferability. We combined AE with regional dropout algorithms that provide greater stability against the selection of hyperparameters. We empirically confirmed that the RDAP method achieved state-of-the-art localization accuracy on four architectures, namely VGG-GAP, InceptionV3, ResNet-50 SE, and PreResNet-18, and two datasets, namely CUB-200-2011 and ImageNet-1k, with a single set of hyperparameters. [ABSTRACT FROM AUTHOR]