22 results on '"fine-grained recognition"'
Search Results
2. CEKD:Cross ensemble knowledge distillation for augmented fine-grained data.
- Author
-
Zhang, Ke, Fan, Jin, Huang, Shaoli, Qiao, Yongliang, Yu, Xiaofeng, and Qin, Feiwei
- Subjects
MACHINE learning ,DATA augmentation ,NETWORK performance - Abstract
Data augmentation has been proved effective in training deep models. Existing data augmentation methods tackle fine-grained problem by blending image pairs and fusing corresponding labels according to the statistics of mixed pixels, which produces additional noise harmful to the performance of networks. Motivated by this, we present a simple yet effective cross ensemble knowledge distillation (CEKD) model for fine-grained feature learning. We innovatively propose a cross distillation module to provide additional supervision to alleviate the noise problem, and propose a collaborative ensemble module to overcome the target conflict problem. The proposed model can be trained in an end-to-end manner, and only requires image-level label supervision. Extensive experiments on widely used fine-grained benchmarks demonstrate the effectiveness of our proposed model. Specifically, with the backbone of ResNet-101, CEKD obtains the accuracy of 89.59%, 95.96% and 94.56% in three datasets respectively, outperforming state-of-the-art API-Net by 0.99%, 1.06% and 1.16%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. PlantNet: transfer learning-based fine-grained network for high-throughput plants recognition.
- Author
-
Yang, Ziying, He, Wenyan, Fan, Xijian, and Tjahjadi, Tardi
- Subjects
- *
CONVOLUTIONAL neural networks , *DEEP learning , *PLANT breeding - Abstract
In high-throughput phenotyping, recognizing individual plant categories is a vital support process for plant breeding. However, different plant categories have different fine-grained characteristics, i.e., intra-class variation and inter-class similarity, making the process challenging. Existing deep learning-based recognition methods fail to effectively address this recognition task under challenging requirements, leading to technical difficulties such as low accuracy and lack of generalization robustness. To address these requirements, this paper proposes PlantNet, a fine-grained network for plant recognition based on transfer learning and a bilinear convolutional neural network, which achieves high recognition accuracy in high-throughput phenotyping requirements. The network operates as follows. First, two deep feature extractors are constructed using transfer learning. The outer product of the different spatial locations corresponding to the two features is then calculated, and the bilinear convergence is computed for the different spatial locations. Finally, the fused bilinear vectors are normalized via maximum expectation to generate the network output. Experiments on a publicly available Arabidopsis dataset show that the proposed bilinear model performed better than related state-of-the-art methods. The interclass recognition accuracy of the four different species of Arabidopsis Sf-2, Cvi, Landsberg and Columbia are found to be 98.48%, 96.53%, 96.79% and 97.33%, respectively, with an average accuracy of 97.25%. Thus, the network has good generalization ability and robust performance, satisfying the needs of fine-grained plant recognition in agricultural production. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition.
- Author
-
Zhang, Lianbo, Huang, Shaoli, and Liu, Wei
- Abstract
Differentiating subcategories of a common visual category is challenging because of the similar appearance shared among different classes in fine-grained recognition. Existing mixture-of-expert based methods divide the fine-grained space into some specific regions and solve the integrated problem by conquering subspace ones. However, it is not feasible to learn diverse experts directly through data partition strategy because of limited data available for fine-grained recognition problems. To address the issue, we leverage visual attention to learn an enhanced experts’ mixture. Specifically, we introduce a gradually-enhanced learning strategy from model attention. The strategy promotes diversity among experts by feeding each expert with full-size data distinct in granularity. We further promote expert’s learning by providing it with a larger data space, which is achieved by swapping attentive regions within positive pairs. Our method learns new experts on the dataset with the prior knowledge from former experts sequentially and enforces the experts to learn more diverse but discriminative representation. These enhanced experts are finally combined to make stronger predictions. We conduct extensive experiments on fine-grained benchmarks. The results show that our method consistently outperforms the state-of-the-art method in both weakly supervised localization and fine-grained image classification. Our code is publicly available at https://github.com/lbzhang/Enhanced-Expert-FGVC-Pytorch.git. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. Vision-based Autonomous Vehicle Recognition: A New Challenge for Deep Learning-based Systems.
- Author
-
BOUKERCHE, AZZEDINE and XIREN MA
- Subjects
- *
DEEP learning , *INTELLIGENT transportation systems , *FEATURE extraction , *VEHICLE models , *AUTONOMOUS vehicles - Abstract
Vision-based Automated Vehicle Recognition (VAVR) has attracted considerable attention recently. Particularly given the reliance on emerging deep learning methods, which have powerful feature extraction and pattern learning abilities, vehicle recognition has made significant progress. VAVR is an essential part of Intelligent Transportation Systems. The VAVR system can fast and accurately locate a target vehicle, which significantly helps improve regional security. A comprehensive VAVR system contains three components: Vehicle Detection (VD), Vehicle Make and Model Recognition (VMMR), and Vehicle Re-identification (VRe-ID). These components perform coarse-to-fine recognition tasks in three steps. In this article, we conduct a thorough review and comparison of the state-of-the-art deep learning-based models proposed for VAVR. We present a detailed introduction to different vehicle recognition datasets used for a comprehensive evaluation of the proposed models. We also critically discuss the major challenges and future research trends involved in each task. Finally, we summarize the characteristics of the methods for each task. Our comprehensive model analysis will help researchers that are interested in VD, VMMR, and VRe-ID and provide them with possible directions to solve current challenges and further improve the performance and robustness of models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. CarVideos: A Novel Dataset for Fine-Grained Car Classification in Videos
- Author
-
Alsahafi, Yousef, Lemmond, Daniel, Ventura, Jonathan, Boult, Terrance, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, and Latifi, Shahram, editor
- Published
- 2019
- Full Text
- View/download PDF
7. Playing to distraction: towards a robust training of CNN classifiers through visual explanation techniques.
- Author
-
Morales, David, Talavera, Estefania, and Remeseiro, Beatriz
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *DISTRACTION - Abstract
The field of deep learning is evolving in different directions, with still the need for more efficient training strategies. In this work, we present a novel and robust training scheme that integrates visual explanation techniques in the learning process. Unlike the attention mechanisms that focus on the relevant parts of images, we aim to improve the robustness of the model by making it pay attention to other regions as well. Broadly speaking, the idea is to distract the classifier in the learning process by forcing it to focus not only on relevant regions but also on those that, a priori, are not so informative for the discrimination of the class. We tested the proposed approach by embedding it into the learning process of a convolutional neural network for the analysis and classification of two well-known datasets, namely Stanford cars and FGVC-Aircraft. Furthermore, we evaluated our model on a real-case scenario for the classification of egocentric images, allowing us to obtain relevant information about peoples' lifestyles. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity. The results obtained indicate the suitability of our proposed training scheme for image classification, improving the robustness of the final model. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
8. Automated Visual Large Scale Monitoring of Faunal Biodiversity.
- Author
-
Radig, Bernd, Bodesheim, Paul, Korsch, Dimitri, Denzler, Joachim, Haucke, Timm, Klasen, Morris, and Steinhage, Volker
- Abstract
To observe biodiversity, the variety of plant and animal life in the world or in a particular habitat, human observers make the most common examinations, often assisted by technical equipment. Measuring objectively the number of different species of animals, plants, fungi, and microbes that make up the ecosystem can be difficult. In order to monitor changes in biodiversity, data have to be compared across space and time. Cameras are an essential sensor to determine the species range, abundance, and behavior of animals. The millions of recordings from camera traps set up in natural environments can no longer be analyzed by biologists. We started research on doing this analysis automatically without human interaction. The focus of our present sensor is on image capture of wildlife and moths. Special hardware elements for the detection of different species are designed, implemented, tested, and improved, as well as the algorithms for classification and counting of samples from images and image sequences, e.g., to calculate presence, absence, and abundance values or the duration of characteristic activities related to the spatial mobilities. For this purpose, we are developing stereo camera traps that allow spatial reconstruction of the observed animals. This allows three-dimensional coordinates to be recorded and the shape to be characterized. With this additional feature data, species identification and movement detection are facilitated. To classify and count moths, they are attracted to an illuminated screen, which is then photographed at intervals by a high-resolution color camera. To greatly reduce the volume of data, redundant elements and elements that are consistent from image to image are eliminated. All design decisions take into account that at remote sites and in fully autonomous operation, power supply on the one hand and possibilities for data exchange with central servers on the other hand are limited. Installation at hard-to-reach locations requires a sophisticated and demanding system design with an optimal balance between power requirements, bandwidth for data transmission, required service and operation in all environmental conditions for at least ten years. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
9. 基于AT-PGGAN的增强数据车辆型号精细识别.
- Author
-
杨昌东, 余烨, 徐珑刀, 付源梓, and 路强
- Subjects
INTELLIGENT transportation systems ,DEEP learning ,COMPUTER vision ,COMPUTER engineering ,FEATURE extraction ,VEHICLE models - Abstract
Copyright of Journal of Image & Graphics is the property of Editorial Office of Journal of Image & Graphics and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2020
- Full Text
- View/download PDF
10. Fine-grained vehicle type recognition based on deep convolution neural networks
- Author
-
Hongcai CHEN, Yu CHENG, and Changyou ZHANG
- Subjects
computer neural network ,vehicle recognition ,convolution neural networks ,fine-grained recognition ,deep learning ,Technology - Abstract
Public security and traffic department put forward higher requirements for real-time performance and accuracy of vehicle type recognition in complex traffic scenes. Aiming at the problems of great plice forces occupation, low retrieval efficiency, and lacking of intelligence for dealing with false license, fake plate vehicles and vehicles without plates, this paper proposes a vehicle type fine-grained recognition method based GoogleNet deep convolution neural networks. The filter size and numbers of convolution neural network are designed, the activation function and vehicle type classifier are optimally selected, and a new network framework is constructed for vehicle type fine-grained recognition. The experimental results show that the proposed method has 97% accuracy for vehicle type fine-grained recognition and has greater improvement than the original GoogleNet model. Moreover, the new model effectively reduces the number of training parameters, and saves computer memory. Fine-grained vehicle type recognition can be used in intelligent traffic management area, and has important theoretical research value and practical significance.
- Published
- 2017
- Full Text
- View/download PDF
11. 区域建议网络的细粒度车型识别.
- Author
-
杨娟, 曹浩宇, 汪荣贵, 薛丽霞, and 胡敏
- Abstract
Copyright of Journal of Image & Graphics is the property of Editorial Office of Journal of Image & Graphics and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2018
- Full Text
- View/download PDF
12. 于卷积神经网络的轿车车型精细识别方法.
- Author
-
陈宏彩, 程煜, and 张常有
- Abstract
Copyright of Journal of Hebei University of Science & Technology is the property of Hebei University of Science & Technology, Journal of Hebei University of Science & Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2017
- Full Text
- View/download PDF
13. Deep CNNs With Spatially Weighted Pooling for Fine-Grained Car Recognition.
- Author
-
Hu, Qichang, Wang, Huibing, Li, Teng, and Shen, Chunhua
- Abstract
Fine-grained car recognition aims to recognize the category information of a car, such as car make, car model, or even the year of manufacture. A number of recent studies have shown that a deep convolutional neural network (DCNN) trained on a large-scale data set can achieve impressive results at a range of generic object classification tasks. In this paper, we propose a spatially weighted pooling (SWP) strategy, which considerably improves the robustness and effectiveness of the feature representation of most dominant DCNNs. More specifically, the SWP is a novel pooling layer, which contains a predefined number of spatially weighted masks or pooling channels. The SWP pools the extracted features of DCNNs with the guidance of its learnt masks, which measures the importance of the spatial units in terms of discriminative power. As the existing methods that apply uniform grid pooling on the convolutional feature maps of DCNNs, the proposed method can extract the convolutional features and generate the pooling channels from a single DCNN. Thus minimal modification is needed in terms of implementation. Moreover, the parameters of the SWP layer can be learned in the end-to-end training process of the DCNN. By applying our method to several fine-grained car recognition data sets, we demonstrate that the proposed method can achieve better performance than recent approaches in the literature. We advance the state-of-the-art results by improving the accuracy from 92.6% to 93.1% on the Stanford Cars-196 data set and 91.2% to 97.6% on the recent CompCars data set. We have also tested the proposed method on two additional large-scale data sets with impressive results observed. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
14. Learn from each other to Classify better: Cross-layer mutual attention learning for fine-grained visual classification.
- Author
-
Liu, Dichao, Zhao, Longjiao, Wang, Yu, and Kato, Jien
- Subjects
- *
VISUAL learning , *DEEP learning , *CONVOLUTIONAL neural networks , *IMAGE recognition (Computer vision) , *SOURCE code - Abstract
• A multi-step framework for improving the accuracy of fine-grained image recognition. • A design for different layers to learn from each other to boost overall performance. • Extensive experiments demonstrate and prove the effectiveness of the proposed idea. • The proposed method reports state-of-the-art results on three challenging datasets. Fine-grained visual classification (FGVC) is valuable yet challenging. The difficulty of FGVC mainly lies in its intrinsic inter-class similarity, intra-class variation, and limited training data. Moreover, with the popularity of deep convolutional neural networks, researchers have mainly used deep, abstract, semantic information for FGVC, while shallow, detailed information has been neglected. This work proposes a cross-layer mutual attention learning network (CMAL-Net) to solve the above problems. Specifically, this work views the shallow to deep layers of CNNs as "experts" knowledgeable about different perspectives. We let each expert give a category prediction and an attention region indicating the found clues. Attention regions are treated as information carriers among experts, bringing three benefits: (i) helping the model focus on discriminative regions; (i i) providing more training data; (i i i) allowing experts to learn from each other to improve the overall performance. CMAL-Net achieves state-of-the-art performance on three competitive datasets: FGVC-Aircraft, Stanford Cars, and Food-11. The source code is available at https://github.com/Dichao-Liu/CMAL [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Deep Kernelized Network for Fine-Grained Recognition
- Author
-
M. Amine Mahmoudi, Aladine Chetouani, Fatma Boufera, Hedi Tabia, University of Mustapha Stambouli Mascara, Laboratoire pluridisciplinaire de recherche en ingénierie des systèmes, mécanique et énergétique (PRISME), Université d'Orléans (UO)-Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA), Informatique, BioInformatique, Systèmes Complexes (IBISC), and Université d'Évry-Val-d'Essonne (UEVE)-Université Paris-Saclay
- Subjects
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Kernel function ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020201 artificial intelligence & image processing ,Deep learning ,02 engineering and technology ,010306 general physics ,Facial expression recognition ,01 natural sciences ,Fine-grained recognition ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; Convolutional Neural Networks (CNNs) are based on linear kernel at different levels of the network. Linear kernels are not efficient, particularly, when the original data is not linearly separable. In this paper, we focus on this issue by investigating the impact of using higher order kernels. For this purpose, we replace convolution layers with Kervolution layers proposed in [28]. Similarly, we replace fully connected layers alternatively with Kernelized Dense Layers (KDL) proposed in [16] and Kernel Support vector Machines (SVM) [1]. These kernel-based methods are more discriminative in the way that they can learn more complex patterns compared to the linear one. Those methods first maps input data to a higher space. After that, they learn a linear classifier in that space which is similar to a powerful non-linear classifier in the first space. We have used Fine-Grained datasets namely FGVC-Aircraft, StanfordCars and CVPRIndoor as well as Facial Expression Recognition (FER) datasets namely, RAF-DB, ExpW and FER2013 to evaluate the performance of these methods. The experimental results demonstrate that these methods outperform the ordinary linear layers when used in a deep network fashion.
- Published
- 2021
- Full Text
- View/download PDF
16. Task-Driven Progressive Part Localization for Fine-Grained Object Recognition.
- Author
-
Huang, Chen, He, Zhihai, Cao, Guitao, and Cao, Wenming
- Abstract
The problem of fine-grained object recognition is very challenging due to the subtle visual differences between different object categories. In this paper, we propose a task-driven progressive part localization (TPPL) approach for fine-grained object recognition. Most existing methods follow a two-step approach that first detects salient object parts to suppress the interference from background scenes and then classifies objects based on features extracted from these regions. The part detector and object classifier are often independently designed and trained. In this paper, our major finding is that the part detector should be jointly designed and progressively refined with the object classifier so that the detected regions can provide the most distinctive features for final object recognition. Specifically, we develop a part-based SPP-net (Part-SPP) as our baseline part detector. We then establish a TPPL framework, which takes the predicted boxes of Part-SPP as an initial guess, and then examines new regions in the neighborhood using a particle swarm optimization approach, searching for more discriminative image regions to maximize the objective function and the recognition performance. This procedure is performed in an iterative manner to progressively improve the joint part detection and object classification performance. Experimental results on the Caltech-UCSD-200-2011 dataset demonstrate that our method outperforms state-of-the-art fine-grained categorization methods both in part localization and classification, even without requiring a bounding box during testing. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
17. A novel part-level feature extraction method for fine-grained vehicle recognition.
- Author
-
Lu, Lei, Wang, Ping, and Cao, Yijie
- Subjects
- *
ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *FEATURE extraction , *DEEP learning - Abstract
In this paper, we propose a novel part-level feature extraction method to enhance the discriminative ability of deep convolutional features for the task of fine-grained vehicle recognition. Generally, the challenges for fine-grained vehicle recognition are mainly caused by the subtle visual differences between part regions of vehicles. Therefore, it is essential to extract discriminative features from part regions. Many existing methods, especially deep convolutional neural networks (D-CNNs), tend to detect the discriminative part regions explicitly or learn the part information implicitly through network restructuring and neglect the abundant part-level information contained in the high-level features generated by CNNs. In light of this, we propose a simple and effective part-level feature extraction method to enhance the representation of part-level features within the global features of target object generated by the backbone networks. The proposed method is built on the deep convolutional layers from which the discriminative part features could be integrated and extracted accordingly. More specifically, a basic feature grouping module is adopted to integrate the feature maps of deep convolutional layers into groups in each of which the related discriminative parts are assembled. The feature grouping process is performed in a multi-stage manner to ensure the integration process. Then a fusion module follows to model the coarse-to-fine relationship of the part features and further ensure the integrity and effectiveness of the part features. We conduct comparison experiments on public datasets, and the results show that the proposed method achieves comparable performance with state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Playing to distraction: towards a robust training of CNN classifiers through visual explanation techniques
- Author
-
Estefania Talavera, Beatriz Remeseiro, and David Aguilera Morales
- Subjects
FOS: Computer and information sciences ,Computer science ,Image classification ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Machine learning ,computer.software_genre ,Convolutional neural network ,Field (computer science) ,Artificial Intelligence ,Robustness (computer science) ,Learning process ,Classifier (linguistics) ,Egocentric vision ,Visual explanation techniques ,Contextual image classification ,business.industry ,Deep learning ,Class (biology) ,Fine-grained recognition ,Artificial Intelligence (cs.AI) ,A priori and a posteriori ,Convolutional neural networks ,Artificial intelligence ,business ,computer ,Software - Abstract
The field of deep learning is evolving in different directions, with still the need for more efficient training strategies. In this work, we present a novel and robust training scheme that integrates visual explanation techniques in the learning process. Unlike the attention mechanisms that focus on the relevant parts of images, we aim to improve the robustness of the model by making it pay attention to other regions as well. Broadly speaking, the idea is to distract the classifier in the learning process to force it to focus not only on relevant regions but also on those that, a priori, are not so informative for the discrimination of the class. We tested the proposed approach by embedding it into the learning process of a convolutional neural network for the analysis and classification of two well-known datasets, namely Stanford cars and FGVC-Aircraft. Furthermore, we evaluated our model on a real-case scenario for the classification of egocentric images, allowing us to obtain relevant information about peoples' lifestyles. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity. The obtained results indicate the suitability of our proposed training scheme for image classification, improving the robustness of the final model., 20 pages,3 figures, 4 tables
- Published
- 2020
19. Fine-Grained Recognition of Surface Targets with Limited Data
- Author
-
Xiaotian Qiu, Runze Guo, Peng Wu, Zhen Zuo, Shaojing Su, and Bei Sun
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,surface targets ,Deep learning ,lcsh:Electronics ,lcsh:TK7800-8360 ,Pattern recognition ,transfer learning ,Residual ,Expression (mathematics) ,Discriminative model ,Hardware and Architecture ,Control and Systems Engineering ,Signal Processing ,Metric (mathematics) ,Feature (machine learning) ,Artificial intelligence ,multi-attention residual model ,Electrical and Electronic Engineering ,fine-grained recognition ,business ,Transfer of learning - Abstract
Recognition of surface targets has a vital influence on the development of military and civilian applications such as maritime rescue patrols, illegal-vessel screening, and maritime operation monitoring. However, owing to the interference of visual similarity and environmental variations and the lack of high-quality datasets, accurate recognition of surface targets has always been a challenging task. In this paper, we introduce a multi-attention residual model based on deep learning methods, in which channel and spatial attention modules are applied for feature fusion. In addition, we use transfer learning to improve the feature expression capabilities of the model under conditions of limited data. A function based on metric learning is adopted to increase the distance between different classes. Finally, a dataset with eight types of surface targets is established. Comparative experiments on our self-built dataset show that the proposed method focuses more on discriminative regions, avoiding problems like gradient disappearance, and achieves better classification results than B-CNN, RA-CNN, MAMC, and MA-CNN, DFL-CNN.
- Published
- 2020
20. Small-sample learning with salient-region detection and center neighbor loss for insect recognition in real-world complex scenarios.
- Author
-
Yang, Zhankui, Yang, Xinting, Li, Ming, and Li, Wenyong
- Subjects
- *
RARE insects , *CLASSIFICATION of insects , *OBJECT recognition (Computer vision) , *INSECTS , *DEEP learning , *TEXT recognition - Abstract
• A small sample learning method different from the traditional deep learning method. • Proposing focus-area locations to deal with fine-grained insect recognition. • Utilizing a Center Neighbor Loss function to achieve competitive performance. • We can predict new categories without retraining the model. Most real-world scenarios face the problems of small-sample learning and fine-grained recognition. For many rare insect classes, collecting a large number of training samples is infeasible or even impossible. In contrast, humans are able to recognize a new object class with little supervision. This motivates us to address the problems of small-sample recognition and fine-grained recognition for insects by combining recognition and localization; this can provide an effective remedy for data scarcity and the two techniques can bootstrap from each other. In this paper, we propose a saliency-detection model to localize the key regions that have the largest discriminative features for fine-grained insect classification. The learner learns to predict foreground and background masks for such localization, having been trained on a training set annotated with bounding boxes. Additionally, to further generate discriminative features, a center neighbor loss function is used to construct a robust feature-space distribution. The proposed model is trained end-to-end on our small-sample learning dataset, which comprises 220 insect categories from a real-world complex environment. Compared with the method using prototypical networks, the proposed method achieves a superior performance, with a mean recognition rate (top-5 accuracy) of 57.65%, and can effectively recognize insects under small-sample and complex-scene conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
21. Discriminative semantic region selection for fine-grained recognition.
- Author
-
Zhang, Chunjie, Wang, Da-Han, and Li, Haisheng
- Subjects
- *
DIGITAL image correlation , *IMAGE recognition (Computer vision) , *SEMANTIC computing , *CONVOLUTIONAL neural networks , *DEEP learning - Abstract
• We combine DCNN with image regions for semantical representations which bridges the semantic gap. • Image regions are combined with semantic distinctiveness and spatial-semantic correlations. • The proposed method can be combined with various pre-learned models to improve the recognition accuracy. Performances of fine-grained recognition have been greatly improved thanks to the fast developments of deep convolutional neural networks (DCNN). DCNN methods often treat each image region equally. Besides, researchers often rely on visual information for classification. To solve these problems, we propose a novel discriminative semantic region selection method for fine-grained recognition (DSRS). We first select a few image regions and then use the pre-trained DCNN models to predict their semantic correlations with corresponding classes. We use both visual and semantic representations to represent image regions. The visual and semantic representations are then linearly combined for joint representation. The combination parameters are determined by considering both semantic distinctiveness and spatial-semantic correlations. We use the joint representations for classifier training. A testing image can be classified by obtaining the visual and semantic representations and encoded for joint representation and classification. Experiments on several publicly available datasets demonstrate the proposed method's superiority. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. Fine-Grained Recognition of Surface Targets with Limited Data.
- Author
-
Guo, Runze, Sun, Bei, Qiu, Xiaotian, Su, Shaojing, Zuo, Zhen, and Wu, Peng
- Subjects
DEEP learning - Abstract
Recognition of surface targets has a vital influence on the development of military and civilian applications such as maritime rescue patrols, illegal-vessel screening, and maritime operation monitoring. However, owing to the interference of visual similarity and environmental variations and the lack of high-quality datasets, accurate recognition of surface targets has always been a challenging task. In this paper, we introduce a multi-attention residual model based on deep learning methods, in which channel and spatial attention modules are applied for feature fusion. In addition, we use transfer learning to improve the feature expression capabilities of the model under conditions of limited data. A function based on metric learning is adopted to increase the distance between different classes. Finally, a dataset with eight types of surface targets is established. Comparative experiments on our self-built dataset show that the proposed method focuses more on discriminative regions, avoiding problems like gradient disappearance, and achieves better classification results than B-CNN, RA-CNN, MAMC, and MA-CNN, DFL-CNN. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.