174 results on '"fine-grained recognition"'
Search Results
2. A fine-grained recognition technique for identifying Chinese food images
- Author
-
Feng, Shuo, Wang, Yangang, Gong, Jianhong, Li, Xiang, and Li, Shangxuan
- Published
- 2023
- Full Text
- View/download PDF
3. An effective retrieval model for home textile images based on deep feature extraction: An effective retrieval model...: Z. Miao et al.
- Author
-
Miao, Ziyi, Yao, Lan, Zeng, Feng, Wang, Yi, and Hong, ZhiGuo
- Abstract
Home textile images have small inter-class differences and large intra-class differences, making home textile image retrieval face great technical challenges. In this paper, we design an effective retrieval model for home textile images, in which ResNet50 is used as backbone network, and a hybrid maximal pooling spatial attention module is proposed to fuse local spatial information at different scales, thus focusing on key information and suppressing irrelevant information. Moreover, we propose a new loss function called SD-arcface for fine-grained feature recognition, which adopts dynamic additive angular margin to improve the intra-class compactness and the inter-class separation of home textile images. In addition, we set up a large-scale dataset of home textile images, which contains 89k home textile images from 12k categories, and evaluate the image retrieval performance of the proposed model with two metrics, Recall@k and MAP@k. Finally, the experimental results show that the proposed model achieves a better retrieval performance than other models. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
4. A Fine-Grained Aircraft Target Recognition Algorithm for Remote Sensing Images Based on YOLOV8
- Author
-
Xiao-Nan Jiang, Xiang-Qian Niu, Fan-Lu Wu, Yao Fu, He Bao, Yan-Chao Fan, Yu Zhang, and Jun-Yan Pei
- Subjects
Feature fusion ,fine-grained recognition ,remote sensing images ,YOLOv8 ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Fine-grained recognition plays a pivotal role in the field of remote sensing image analysis, particularly in critical applications such as reconnaissance and early warning, intelligence analysis, and intelligent interpretation. However, the extensive coverage of remote sensing images, the low pixel ratio of targets, and the subtlety of features pose significant challenges for fine-grained recognition of aircraft targets. This article addresses the issues of missed and false detections in existing aircraft target fine-grained recognition algorithms for remote sensing images by proposing an improved algorithm based on YOLOv8, called FD-YOLOv8 (Focus Detail-YOLOv8). Initially, this article designs a local detail feature module to tackle the problem of information loss in shallow networks. This module enhances the capture of semantic information while extracting shallow features, thereby preserving more fine-grained features and improving the network's feature extraction capability. Subsequently, a focus modulation mechanism is employed to enhance the network's interactive understanding of local and global features, thereby improving the recognition accuracy for small and challenging targets. Finally, a multitype feature fusion is designed, which optimizes the generation of feature maps by integrating local features, high-level semantic information, and low-level texture information, enhancing the accuracy of fine-grained target recognition. Experiments conducted on the public remote sensing image dataset FAIR1M demonstrated that the YOLOv8n algorithm achieved a mean average precision (mAP) of 81.8% for aircraft category recognition tasks. In contrast, FD-YOLOv8 exhibited superior performance, with an mAP of 85.0%, indicating a significant advantage in fine-grained recognition.
- Published
- 2025
- Full Text
- View/download PDF
5. A fine-grained attributes recognition model for clothing based on improved the CSPDarknet and PAFPN network.
- Author
-
Pan, Bo, Xiang, Jun, Zhang, Ning, and Pan, Ruru
- Abstract
An efficient and accurate model for recognizing fine-grained clothing attributes has significant commercial potential and social impact. However, the inherent diversity and complexity of clothing make acquiring datasets with fine-grained attributes a costly endeavor. To address these challenges, we propose a lightweight clothing fine-grained attributes recognition model. First, the Ghost module is introduced into the CSPDarknet network to enhance the depth and expressiveness of feature learning while reducing the parameters and computational complexity. Then, the Conv module is replaced by the GSConv module in the PAFPN network to further reduce the network computational load, and the SE attention mechanism is also added to enhance key feature perception. Finally, the Detect module is employed to effectively recognize fine-grained clothing attributes. To evaluate performance, we constructed a clothing dataset containing 20 fine-grained attributes. Experimental results show that the model achieves precision, recall and mAP of 76.2%, 78.9% and 81.7%. Compared to the original model, the number of parameters is reduced by 26.2%, and the FPS improves by 25.4%. Our proposed model performs well on a small-scale dataset and improves performance in resource-constrained environments, making it highly applicable to clothing recommendation, virtual fitting, and personalization. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
6. RSNC-YOLO: A Deep-Learning-Based Method for Automatic Fine-Grained Tuna Recognition in Complex Environments.
- Author
-
Xu, Wenjie, Fang, Hui, Yu, Shengchi, Yang, Shenglong, Yang, Haodong, Xie, Yujia, and Dai, Yang
- Subjects
FISHERIES ,TUNA fishing ,TUNA ,FISHERY management ,IMAGE databases - Abstract
Tuna accounts for 20% of the output value of global marine capture fisheries, and it plays a crucial role in maintaining ecosystem stability, ensuring global food security, and supporting economic stability. However, improper management has led to significant overfishing, resulting in a sharp decline in tuna populations. For sustainable tuna fishing, it is essential to accurately identify the species of tuna caught and to count their numbers, as these data are the foundation for setting scientific catch quotas. The traditional manual identification method suffers from several limitations and is prone to errors during prolonged operations, especially due to factors like fatigue, high-intensity workloads, or adverse weather conditions, which ultimately compromise its accuracy. Furthermore, the lack of transparency in the manual process may lead to intentional underreporting, which undermines the integrity of fisheries' data. In contrast, an intelligent, real-time identification system can reduce the need for human labor, assist in more accurate identification, and enhance transparency in fisheries' management. This system not only provides reliable data for refined management but also enables fisheries' authorities to dynamically adjust fishing strategies in real time, issue timely warnings when catch limits are approached or exceeded, and prevent overfishing, thus ultimately contributing to sustainable tuna management. In light of this need, this article proposes the RSNC-YOLO algorithm, an intelligent model designed for recognizing tuna in complex scenarios on fishing vessels. Based on YOLOv8s-seg, RSNC-YOLO integrates Reparameterized C3 (RepC3), Selective Channel Down-sampling (SCDown), a Normalization-based Attention Module (NAM), and C2f-DCNv3-DLKA modules. By utilizing a subset of images selected from the Fishnet Open Image Database, the model achieves a 2.7% improvement in mAP@0.5 and a 0.7% improvement in mAP@0.5:0.95. Additionally, the number of parameters is reduced by approximately 30%, and the model's weight size is reduced by 9.6 MB, while maintaining an inference speed comparable to that of YOLOv8s-seg. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Large scale multimodal fashion care recognition.
- Author
-
Su, Mingyue and Wang, Zhongfei
- Abstract
Smart Fashion is reshaping people’s lives, and affects people’s choices and outfits. Existing computer-vision-enabled fashion technology has covered many aspects, such as fashion detection, fashion recognition, fashion segmentation, virtual fitting, fashion recommendation and fashion compatibility, etc. However, there is a gap in the direction of intelligent care. The care process is closely related to the lifetime of clothing, and also plays a very important role in the health and well-being of humans. The care label inside the clothing indicates the recommended care operation, which usually contains multiple care symbols and multilingual textual descriptions. Repeated washing can lead to fading and deformation of labels. Care label recognition is a challenging task in the wild scene. In this paper, we propose a strong multi-modal multi-task baseline (abbreviated as MMFC), which combines visual textual features and visual symbol features into a united framework. The Modality Mutual Transformation Module (MMTM) is employed to enhance the feature fusion. We refine the alignment of different modality features utilizing the methodology of contrastive learning and feature mapping. The lack of care label datasets has limited the development of intelligent care. Therefore, we introduce a new high-quality large-scale dataset called FashionCare, which has 30,477 images, a total of 157,907 fashion care symbols, six major categories, 66 subcategories and textual description. To our knowledge, this is the first large-scale dataset of care label. Extensive experiments on FashionCare show the effectiveness of MMFC. In order to demonstrate the few-shot recognition performance of MMFC, we build a sub-dataset called FashionCare-LT by constructing the tail subcategories. Both quantitative and qualitative results show that MMFC possesses exceptional few-shot recognition capabilities. We hope that FashionCare can serve as a new benchmark for large-scale fine-grained multimodal learning, and contribute to the development of multimodal recognition, understanding and analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. MT-ASM: a multi-task attention strengthening model for fine-grained object recognition.
- Author
-
Liu, Dichao, Wang, Yu, Mase, Kenji, and Kato, Jien
- Abstract
Fine-Grained Object Recognition (FGOR) equips intelligent systems with recognition capabilities at or even beyond the level of human experts, making it a core technology for numerous applications such as biodiversity monitoring systems and advanced driver assistance systems. FGOR is highly challenging, and recent research has primarily focused on identifying discriminative regions to tackle this task. However, these methods often require extensive manual labor or expensive algorithms, which may lead to irreversible information loss and pose significant barriers to their practical application. Instead of learning region capturing, this work enhances networks’ response to discriminative regions. We propose a multitask attention-strengthening model (MT-ASM), inspired by the human ability to effectively utilize experiences from related tasks when solving a specific task. When faced with an FGOR task, humans naturally compare images from the same and different categories to identify discriminative and non-discriminative regions. MT-ASM employs two networks during the training phase: the major network, tasked with the main goal of category classification, and a subordinate task that involves comparing images from the same and different categories to find discriminative and non-discriminative regions. The subordinate network evaluates the major network’s performance on the subordinate task, compelling the major network to improve its subordinate task performance. Once training is complete, the subordinate network is removed, ensuring no additional overhead during inference. Experimental results on CUB-200-2011, Stanford Cars, and FGVC-Aircraft datasets demonstrate that MT-ASM significantly outperforms baseline methods. Given its simplicity and low overhead, it remains highly competitive with state-of-the-art methods. The code is available at . [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. 多类别形态的未隶定青铜器铭文细粒度识别.
- Author
-
刘可欣, 王慧琴, 王可, 王展, and 王宏
- Subjects
- *
CONVOLUTIONAL neural networks , *TEXT recognition , *FEATURE extraction , *INSCRIPTIONS , *BRONZE - Abstract
Fine-grained recognition of untranscribed bronze inscriptions relies on traditional convolutional neural networks. However, this method used overlooks the relationship between localization and feature learning, leading to difficulties in acenrately representing the complex structures of the text and resulting in recognition errors. This paper proposed a model, named MP-CNN, addressed this issues through a pose-aligned multi-part fine-grained recognition approach. In the first stage, it employed a spatial transformer to guide inscriptions to adopt a consistent glyph posture, aiding the model in accurately locating key text regions. The second stage incorporated constructing a cascaded efficient channel attention (ECA) mechanism to guide the combination of feature channels, locating multiple independent discriminative regions and refining the extraction of morphological features for complex text structures. Finally, in the third stage, it built a feature fusion layer to obtain the recognition results. Experimental results demonstrate that the algorithm achieves recognition accuracies of 97.25% and 97.18% on standard and multi-category morphology datasets, respectively. Compared to the traditional convolutional network ResNet34, the method exhibits improvements of 4.63% and 8.89% on these datasets. The results indicate that the algorithm effectively adapts to the actual morphological variations in inscriptions, achieving fine-grained recognition of untranscribed bronze inscriptions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. V2MLP: an accurate and simple multi-view MLP network for fine-grained 3D shape recognition.
- Author
-
Zheng, Liang, Bai, Jing, Bai, Shaojin, Li, Wenjing, Peng, Bin, and Zhou, Tao
- Subjects
- *
ANNOTATIONS - Abstract
Fine-grained 3D shape recognition (FGSR) is crucial for real-world applications. Existing methods face challenges in achieving high accuracy for FGSR due to high similarity within sub-categories and low dissimilarity between them, especially in the absence of part location or attribute annotations. In this paper, we propose V 2 MLP, a multi-view representation-oriented MLP network dedicated to FGSR, using only class labels as supervision. V 2 MLP comprises two key modules: the cross-view interaction MLP (CVI-MLP) and the cross-view fusion MLP (CVF-MLP). The CVI-MLP module captures contextual information, including local and global contexts through cross-view interactions, to extract discriminative view features that reinforce subtle differences between sub-categories. Meanwhile, the CVF-MLP module performs cross-view aggregation from space and view dimensions to obtain the final 3D shape features, minimizing information loss during the view feature fusion process. Extensive experiments on three categories from the FG3D dataset demonstrate the effectiveness of V 2 MLP in learning discriminative features for 3D shapes, achieving state-of-the-art accuracy for FGSR. Additionally, V 2 MLP performs competitively for meta-category recognition on the ModelNet40 dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Multimodal fine-grained grocery product recognition using image and OCR text.
- Author
-
Pettersson, Tobias, Riveiro, Maria, and Löfström, Tuwe
- Abstract
Automatic recognition of grocery products can be used to improve customer flow at checkouts and reduce labor costs and store losses. Product recognition is, however, a challenging task for machine learning-based solutions due to the large number of products and their variations in appearance. In this work, we tackle the challenge of fine-grained product recognition by first extracting a large dataset from a grocery store containing products that are only differentiable by subtle details. Then, we propose a multimodal product recognition approach that uses product images with extracted OCR text from packages to improve fine-grained recognition of grocery products. We evaluate several image and text models separately and then combine them using different multimodal models of varying complexities. The results show that image and textual information complement each other in multimodal models and enable a classifier with greater recognition performance than unimodal models, especially when the number of training samples is limited. Therefore, this approach is suitable for many different scenarios in which product recognition is used to further improve recognition performance. The dataset can be found at . [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Fine-Grained Ship Recognition Based on Visible and Near-Infrared Multimodal Remote Sensing Images: Dataset, Methodology and Evaluation.
- Author
-
Shiwen Song, Rui Zhang, Min Hu, and Feiyao Huang
- Subjects
OBJECT recognition algorithms ,EVALUATION methodology ,FEATURE extraction ,IMAGE recognition (Computer vision) ,REMOTE sensing ,SHIPS ,SATELLITE-based remote sensing - Abstract
Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security. Currently, with the emergence of massive high-resolution multimodality images, the use of multi-modality images for fine-grained recognition has become a promising technology. Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples. The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features. The attention mechanism helps the model to pinpoint the key information in the image, resulting in a significant improvement in the model's performance. In this paper, a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first, named Dataset for Multimodal Fine-grained Recognition of Ships (DMFGRS). It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories, collated from digital orthophotos model provided by commercial remote sensing satellites. DMFGRS provides two types of annotation format files, as well as segmentation mask images corresponding to the ship targets. Then, a Multimodal Information Cross- Enhancement Network (MICE-Net) fusing features of visible and near-infrared remote sensing images, has been proposed. In the network, a dual-branch feature extraction and fusion module has been designed to obtain more expressive features. The Feature Cross Enhancement Module (FCEM) achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map. A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS. MICE-Net conducted experiments on DMFGRS, and the precision, recall, mAP0.5 and mAP0.5:0.95 reached 87%, 77.1%, 83.8% and 63.9%, respectively. Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS. Built on lightweight network YOLO, the model has excellent generalizability, and thus has good potential for application in real-life scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Fine-grained recognition algorithm of crop pests based on cross-layer bilinear aggregation and multi-task learning
- Author
-
Juquan Ruan, Shuo Liu, Wanjing Mao, Shan Zeng, Zhuoyi Zhang, and Guangsun Yin
- Subjects
Agricultural engineering ,lightweight network ,cross-layer bilinear aggregation ,multi-task learning ,fine-grained recognition ,Agriculture ,Agriculture (General) ,S1-972 - Abstract
Fine-grained recognition of crop pests is a crucial concern in the field of agriculture, as the accuracy of recognition and generalization ability directly affect the yield and quality of crops. Aiming at the characteristics of crop pests with a wide variety of species, small inter-class and large intra-class differences in external morphology, as well as the problems of uneven sample distribution and noisy labels in fine-grained image datasets under complex environments, we propose a fine-grained recognition model of crop pests (MT-MACLBPHSNet) based on cross-layer bilinear aggregation and multi-task learning, which consists of three key modules: the backbone network module, the cross-layer bilinear aggregation module, and the multi-task learning module. A new union loss function is designed in the primary task of the multi-task learning module, which is used to alleviate the two problems existing in the model training fine-grained image datasets. The experimental results show that the model effectively balances the model complexity and recognition accuracy in a comparative analysis with several existing excellent network models on the IP102-CP13 dataset, with the recognition accuracy reaching 75.37%, which is 7.06% higher than the Baseline model, and the F1-score reaching 67.06%. Additionally, the generalization of the model is also verified on the IP102-VP16 dataset, and the model outperforms most of the models in terms of recognition accuracy and generalization ability, which can provide an effective reference for fine-grained recognition of crop pests.
- Published
- 2024
- Full Text
- View/download PDF
14. Robust fine‐grained visual recognition with images based on internet of things.
- Author
-
Cai, Zhenhuang, Yan, Shuai, and Huang, Dan
- Subjects
- *
INTERNET of things , *IMAGE recognition (Computer vision) , *ARTIFICIAL neural networks , *SOURCE code - Abstract
Labeling fine‐grained objects manually is extremely challenging, as it is not only label‐intensive but also requires professional knowledge. Accordingly, robust learning methods for fine‐grained recognition with web images collected from Internet of Things have drawn significant attention. However, training deep fine‐grained models directly using untrusted web images is confronted by two primary obstacles: (1) label noise in web images and (2) domain variance between the online sources and test datasets. To this end, in this study, we mainly focus on addressing these two pivotal problems associated with untrusted web images. To be specific, we introduce an end‐to‐end network that collaboratively addresses these concerns in the process of separating trusted data from untrusted web images. To validate the efficacy of our proposed model, untrusted web images are first collected by utilizing the text category labels found within fine‐grained datasets. Subsequently, we employ the designed deep model to eliminate label noise and ameliorate domain mismatch. And the chosen trusted web data are utilized for model training. Comprehensive experiments and ablation studies validate that our method consistently surpasses other state‐of‐the‐art approaches for fine‐grained recognition tasks in real‐world scenarios, demonstrating a significant improvement margin (2.51% on CUB200‐2011 and 2.92% on Stanford Dogs). The source code and models can be accessed at: https://github.com/Codeczh/FGVC‐IoT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. 双注意力随机选择全局上下文细粒度识别网络.
- Author
-
徐胜军, 荆扬, 段中兴, 李明海, 李海涛, and 刘福友
- Abstract
Copyright of Chinese Journal of Liquid Crystal & Displays is the property of Chinese Journal of Liquid Crystal & Displays and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
16. Spatial-aware collaborative region mining for fine-grained recognition.
- Author
-
Yang, Weiwei and Yin, Jian
- Abstract
Fine-grained recognition aims to classify images into hundreds of subcategorical labels under a generic category. The main challenge lies in the similar appearance between sub-categories and pushes a model to explore the discriminative regions automatically. Most existing approaches either only mine the informative regions without considering the interclass relationship or focus on pairwise images but neglect the multiple-class relationship, which leads to incomplete information and the tendency to focus on a single region. Since the interclass correlations and the discriminative regions both play an important role in distinguishing one fine-grained category from others, we propose a new Spatial-aware Collaborative Region Mining (SCRIM) scheme by fully exploiting the relationships between inter- and intraclass regions. The proposed SCRIM scheme consists of two modules that collaboratively mine the spatially aware feature: the Coarse Parts Localization (CPL) module that exploits the hierarchical inter- and intraclass correlations; and the Fine Parts Localization (FPL) module, which mines the multi-scale fine discriminative parts. Specifically, dual CPLs create two groups of contrastive part features separately by extracting contrastive features for each image. These features from the same class and module should have smaller distances. Given the extracted features, dual FPLs further mine and updates the fine region features by ranking their informativeness scores with ground truth subcategorical labels. Through the collaboration between the CPL and FPL, our SCRIM scheme can take the hierarchical correlations between multiple samples into account and mine the multi-scale discriminative parts for final fine-grained classification. Extensive experiments on three popular benchmarks show that our proposed SCRIM outperforms the state-of-the-art methods by a large margin. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Foodnet: multi-scale and label dependency learning-based multi-task network for food and ingredient recognition.
- Author
-
Shuang, Feng, Lu, Zhouxian, Li, Yong, Han, Chao, Gu, Xia, and Wei, Shidi
- Subjects
- *
CONVOLUTIONAL neural networks , *ALGORITHMS , *LEARNING modules , *SOURCE code - Abstract
Image-based food pattern classification poses challenges of non-fixed spatial distribution and ingredient occlusion for mainstream computer vision algorithms. However, most current approaches classify food and ingredients by directly extracting abstract features of the entire image through a convolutional neural network (CNN), ignoring the relationship between food and ingredients and ingredient occlusion problem. To address these issues mentioned, we propose a FoodNet for both food and ingredient recognition, which uses a multi-task structure with a multi-scale relationship learning module (MSRL) and a label dependency learning module (LDL). As ingredients normally co-occur in an image, we present the LDL to use the dependency of ingredient to alleviate the occlusion problem of ingredient. MSRL aggregates multi-scale information of food and ingredients, then uses two relational matrixs to model the food-ingredient matching relationship to obtain richer feature representation. The experimental results show that FoodNet can achieve good performance on the Vireo Food-172 and UEC Food-100 datasets. It is worth noting that it reaches the most state-of-the-art level in terms of ingredient recognition in the Vireo Food-172 and UECFood-100.The source code will be made available at https://github.com/visipaper/FoodNet. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. RF-AIRCGR: Lightweight Convolutional Neural Network-Based RFID Chinese Character Gesture Recognition Research
- Author
-
Yajun Zhang, Congcong Wang, Feng Li, Weiqian Yu, Yuankang Wang, and Jingying Chen
- Subjects
Gesture recognition ,RFID ,Markov transition field ,fine-grained recognition ,neural network ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Gesture recognition serves as a foundation for Human-Computer Interaction (HCI). Although Radio Frequency Identification (RFID) is gaining popularity due to its advantages (non-invasive, low-cost, and lightweight), most existing research has only addressed the recognition of simple sign language gestures or body movements. There is still a significant gap in the recognition of fine-grained gestures. In this paper, we propose RF-AIRCGR as a fine-grained hand gesture recognition system for Chinese characters. It enables information input and querying through gestures in contactless scenarios, which is of great significance for both medical and educational applications. This system has three main advantages: First, by designing a tag matrix and dual-antenna layout, it fully captures fine-grained gesture data for handwritten Chinese characters. Second, it uses a variance-based sliding window method to segment continuous gesture actions. Lastly, the phase signals of Chinese characters are innovatively transformed into feature images using the Markov Transition Field. After a series of preprocessing steps, the improved C-AlexNet model is employed for deep training and experimentation. Experimental results show that RF-AIRCGR achieves average recognition accuracies of 97.85% for new users and 97.15% for new scenarios. The accuracy and robustness of the system in recognizing Chinese character gestures have been validated.
- Published
- 2024
- Full Text
- View/download PDF
19. RSNC-YOLO: A Deep-Learning-Based Method for Automatic Fine-Grained Tuna Recognition in Complex Environments
- Author
-
Wenjie Xu, Hui Fang, Shengchi Yu, Shenglong Yang, Haodong Yang, Yujia Xie, and Yang Dai
- Subjects
tuna ,fine-grained recognition ,fishing vessel ,complex scenarios ,improved YOLOv8 ,segmentation ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Tuna accounts for 20% of the output value of global marine capture fisheries, and it plays a crucial role in maintaining ecosystem stability, ensuring global food security, and supporting economic stability. However, improper management has led to significant overfishing, resulting in a sharp decline in tuna populations. For sustainable tuna fishing, it is essential to accurately identify the species of tuna caught and to count their numbers, as these data are the foundation for setting scientific catch quotas. The traditional manual identification method suffers from several limitations and is prone to errors during prolonged operations, especially due to factors like fatigue, high-intensity workloads, or adverse weather conditions, which ultimately compromise its accuracy. Furthermore, the lack of transparency in the manual process may lead to intentional underreporting, which undermines the integrity of fisheries’ data. In contrast, an intelligent, real-time identification system can reduce the need for human labor, assist in more accurate identification, and enhance transparency in fisheries’ management. This system not only provides reliable data for refined management but also enables fisheries’ authorities to dynamically adjust fishing strategies in real time, issue timely warnings when catch limits are approached or exceeded, and prevent overfishing, thus ultimately contributing to sustainable tuna management. In light of this need, this article proposes the RSNC-YOLO algorithm, an intelligent model designed for recognizing tuna in complex scenarios on fishing vessels. Based on YOLOv8s-seg, RSNC-YOLO integrates Reparameterized C3 (RepC3), Selective Channel Down-sampling (SCDown), a Normalization-based Attention Module (NAM), and C2f-DCNv3-DLKA modules. By utilizing a subset of images selected from the Fishnet Open Image Database, the model achieves a 2.7% improvement in mAP@0.5 and a 0.7% improvement in mAP@0.5:0.95. Additionally, the number of parameters is reduced by approximately 30%, and the model’s weight size is reduced by 9.6 MB, while maintaining an inference speed comparable to that of YOLOv8s-seg.
- Published
- 2024
- Full Text
- View/download PDF
20. 特征迁移的细粒度产品形态智能决策方法.
- Author
-
李雄, 苏建宁, 张志鹏, and 李晓晓
- Abstract
Copyright of Journal of Computer-Aided Design & Computer Graphics / Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao is the property of Gai Kan Bian Wei Hui and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
21. 基于多尺度特征深度神经网络的 不同产地山楂细粒度图像识别.
- Author
-
谭超群, 秦中翰, 黄欣然, 陈虎, 黄永亮, 吴纯洁, and 游志胜
- Abstract
Copyright of Journal Of Sichuan University (Natural Sciences Division) / Sichuan Daxue Xuebao-Ziran Kexueban is the property of Editorial Department of Journal of Sichuan University Natural Science Edition and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
22. MAR20:遥感图像军用飞机目标识别数据集.
- Author
-
禹文奇, 程塨, 王美君, 姚艳清, 谢星星, 姚西文, and 韩军伟
- Subjects
REMOTE sensing ,MILITARY airplanes - Abstract
Copyright of Journal of Remote Sensing is the property of Editorial Office of Journal of Remote Sensing & Science Publishing Co. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
23. Saliency-Driven ‘Evidence CNN’ for Fine-Grained Recognition of Twins
- Author
-
de Loyola Furtado e Sardinha, Razia, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Borah, Samarjeet, editor, Gandhi, Tapan K., editor, and Piuri, Vincenzo, editor
- Published
- 2023
- Full Text
- View/download PDF
24. PARTICUL: Part Identification with Confidence Measure Using Unsupervised Learning
- Author
-
Xu-Darme, Romain, Quénot, Georges, Chihani, Zakaria, Rousset, Marie-Christine, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rousseau, Jean-Jacques, editor, and Kapralos, Bill, editor
- Published
- 2023
- Full Text
- View/download PDF
25. Towards Better Guided Attention and Human Knowledge Insertion in Deep Convolutional Neural Networks
- Author
-
Gupta, Ankit, Sintorn, Ida-Maria, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Karlinsky, Leonid, editor, Michaeli, Tomer, editor, and Nishino, Ko, editor
- Published
- 2023
- Full Text
- View/download PDF
26. 基于混合类别均衡损失的车型精细识别.
- Author
-
李熙莹, 全峰玮, and 叶芝桧
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
27. Aircraft Detection and Fine-Grained Recognition Based on High-Resolution Remote Sensing Images.
- Author
-
Guan, Qinghe, Liu, Ying, Chen, Lei, Zhao, Shuang, and Li, Guandian
- Subjects
OBJECT recognition (Computer vision) ,COLOR space ,MICRO air vehicles ,IMAGE recognition (Computer vision) ,KALMAN filtering ,DATA augmentation ,REMOTE sensing - Abstract
In order to realize the detection and recognition of specific types of an aircraft in remote sensing images, this paper proposes an algorithm called Fine-grained S
2 ANet (FS2 ANet) based on the improved Single-shot Alignment Network (S2 ANet) for remote sensing aircraft object detection and fine-grained recognition. Firstly, to address the imbalanced number of instances of various aircrafts in the dataset, we perform data augmentation on some remote sensing images using flip and color space transformation methods. Secondly, this paper selects ResNet101 as the backbone, combines space-to-depth (SPD) to improve the FPN structure, constructs the FPN-SPD module, and builds the aircraft fine feature focusing module (AF3 M) in the detection head of the network, which reduces the loss of fine-grained information in the process of feature extraction, enhances the extraction capability of the network for fine aircraft features, and improves the detection accuracy of remote sensing micro aircraft objects. Finally, we use the SkewIoU based on Kalman filtering (KFIoU) as the algorithm's regression loss function, improving the algorithm's convergence speed and the object boxes' regression accuracy. The experimental results of the detection and fine-grained recognition of 11 types of remote sensing aircraft objects such as Boeing 737, A321, and C919 using the FS2 ANet algorithm show that the mAP0.5 of FS2 ANet is 46.82%, which is 3.87% higher than S2 ANet, and it can apply to the field of remote sensing aircraft object detection and fine-grained recognition. [ABSTRACT FROM AUTHOR]- Published
- 2023
- Full Text
- View/download PDF
28. An Integrated Transformer with Collaborative Tokens Mining for Fine-Grained Recognition.
- Author
-
Yang, Weiwei and Yin, Jian
- Subjects
REINFORCEMENT learning ,VERNACULAR architecture ,INTRACLASS correlation ,SPINE ,WARBLERS ,TEMPORAL lobe - Abstract
Fine-grained recognition mainly classifies subclass images into hundreds of subcategorical labels by locating the discriminative regions (e.g., Cape May warbler or Magnolia warbler bird). Due to the high complexity and non-differentiation of region locations through the traditional backbone architecture, most existing approaches utilize multi-level reinforcement learning to distinguish the similar appearance among sub-categories. These methods explore incomplete information through only the intra-class informative regions in one image or the inter-class and intra-class relationship in pairwise images, leading to the tendency for overlapped region locations. Since the inter-class correlations and new backbone with complete contextual semantic information play important roles in distinguishing fine-grained classes, we propose a novel transformer with the collaborative token mining (TCTM) scheme by fully exploiting the relationships between inter-class and intra-class regions. The proposed TCTM scheme with a new transformer backbone consists of two modules that collaboratively explore the spatially aware tokens: the Pyramid Tokens Multiplication (PTM) module which exploits the integrated multi-stage inter-class and intra-class correlations from new transformer architecture and the Tokens Proposals Generation (TPG) module which captures two groups of top-four discriminative tokens. The two PTMs extract contrastive tokens for each image and learn to rank these tokens, assuming that those from the same class and the same module should have smaller distances. The TPGs further sort and update the candidate tokens from the extracted attention tokens by ranking their probabilities with ground truth subcategorical labels. Through the collaboration between the PTM and TPG, our TCTM scheme can take the integrated correlations into account and mine the discriminative tokens for final fine-grained classification. Extensive experiments on four popular benchmarks show that our proposed TCTM outperforms the state-of-the-art methods by a large margin. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Zero-Shot Attribute Attacks on Fine-Grained Recognition Models
- Author
-
Shafiee, Nasim, Elhamifar, Ehsan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
30. Marine Ship Identification Algorithm Based on Object Detection and Fine-Grained Recognition
- Author
-
Du, Xingyue, Wang, Jianjun, Li, Yiqing, Tang, Bingling, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Nakamatsu, Kazumi, editor, Kountchev, Roumen, editor, Patnaik, Srikanta, editor, Abe, Jair M., editor, and Tyugashev, Andrey, editor
- Published
- 2022
- Full Text
- View/download PDF
31. Fine-Grained Activity Recognition Based on Features of Action Subsegments and Incremental Broad Learning
- Author
-
Chen, Shi, Wu, Sheng, Zhu, Licai, Yang, Hao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lai, Yongxuan, editor, Wang, Tian, editor, Jiang, Min, editor, Xu, Guangquan, editor, Liang, Wei, editor, and Castiglione, Aniello, editor
- Published
- 2022
- Full Text
- View/download PDF
32. Multi-Scale Feature Fusion of Covariance Pooling Networks for Fine-Grained Visual Recognition.
- Author
-
Qian, Lulu, Yu, Tan, and Yang, Jianyu
- Subjects
- *
IMAGE recognition (Computer vision) , *CLASSIFICATION algorithms , *RECOGNITION (Psychology) , *NAIVE Bayes classification , *COMPUTER vision - Abstract
Multi-scale feature fusion techniques and covariance pooling have been shown to have positive implications for completing computer vision tasks, including fine-grained image classification. However, existing algorithms that use multi-scale feature fusion techniques for fine-grained classification tend to consider only the first-order information of the features, failing to capture more discriminative features. Likewise, existing fine-grained classification algorithms using covariance pooling tend to focus only on the correlation between feature channels without considering how to better capture the global and local features of the image. Therefore, this paper proposes a multi-scale covariance pooling network (MSCPN) that can capture and better fuse features at different scales to generate more representative features. Experimental results on the CUB200 and MIT indoor67 datasets achieve state-of-the-art performance (CUB200: 94.31% and MIT indoor67: 92.11%). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Fine-Grained Object Recognition Using a Combination Model of Navigator–Teacher–Scrutinizer and Spinal Networks.
- Author
-
Nurhasanah, Yulianto, and Kusuma, Gede Putra
- Abstract
Fine-grained object recognition aims to recognize objects with a large variety of intraclass and low variations between classes. To overcome this problem, using a simple model may hard to find more discriminative parts. Thus, we proposed a combination model of navigator–teacher–scrutinizer and spinal networks to improve accuracy. Employing two feature extractors, residual networks with 50 and 101 layers deep, and replacing the basic fully connected layer with spinal network outperform the baseline results on Stanford Cars, Fine-Grained Visual Classification of Aircraft, and 275 Bird Species datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Fine-grained recognition via submodular optimization regulated progressive training.
- Author
-
Kang, Bin, Du, Songlin, Liang, Dong, Wu, Fan, and Li, Xin
- Subjects
- *
DESIGN - Abstract
Progressive training has unfolded its superiority on a wide range of downstream tasks. However, it may fail in fine-grained recognition (FGR) due to special challenges with high intra-class and low inter-class variances. In this paper, we propose an active self-pace learning method to exploit the full potential of progressive training strategy in FGR. The key innovation of our design is to integrate submodular optimization and self-pace learning into a maximum–minimum optimization framework. The submodular optimization is regarded as a dynamic regularization to select active sample groups in each training round for restricting the search space of self-pace optimization. This can overcome the limitation of traditional self-pace learning that is easily trapped into local minimums when facing challenging samples. Extensive experiments on three public FGR datasets show that the proposed method can win at least 1.5% performance gain in various kinds of network backbones including swin-transformer. • We are the first to exploit the sub–modularity for active sample selection. By our problem formulation, the optimal category subsets can be progressively selected for obtaining steady cumulative gain. • We combine submodular optimization with self-paced learning to generate a collaborated maximum–minimum optimization framework. The constructed framework can achieve smooth and stable progressive learning through using active samples to restrict the search space of self-paced optimization. • The proposed collaborated optimization framework can be deployed on various types of FGR networks. Extensive experiments on three fine-grained recognition datasets can verify the superiority of progressive training, where the averaged recognition gain surpasses 1.5%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. EFCMF: A Multimodal Robustness Enhancement Framework for Fine-Grained Recognition.
- Author
-
Zou, Rongping, Zhu, Bin, Chen, Yi, Xie, Bo, and Shao, Bin
- Subjects
RECOGNITION (Psychology) ,SOURCE code ,PROBLEM solving - Abstract
Fine-grained recognition has many applications in many fields and aims to identify targets from subcategories. This is a highly challenging task due to the minor differences between subcategories. Both modal missing and adversarial sample attacks are easily encountered in fine-grained recognition tasks based on multimodal data. These situations can easily lead to the model needing to be fixed. An Enhanced Framework for the Complementarity of Multimodal Features (EFCMF) is proposed in this study to solve this problem. The model's learning of multimodal data complementarity is enhanced by randomly deactivating modal features in the constructed multimodal fine-grained recognition model. The results show that the model gains the ability to handle modal missing without additional training of the model and can achieve 91.14% and 99.31% accuracy on Birds and Flowers datasets. The average accuracy of EFCMF on the two datasets is 52.85%, which is 27.13% higher than that of Bi-modal PMA when facing four adversarial example attacks, namely FGSM, BIM, PGD and C&W. In the face of missing modal cases, the average accuracy of EFCMF is 76.33% on both datasets respectively, which is 32.63% higher than that of Bi-modal PMA. Compared with existing methods, EFCMF is robust in the face of modal missing and adversarial example attacks in multimodal fine-grained recognition tasks. The source code is available at https://github.com/RPZ97/EFCMF (accessed on 8 January 2023). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Deep Kernelized Network for Fine-Grained Recognition
- Author
-
Mahmoudi, M. Amine, Chetouani, Aladine, Boufera, Fatma, Tabia, Hedi, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Mantoro, Teddy, editor, Lee, Minho, editor, Ayu, Media Anugerah, editor, Wong, Kok Wai, editor, and Hidayanto, Achmad Nizar, editor
- Published
- 2021
- Full Text
- View/download PDF
37. Disentangled Feature Network for Fine-Grained Recognition
- Author
-
Miao, Shuyu, Li, Shuaicheng, Zheng, Lin, Yu, Wei, Liu, Jingjing, Gong, Mingming, Feng, Rui, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Mantoro, Teddy, editor, Lee, Minho, editor, Ayu, Media Anugerah, editor, Wong, Kok Wai, editor, and Hidayanto, Achmad Nizar, editor
- Published
- 2021
- Full Text
- View/download PDF
38. Towards Robust Fine-Grained Recognition by Maximal Separation of Discriminative Features
- Author
-
Nakka, Krishna Kanth, Salzmann, Mathieu, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ishikawa, Hiroshi, editor, Liu, Cheng-Lin, editor, Pajdla, Tomas, editor, and Shi, Jianbo, editor
- Published
- 2021
- Full Text
- View/download PDF
39. Few-Shot Fine-Grained Forest Fire Smoke Recognition Based on Metric Learning.
- Author
-
Sun, Bingjian, Cheng, Pengle, and Huang, Ying
- Subjects
- *
FOREST fires , *FOREST monitoring , *FOREST fire prevention & control , *SMOKE , *FIRE detectors , *FALSE alarms , *TUNNEL ventilation - Abstract
To date, most existing forest fire smoke detection methods rely on coarse-grained identification, which only distinguishes between smoke and non-smoke. Thus, non-fire smoke and fire smoke are treated the same in these methods, resulting in false alarms within the smoke classes. The fine-grained identification of smoke which can identify differences between non-fire and fire smoke is of great significance for accurate forest fire monitoring; however, it requires a large database. In this paper, for the first time, we combine fine-grained smoke recognition with the few-shot technique using metric learning to identify fire smoke with the limited available database. The experimental comparison and analysis show that the new method developed has good performance in the structure of the feature extraction network and the training method, with an accuracy of 93.75% for fire smoke identification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. CEKD:Cross ensemble knowledge distillation for augmented fine-grained data.
- Author
-
Zhang, Ke, Fan, Jin, Huang, Shaoli, Qiao, Yongliang, Yu, Xiaofeng, and Qin, Feiwei
- Subjects
MACHINE learning ,DATA augmentation ,NETWORK performance - Abstract
Data augmentation has been proved effective in training deep models. Existing data augmentation methods tackle fine-grained problem by blending image pairs and fusing corresponding labels according to the statistics of mixed pixels, which produces additional noise harmful to the performance of networks. Motivated by this, we present a simple yet effective cross ensemble knowledge distillation (CEKD) model for fine-grained feature learning. We innovatively propose a cross distillation module to provide additional supervision to alleviate the noise problem, and propose a collaborative ensemble module to overcome the target conflict problem. The proposed model can be trained in an end-to-end manner, and only requires image-level label supervision. Extensive experiments on widely used fine-grained benchmarks demonstrate the effectiveness of our proposed model. Specifically, with the backbone of ResNet-101, CEKD obtains the accuracy of 89.59%, 95.96% and 94.56% in three datasets respectively, outperforming state-of-the-art API-Net by 0.99%, 1.06% and 1.16%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. Fine-Grained Butterfly Recognition via Peer Learning Network with Distribution-Aware Penalty Mechanism.
- Author
-
Xu, Chudong, Cai, Runji, Xie, Yuhao, Cai, Huiyi, Wang, Min, Gao, Yuefang, and Ma, Xiaoming
- Subjects
- *
INTERACTIVE learning , *BUTTERFLIES , *PRODUCTION management (Manufacturing) , *SPECIES distribution - Abstract
Simple Summary: Automatic species recognition, such as butterflies or other insects, plays a crucial role in intelligent agricultural production management and the study of species diver-sity. However, the quite diverse and subtle interspecific differences and the long-tailed distribution of sample data in fine-grained species recognition are insufficient to learn robust feature representation and alleviate the bias and variance problems of the long-tailed classifier on insect recognition. The objective of this study is to develop a peer learning network with a distribution-aware penalty mechanism proposed to learn discriminative feature representation and mitigate the bias and variance problems in the long-tailed distribution. The results of various contrast experiments on collecting the butterfly-914 dataset show that the proposed PLN-DPM has a higher Rank-1 ac-curacy rate (86.2% on the butterfly dataset and 73.51% on the IP102 dataset). Addi-tionally, we deployed the PLN-DPM model on the smartphone app for butterfly recognition in a real-life environment. Automatic species recognition plays a key role in intelligent agricultural production management and the study of species diversity. However, fine-grained species recognition is a challenging task due to the quite diverse and subtle interclass differences among species and the long-tailed distribution of sample data. In this work, a peer learning network with a distribution-aware penalty mechanism is proposed to address these challenges. Specifically, the proposed method employs the two-stream ResNeSt-50 as the backbone to obtain the initial predicted results. Then, the samples, which are selected from the instances with the same predicted labels by knowledge exchange strategy, are utilized to update the model parameters via the distribution-aware penalty mechanism to mitigate the bias and variance problems in the long-tailed distribution. By performing such adaptive interactive learning, the proposed method can effectively achieve improved recognition accuracy for head classes in long-tailed data and alleviate the adverse effect of many head samples relative to a few samples of the tail classes. To evaluate the proposed method, we construct a large-scale butterfly dataset (named Butterfly-914) that contains approximately 72,152 images belonging to 914 species and at least 20 images for each category. Exhaustive experiments are conducted to validate the efficiency of the proposed method from several perspectives. Moreover, the superior Top-1 accuracy rate (86.2%) achieved on the butterfly dataset demonstrates that the proposed method can be widely used for agricultural species identification and insect monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
42. PlantNet: transfer learning-based fine-grained network for high-throughput plants recognition.
- Author
-
Yang, Ziying, He, Wenyan, Fan, Xijian, and Tjahjadi, Tardi
- Subjects
- *
CONVOLUTIONAL neural networks , *DEEP learning , *PLANT breeding - Abstract
In high-throughput phenotyping, recognizing individual plant categories is a vital support process for plant breeding. However, different plant categories have different fine-grained characteristics, i.e., intra-class variation and inter-class similarity, making the process challenging. Existing deep learning-based recognition methods fail to effectively address this recognition task under challenging requirements, leading to technical difficulties such as low accuracy and lack of generalization robustness. To address these requirements, this paper proposes PlantNet, a fine-grained network for plant recognition based on transfer learning and a bilinear convolutional neural network, which achieves high recognition accuracy in high-throughput phenotyping requirements. The network operates as follows. First, two deep feature extractors are constructed using transfer learning. The outer product of the different spatial locations corresponding to the two features is then calculated, and the bilinear convergence is computed for the different spatial locations. Finally, the fused bilinear vectors are normalized via maximum expectation to generate the network output. Experiments on a publicly available Arabidopsis dataset show that the proposed bilinear model performed better than related state-of-the-art methods. The interclass recognition accuracy of the four different species of Arabidopsis Sf-2, Cvi, Landsberg and Columbia are found to be 98.48%, 96.53%, 96.79% and 97.33%, respectively, with an average accuracy of 97.25%. Thus, the network has good generalization ability and robust performance, satisfying the needs of fine-grained plant recognition in agricultural production. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition.
- Author
-
Zhang, Lianbo, Huang, Shaoli, and Liu, Wei
- Abstract
Differentiating subcategories of a common visual category is challenging because of the similar appearance shared among different classes in fine-grained recognition. Existing mixture-of-expert based methods divide the fine-grained space into some specific regions and solve the integrated problem by conquering subspace ones. However, it is not feasible to learn diverse experts directly through data partition strategy because of limited data available for fine-grained recognition problems. To address the issue, we leverage visual attention to learn an enhanced experts’ mixture. Specifically, we introduce a gradually-enhanced learning strategy from model attention. The strategy promotes diversity among experts by feeding each expert with full-size data distinct in granularity. We further promote expert’s learning by providing it with a larger data space, which is achieved by swapping attentive regions within positive pairs. Our method learns new experts on the dataset with the prior knowledge from former experts sequentially and enforces the experts to learn more diverse but discriminative representation. These enhanced experts are finally combined to make stronger predictions. We conduct extensive experiments on fine-grained benchmarks. The results show that our method consistently outperforms the state-of-the-art method in both weakly supervised localization and fine-grained image classification. Our code is publicly available at https://github.com/lbzhang/Enhanced-Expert-FGVC-Pytorch.git. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. CAT: a coarse-to-fine attention tree for semantic change detection
- Author
-
Wei, Xiu-Shen, Xu, Yu-Yan, Zhang, Chen-Lin, Xia, Gui-Song, and Peng, Yu-Xin
- Published
- 2023
- Full Text
- View/download PDF
45. Vision-based Autonomous Vehicle Recognition: A New Challenge for Deep Learning-based Systems.
- Author
-
BOUKERCHE, AZZEDINE and XIREN MA
- Subjects
- *
DEEP learning , *INTELLIGENT transportation systems , *FEATURE extraction , *VEHICLE models , *AUTONOMOUS vehicles - Abstract
Vision-based Automated Vehicle Recognition (VAVR) has attracted considerable attention recently. Particularly given the reliance on emerging deep learning methods, which have powerful feature extraction and pattern learning abilities, vehicle recognition has made significant progress. VAVR is an essential part of Intelligent Transportation Systems. The VAVR system can fast and accurately locate a target vehicle, which significantly helps improve regional security. A comprehensive VAVR system contains three components: Vehicle Detection (VD), Vehicle Make and Model Recognition (VMMR), and Vehicle Re-identification (VRe-ID). These components perform coarse-to-fine recognition tasks in three steps. In this article, we conduct a thorough review and comparison of the state-of-the-art deep learning-based models proposed for VAVR. We present a detailed introduction to different vehicle recognition datasets used for a comprehensive evaluation of the proposed models. We also critically discuss the major challenges and future research trends involved in each task. Finally, we summarize the characteristics of the methods for each task. Our comprehensive model analysis will help researchers that are interested in VD, VMMR, and VRe-ID and provide them with possible directions to solve current challenges and further improve the performance and robustness of models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Dynamic Perception Framework for Fine-Grained Recognition.
- Author
-
Ding, Yao, Han, Zhenjun, Zhou, Yanzhao, Zhu, Yi, Chen, Jie, Ye, Qixiang, and Jiao, Jianbin
- Subjects
- *
VISUAL cortex , *IMAGE recognition (Computer vision) , *FEATURE extraction , *RADIO frequency - Abstract
Fine-grained recognition poses the challenge of discriminating categories with only small subtle visual differences, which can be easily overwhelmed by diverse appearance within categories. Conventional approaches generally locate discriminative parts and then recognize the part-based features. However, we find that tuning the effective receptive field (ERF) of the network to the task plays the key role, which enables significant regions to contribute more to the output. Inspired by the receptive field stimulation mechanism of the visual cortex, we propose a Dynamic Perception framework as a solution. Our framework adapts the ERF by considering the image space and the kernel space simultaneously. In the image space, the Spatial Selective Sampling module is adopted to enlarge informative regions locally. In the kernel space, Spatial Selective Kernel convolution is introduced to adapt different kernel sizes for regions of interest and backgrounds by embedding spatial attention in the multi-path convolution. Extensive experiments on challenging benchmarks, including CUB-200-2011, FGVC-Aircraft, and Stanford Cars, demonstrate that our method yields a performance boost over the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. Local R-Symmetry Co-Occurrence: Characterising Leaf Image Patterns for Identifying Cultivars.
- Author
-
Wang, Bin, Gao, Yongsheng, Yuan, Xiaohui, and Xiong, Shengwu
- Abstract
Leaf image recognition techniques have been actively researched for plant species identification. However it remains unclear whether analysing leaf patterns can provide sufficient information for further differentiating cultivars. This paper reports our attempt on cultivar recognition from leaves as a general very fine-grained pattern recognition problem, which is not only a challenging research problem but also important for cultivar evaluation, selection and production in agriculture. We propose a novel local R-symmetry co-occurrence method for characterising discriminative local symmetry patterns to distinguish subtle differences among cultivars. Through scalable and moving R-relation radius pairs, we generate a set of radius symmetry co-occurrence matrices (RsCoM)and their measures for describing the local symmetry properties of interior regions. By varying the size of the radius pair, the RsCoM measures local R-symmetry co-occurrence from global/coarse to fine scales. A new two-phase strategy of analysing the distribution of local RsCoM measures is designed to match the multiple scale appearance symmetry pattern distributions of similar cultivar leaf images. We constructed three leaf image databases, SoyCultivar, CottCultivar, and PeanCultivar, for an extensive experimental evaluation on recognition across soybean, cotton and peanut cultivars. Encouraging experimental results of the proposed method in comparison with the state-of-the-art leaf species recognition methods demonstrate the effectiveness of the proposed method for cultivar identification, which may advance the research in leaf recognition from species to cultivar. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. CarVideos: A Novel Dataset for Fine-Grained Car Classification in Videos
- Author
-
Alsahafi, Yousef, Lemmond, Daniel, Ventura, Jonathan, Boult, Terrance, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, and Latifi, Shahram, editor
- Published
- 2019
- Full Text
- View/download PDF
49. Region Selection Model with Saliency Constraint for Fine-Grained Recognition
- Author
-
Zhou, Shaoxiong, Gong, Shengrong, Zhong, Shan, Pan, Wei, Ying, Wenhao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Gedeon, Tom, editor, Wong, Kok Wai, editor, and Lee, Minho, editor
- Published
- 2019
- Full Text
- View/download PDF
50. Playing to distraction: towards a robust training of CNN classifiers through visual explanation techniques.
- Author
-
Morales, David, Talavera, Estefania, and Remeseiro, Beatriz
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *DISTRACTION - Abstract
The field of deep learning is evolving in different directions, with still the need for more efficient training strategies. In this work, we present a novel and robust training scheme that integrates visual explanation techniques in the learning process. Unlike the attention mechanisms that focus on the relevant parts of images, we aim to improve the robustness of the model by making it pay attention to other regions as well. Broadly speaking, the idea is to distract the classifier in the learning process by forcing it to focus not only on relevant regions but also on those that, a priori, are not so informative for the discrimination of the class. We tested the proposed approach by embedding it into the learning process of a convolutional neural network for the analysis and classification of two well-known datasets, namely Stanford cars and FGVC-Aircraft. Furthermore, we evaluated our model on a real-case scenario for the classification of egocentric images, allowing us to obtain relevant information about peoples' lifestyles. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity. The results obtained indicate the suitability of our proposed training scheme for image classification, improving the robustness of the final model. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.