462 results on '"Weakly supervised"'
Search Results
2. Deep learning-based IDH1 gene mutation prediction using histopathological imaging and clinical data
- Author
-
Nakagaki, Riku, Debsarkar, Shyam Sundar, Kawanaka, Hiroharu, Aronow, Bruce J., and Prasath, V.B. Surya
- Published
- 2024
- Full Text
- View/download PDF
3. OBBInst: Remote sensing instance segmentation with oriented bounding box supervision
- Author
-
Cao, Xu, Zou, Huanxin, Li, Jun, Ying, Xinyi, and He, Shitian
- Published
- 2024
- Full Text
- View/download PDF
4. Weakly supervised object detection for automatic tooth-marked tongue recognition
- Author
-
Zhang, Yongcun, Xu, Jiajun, He, Yina, Li, Shaozi, Luo, Zhiming, and Lei, Huangwei
- Published
- 2025
- Full Text
- View/download PDF
5. Weakly Supervised Video Anomaly Detection Method Based on Multi-scale Feature Fusion and Contrastive Loss
- Author
-
Yang, Kun, Luo, Zhiming, Li, Shaozi, Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Xu, Zhiwei, Series Editor, Sun, Hailong, editor, Fan, Hongfei, editor, Gao, Yongqiang, editor, Wang, Xiaokang, editor, Liu, Dongning, editor, Du, Bowen, editor, and Lu, Tun, editor
- Published
- 2025
- Full Text
- View/download PDF
6. Cross-Domain Calibration and Boundary Denoising Network for Weakly Supervised Semantic Segmentation
- Author
-
Liu, Zhoufeng, Li, Bingrui, Ding, Shumin, Xi, Jiangtao, Li, Chunlei, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
7. MDNet: Morphology-Driven Weakly Supervised Polyp Detection
- Author
-
Chen, Jiajia, Zhang, Xuejun, Gui, Jie, Du, Xiuquan, Sha, Wen, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lin, Zhouchen, editor, Cheng, Ming-Ming, editor, He, Ran, editor, Ubul, Kurban, editor, Silamu, Wushouer, editor, Zha, Hongbin, editor, Zhou, Jie, editor, and Liu, Cheng-Lin, editor
- Published
- 2025
- Full Text
- View/download PDF
8. 简单且有效的弱监督中文文本分类算法.
- Author
-
陈中涛 and 周亚同
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2025
- Full Text
- View/download PDF
9. A Novel 3D Magnetic Resonance Imaging Registration Framework Based on the Swin-Transformer UNet+ Model with 3D Dynamic Snake Convolution Scheme.
- Author
-
Han, Yaolong, Wang, Lei, Huang, Zizhen, Zhang, Yukun, and Zheng, Xiao
- Subjects
MAGNETIC resonance imaging ,TRANSFORMER models ,IMAGE registration ,SNAKES ,DYNAMIC models - Abstract
Transformer-based image registration methods have achieved notable success, but they still face challenges, such as difficulties in representing both global and local features, the inability of standard convolution operations to focus on key regions, and inefficiencies in restoring global context using the decoder. To address these issues, we extended the Swin-UNet architecture and incorporated dynamic snake convolution (DSConv) into the model, expanding it into three dimensions. This improvement enables the model to better capture spatial information at different scales, enhancing its adaptability to complex anatomical structures and their intricate components. Additionally, multi-scale dense skip connections were introduced to mitigate the spatial information loss caused by downsampling, enhancing the model's ability to capture both global and local features. We also introduced a novel optimization-based weakly supervised strategy, which iteratively refines the deformation field generated during registration, enabling the model to produce more accurate registered images. Building on these innovations, we proposed OSS DSC-STUNet+ (Swin-UNet+ with 3D dynamic snake convolution). Experimental results on the IXI, OASIS, and LPBA40 brain MRI datasets demonstrated up to a 16.3% improvement in Dice coefficient compared to five classical methods. The model exhibits outstanding performance in terms of registration accuracy, efficiency, and feature preservation. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
10. A Weakly Supervised Data Labeling Framework for Machine Lexical Normalization in Vietnamese Social Media.
- Author
-
Nguyen, Dung Ha, Nguyen, Anh Thi Hoang, and Van Nguyen, Kiet
- Abstract
This study introduces an innovative automatic labeling framework to address the challenges of lexical normalization in social media texts for low-resource languages like Vietnamese. Social media data is rich and diverse, but the evolving and varied language used in these contexts makes manual labeling labor-intensive and expensive. To tackle these issues, we propose a framework that integrates semi-supervised learning with weak supervision techniques. This approach enhances the quality of the training dataset and expands its size while minimizing manual labeling efforts. Our framework automatically labels raw data, converting non-standard vocabulary into standardized forms, thereby improving the accuracy and consistency of the training data. Experimental results demonstrate the effectiveness of our weak supervision framework in normalizing Vietnamese text, especially when utilizing pre-trained language models. The proposed framework achieves an impressive F1-score of 82.72% and maintains vocabulary integrity with an accuracy of up to 99.22%. Additionally, it effectively handles undiacritized text under various conditions. This framework significantly enhances natural language normalization quality and improves the accuracy of various NLP tasks, leading to an average accuracy increase of 1–3%. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
11. Dual Semantic Reconstruction Network for Weakly Supervised Temporal Sentence Grounding.
- Author
-
Tang, Kefan, He, Lihuo, Wang, Nannan, and Gao, Xinbo
- Published
- 2025
- Full Text
- View/download PDF
12. Weakly-supervised Semantic Segmentation with Image-level Labels: From Traditional Models to Foundation Models.
- Author
-
Chen, Zhaozheng and Sun, Qianru
- Subjects
- *
ARTIFICIAL neural networks , *GRAPH neural networks , *SUPERVISED learning , *TRANSFORMER models , *IMAGE recognition (Computer vision) , *DEEP learning , *TEXT recognition - Published
- 2025
- Full Text
- View/download PDF
13. Weakly supervised video anomaly detection based on hyperbolic space
- Author
-
Meilin Qi and Yuanyuan Wu
- Subjects
Hyperbolic space ,Video anomaly detection ,Weakly supervised ,Medicine ,Science - Abstract
Abstract In recent years, there has been a proliferation of weakly supervised methods in the field of video anomaly detection. Despite significant progress in existing research, these efforts have primarily focused on addressing this issue within Euclidean space. Conducting weakly supervised video anomaly detection in Euclidean space imposes a fundamental limitation by constraining the ability to model complex patterns due to the dimensionality constraints of the embedding space and lacking the capacity to model long-term contextual information. This inadequacy can lead to misjudgments of anomalous events due to insufficient video representation. However, hyperbolic space has shown significant potential for modeling complex data, offering new insights. In this paper, we rethink weakly supervised video anomaly detection with a novel perspective: transforming video features from Euclidean space into hyperbolic space may enable the network to learn implicit relationships in normal and anomalous videos, thereby enhancing its ability to effectively distinguish between them. Finally, to validate our approach, we conducted extensive experiments on the UCF-Crime and XD-Violence datasets. Experimental results show that our method not only has the lowest number of parameters but also achieves state-of-the-art performance on the XD-Violence dataset using only RGB information.
- Published
- 2024
- Full Text
- View/download PDF
14. A Weakly Supervised Crowd Counting Method via Combining CNN and Transformer.
- Author
-
Cai, Yuhang and Zhang, De
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,FEATURE extraction ,IMAGE representation ,COUNTING - Abstract
During the past five years, there has been an increasing trend of weakly supervised crowd counting methods being developed since such methods just rely on count-level annotations and avoid a laborious labeling process. But, the existing weakly supervised methods usually fail to achieve comparable counting performance to the fully supervised methods. To improve the accuracy of crowd counting tasks, we propose to combine the convolutional neural network (CNN) and Transformer frameworks. Since CNN focuses on capturing local detail information and Transformer can effectively extract global context information, we believe that the combination of CNN and Transformer could learn more efficient feature representations for crowd images. Our proposed framework is named CrowdCCT (Crowd Counting via CNN and Transformer), and it is composed of a CNN feature extraction part, a Transformer feature extraction part, and a counting regression part. In the CNN part, we utilize DenseNet121 to learn rich semantic features with its inherent dense connection structure. In the Transformer part, we introduce two attention modules, Multi-Scale Dilated Attention (MSDA) and Location-Enhanced Attention (LEA), working together to extract more expressive features. The output features are then fed into the regression part to generate the predicted counting results. Experiments on four crowd counting benchmark datasets demonstrate that our proposed CrowdCCT can achieve superior performance. Also, the experimental results validate the feasibility and effectiveness of combining CNN and Transformer for weakly supervised counting tasks. Our work could be expected to promote further combination research on CNN and Transformer. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Weakly supervised point cloud semantic segmentation based on scene consistency.
- Author
-
Niu, Yingchun, Yin, Jianqin, Qi, Chao, and Geng, Liang
- Subjects
POINT cloud ,COST effectiveness ,SUPERVISION ,SIMPLICITY ,COST - Abstract
Weakly supervised point cloud segmentation has garnered considerable interest recently, primarily due to its ability to diminish labor-intensive manual labeling costs. The effectiveness of such methods hinges on their ability to augment the supervision signals available for training implicitly. However, we found that most approaches tend to be implemented through complex modeling, which is not conducive to deployment and implementation in resource-poor scenarios. Our study introduces a novel scene consistency modeling approach that significantly enhances weakly supervised point cloud segmentation in this context. By synergistically modeling both complete and incomplete scenes, our method can improve the quality of the supervision signal and save more resources and ease of deployment in practical applications. To achieve this, we first generate the corresponding incomplete scene for the whole scene using windowing techniques. Next, we input the complete and incomplete scenes into a network encoder and obtain prediction results for each scene through two decoders. We enforce semantic consistency between the labeled and unlabeled data in the two scenes by employing cross-entropy and KL loss. This consistent modeling method enables the network to focus more on the same areas in both scenes, capturing local details and effectively increasing the supervision signals. One of the advantages of the proposed method is its simplicity and cost-effectiveness. Because we rely solely on variance and KL loss to model scene consistency, resulting in straightforward computations. Our experimental evaluations on S3DIS, ScanNet, and Semantic3D datasets provide further evidence that our method can effectively leverage sparsely labeled data and abundant unlabeled data to enhance supervision signals and improve the overall model performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. 视图一致性网络下的弱监督遥感影像旋转目标检测.
- Author
-
方, 婷婷, 刘, 斌, 陈, 春晖, and 厉, 香蕴
- Subjects
OBJECT recognition (Computer vision) ,WAREHOUSES ,DETECTORS ,DEEP learning ,ANNOTATIONS - Abstract
Copyright of Journal of Remote Sensing is the property of Editorial Office of Journal of Remote Sensing & Science Publishing Co. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
17. Weakly supervised video anomaly detection based on hyperbolic space.
- Author
-
Qi, Meilin and Wu, Yuanyuan
- Subjects
HYPERBOLIC spaces ,ANOMALY detection (Computer security) ,IMPLICIT learning ,DATA modeling ,VIDEOS - Abstract
In recent years, there has been a proliferation of weakly supervised methods in the field of video anomaly detection. Despite significant progress in existing research, these efforts have primarily focused on addressing this issue within Euclidean space. Conducting weakly supervised video anomaly detection in Euclidean space imposes a fundamental limitation by constraining the ability to model complex patterns due to the dimensionality constraints of the embedding space and lacking the capacity to model long-term contextual information. This inadequacy can lead to misjudgments of anomalous events due to insufficient video representation. However, hyperbolic space has shown significant potential for modeling complex data, offering new insights. In this paper, we rethink weakly supervised video anomaly detection with a novel perspective: transforming video features from Euclidean space into hyperbolic space may enable the network to learn implicit relationships in normal and anomalous videos, thereby enhancing its ability to effectively distinguish between them. Finally, to validate our approach, we conducted extensive experiments on the UCF-Crime and XD-Violence datasets. Experimental results show that our method not only has the lowest number of parameters but also achieves state-of-the-art performance on the XD-Violence dataset using only RGB information. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Annotate less but perform better: weakly supervised shadow detection via label augmentation.
- Author
-
Chen, Hongyu, Chen, Xiao-Diao, Wu, Wen, Yang, Wenya, and Mao, Xiaoyang
- Subjects
- *
IMAGE segmentation , *IMAGE reconstruction , *DETECTORS , *ANNOTATIONS , *PIXELS - Abstract
Shadow detection is essential for scene understanding and image restoration. Existing paradigms for producing shadow detection training data usually rely on densely labeling each image pixel, which will lead to a bottleneck when scaling up the number of images. To tackle this problem, by labeling shadow images with only a few strokes, this paper designs a learning framework for Weakly supervised Shadow Detection, namely WSD. Firstly, it creates two shadow detection datasets with scribble annotations, namely Scr-SBU and Scr-ISTD. Secondly, it proposes an uncertainty-guided label augmentation scheme based on graph convolutional networks, which can propagate the sparse scribble annotations to more reliable regions, and then avoid the model converging to an undesired local minima as intra-class discontinuity. Finally, it introduces a multi-task learning framework to jointly learn for shadow detection and edge detection, which encourages generated shadow maps to be comprehensive and well aligned with shadow boundaries. Experimental results on benchmark datasets demonstrate that our framework even outperforms existing semi-supervised and fully supervised shadow detectors requiring only 2% pixels to be labeled. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Exploiting Instance-level Relationships in Weakly Supervised Text-to-Video Retrieval.
- Author
-
Yin, Shukang, Zhao, Sirui, Wang, Hao, Xu, Tong, and Chen, Enhong
- Subjects
GREEDY algorithms ,VIDEOS ,HEURISTIC ,ANNOTATIONS ,HYPOTHESIS - Abstract
Text-to-Video Retrieval is a typical cross-modal retrieval task that has been studied extensively under a conventional supervised setting. Recently, some works have sought to extend the problem to a weakly supervised formulation, which can be more consistent with real-life scenarios and more efficient in annotation cost. In this context, a new task called Partially Relevant Video Retrieval (PRVR) is proposed, which aims to retrieve videos that are partially relevant to a given textual query, i.e., the videos containing at least one semantically relevant moment. Formulating the task as a Multiple Instance Learning (MIL) ranking problem, prior arts rely on heuristics algorithms such as a simple greedy search strategy and deal with each query independently. Although these early explorations have achieved decent performance, they may not fully utilize the bag-level label and only consider the local optimum, which could result in suboptimal solutions and inferior final retrieval performance. To address this problem, in this paper, we propose to exploit the relationships between instances to boost retrieval performance. Based on this idea, we creatively put forward: (1) a new matching scheme for pairing queries and their related moments in the video; and (2) a new loss function to facilitate cross-modal alignment between two views of an instance. Extensive validations on three publicly available datasets have demonstrated the effectiveness of our solution and verified our hypothesis that modeling instance-level relationships is beneficial in the MIL ranking setting. Our code will be publicly available at https://github.com/xjtupanda/BGM-Net. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Epipolar constraint-guided differentiable keypoint detection and description: Epipolar constraint-guided differentiable keypoint detection and description
- Author
-
Li, Xi, Feng, Yulong, Yu, Xianguo, Cong, Yirui, and Chen, Lili
- Published
- 2025
- Full Text
- View/download PDF
21. Semantic segmentation of point clouds of ancient buildings based on weak supervision
- Author
-
Jianghong Zhao, Haiquan Yu, Xinnan Hua, Xin Wang, Jia Yang, Jifu Zhao, and Ailin Xu
- Subjects
3D point cloud ,Weakly supervised ,Deep learning ,Fine Arts ,Analytical chemistry ,QD71-142 - Abstract
Abstract Semantic segmentation of point clouds of ancient buildings plays an important role in Historical Building Information Modelling (HBIM). As the annotation task of point cloud of ancient architecture is characterised by strong professionalism and large workload, which greatly restricts the application of point cloud semantic segmentation technology in the field of ancient architecture, therefore, this paper launches a research on the semantic segmentation method of point cloud of ancient architecture based on weak supervision. Aiming at the problem of small differences between classes of ancient architectural components, this paper introduces a self-attention mechanism, which can effectively distinguish similar components in the neighbourhood. Moreover, this paper explores the insufficiency of positional encoding in baseline and constructs a high-precision point cloud semantic segmentation network model for ancient buildings—Semantic Query Network based on Dual Local Attention (SQN-DLA). Using only 0.1% of the annotations in our homemade dataset and the Architectural Cultural Heritage (ArCH) dataset, the mean Intersection over Union (mIoU) reaches 66.02% and 58.03%, respectively, which is an improvement of 3.51% and 3.91%, respectively, compared to the baseline.
- Published
- 2024
- Full Text
- View/download PDF
22. Semantic segmentation of point clouds of ancient buildings based on weak supervision.
- Author
-
Zhao, Jianghong, Yu, Haiquan, Hua, Xinnan, Wang, Xin, Yang, Jia, Zhao, Jifu, and Xu, Ailin
- Subjects
POINT cloud ,ANCIENT architecture ,HISTORIC buildings ,BUILDING information modeling ,PROFESSIONALISM - Abstract
Semantic segmentation of point clouds of ancient buildings plays an important role in Historical Building Information Modelling (HBIM). As the annotation task of point cloud of ancient architecture is characterised by strong professionalism and large workload, which greatly restricts the application of point cloud semantic segmentation technology in the field of ancient architecture, therefore, this paper launches a research on the semantic segmentation method of point cloud of ancient architecture based on weak supervision. Aiming at the problem of small differences between classes of ancient architectural components, this paper introduces a self-attention mechanism, which can effectively distinguish similar components in the neighbourhood. Moreover, this paper explores the insufficiency of positional encoding in baseline and constructs a high-precision point cloud semantic segmentation network model for ancient buildings—Semantic Query Network based on Dual Local Attention (SQN-DLA). Using only 0.1% of the annotations in our homemade dataset and the Architectural Cultural Heritage (ArCH) dataset, the mean Intersection over Union (mIoU) reaches 66.02% and 58.03%, respectively, which is an improvement of 3.51% and 3.91%, respectively, compared to the baseline. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. 无监督和弱监督视频异常检测方法回顾与前瞻.
- Author
-
张琳, 陈兆波, 马晓轩, and 张凡博
- Abstract
With the continuous development of monitoring technology, surveillance cameras have been widely deployed in various scenarios. Manual detection of video abnormality has become impossible. Therefore, video anomaly detection technology, as the core of intelligent surveillance systems, is receiving extensive attention and research. With the development of deep learning, the field of video anomaly detection has made significant achievements and has emerged many new anomaly detection methods. Unsupervised and weakly supervised video anomaly detection learning methods applied to various data types were sorted out, the contributions of existing methods were analyzed, and the performance of different models was compared. In addition, some commonly used and newly released datasets have also been compiled, and the challenges and development trends that future work will face have been summarized. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. A weakly supervised time series anomaly detection method with dual-association discrepancy.
- Author
-
Liu, Fanxing, Zhang, Lu, Li, Hao, Zhou, Siyu, and Zhou, Yingjie
- Abstract
Time series anomaly detection is a task of significant importance and has been widely employed in realistic scenarios. Most of existing methods conduct time series anomaly detection in an unsupervised manner, ignoring the limited number of labeled anomalies that are commonly available in practical situations. However, how to take advantage of these limited but valuable labeled anomalies to benefit time series anomaly detection requires fully exploration. To this end, we propose a weakly supervised time series anomaly detection method with dual-association discrepancy to effectively identify anomalies. Specifically, the proposed method utilizes the limited number of anomalies to enlarge the distance between the association discrepancy of the anomaly and that of the normal one, enforcing significant differences between abnormal and normal samples in discriminant places. We also design a masking strategy to enrich the representations of anomalies in latent feature space. Experiments with public available datasets have demonstrated the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. PathMamba: Weakly Supervised State Space Model for Multi-class Segmentation of Pathology Images
- Author
-
Fan, Jiansong, Lv, Tianxu, Di, Yicheng, Li, Lihua, Pan, Xiang, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Linguraru, Marius George, editor, Dou, Qi, editor, Feragen, Aasa, editor, Giannarou, Stamatia, editor, Glocker, Ben, editor, Lekadir, Karim, editor, and Schnabel, Julia A., editor
- Published
- 2024
- Full Text
- View/download PDF
26. Bounding Box Is All You Need: Learning to Segment Cells in 2D Microscopic Images via Box Annotations
- Author
-
Khalid, Nabeel, Caroprese, Maria, Lovell, Gillian, Porto, Daniel A., Trygg, Johan, Dengel, Andreas, Ahmed, Sheraz, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Yap, Moi Hoon, editor, Kendrick, Connah, editor, Behera, Ardhendu, editor, Cootes, Timothy, editor, and Zwiggelaar, Reyer, editor
- Published
- 2024
- Full Text
- View/download PDF
27. Multitask Deep Convolutional Neural Network with Attention for Pulmonary Tuberculosis Detection and Weak Localization of Pathological Manifestations in Chest X-Ray
- Author
-
Wolde Feyisa, Degaga, Megersa Ayano, Yehualashet, Girma Debelee, Taye, Sisay Hailu, Samuel, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Debelee, Taye Girma, editor, Ibenthal, Achim, editor, Schwenker, Friedhelm, editor, and Megersa Ayano, Yehualashet, editor
- Published
- 2024
- Full Text
- View/download PDF
28. Part-Aware Prompt Tuning for Weakly Supervised Referring Expression Grounding
- Author
-
Zhao, Chenlin, Ye, Jiabo, Song, Yaguang, Yan, Ming, Yang, Xiaoshan, Xu, Changsheng, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rudinac, Stevan, editor, Hanjalic, Alan, editor, Liem, Cynthia, editor, Worring, Marcel, editor, Jónsson, Björn Þór, editor, Liu, Bei, editor, and Yamakata, Yoko, editor
- Published
- 2024
- Full Text
- View/download PDF
29. STN-BA: Weakly-Supervised Few-Shot Temporal Action Localization
- Author
-
Ye, Na, Zhang, Zhijie, Zhang, Xiang, Li, Baoshan, Wang, Xiaoshu, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Fenrong, editor, Sadanandan, Arun Anand, editor, Pham, Duc Nghia, editor, Mursanto, Petrus, editor, and Lukose, Dickson, editor
- Published
- 2024
- Full Text
- View/download PDF
30. PIS-Net: Efficient weakly supervised instance segmentation network based on annotated points for rice field weed identification
- Author
-
Hao Chen, Youliu Zhang, Caijie He, Chuting Chen, Yaoxuan Zhang, Zhiying Chen, Yu Jiang, Caixia Lin, Ruijun Ma, and Long Qi
- Subjects
Weakly supervised ,Instance segmentation ,Deep learning ,Weed recognition ,Agriculture (General) ,S1-972 ,Agricultural industries ,HD9000-9495 - Abstract
Weed damage in rice fields is one of the main causes of reduced rice yields and quality. Accurate and efficient weed identification is the prerequisite for realizing intelligent and precise weeding in paddies. Recently, Vision Transformers (ViTs) have emerged with superior performance on computer vision tasks compared to the convolutional neural network (CNN)-based models. However, the lack of fully labeled weed datasets hinders the potential application of deep learning models in weed identification. To address the above issues, this study customizes a novel point-supervised instance segmentation network (PIS-Net) for weakly supervised instance segmentation of weeds in rice fields. More correctly, we first propose a novel instance segmentation point labeling scheme that utilizes randomly generated annotation points within each instance, aiming to decrease both labeling time and difficulty. Additionally, to make optimal use of point labels, this study puts forth a mask generation strategy based on the adaptive selection of pyramid levels. In this sense, the network model can flexibly choose the pyramid level expected to generate the most suitable instance mask based on the network's reliability. Finally, we establish the pseudo label refinement network (PLR-Net) to refine rough instance masks. The proposed PIS-Net utilizes 13 randomly generated annotation points for each instance, yet achieving an AP of 38.5 and an AP50 of 68.3, which is superior to the baseline mask-R-CNN with an AP of 8.2 and AP50 of 6.9, achieving 90 % fully supervised performance. This method effectively utilizes point labels, annotated with high efficiency, as a robust source of weak supervision to address challenges in weed data annotation and the low accuracy of existing weakly supervised models. Experiments show that the point annotation scheme of the PIS-Net is faster than full-object mask annotation, and the AP is also higher than the current semi-supervised weed segmentation model, enjoying high potentials in practical paddy fields.
- Published
- 2024
- Full Text
- View/download PDF
31. A Novel 3D Magnetic Resonance Imaging Registration Framework Based on the Swin-Transformer UNet+ Model with 3D Dynamic Snake Convolution Scheme
- Author
-
Yaolong Han, Lei Wang, Zizhen Huang, Yukun Zhang, and Xiao Zheng
- Subjects
magnetic resonance imaging ,image registration ,swin transformer ,UNet ,3D dynamic snake convolution ,weakly supervised ,Photography ,TR1-1050 ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Transformer-based image registration methods have achieved notable success, but they still face challenges, such as difficulties in representing both global and local features, the inability of standard convolution operations to focus on key regions, and inefficiencies in restoring global context using the decoder. To address these issues, we extended the Swin-UNet architecture and incorporated dynamic snake convolution (DSConv) into the model, expanding it into three dimensions. This improvement enables the model to better capture spatial information at different scales, enhancing its adaptability to complex anatomical structures and their intricate components. Additionally, multi-scale dense skip connections were introduced to mitigate the spatial information loss caused by downsampling, enhancing the model’s ability to capture both global and local features. We also introduced a novel optimization-based weakly supervised strategy, which iteratively refines the deformation field generated during registration, enabling the model to produce more accurate registered images. Building on these innovations, we proposed OSS DSC-STUNet+ (Swin-UNet+ with 3D dynamic snake convolution). Experimental results on the IXI, OASIS, and LPBA40 brain MRI datasets demonstrated up to a 16.3% improvement in Dice coefficient compared to five classical methods. The model exhibits outstanding performance in terms of registration accuracy, efficiency, and feature preservation.
- Published
- 2025
- Full Text
- View/download PDF
32. Weakly supervised pathological whole slide image classification based on contrastive learning.
- Author
-
Xie, Yining, Long, Jun, Hou, Jianxin, Chen, Deyun, and Guan, Guohui
- Subjects
IMAGE recognition (Computer vision) ,RECEIVER operating characteristic curves ,SUPERVISED learning ,LYMPH node cancer ,FEATURE extraction - Abstract
In the context of dealing with limited annotated data, this paper introduces a weakly supervised whole slide image (WSI) classification approach based on contrastive learning. The proposed method aims to detect whether cancer cells have metastasized in anterior lymph nodes of breast cancer in whole slide images. Initially, small patches are extracted from whole-slide pathology images, and an unsupervised pretraining is performed on the feature extraction model using the MoCo v2 framework. Subsequently, the feature extraction model is used to extract features from the small patches. Finally, CLAM is employed to aggregate the extracted features to obtain the overall whole slide image (WSI) classification results. Experimental results demonstrate that using MoCo v2 for unsupervised pretraining of the feature extraction model achieves an accuracy of 0.8808 in the small patch classification task. Moreover, under coarse-grained WSI-level labels, the proposed approach achieves area under the receiver operating characteristic curve (AUC) values of 0.957 ± 0.0276 and 0.9442 on different datasets, outperforming typical weakly supervised and partially supervised methods in terms of classification performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. A noise-robust water segmentation method based on synthetic aperture radar images combined with automatic sample collection.
- Author
-
Hou, Zhuoyan, Meng, Mengmeng, Zhou, Guichao, Zhang, Xuedong, Cao, Mingjun, Qian, Junhao, Li, Ning, Huang, Yabo, Wu, Lin, and Xie, Linglin
- Subjects
- *
SYNTHETIC aperture radar , *BODIES of water , *SYNTHETIC apertures , *SPECKLE interference , *AUTOMATIC identification , *WATER sampling , *WATER use - Abstract
Synthetic Aperture Radar (SAR) images have been widely used for surface water identification due to their all-weather capabilities. However, the presence of inherent speckle noise in SAR data poses a challenge for accurate water identification. Additionally, annotating high-quality water body samples requires significant human labour, which can be costly and time-consuming. Aiming at the above problems, a noise-robust automatic water identification architecture without artificial labels is proposed. First, a two-stage automatic sample collection method that utilizes k-means++ clustering and morphological concepts is designed. Then, a weakly supervised noise-resistant SAR water body segmentation method NRM-ACUNet has been developed based on U-Net combined with LNR-Dice loss function and Conditionally Parameterized Convolutions (CondConv) to minimize the impact of sample noises. Experimental results show that the morphological processing can improve water body sample quality compared to k-means++, and compared with U-Net, NRM-ACUNet performs superior with noise-containing pseudo-samples, achieving 96.8% F1 accuracy and 52.06% accuracy improvement. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Active self-training for weakly supervised 3D scene semantic segmentation.
- Author
-
Liu, Gengxin, van Kaick, Oliver, Huang, Hui, and Hu, Ruizhen
- Subjects
ACTIVE learning ,POINT cloud - Abstract
Since the preparation of labeled data for training semantic segmentation networks of point clouds is a time-consuming process, weakly supervised approaches have been introduced to learn from only a small fraction of data. These methods are typically based on learning with contrastive losses while automatically deriving per-point pseudo-labels from a sparse set of user-annotated labels. In this paper, our key observation is that the selection of which samples to annotate is as important as how these samples are used for training. Thus, we introduce a method for weakly supervised segmentation of 3D scenes that combines self-training with active learning. Active learning selects points for annotation that are likely to result in improvements to the trained model, while self-training makes efficient use of the user-provided labels for learning the model. We demonstrate that our approach leads to an effective method that provides improvements in scene segmentation over previous work and baselines, while requiring only a few user annotations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Enhancing action discrimination via category-specific frame clustering for weakly-supervised temporal action localization.
- Author
-
Xia, Huifen, Zhan, Yongzhao, Liu, Honglin, and Ren, Xiaopeng
- Abstract
Copyright of Frontiers of Information Technology & Electronic Engineering is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
36. Discriminative Action Snippet Propagation Network for Weakly Supervised Temporal Action Localization.
- Author
-
Dang, Yuanjie, Huang, Chunxia, Chen, Peng, Zhao, Dongdong, Gao, Nan, Liang, Ronghua, and Huan, Ruohong
- Subjects
ACTIVE learning ,COMPLETENESS theorem - Abstract
Weakly supervised temporal action localization (WTAL) aims to classify and localize actions in untrimmed videos with only video-level labels. Recent studies have attempted to obtain more accurate temporal boundaries by exploiting latent action instances in ambiguous snippets or propagating representative action features. However, empirically handcrafted ambiguous snippet extraction and the imprecise alignment of representative snippet propagation lead to challenges in modeling the completeness of actions for these methods. In this article, we propose a Discriminative Action Snippet Propagation Network (DASP-Net) to accurately discover ambiguous snippets in videos and propagate discriminative instance-level features throughout the video for improving action completeness. Specifically, we introduce a novel discriminative feature propagation module for capturing the global contextual attention and propagating the action concept across the whole video by perceiving the discriminative action snippets with instance information from the same video. Simultaneously, we incorporate denoised pseudo-labels as supervision, where we correct the controversial prediction based on the feature space distribution during training, thereby alleviating false detection caused by noise background features. Furthermore, we design an ambiguous feature mining module, which maximizes the feature affinity information of action and background in ambiguous snippets to generate more accurate latent action and background snippets and learns more precise action instance boundaries through contrastive learning of action and background snippets. Extensive experiments show that DASP-Net achieves state-of-the-art results on THUMOS14 and ActivityNet1.2 datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Segmentation and grade evaluation of corrosion on hydraulic steel gates based on image-level labels.
- Author
-
Zhang, Wenheng, Zhang, Yuqi, Gu, Qifeng, and Zhao, Huadong
- Abstract
Machine vision offers distinct advantages, such as enhanced efficiency and precision, in the segmentation and assessment of corrosion on hydraulic steel gates. This study addresses the challenge of demanding a substantial amount of pixel-level annotated data in machine vision-based corrosion segmentation and assessment approaches. To tackle this issue, a novel weakly supervised method for corrosion segmentation and assessment in hydraulic steel gates is proposed, leveraging class labeling. The technique employs a class activation map to pinpoint regions containing corrosion seeds and to train a network to capture semantic affinity relations. Subsequently, the concept of region growing is adopted to propagate semantic information across the entire image. The average feature vector of the seed region serves as the corrosion feature, enabling precise segmentation of corroded areas and circumventing the laborious pixel-level annotation process. Additionally, a fine-grained corrosion classification network is established and trained using salt spray corrosion test data to accurately evaluate the corrosion severity. To validate the proposed method's accuracy, a dataset of steel gate corrosion images is curated based on real-world operational scenes. Experimental results demonstrate that, in practical scenarios, the segmentation method presented in this paper achieves a segmentation intersection ratio of 62.37% in corrosion, without pixel-level annotation. This performance closely approaches the performance of mainstream fully supervised methods. Additionally, the corrosion grade evaluation method proposed in this study achieves an accuracy of 95.77%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Category-Aware Saliency Enhance Learning Based on CLIP for Weakly Supervised Salient Object Detection.
- Author
-
Zhang, Yunde, Zhang, Zhili, Liu, Tianshan, and Kong, Jun
- Subjects
PIXELS ,CLASSIFICATION - Abstract
Weakly supervised salient object detection (SOD) using image-level category labels has been proposed to reduce the annotation cost of pixel-level labels. However, existing methods mostly train a classification network to generate a class activation map, which suffers from coarse localization and difficult pseudo-label updating. To address these issues, we propose a novel Category-aware Saliency Enhance Learning (CSEL) method based on contrastive vision-language pre-training (CLIP), which can perform image-text classification and pseudo-label updating simultaneously. Our proposed method transforms image-text classification into pixel-text matching and generates a category-aware saliency map, which is evaluated by the classification accuracy. Moreover, CSEL assesses the quality of the category-aware saliency map and the pseudo saliency map, and uses the quality confidence scores as weights to update the pseudo labels. The two maps mutually enhance each other to guide the pseudo saliency map in the correct direction. Our SOD network can be trained jointly under the supervision of the updated pseudo saliency maps. We test our model on various well-known RGB-D and RGB SOD datasets. Our model achieves an S-measure of 87.6 % on the RGB-D NLPR dataset and 84.3 % on the RGB ECSSD dataset. Additionally, we obtain satisfactory performance on the weakly supervised E-measure, F-measure, and mean absolute error metrics for other datasets. These results demonstrate the effectiveness of our model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Weakly supervised salient object detection via bounding-box annotation and SAM model
- Author
-
Xiangquan Liu and Xiaoming Huang
- Subjects
salient object detection ,weakly supervised ,segment anything ,bounding-box annotation ,deep learning ,Mathematics ,QA1-939 ,Applied mathematics. Quantitative methods ,T57-57.97 - Abstract
Salient object detection (SOD) aims to detect the most attractive region in an image. Fully supervised SOD based on deep learning usually needs a large amount of data with human annotation. Researchers have gradually focused on the SOD task using weakly supervised annotation such as category, scribble, and bounding-box, while these existing weakly supervised methods achieve limited performance and demonstrate a huge performance gap with fully supervised methods. In this work, we proposed one novel two-stage weakly supervised method based on bounding-box annotation and the recent large visual model Segment Anything (SAM). In the first stage, we regarded the bounding-box annotation as the box prompt of SAM to generate initial labels and proposed object completeness check and object inversion check to exclude low quality labels, then we selected reliable pseudo labels for the training initial SOD model. In the second stage, we used the initial SOD model to predict the saliency map of excluded images and adopted SAM with the everything mode to generate segmentation candidates, then we fused the saliency map and segmentation candidates to predict pseudo labels. Finally we used all reliable pseudo labels generated in the two stages to train one refined SOD model. We also designed a simple but effective SOD model, which can capture rich global context information. Performance evaluation on four public datasets showed that the proposed method significantly outperforms other weakly supervised methods and also achieves comparable performance with fully supervised methods.
- Published
- 2024
- Full Text
- View/download PDF
40. Beyond Supervised Learning in Remote Sensing: A Systematic Review of Deep Learning Approaches
- Author
-
Benyamin Hosseiny, Masoud Mahdianpari, Mohammadali Hemati, Ali Radman, Fariba Mohammadimanesh, and Jocelyn Chanussot
- Subjects
Self-supervised ,semisupervised ,training data ,transfer learning (TL) ,unsupervised ,weakly supervised ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
An increasing availability of remote sensing data in the era of geo big-data makes producing well-represented, reliable training data to be more challenging and requires an excessive amount of human labor. In addition, the rapid increase in graphics processing unit processing power has enabled the development of advanced deep learning algorithms, which achieve impressive results in the field of satellite image processing. However, they require a huge and comprehensive training dataset to avoid overfitting problems and to represent a generalizable model. Thus, moving toward the development of nonsupervised deep learning (NSDL) models in different remote sensing applications is an inevitable need. To provide an initial response to that need, this article performs a comprehensive review and systematic meta-analysis of recently published research articles focusing on the applications of NSDL for remote sensing data processing. In order to identify future research directions and formulate recommendations, we extract trends and highlight interesting approaches from this large body of literature. Consequently, current challenges, prospects, and recommendations are also discussed to uncover the trend. According to the results, there is a sharp increasing trend in the applicability of NSDL methods during these few years particularly, with the advent of new deep architectures, such as adversarial, graph, and transformer models. As a result, this review article discusses different remote sensing data processing applications and challenges that can be addressed using NSDL approaches.
- Published
- 2024
- Full Text
- View/download PDF
41. Weakly supervised salient object detection via image category annotation
- Author
-
Ruoqi Zhang, Xiaoming Huang, and Qiang Zhu
- Subjects
weakly supervised ,salient object detection ,saliency detection ,image category annotation ,deep learning ,Biotechnology ,TP248.13-248.65 ,Mathematics ,QA1-939 - Abstract
The rapid development of deep learning has made a great progress in salient object detection task. Fully supervised methods need a large number of pixel-level annotations. To avoid laborious and consuming annotation, weakly supervised methods consider low-cost annotations such as category, bounding-box, scribble, etc. Due to simple annotation and existing large-scale classification datasets, the category annotation based methods have received more attention while still suffering from inaccurate detection. In this work, we proposed one weakly supervised method with category annotation. First, we proposed one coarse object location network (COLN) to roughly locate the object of an image with category annotation. Second, we refined the coarse object location to generate pixel-level pseudo-labels and proposed one quality check strategy to select high quality pseudo labels. To this end, we studied COLN twice followed by refinement to obtain a pseudo-labels pair and calculated the consistency of pseudo-label pairs to select high quality labels. Third, we proposed one multi-decoder neural network (MDN) for saliency detection supervised by pseudo-label pairs. The loss of each decoder and between decoders are both considered. Last but not least, we proposed one pseudo-labels update strategy to iteratively optimize pseudo-labels and saliency detection models. Performance evaluation on four public datasets shows that our method outperforms other image category annotation based work.
- Published
- 2023
- Full Text
- View/download PDF
42. Iterative learning for maxillary sinus segmentation based on bounding box annotations.
- Author
-
Xu, Xinli, Wang, Kaidong, Wang, Chengze, Chen, Ruihao, Zhu, Fudong, Long, Haixia, and Guan, Qiu
- Abstract
An accurate segmentation of the maxillary sinus (MS) is helpful for preoperative planning of dental implantation, diagnosis and evaluation of sinusitis, and validation of radiotherapy for sinus cancer. Many medical image segmentation models based on convolutional neural networks have achieved excellent performance, however, relied heavily on manual accurate labeling of training data. We propose an iterative learning method for MS segmentation with only bounding box supervision. First, a cone-beam computed tomography (CBCT) image is over-segmented into a set of superpixels and a feature extraction network is optimized to better extract multi-scale features of each small-size superpixel. Second, an improved graph convolutional network (IGCN) is developed to merge superpixel regions and improve the feature transformation ability of each node on a superpixel-wise graph. Finally, the iterative learning combined with the superpixel-conditional random field and IGCN makes pseudo labels gradually refine and close to fully supervised information. On a practical MS dataset, the proposed method achieves 90.5% in Dice similarity coefficient. Extending to the public dataset Promise12 for prostate MR image segmentation, it also performs well. The results show that our proposed method has good comprehensive weakly supervised segmentation performance and can narrow a gap between the bounding box and full supervision. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation.
- Author
-
Zhai, Wei, Wu, Pingyu, Zhu, Kai, Cao, Yang, Wu, Feng, and Zha, Zheng-Jun
- Subjects
- *
CROSS-entropy method , *LOCALIZATION (Mathematics) - Abstract
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels. Recently, a new paradigm has emerged by generating a foreground prediction map (FPM) to achieve pixel-level localization. While existing FPM-based methods use cross-entropy to evaluate the foreground prediction map and to guide the learning of the generator, this paper presents two astonishing experimental observations on the object localization learning process: For a trained network, as the foreground mask expands, (1) the cross-entropy converges to zero when the foreground mask covers only part of the object region. (2) The activation value continuously increases until the foreground mask expands to the object boundary. Therefore, to achieve a more effective localization performance, we argue for the usage of activation value to learn more object regions. In this paper, we propose a background activation suppression (BAS) method. Specifically, an activation map constraint module is designed to facilitate the learning of generator by suppressing the background activation value. Meanwhile, by using foreground region guidance and area constraint, BAS can learn the whole region of the object. In the inference phase, we consider the prediction maps of different categories together to obtain the final localization results. Extensive experiments show that BAS achieves significant and consistent improvement over the baseline methods on the CUB-200-2011 and ILSVRC datasets. In addition, our method also achieves state-of-the-art weakly supervised semantic segmentation performance on the PASCAL VOC 2012 and MS COCO 2014 datasets. Code and models are available at https://github.com/wpy1999/BAS-Extension. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Advances and Challenges in Deep Learning-Based Change Detection for Remote Sensing Images: A Review through Various Learning Paradigms.
- Author
-
Wang, Lukang, Zhang, Min, Gao, Xu, and Shi, Wenzhong
- Subjects
- *
REMOTE sensing , *PATTERN recognition systems , *SURFACE of the earth , *OPTICAL remote sensing , *DEEP learning , *EMERGENCY management - Abstract
Change detection (CD) in remote sensing (RS) imagery is a pivotal method for detecting changes in the Earth's surface, finding wide applications in urban planning, disaster management, and national security. Recently, deep learning (DL) has experienced explosive growth and, with its superior capabilities in feature learning and pattern recognition, it has introduced innovative approaches to CD. This review explores the latest techniques, applications, and challenges in DL-based CD, examining them through the lens of various learning paradigms, including fully supervised, semi-supervised, weakly supervised, and unsupervised. Initially, the review introduces the basic network architectures for CD methods using DL. Then, it provides a comprehensive analysis of CD methods under different learning paradigms, summarizing commonly used frameworks. Additionally, an overview of publicly available datasets for CD is offered. Finally, the review addresses the opportunities and challenges in the field, including: (a) incomplete supervised CD, encompassing semi-supervised and weakly supervised methods, which is still in its infancy and requires further in-depth investigation; (b) the potential of self-supervised learning, offering significant opportunities for Few-shot and One-shot Learning of CD; (c) the development of Foundation Models, with their multi-task adaptability, providing new perspectives and tools for CD; and (d) the expansion of data sources, presenting both opportunities and challenges for multimodal CD. These areas suggest promising directions for future research in CD. In conclusion, this review aims to assist researchers in gaining a comprehensive understanding of the CD field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Weakly supervised salient object detection via bounding-box annotation and SAM model.
- Author
-
Liu, Xiangquan and Huang, Xiaoming
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *DIGITAL technology , *ARTIFICIAL intelligence , *EQUATIONS - Abstract
Salient object detection (SOD) aims to detect the most attractive region in an image. Fully supervised SOD based on deep learning usually needs a large amount of data with human annotation. Researchers have gradually focused on the SOD task using weakly supervised annotation such as category, scribble, and bounding-box, while these existing weakly supervised methods achieve limited performance and demonstrate a huge performance gap with fully supervised methods. In this work, we proposed one novel two-stage weakly supervised method based on bounding-box annotation and the recent large visual model Segment Anything (SAM). In the first stage, we regarded the bounding-box annotation as the box prompt of SAM to generate initial labels and proposed object completeness check and object inversion check to exclude low quality labels, then we selected reliable pseudo labels for the training initial SOD model. In the second stage, we used the initial SOD model to predict the saliency map of excluded images and adopted SAM with the everything mode to generate segmentation candidates, then we fused the saliency map and segmentation candidates to predict pseudo labels. Finally we used all reliable pseudo labels generated in the two stages to train one refined SOD model. We also designed a simple but effective SOD model, which can capture rich global context information. Performance evaluation on four public datasets showed that the proposed method significantly outperforms other weakly supervised methods and also achieves comparable performance with fully supervised methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. EnNuSegNet: Enhancing Weakly Supervised Nucleus Segmentation through Feature Preservation and Edge Refinement.
- Author
-
Chen, Xiaohui, Ruan, Qisheng, Chen, Lingjun, Sheng, Guanqun, and Chen, Peng
- Subjects
CELL nuclei ,SUPERVISED learning ,IMAGE segmentation ,IMAGE analysis ,CELL size ,CELL imaging - Abstract
Nucleus segmentation plays a crucial role in tissue pathology image analysis. Despite significant progress in cell nucleus image segmentation algorithms based on fully supervised learning, the large number and small size of cell nuclei pose a considerable challenge in terms of the substantial workload required for label annotation. This difficulty in acquiring datasets has become exceptionally challenging. This paper proposes a novel weakly supervised nucleus segmentation method that only requires point annotations of the nuclei. The technique is an encoder–decoder network which enhances the weakly supervised nucleus segmentation performance (EnNuSegNet). Firstly, we introduce the Feature Preservation Module (FPM) in both encoder and decoder, which preserves more low-level features from the shallow layers of the network during the early stages of training while enhancing the network's expressive capability. Secondly, we incorporate a Scale-Aware Module (SAM) in the bottleneck part of the network to improve the model's perception of cell nuclei at different scales. Lastly, we propose a training strategy for nucleus edge regression (NER), which guides the model to optimize the segmented edges during training, effectively compensating for the loss of nucleus edge information and achieving higher-quality nucleus segmentation. Experimental results on two publicly available datasets demonstrate that our proposed method outperforms state-of-the-art approaches, with improvements of 2.02%, 1.41%, and 1.59% in terms of F1 score, Dice coefficient, and Average Jaccard Index (AJI), respectively. This indicates the effectiveness of our method in improving segmentation performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Weakly supervised semantic segmentation with segments and neighborhood classifiers.
- Author
-
Xie, Xinlin, Zhao, Wenjing, Luo, Chenyan, and Cui, Lei
- Abstract
Semantic segmentation can provide basic semantic information for scene understanding, which has important theoretical research value and broad application prospects. Limited by the labeling cost and the scale of training data, weakly supervised semantic segmentation based on image-level labels has become a potential research issue. However, how to infer the location of image-level labels is a tough problem. Therefore, we propose a weakly-supervised semantic segmentation method with segments and neighborhood classifiers. First, we propose a scheme of segment generation based on the multiple of the number of image-level labels, which can provide high-precision boundary information with fewer regions. Second, to improve the precision of label location inference, we propose an inference method based on the most similar neighborhood granule. It can appropriately determine the number of segments contained in the inferred category label. Finally, we construct a decision table with features as conditional attribute and semantic label as decision attribute, and extract the discriminative features from attribute class reduction for neighborhood classifiers learning. Experiments evidence that our proposed algorithm can produce comparable and competitive results on widely-used MRSC and PASCAL VOC 2012 datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Knowledge evolution learning: A cost-free weakly supervised semantic segmentation framework for high-resolution land cover classification.
- Author
-
Cui, Hao, Zhang, Guo, Chen, Yujia, Li, Xue, Hou, Shasha, Li, Haifeng, Ma, Xiaolong, Guan, Na, and Tang, Xuemin
- Subjects
- *
LAND cover , *CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Despite the success of deep learning in land cover classification, high-resolution (HR) land cover mapping remains challenging due to the time-consuming and labor-intensive process of collecting training samples. Many global land cover products (LCP) can reflect the low-level commonality (LLC) knowledge of land covers, such as basic shape and underlying semantic information. Therefore, we expect to use LCP as weakly supervised information to guide the semantic segmentation of HR images. We regard high-level specialty (HLS) knowledge as HR information unavailable in the LCP. We believe LLC knowledge can gradually evolve into HLS knowledge through self-active learning. Hence, we design a knowledge evolution learning strategy from LLC to HLS knowledge and correspondingly devise a knowledge evolution weakly supervised learning framework (KE-WESUP) based on LCP. KE-WESUP mainly includes three tasks: (1) Abstraction of LLC knowledge. KE-WESUP first adopts a training method based on superpixel to alleviate the inconsistency between LCP and HR images and directly learns the LLC knowledge from LCP according to the feature-fitting capacity of convolutional neural networks. (2) Automatic exploration of HLS knowledge. We propose a dynamic label optimization strategy to obtain a small number of point labels with high confidence and encourage the model to automatically mine HLS knowledge through the knowledge exploration mechanism, which prompts the model to adapt to complexHR scenes. (3) Dynamic interaction of LLC and HLS knowledge. We adopt the consistency regularization method to achieve further optimization and verification of LLC and HLS knowledge. To verify the effectiveness of KE-WESUP, we conduct experiments on USDA National Agriculture Imagery Program (1 m) and GaoFen-2 (1 m) data using WolrdCover (10 m) as labels. The results show that KE-WESUP achieves outstanding results in both experiments, which has significant advantages over existing weakly supervised methods. Therefore, the proposed method has great potential in utilizing the prior information of LCP and is expected to become a new paradigm for large-scale HR land cover classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. ISLE: A Framework for Image Level Semantic Segmentation Ensemble
- Author
-
Ostrowski, Erik, Shafique, Muhammad, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bebis, George, editor, Ghiasi, Golnaz, editor, Fang, Yi, editor, Sharf, Andrei, editor, Dong, Yue, editor, Weaver, Chris, editor, Leo, Zhicheng, editor, LaViola Jr., Joseph J., editor, and Kohli, Luv, editor
- Published
- 2023
- Full Text
- View/download PDF
50. Cross-Modal Attention Mechanism for Weakly Supervised Video Anomaly Detection
- Author
-
Sun, Wenwen, Cao, Lin, Guo, Yanan, Du, Kangning, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Jia, Wei, editor, Kang, Wenxiong, editor, Pan, Zaiyu, editor, Ben, Xianye, editor, Bian, Zhengfu, editor, Yu, Shiqi, editor, He, Zhaofeng, editor, and Wang, Jun, editor
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.