135 results on '"multiscale fusion"'
Search Results
2. Progressive CNN-transformer alternating reconstruction network for hyperspectral image reconstruction—A case study in red tide detection
- Author
-
Shen, Ying, Zhong, Ping, Zhan, Xiuxing, Chen, Xu, and Huang, Feng
- Published
- 2024
- Full Text
- View/download PDF
3. MCSSAFNet: A multi-scale state-space attention fusion network for RGBT tracking
- Author
-
Zhao, Chunbo, Mo, Bo, Li, Dawei, Wang, Xinchun, Zhao, Jie, and Xu, Junwei
- Published
- 2025
- Full Text
- View/download PDF
4. DMANet: Dual-branch multiscale attention network for real-time semantic segmentation
- Author
-
Dong, Yongsheng, Mao, Chongchong, Zheng, Lintao, and Wu, Qingtao
- Published
- 2025
- Full Text
- View/download PDF
5. Multiscale unsupervised network for deformable image registration.
- Author
-
Wang, Yun, Chang, Wanru, Huang, Chongfei, and Kong, Dexing
- Subjects
- *
IMAGE segmentation , *PIXELS , *RECORDING & registration , *ANNOTATIONS - Abstract
BACKGROUND: Deformable image registration (DIR) plays an important part in many clinical tasks, and deep learning has made significant progress in DIR over the past few years. OBJECTIVE: To propose a fast multiscale unsupervised deformable image registration (referred to as FMIRNet) method for monomodal image registration. METHODS: We designed a multiscale fusion module to estimate the large displacement field by combining and refining the deformation fields of three scales. The spatial attention mechanism was employed in our fusion module to weight the displacement field pixel by pixel. Except mean square error (MSE), we additionally added structural similarity (ssim) measure during the training phase to enhance the structural consistency between the deformed images and the fixed images. RESULTS: Our registration method was evaluated on EchoNet, CHAOS and SLIVER, and had indeed performance improvement in terms of SSIM, NCC and NMI scores. Furthermore, we integrated the FMIRNet into the segmentation network (FCN, UNet) to boost the segmentation task on a dataset with few manual annotations in our joint leaning frameworks. The experimental results indicated that the joint segmentation methods had performance improvement in terms of Dice, HD and ASSD scores. CONCLUSIONS: Our proposed FMIRNet is effective for large deformation estimation, and its registration capability is generalizable and robust in joint registration and segmentation frameworks to generate reliable labels for training segmentation tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Background-Aware Cross-Attention Multiscale Fusion for Multispectral Object Detection.
- Author
-
Guo, Runze, Guo, Xiaojun, Sun, Xiaoyong, Zhou, Peida, Sun, Bei, and Su, Shaojing
- Subjects
- *
INFRARED imaging , *IMAGE sensors , *DETECTORS , *A priori - Abstract
Limited by the imaging capabilities of sensors, research based on single modality is difficult to cope with faults and dynamic perturbations in detection. Effective multispectral object detection, which can achieve better detection accuracy by fusing visual information from different modalities, has attracted widespread attention. However, most of the existing methods adopt simple fusion mechanisms, which fail to utilize the complementary information between modalities while lacking the guidance of a priori knowledge. To address the above issues, we propose a novel background-aware cross-attention multiscale fusion network (BA-CAMF Net) to achieve adaptive fusion in visible and infrared images. First, a background-aware module is designed to calculate the light and contrast to guide the fusion. Then, a cross-attention multiscale fusion module is put forward to enhance inter-modality complement features and intra-modality intrinsic features. Finally, multiscale feature maps from different modalities are fused according to background-aware weights. Experimental results on LLVIP, FLIR, and VEDAI indicate that the proposed BA-CAMF Net achieves higher detection accuracy than the current State-of-the-Art multispectral detectors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. A small underwater object detection model with enhanced feature extraction and fusion
- Author
-
Tao Li, Yijin Gang, Sumin Li, and Yizi Shang
- Subjects
Underwater ,Deep learning ,Small object detection ,Multiscale fusion ,Medicine ,Science - Abstract
Abstract In the underwater domain, small object detection plays a crucial role in the protection, management, and monitoring of the environment and marine life. Advancements in deep learning have led to the development of many efficient detection techniques. However, the complexity of the underwater environment, limited information available from small objects, and constrained computational resources make small object detection challenging. To tackle these challenges, this paper presents an efficient deep convolutional network model. First, a CSP for small object and lightweight (CSPSL) module is introduced to enhance feature retention and preserve essential details. Next, a variable kernel convolution (VKConv) is proposed to dynamically adjust the convolution kernel size, enabling better multi-scale feature extraction. Finally, a spatial pyramid pooling for multi-scale (SPPFMS) method is presented to preserve the features of small objects more effectively. Ablation experiments on the UDD dataset demonstrate the effectiveness of the proposed methods. Comparative experiments on the UDD and DUO datasets demonstrate that the proposed model delivers the best performance in terms of computational cost and detection accuracy, outperforming state-of-the-art methods in real-time underwater small object detection tasks.
- Published
- 2025
- Full Text
- View/download PDF
8. A multimodal approach with firefly based CLAHE and multiscale fusion for enhancing underwater images
- Author
-
Venkata Lalitha Narla, Gulivindala Suresh, Chanamallu Srinivasa Rao, Mohammed Al Awadh, and Nasim Hasan
- Subjects
White balance ,CLAHE ,DCP ,Multiscale fusion ,Underwater imaging ,Colour correction ,Medicine ,Science - Abstract
Abstract With the advances in technology, humans tend to explore the world underwater in a more constructive way than before. The appearance of an underwater object varies depending on depth, biological composition, temperature, ocean currents, and other factors. This results in colour distorted images and hazy images with low contrast. To address the aforesaid problems, in proposed approach, initially White balance algorithm is carried out to pre-process original underwater image. Contrast enhanced image is achieved by applying the Contrast Limited Adaptive Histogram Equalization algorithm (CLAHE). In CLAHE, tile size and clip limit are the major parameters that control the enhanced image quality. Hence, to enhance the contrast of images optimally, Firefly algorithm is adopted for CLAHE. Dark Channel Prior algorithm (DCP) is modified with guided filter correction to get the sharpened version of the underwater image. Multiscale fusion strategy was performed to fuse CLAHE enhanced and dehazed images. Finally, the restored image is treated with optimal CLAHE to improve visibility of enhanced underwater image. Experimentation is carried out on different underwater image datasets such as U45 and RUIE and resulted in UIQM = 5.1384, UCIQE = 0.6895 and UIQM = 5.4875, UCIQE = 0.6953 respectively which shows the superiority of proposed approach.
- Published
- 2024
- Full Text
- View/download PDF
9. PSMFNet: Lightweight Partial Separation and Multiscale Fusion Network for Image Super-Resolution.
- Author
-
Shuai Cao, Jianan Liang, Yongjun Cao, Jinglun Huang, and Zhishu Yang
- Subjects
IMAGE reconstruction ,FEATURE extraction ,CONVOLUTIONAL neural networks ,HIGH resolution imaging ,IMAGE fusion - Abstract
The employment of deep convolutional neural networks has recently contributed to significant progress in single image super-resolution (SISR) research. However, the high computational demands of most SR techniques hinder their applicability to edge devices, despite their satisfactory reconstruction performance. These methods commonly use standard convolutions, which increase the convolutional operation cost of the model. In this paper, a lightweight Partial Separation and Multiscale Fusion Network (PSMFNet) is proposed to alleviate this problem. Specifically, this paper introduces partial convolution (PConv), which reduces the redundant convolution operations throughout the model by separating some of the features of an image while retaining features useful for image reconstruction. Additionally, it is worth noting that the existing methods have not fully utilized the rich feature information, leading to information loss, which reduces the ability to learn feature representations. Inspired by self-attention, this paper develops a multiscale feature fusion block (MFFB), which can better utilize the non-local features of an image. MFFB can learn long-range dependencies from the spatial dimension and extract features from the channel dimension, thereby obtaining more comprehensive and rich feature information. As the role of the MFFB is to capture rich global features, this paper further introduces an efficient inverted residual block (EIRB) to supplement the local feature extraction ability of PSMFNet. A comprehensive analysis of the experimental results shows that PSMFNet maintains a better performance with fewer parameters than the state-of-the-art models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Multi-channel Capsule Network for Micro-expression Recognition with Multiscale Fusion.
- Author
-
Xie, Zhihua, Fan, Jiawei, and Cheng, Shijia
- Subjects
CONVOLUTIONAL neural networks ,CAPSULE neural networks ,OPTICAL flow ,FACIAL expression ,LIGHT filters - Abstract
Facial micro-expression (ME), consisting of uncontrollable muscle movements in faces, is an important clue for revealing real people's feelings. Due to the short duration and low intensity, the salient feature representation learning is the main challenge for robust facial ME recognition. To acquire the diverse and spatial relation representation, this paper proposes a simple and yet distinctive micro-expression recognition model based on multiscale convolutional fusion and multi-channel capsule network (MCFMCN). Firstly, the apex frame in a ME clip, located by computing the pixel difference between frames, is filtered by the optical flow transformation. Secondly, a multiscale fusion module is introduced to capture diverse ME related details. Then, to further explore the subtle spatial relations between parts in the ME faces, the multi-channel capsule network is designed to improve the feature representation performance of the traditional single channel capsule network. Finally, the entire ME recognition model is trained and verified on three benchmarks (CASMEII, SAMM, and SMIC) using the associated standard evaluation protocols: unweighted average recall rate (UAR) and unweighted F1 score (UF1). ME recognition experiments indicate that our method based on MCFMCN can improve the UAR (from 75.79% to 83.58%) and UF1(from79.37% to 87.06%) in comparison with the traditional capsule network. Extensive experimental results show the performance of proposed ME recognition is superior to that of works based on pervious single channel capsule network or other state-of-the-art CNN models, which validates the finding that combination of multi-scale analysis and multi-channel capsule network is feasible and effective to improve the ME recognition performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. DAEiS-Net: Deep Aggregation Network with Edge Information Supplement for Tunnel Water Stain Segmentation.
- Author
-
Wang, Yuliang, Huang, Kai, Zheng, Kai, and Liu, Shuliang
- Subjects
- *
WATER tunnels , *URBAN transportation , *DATA mining , *FEATURE extraction , *INFORMATION networks - Abstract
Tunnel disease detection and maintenance are critical tasks in urban engineering, and are essential for the safety and stability of urban transportation systems. Water stain detection presents unique challenges due to its variable morphology and scale, which leads to insufficient multiscale contextual information extraction and boundary information loss in complex environments. To address these challenges, this paper proposes a method called Deep Aggregation Network with Edge Information Supplement (DAEiS-Net) for detecting tunnel water stains. The proposed method employs a classic encoder–decoder architecture. Specifically, in the encoder part, a Deep Aggregation Module (DAM) is introduced to enhance feature representation capabilities. Additionally, a Multiscale Cross-Attention Module (MCAM) is proposed to suppress noise in the shallow features and enhance the texture information of the high-level features. Moreover, an Edge Information Supplement Module (EISM) is designed to mitigate semantic gaps across different stages of feature extraction, improving the extraction of water stain edge information. Furthermore, a Sub-Pixel Module (SPM) is proposed to fuse features at various scales, enhancing edge feature representation. Finally, we introduce the Tunnel Water Stain Dataset (TWS), specifically designed for tunnel water stain segmentation. Experimental results on the TWS dataset demonstrate that DAEiS-Net achieves state-of-the-art performance in tunnel water stain segmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Lightweight weed detection using re-parameterized partial convolution and collection-distribution feature fusion
- Author
-
Yan, Kunyu, Zheng, Wenbin, and Yang, Yujie
- Published
- 2024
- Full Text
- View/download PDF
13. 一种改进多尺度融合的电动汽车充电口识别方法.
- Author
-
赵晓东, 刘瑞庆, 王向, and 溫士涛
- Subjects
FEATURE extraction ,IMAGE processing ,ALGORITHMS ,PYRAMIDS ,SPEED ,ELECTRIC vehicles - Abstract
Copyright of Journal of Chongqing University of Technology (Natural Science) is the property of Chongqing University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
14. Road Surface Defect Detection Algorithm Based on YOLOv8.
- Author
-
Sun, Zhen, Zhu, Lingxi, Qin, Su, Yu, Yongbo, Ju, Ruiwen, and Li, Qingdang
- Subjects
PAVEMENTS ,SURFACE defects ,ROAD maintenance ,PAVEMENT maintenance & repair ,ALGORITHMS - Abstract
In maintaining roads and ensuring safety, promptly detecting and repairing pavement defects is crucial. However, conventional detection methods demand substantial manpower, incur high costs, and suffer from low efficiency. To enhance road maintenance efficiency and reduce costs, we propose an improved algorithm based on YOLOv8. Our method incorporates several key enhancements. First, we replace conventional convolutions with a module composed of spatial-to-depth layers and nonstrided convolution layers (SPD-Conv) in the network backbone, enhancing the capability of recognizing small-sized defects. Second, we replace the neck of YOLOv8 with the neck of the ASF-YOLO network to fully integrate spatial and scale features, improving multiscale feature extraction capability. Additionally, we introduce the FasterNet block from the FasterNet network into C2f to minimize redundant computations. Furthermore, we utilize Wise-IoU (WIoU) to optimize the model's loss function, which accounts for the quality factors of objects more effectively, enabling adaptive learning adjustments based on samples of varying qualities. Our model was evaluated on the RDD2022 road damage dataset, demonstrating significant improvements over the baseline model. Specifically, with a 2.8% improvement in m A P and a detection speed reaching 43 FPS, our method proves to be highly effective in real-time road damage detection tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. 结合注意力机制与多尺度特征融合的视频彩色化方法.
- Author
-
周柯明, 孔广黔, and 邓周灰
- Abstract
To address the issue that existing video colorization methods are complicated to guarantee both coloring quality and temporal consistency, this paper proposed a video colorization method AMVC-GAN combining attention mechanism and multi-scale feature fusion. Firstly, it proposed a GAN-based video colorization network model. It designed a multi-scale feature fusion module in the generator of GAN with a cyclic time network as the main body to obtain information of different time frequencies. Secondly, to effectively consider the relationship between adjacent frames, it used the features extracted from different time frequencies to strengthen the connection between frames as a way to enhance the temporal consistency of colorization. Finally, to obtain more helpful information, it introduced an attention module in the upsampling part, and optimally trained the results by utilizing PatchGAN to enhance the final colorization effect. Comparing with the state-of-the-art automatic video colorization methods on DAVIS and VIDEVO datasets, the results show that AMVC-GAN ranks first in multiple indicators, with better time consistency and colorization effect. Compared with other methods, AMVC-GAN can effectively reduce time flicker, while ensuring more real and natural colorization effect. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Underwater image enhancement based on multiscale fusion generative adversarial network.
- Author
-
Dai, Yating, Wang, Jianyu, Wang, Hao, and He, Xin
- Abstract
The underwater optical imaging environment presents unique challenges due to its complexity. This paper addresses the limitations of existing algorithms in handling underwater images captured in artificial light scenes. We proposed an underwater artificial light optimization algorithm to preprocess images with uneven lighting, mitigating the effects of light distortion. Furthermore, we proposed a novel underwater image enhancement algorithm based the Multiscale Fusion Generative Adversarial Network, named UMSGAN, to address the issues of low contrast and color distortion. UMSGAN uses the generative adversarial network as the underlying framework and first extracts information from the degraded image through three parallel branches separately, and adds residual dense blocks in each branch to learn deeper features. Subsequently, the features extracted from the three branches are fused and the detailed information of the image is recovered by the reconstruction module, named RM. Finally, multiple loss functions are linearly superimposed, and the adversarial network is trained iteratively to obtain the enhanced underwater images. The algorithm is designed to accommodate various underwater scenes, providing both color correction and detail enhancement. We conducted a comprehensive evaluation of the proposed algorithm, considering both qualitative and quantitative aspects. The experimental results demonstrate the effectiveness of our approach on a diverse underwater image dataset. The proposed algorithm exhibits superior performance in terms of enhancing underwater image quality, achieving significant improvements in contrast, color accuracy, and detail preservation. The proposed methodology exhibits promising results, offering potential applications in various domains such as underwater photography, marine exploration, and underwater surveillance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Image inpainting via progressive decoder and gradient guidance.
- Author
-
Hou, Shuang, Dong, Xiucheng, Yang, Chencheng, Wang, Chao, Guo, Hongda, and Zhang, Fan
- Subjects
DEEP learning ,INPAINTING ,GENERATIVE adversarial networks - Abstract
Very recently, with the widespread research of deep learning, its achievements are increasingly evident in image inpainting tasks. However, many existing multi-stage methods fail to effectively inpainting the larger missing areas, their common drawback is that the result of each stage is easily misguided by the wrong content generated in the previous stage. To solve this issue, in this paper, a novel one-stage generative adversarial network based on the progressive decoding architecture and gradient guidance. Firstly, gradient priors are extracted at the encoder stage to be passed to the decoding branch, and multiscale attention fusion group is used to help the network understand the image features. Secondly, multiple parallel decoding branches fill and refine the missing regions by top-down passing the reconstructed priors. This progressively guided repair avoids the detrimental effects of inappropriate priors. The joint guidance of features and gradient priors helps the restoration results contain the correct structure and rich details. And the progressive guidance is achieved by our fusion strategy, combining reimage convolution and design channel coordinate attention to fuse and reweight the features of different branches. Finally, we use the multiscale fusion to merge the feature maps at different scales reconstructed by the last decoding branch and map them to the image space, which further improves the semantic plausibility of the restoration results. Experiments on multiple datasets show that the qualitative and quantitative results of our computationally efficient model are competitive with those of state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Dual-Attention-Guided Multiscale Feature Aggregation Network for Remote Sensing Image Change Detection
- Author
-
Hongjin Ren, Min Xia, Liguo Weng, Kai Hu, and Haifeng Lin
- Subjects
Change detection ,deep learning ,multiscale fusion ,remote sensing image ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Remote sensing image change detection plays an important role in urban planning and environmental monitoring. However, the existing change detection algorithms have limited ability in feature extraction, feature relationship understanding, and capture of small target features and edge detail features, which leads to the loss of some edge detail information and small target features. To this end, a new dual-attention-guided multiscale feature aggregation network is proposed. In the encoding stage, the fully convolutional dual-branch structure is used to extract the semantic features of different scales, and then, the multiscale adjacent semantic information aggregation module is used to aggregate the adjacent semantic features at different scales, which can better capture and fuse the features of different scales, thereby improving the accuracy and robustness of change detection. In the decoding stage, the dual-attention fusion module is proposed to guide and fuse the features extracted from different scales along the spatial and channel directions and reduce the background noise interference. In addition, this article also proposes a three-branch feature fusion module and a global semantic information enhancement module to make the network better integrate global semantics and differential semantics and further integrate high-level semantic features. We also introduce an auxiliary classifier in the decoding stage to provide additional supervision signals and fuse the output of the three auxiliary classifiers with the output of the main decoder to further achieve multiscale feature fusion. The comparative experiments on three remote sensing datasets show that the proposed method is superior to the existing change detection methods.
- Published
- 2024
- Full Text
- View/download PDF
19. Multielement-Feature-Based Hierarchical Context Integration Network for Remote Sensing Image Segmentation
- Author
-
Yunsong Yang, Genji Yuan, and Jinjiang Li
- Subjects
Edge fusion ,multiscale fusion ,remote sensing ,semantic segmentation ,transformer ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
In the current remote sensing segmentation tasks, we identify issues of insufficient accuracy in segmenting objects and types with similar colors, along with a lack of adequate smoothness and coherence in edge segmentation. To address these challenges, we propose a network framework called the multielement-feature-based hierarchical context integration network (MHCINet). This framework achieves deep integration of global information, local information, multiscale information, and edge information. First, we introduce an Edge and Levels Grouped Aggregator to fuse shallow features, deep features, and edge information, enhancing foreground saliency. Finally, to better identify instances with similar colors during the feature reconstruction stage, we design a constant multivariate feature integrator to fully exploit multiscale information and global context, thereby improving the segmentation model's performance. Comprehensive experimental results on the Vaihingen and Potsdam datasets demonstrate that MHCINet outperforms existing state-of-the-art methods, achieving mean intersection over union of 84.8% and 87.6% on the Vaihingen and Potsdam datasets, respectively.
- Published
- 2024
- Full Text
- View/download PDF
20. M-FSDistill: A Feature Map Knowledge Distillation Algorithm for SAR Ship Detection
- Author
-
Guohui Wang, Rui Qin, and Ying Xia
- Subjects
Feature imitation ,knowledge distillation ,multiscale fusion ,ship detection ,synthetic aperture radar (SAR) image ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Limited by the capacity and computing ability of platform payload, researchers often face the challenge of balancing lightness and performance of models in synthetic aperture radar (SAR) ship detection, especially for models based on deep learning. Nonetheless, traditional lightweight methods, such as reduced convolutional layers and pruning, can easily lead to missed detections in models. Researchers have introduced knowledge distillation algorithms to address the issue of poor performance of lightweight models. However, the improvement effect of algorithms is limited due to shortcomings such as noise interference in the background and improper distillation strategies, especially for small ship detection with complex backgrounds. Aiming to address the limited performance improvement of distillation algorithms and missing detections of small ships in distillation models, we propose a multiscale feature enhancement and foreground-scene feature distillation algorithm for SAR ship detection. Specifically, in order to improve distillation efficiency, the feature learning distillation module is proposed to improve the quality of distillation knowledge by separating foreground and scene distillation. Then, the ship feature representation enhancement module utilizes a feature map decoupling and attention-based multiscale fusion algorithm to enhance student model's learning of small ship features and reduce missing detection. To validate the performance of the proposed method, we conducted experiments on SAR ship detection dataset (SSDD) and high-resolution SAR images dataset (HRSID) datasets and compared with several advanced methods. The results indicate that the models using our algorithm achieved significant improvements in average precision (AP). For instance, on the SSDD dataset, RetinaNet, Cascade R-CNN, and RepPoints based on ResNet18 achieved AP scores of 95.5%, 95.4%, and 95.9% respectively, surpassing the baseline by 3.9%, 3.1%, and 1.9%.
- Published
- 2024
- Full Text
- View/download PDF
21. Fine-Grained Butterfly Classification Based on Multi-Feature Enhancement
- Author
-
Hong Jin, Ke Sha, and Xiaolan Xie
- Subjects
Butterfly classification ,fine grained visual classification ,multiscale fusion ,multi feature enhancement ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Due to the challenges caused by the complexity and repetitiveness of discriminative features in fine-grained butterfly images, this paper proposes a fine-grained butterfly classification model based on multi-feature enhancement and multi-scale fusion to improve the accuracy of fine-grained butterfly classification. We use ResNet50 as the base network to extract features at different scales of the image. Next, the discriminative features are enhanced using spatial attention. The enhanced discriminative features are fed into the next stage of the base network, and the above operation is repeated to obtain the enhanced discriminative features at different scales. We then use the channel covariance attention module to eliminate the extra noise introduced during multi-feature enhancement. Finally, the final feature map is obtained by fusing the complementary information of different scale feature maps using the multi-scale fusion module. Ultimately, the method used in this paper achieved 96.896% accuracy on a dataset of 10 classes of fine-grained butterfly classification, and the comparison experiments proved that the method used in this paper outperforms the mainstream fine-grained visual classification methods.
- Published
- 2024
- Full Text
- View/download PDF
22. GVANet: A Grouped Multiview Aggregation Network for Remote Sensing Image Segmentation
- Author
-
Yunsong Yang, Jinjiang Li, Zheng Chen, and Lu Ren
- Subjects
Attention mechanism ,multiscale fusion ,remote sensing ,semantic segmentation ,transformer ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
In remote sensing image segmentation tasks, various challenges arise, including difficulties in recognizing objects due to differences in perspective, difficulty in distinguishing objects with similar colors, and challenges in segmentation caused by occlusions. To address these issues, we propose a method called the grouped multiview aggregation network (GVANet), which leverages multiview information for image analysis. This approach enables global multiview expansion and fine-grained cross-layer information interaction within the network. Within this network framework, to better utilize a wider range of multiview information to tackle challenges in remote sensing segmentation, we introduce the multiview feature aggregation block for extracting multiview information. Furthermore, to overcome the limitations of same-level shortcuts when dealing with multiview problems, we propose the channel group fusion block for cross-layer feature information interaction through a grouped fusion approach. Finally, to enhance the utilization of global features during the feature reconstruction phase, we introduce the aggregation-inhibition-activation block for feature selection and focus, which captures the key features for segmentation. Comprehensive experimental results on the Vaihingen and Potsdam datasets demonstrate that GVANet outperforms current state-of-the-art methods, achieving mIoU scores of 84.5% and 87.6%, respectively.
- Published
- 2024
- Full Text
- View/download PDF
23. Multiscale Hyperspectral Pansharpening Network Based on Dual Pyramid and Transformer
- Author
-
Hengyou Wang, Jie Zhang, and Lian-Zhi Huo
- Subjects
Dual Gaussian-Laplacian pyramid (DGLP) ,hyperspectral pansharpening ,multiscale fusion ,Transformer ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Hyperspectral pansharpening is to fuse a high spatial resolution panchromatic image (PAN) with a low spatial resolution hyperspectral image (LR-HSI) and generate high resolution hyperspectral image (HR-HSI). However, most existing deep learning-based pansharpening methods have some issues, such as spectral distortion and insufficient spatial texture enhancement. In this work, we propose a novel multiscale pansharpening network based on the Dual Gaussian-Laplacian Pyramid (DGLP) and Transformer, named MDTP-Net. Specifically, the DGLP module is designed to obtain feature maps at multilevel scales, which effectively learn global spectral information and spatial detail texture information. Then, we design a corresponding Transformer module for each scale feature and utilize the multihead attention mechanism to guide the extraction of spatial information from LR-HSI and PAN images. This enhances the stability of pansharpening and improves the fusion of spectral with spatial information across feature spaces. In addition, the feature extractors are inserted to connect DGLP and Transformer, making the spatial feature map smoother and richer in channel and texture features. The feature fusion and multiscale feature connection blocks are used to connect multiscale information together to generate HR-HSI images with more comprehensive spatial and spectral features. Finally, extensive experiments on three classic hyperspectral datasets are conducted. The experimental results demonstrate that our proposed MDTP-Net outperforms conventional methods and existing deep learning-based methods.
- Published
- 2024
- Full Text
- View/download PDF
24. 基于改进YOLOv5的铝型材表面缺陷检测方法.
- Author
-
席凌飞, 伊力哈木·亚尔买买提, and 刘雅洁
- Abstract
Copyright of Journal of Guangxi Normal University - Natural Science Edition is the property of Gai Kan Bian Wei Hui and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
25. Rapid screening for autoimmune diseases using Fourier transform infrared spectroscopy and deep learning algorithms.
- Author
-
Xue Wu, Wei Shuai, Chen Chen, Xiaomei Chen, Cainan Luo, Yi Chen, Yamei Shi, Zhengfang Li, Xiaoyi Lv, Cheng Chen, Xinyan Meng, Xin Lei, and Lijun Wu
- Subjects
MACHINE learning ,FOURIER transform infrared spectroscopy ,DEEP learning ,MEDICAL screening ,AUTOIMMUNE diseases ,RHEUMATOID arthritis - Abstract
Introduce: Ankylosing spondylitis (AS), rheumatoid arthritis (RA), and osteoarthritis (OA) are three rheumatic immune diseases with many common characteristics. If left untreated, they can lead to joint destruction and functional limitation, and in severe cases, they can cause lifelong disability and even death. Studies have shown that early diagnosis and treatment are key to improving patient outcomes. Therefore, a rapid and accurate method for rapid diagnosis of diseases has been established, which is of great clinical significance for realizing early diagnosis of diseases and improving patient prognosis. Methods: This study was based on Fourier transform infrared spectroscopy (FTIR) combined with a deep learning model to achieve non-invasive, rapid, and accurate differentiation of AS, RA, OA, and healthy control group. In the experiment, 320 serum samples were collected, 80 in each group. AlexNet, ResNet, MSCNN, and MSResNet diagnostic models were established by using a machine learning algorithm. Result: The range of spectral wave number measured by four sets of Fourier transform infrared spectroscopy is 700-4000 cm-1. Serum spectral characteristic peaks were mainly at 1641 cm-1(amide I), 1542 cm-1(amide II), 3280 cm-1(amide A), 1420 cm-1(proline and tryptophan), 1245 cm-1(amide III), 1078 cm-1(carbohydrate region). And 2940 cm-1 (mainly fatty acids and cholesterol). At the same time, AlexNet, ResNet, MSCNN, and MSResNet diagnostic models are established by using machine learning algorithms. The multi-scale MSResNet classification model combined with residual blocks can use convolution modules of different scales to extract different scale features and use resblocks to solve the problem of network degradation, reduce the interference of spectral measurement noise, and enhance the generalization ability of the network model. By comparing the experimental results of the other three models AlexNet, ResNet, and MSCNN, it is found that the MSResNet model has the best diagnostic performance and the accuracy rate is 0.87. Conclusion: The results prove the feasibility of serum Fourier transform infrared spectroscopy combined with a deep learning algorithm to distinguish AS, RA, OA, and healthy control group, which can be used as an effective auxiliary diagnostic method for these rheumatic immune diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Image inpainting via progressive decoder and gradient guidance
- Author
-
Shuang Hou, Xiucheng Dong, Chencheng Yang, Chao Wang, Hongda Guo, and Fan Zhang
- Subjects
Multiscale attention fusion group ,Channel coordinate attention ,Multiscale fusion ,Image inpainting ,Electronic computers. Computer science ,QA75.5-76.95 ,Information technology ,T58.5-58.64 - Abstract
Abstract Very recently, with the widespread research of deep learning, its achievements are increasingly evident in image inpainting tasks. However, many existing multi-stage methods fail to effectively inpainting the larger missing areas, their common drawback is that the result of each stage is easily misguided by the wrong content generated in the previous stage. To solve this issue, in this paper, a novel one-stage generative adversarial network based on the progressive decoding architecture and gradient guidance. Firstly, gradient priors are extracted at the encoder stage to be passed to the decoding branch, and multiscale attention fusion group is used to help the network understand the image features. Secondly, multiple parallel decoding branches fill and refine the missing regions by top-down passing the reconstructed priors. This progressively guided repair avoids the detrimental effects of inappropriate priors. The joint guidance of features and gradient priors helps the restoration results contain the correct structure and rich details. And the progressive guidance is achieved by our fusion strategy, combining reimage convolution and design channel coordinate attention to fuse and reweight the features of different branches. Finally, we use the multiscale fusion to merge the feature maps at different scales reconstructed by the last decoding branch and map them to the image space, which further improves the semantic plausibility of the restoration results. Experiments on multiple datasets show that the qualitative and quantitative results of our computationally efficient model are competitive with those of state-of-the-art methods.
- Published
- 2023
- Full Text
- View/download PDF
27. Crack Detection of Concrete Pavement Based on Attention Mechanism and Lightweight DilatedConvolution
- Author
-
QU Zhong, WANG Caiyun
- Subjects
crack detection ,attention mechanism ,dilated convolution ,multiscale fusion ,fully convolutional network ,deep supervision network ,Computer software ,QA76.75-76.765 ,Technology (General) ,T1-995 - Abstract
Cracks in the concrete pavement will affect the safety,applicability,and durability of the structure,and crack detection is a challenging research hotspot.This paper proposes a crack detection model composed of an improved full convolutional network and a deep supervision network,which uses the improved VGG-16 as the backbone network.Firstly,the low-level convolutional feature aggregation is fused to the backbone network again through the spatial attention mechanism.Secondly,the middle and high-level convolutional features are fused through the lightweight dilated convolution fusion module for multi-feature fusion to get the clear edge and high-resolution feature maps,all side feature maps are added to produce the final prediction map.Finally,the deep supervision network provides direct supervision for the detection results of each stage.In this paper,the focus loss function is selected as the evaluation function,and the trained network model can efficiently identify the crack location from the input original image under various conditions such as uneven illumination and complex background.To verify the effectiveness and robustness of the proposed method,it is compared with six methods on three datasets,DeepCrack,CFD,and Crack500,and the results show that it has excellent performance,and the F-score value reaches 87.12%.
- Published
- 2023
- Full Text
- View/download PDF
28. MISNet: Multiscale Cross-Layer Interactive and Similarity Refinement Network for Scene Parsing of Aerial Images
- Author
-
Wujie Zhou, Xiaomin Fan, Lu Yu, and Jingsheng Lei
- Subjects
Cross-layer interaction ,feature similarity ,high-resolution aerial images (HRAIs) ,multiscale fusion ,scene parsing ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Although progress has been made in multisource data scene parsing of natural scene images, extracting complex backgrounds from aerial images of various types and presenting the image at different scales remain challenging. Various factors in high-resolution aerial images (HRAIs), such as imaging blur, background clutter, object shadow, and high resolution, substantially reduce the integrity and accuracy of object segmentation. By applying multisource data fusion, as in scene parsing of natural scene images, we can solve the aforementioned problems through the integration of auxiliary data into HRAIs. To this end, we propose a multiscale cross-layer interactive and similarity refinement network (MISNet) for scene parsing of HRAIs. First, in a feature fusion optimization module, we extract, filter, and optimize multisource features and further guide and optimize the features using a feature guidance module. Second, a multiscale context aggregation module increases the receptive field, captures semantic information, and extracts rich multiscale background features. Third, a dense decoding module fuses the global guidance information and high-level fused features. We also propose a joint learning method based on feature similarity and a joint learning module to obtain deep multilevel information, enhance feature generation, and fuse multiscale and global features to enhance network representation for accurate scene parsing of HRAIs. Comprehensive experiments on two benchmark HRAIs datasets indicate that our proposed MISNet is qualitatively and quantitatively superior to similar state-of-the-art models.
- Published
- 2023
- Full Text
- View/download PDF
29. Multiscale Feature Fusion for Hyperspectral Marine Oil Spill Image Segmentation.
- Author
-
Chen, Guorong, Huang, Jiaming, Wen, Tingting, Du, Chongling, Lin, Yuting, and Xiao, Yanbing
- Subjects
OIL spills ,LIGHT absorption ,MULTISCALE modeling ,OPTICAL images ,SURFACE area ,IMAGE segmentation - Abstract
Oil spills have always been a threat to the marine ecological environment; thus, it is important to identify and divide oil spill areas on the ocean surface into segments after an oil spill accident occurs to protect the marine ecological environment. However, oil spill area segmentation using ordinary optical images is greatly interfered with by the absorption of light by the deep sea and the distribution of algal organisms on the ocean surface, and it is difficult to improve segmentation accuracy. To address the above problems, a hyperspectral ocean oil spill image segmentation model with multiscale feature fusion (MFFHOSS-Net) is proposed. Specifically, the oil spill segmentation dataset was created using hyperspectral image data from NASA for the Gulf of Mexico oil spill, small-size images after the waveband filtering of the hyperspectral images were generated and the oil spill images were annotated. The model makes full use of having different layers with different characteristics by fusing feature maps of different scales. In addition, an attention mechanism was used to effectively fuse these features to improve the oil spill region segmentation accuracy. A case study, ablation experiments and model evaluation were also carried out in this work. Compared with other models, our proposed method achieved good results according to various evaluation metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. BiRPN-YOLOvX: A weighted bidirectional recursive feature pyramid algorithm for lung nodule detection.
- Author
-
Han, Liying, Li, Fugai, Yu, Hengyong, Xia, Kewen, Xin, Qiyuan, and Zou, Xiaoyu
- Subjects
- *
PULMONARY nodules , *LUNGS , *LUNG cancer , *PYRAMIDS , *EARLY detection of cancer , *FEATURE extraction - Abstract
BACKGROUND: Lung cancer has the second highest cancer mortality rate in the world today. Although lung cancer screening using CT images is a common way for early lung cancer detection, accurately detecting lung nodules remains a challenged issue in clinical practice. OBJECTIVE: This study aims to develop a new weighted bidirectional recursive pyramid algorithm to address the problems of small size of lung nodules, large proportion of background region, and complex lung structures in lung nodule detection of CT images. METHODS: First, the weighted bidirectional recursive feature pyramid network (BiPRN) is proposed, which can increase the ability of network model to extract feature information and achieve multi-scale fusion information. Second, a CBAM_CSPDarknet53 structure is developed to incorporate an attention mechanism as a feature extraction module, which can aggregate both spatial information and channel information of the feature map. Third, the weighted BiRPN and CBAM_CSPDarknet53 are applied to the YOLOvX model for lung nodule detection experiments, named BiRPN-YOLOvX, where YOLOvX represents different versions of YOLO. To verify the effectiveness of our weighted BiRPN and CBAM_ CSPDarknet53 algorithm, they are fused with different models of YOLOv3, YOLOv4 and YOLOv5, and extensive experiments are carried out using the publicly available lung nodule datasets LUNA16 and LIDC-IDRI. The training set of LUNA16 contains 949 images, and the validation and testing sets each contain 118 images. There are 1987, 248 and 248 images in LIDC-IDRI's training, validation and testing sets, respectively. RESULTS: The sensitivity of lung nodule detection using BiRPN-YOLOv5 reaches 98.7% on LUNA16 and 96.2% on LIDC-IDRI, respectively. CONCLUSION: This study demonstrates that the proposed new method has potential to help improve the sensitivity of lung nodule detection in future clinical practice. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. A Hybrid Fusion-Based Algorithm for Underwater Image Enhancement Using Fog Aware Density Evaluator and Mean Saturation
- Author
-
Paulson, Rosalind Margaret, Gopalakrishnan, Sruthi, Mahendiran, Sruthi, Srambical, Varghese Paul, Gopan, Neethu Radha, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Khanna, Ashish, editor, Gupta, Deepak, editor, Bhattacharyya, Siddhartha, editor, Hassanien, Aboul Ella, editor, Anand, Sameer, editor, and Jaiswal, Ajay, editor
- Published
- 2022
- Full Text
- View/download PDF
32. FSRSI: New Deep Learning-Based Approach for Super-Resolution of Multispectral Satellite Images.
- Author
-
Soufi, Omar and Belouadha, Fatima Zahra
- Subjects
DEEP learning ,REMOTE-sensing images ,CONVOLUTIONAL neural networks - Abstract
Open access in space remote sensing has allowed easy access to satellite imagery; however, access to high-resolution imagery is not given to everyone, but only to those who master space technology. Thus, this paper presents a new approach for improving the quality of Sentinel-2 satellite images by super-resolution exploiting deep learning techniques. In this context, this work proposes a generic solution that improves the spatial resolution from 10m to 2.5m (scaling factor 4) taking into account the constraints of volumetry and dependence between spectral bands imposed by the specificities of satellite images. This study proposes the FSRSI model which exploits the potential of deep convolutional networks (CNN) and integrates new state-of-the-art concepts including Network in Network, end-to-end learning, multi-scale fusion, neural network optimization, acceleration, and filter transfer. This model has also been improved by an efficient mosaicking technique for the Super-Resolution of satellite images in addition to the consideration of inter-spectral dependence combined with the efficient choice of training data. This approach shows better performance than what has been proven in the field of spatial imagery. The experimental results showed that the adopted algorithm restores the details of satellite images quickly and efficiently; outperforming several state-of-the-art methods. These performances were observed following a benchmark with several neural networks and experimentation of applications to a carefully constructed dataset. The proposed solution showed promising results in terms of visual and perceptual quality with a better inference speed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Improving heat transfer prediction across catalytic SiC interface through a multiscale framework leveraging machine learning and data fusion techniques.
- Author
-
Ye, Zhifan, Zhao, Jin, Xing, Haoyun, Yao, Guice, Xu, Dichu, and Wen, Dongsheng
- Subjects
- *
CHEMICAL kinetics , *HYPERSONIC flow , *COMPUTATIONAL fluid dynamics , *RADIAL basis functions , *EXOTHERMIC reactions , *MAXIMUM entropy method - Abstract
• A multiscale fusion approach for surface catalytic heat prediction is proposed. • Heat flux error using the multiscale fusion method is 0.30 % compared with experimental result. • Database of surface catalytic recombination for silicon carbide material is designed. Accurate aerothermal prediction under hypersonic flow conditions is crucial for any thermal protection materials design and engineering. The surface catalytic effect, where dissociated atoms recombine into their molecular forms at the air-solid interface, is an exothermic reaction and plays an important role in high-enthalpy aerodynamic environment prediction. Taking SiC, a widely recognized high-temperature thermal protection material as an example, a multiscale fusion approach for aerothermal prediction is proposed. The approach utilizes reactive molecular dynamics method to construct an interface catalysis model, and the Bayesian maximum entropy to integrate experimental and simulational data into an optimized database. Subsequently, using the radial basis function neural network algorithm, a machine learning-based reaction kinetics model with precise analysis of surface catalysis is trained. The obtained catalytic recombination efficiency is used as a boundary input for computational fluid dynamics simulation, enabling rapid and accurate prediction of hypersonic thermal environment. The result shows that the predicted surface heat flux by this novel approach is consistent with the benchmark studies, but comes with significantly reduced computation time. The stagantion heat flux predictions based on the assumptions of full catalytic wall, finite rate catalytic wall (from the proposed multiscale fusion method), and non-catalytic wall are determined to be 8.65×106 W/m2, 7.51×106 W/m2, and 4.02×106 W/m2, respectively. Comparing these with the wind tunnel benchmark value of 7.48×106 W/m2, the error from the proposed multiscale fusion method is 0.30 %, indicating significant enhancement in prediction accuracy through the proposed upscaling method. The new approach could not only reveal complex exothermic reaction processes at the interface, but also enhance the prediction accuracy at much higher computational efficiency, providing an alternative for multiscale modeling of complex flow and heat transfer at the interface. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
34. 基于 YOLOv3 的公共场所口罩佩戴检测方法.
- Author
-
魏明军, 周太宇, 纪占林, and 张鑫楠
- Subjects
PUBLIC spaces ,PYRAMIDS ,ALGORITHMS ,SPEED ,SURGICAL equipment - Abstract
Copyright of Journal of Guangxi Normal University - Natural Science Edition is the property of Gai Kan Bian Wei Hui and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
35. DEF-Net: A Dual-Encoder Fusion Network for Fundus Retinal Vessel Segmentation.
- Author
-
Li, Jianyong, Gao, Ge, Yang, Lei, Liu, Yanhong, and Yu, Hongnian
- Subjects
RETINAL blood vessels ,EYE diseases ,VIDEO coding - Abstract
The deterioration of numerous eye diseases is highly related to the fundus retinal structures, so the automatic retinal vessel segmentation serves as an essential stage for efficient detection of eye-related lesions in clinical practice. Segmentation methods based on encode-decode structures exhibit great potential in retinal vessel segmentation tasks, but have limited feature representation ability. In addition, they don't effectively consider the information at multiple scales when performing feature fusion, resulting in low fusion efficiency. In this paper, a newly model, named DEF-Net, is designed to segment retinal vessels automatically, which consists of a dual-encoder unit and a decoder unit. Fused with recurrent network and convolution network, a dual-encoder unit is proposed, which builds a convolutional network branch to extract detailed features and a recurrent network branch to accumulate contextual features, and it could obtain richer features compared to the single convolution network structure. Furthermore, to exploit the useful information at multiple scales, a multi-scale fusion block used for facilitating feature fusion efficiency is designed. Extensive experiments have been undertaken to demonstrate the segmentation performance of our proposed DEF-Net. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
36. RMP-Net: A structural reparameterization and subpixel super-resolution-based marine scene segmentation network
- Author
-
Jiongjiang Chen, Jialin Tang, Shounan Lin, Wanxin Liang, Binghua Su, Jinghui Yan, Dujuan Zhou, Lili Wang, Yunting Lai, and Benxi Yang
- Subjects
submarine exploration ,underwater scene ,RMP-Net ,structural re-parameterization ,multiscale fusion ,Science ,General. Including nature conservation, geographical distribution ,QH1-199.5 - Abstract
Ocean exploration has always been an important strategic direction for the joint efforts of all mankind. Many countries in the world today are developing their own underwater autonomous explorers to better explore the seabed. Vision, as the core technology of autonomous underwater explorers, has a great impact on the efficiency of exploration. Different from traditional tasks, the lack of ambient light on the seabed makes the visual system more demanding. In addition, the complex terrain on the seabed and various creatures with different shapes and colors also make exploration tasks more difficult. In order to effectively solve the above problems, we combined the traditional models to modify the structure and proposed an algorithm for the super-resolution fusion of enhanced extraction features to perform semantic segmentation of seabed scenes. By using a structurally reparameterized backbone network to better extract target features in complex environments, and using subpixel super-resolution to combine multiscale feature semantic information, we can achieve superior ocean scene segmentation performance. In this study, multiclass segmentation and two-class segmentation tests were performed on the public datasets SUIM and DeepFish, respectively. The test results show that the mIoU and mPA indicators of our proposed method on SUIM reach 84.52% and 92.33%mPA, respectively. The mIoU and mPA on DeepFish reach 95.26% and 97.38%, respectively, and the proposed model achieves SOTA compared with state-of-the-art methods. The proposed model and code are exposed via Github1.
- Published
- 2022
- Full Text
- View/download PDF
37. Multiscale Superpixel Guided Discriminative Forest for Hyperspectral Anomaly Detection.
- Author
-
Cheng, Xi, Zhang, Min, Lin, Sheng, Zhou, Kexue, Wang, Liang, and Wang, Hai
- Subjects
- *
ANOMALY detection (Computer security) , *INTRUSION detection systems (Computer security) , *FALSE alarms , *PIXELS - Abstract
Recently, the isolation forest (IF) methods have received increasing attention for their promising performance in hyperspectral anomaly detection (HAD). However, limited by the ability of exploiting spatial-spectral information, existing IF-based methods suffer from a lot of false alarms and disappointing performance of detecting local anomalies. To overcome the two problems, a multiscale superpixel guided discriminative forest method is proposed for HAD. First, the multiscale superpixel segmentation is employed to generate some homogeneous regions, and it can effectively extract spatial information to guide anomaly detection for the discriminative forest in local areas. Then, a novel discriminative forest (DF) model with the gain split criterion is designed, which enhances the sensitivity of the DF to local anomalies by the utilization of multi-dimension spectral bands for node division; meanwhile, the acceptable range of hyperplane attribute values is introduced to capture any unseen anomaly pixels that are out-of-range in the evaluation stage. Finally, for the high false alarm rate situation in the existing IF-based algorithms, the multiscale fusion with guided filtering is put forward to refine the initial detection results from the DF. In addition, the extensive experimental results on four real hyperspectral datasets demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. MDA-Net: Multiscale dual attention-based network for breast lesion segmentation using ultrasound images.
- Author
-
Iqbal, Ahmed and Sharif, Muhammad
- Subjects
BREAST ,ULTRASONIC imaging ,SPECKLE interference ,IMAGE segmentation ,BREAST imaging ,MAGNETIC resonance mammography ,BREAST cancer ,CANCER treatment - Abstract
Accurate breast lesion segmentation is a great help in the initial stage of breast cancer treatment planning. Ultrasound is considered the safe and cheapest method for the breast screening process. However, ultrasound images inherently contain speckle noise, unclear boundaries, and complex shapes, making it more challenging for automatic segmentation methods. This work proposes a multiscale dual attention-based network (MDA-Net) for concurrent segmentation of breast lesions images. The multiscale fusion (MF) block is introduced that addresses the classical fixed receptive field issues, and helps to extract more semantic features and aims to achieve more features diversity. A dual-attention (d A) is also proposed, which is a hybrid of channel-based attention (c A) and lesion attention (l A) blocks that improves the feature representation capability and adaptatively learns a discriminative representation of high-level features. As a result, a combination of two attention blocks helped the proposed network to concentrate on a more relevant field of view of targets. The MDA-Net is extensively tested on both self-collected private datasets and two public UDIAT, BUSIS datasets. Furthermore, our method is also evaluated on MRI datasets to observe the broad applicability of our method in a different imaging modality. The MDA-Net has achieved the DSC of 87.68%, 91.85%, 90.41%, 83.47% on UDIAT, BUSIS, Private, and RIDER breast MRI datasets (p- value < 0.05 with paired t -test). Our MDA-Net implementation code and pretrained models are released at GitHub: https://github.com/ahmedeqbal/MDA-Net. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. Contour detection based on binocular parallax perception mechanism.
- Author
-
Wei, Chujie, Fang, Tao, Fan, Yingle, Wu, Wei, Meng, Ming, and She, Qingshan
- Abstract
We propose a new method of image contour detection, considering the close relationship between binocular parallax in the biological vision system and the hierarchical transmission of visual channel information flow. Firstly, we present the dynamic adjustment mechanism of different opponent cell connection weights in a color channel to obtain the initial contour response diagram. Subsequently, we introduce the binocular parallax energy model to separate the image feature information to receive the response of position and phase differences; we then construct the end-stopped cells with different phases to extract the primary contour of the image to a significant extent. At the same time, we propose a multi-scale receptive field fusion strategy to suppress the local texture with multi-intensity. Finally, we use the feedforward mechanism across the hierarchy to refine the textured background and improve the primary contour contrast to obtain the final contour response. Our image processing method based on binocular parallax compensation can provide a new idea for subsequent studies on the higher visual cortex's image understanding and visual cognition. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. A Multiscale Fusion Lightweight Image-Splicing Tamper-Detection Model.
- Author
-
Zhao, Dan and Tian, Xuedong
- Subjects
DEEP learning ,COMPUTER crimes ,PHOTOGRAPHIC editing ,PYRAMIDS ,FORGERY - Abstract
The easy availability and usability of photo-editing tools have increased the number of forgery attacks, primarily splicing attacks, thereby increasing cybercrimes. Because of an existing image-splicing tamper-detection algorithm based on deep learning with high model complexity and weak robustness, a multiscale fusion lightweight model for image-splicing tamper detection is proposed. For the above problems and to improve MobileNetV2, the structural block of the classification part of the original network structure was removed, the stride of the sixth largest structural block of the network was changed to 1, the dilated convolution was used instead of downsampling, and the features extracted from the second and third large structural blocks in the network were downsampled with maximal pooling; then, the constraint on the backbone network was increased by jumping connections. Combined with the pyramid pooling module, the acquired feature layers were divided into regions of different sizes for average pooling; then, all feature layers were fused. The experimental results show that it had a low number of parameters and required a small amount of computation, achieving 91.0% and 96.4% precision on CASIA and COLUMB, respectively, and 83.2% and 88.1% F-measure on CASIA and COLUMB, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. MMIF-INet: Multimodal medical image fusion by invertible network.
- Author
-
He, Dan, Li, Weisheng, Wang, Guofen, Huang, Yuping, and Liu, Shiqiang
- Subjects
- *
IMAGE fusion , *DISCRETE wavelet transforms , *FEATURE extraction , *WAVELET transforms , *DIAGNOSTIC imaging , *MULTICHANNEL communication - Abstract
Multimodal medical image fusion (MMIF) technology aims to generate fused images that comprehensively reflect the information of tissues, organs, and metabolism, thereby assisting medical diagnosis and enhancing the reliability of clinical diagnosis. However, most approaches suffer from information loss during feature extraction and fusion, and rarely explore how to directly process multichannel data. To address the above problems, this paper proposes a novel invertible fusion network (MMIF-INet) that accepts three-channel color images as inputs to the model and generates multichannel data distributions in a process-reversible manner. Specifically, the discrete wavelet transform (DWT) is utilized for downsampling, aiming to decompose the source image pair into high- and low-frequency components. Concurrently, an invertible block (IB) facilitates preliminary feature fusion, enabling the integration of cross-domain complementary information and multisource aggregation in an information-lossless manner. The combination of IB and DWT ensures the initial fusion's reversibility and the extraction of semantic features across various scales. To accommodate fusion tasks, a multiscale fusion module is employed, integrating diverse components from different modalities and multiscale features. Finally, a hybrid loss is designed to constrain model training from the perspectives of structure, gradient, intensity, and chromaticity, thus enabling effective retention of the luminance, color, and detailed information of the source images. Experiments on multiple medical datasets demonstrate that MMIF-INet outperforms existing methods in visual quality, quantitative metrics, and fusion efficiency, particularly in color fidelity. Extended to infrared–visible image fusion, seven optimal evaluation criteria further substantiate MMIF-INet's superior fusion performance. The code of MMIF-INet is available at https://github.com/HeDan-11/MMIF-INet. • A novel invertible fusion network (MMIF-INet) is proposed for medical image fusion. • Initial fusion of features using invertible blocks with no loss of information. • A hybrid loss is designed based on structure, gradient, pixel intensity, and chromaticity. • Excellent fusion performance with only 0.016 s test time. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
42. Multiscale Feature Fusion for Hyperspectral Marine Oil Spill Image Segmentation
- Author
-
Guorong Chen, Jiaming Huang, Tingting Wen, Chongling Du, Yuting Lin, and Yanbing Xiao
- Subjects
oil spill segmentation ,hyperspectral images ,multiscale fusion ,attention mechanism ,Naval architecture. Shipbuilding. Marine engineering ,VM1-989 ,Oceanography ,GC1-1581 - Abstract
Oil spills have always been a threat to the marine ecological environment; thus, it is important to identify and divide oil spill areas on the ocean surface into segments after an oil spill accident occurs to protect the marine ecological environment. However, oil spill area segmentation using ordinary optical images is greatly interfered with by the absorption of light by the deep sea and the distribution of algal organisms on the ocean surface, and it is difficult to improve segmentation accuracy. To address the above problems, a hyperspectral ocean oil spill image segmentation model with multiscale feature fusion (MFFHOSS-Net) is proposed. Specifically, the oil spill segmentation dataset was created using hyperspectral image data from NASA for the Gulf of Mexico oil spill, small-size images after the waveband filtering of the hyperspectral images were generated and the oil spill images were annotated. The model makes full use of having different layers with different characteristics by fusing feature maps of different scales. In addition, an attention mechanism was used to effectively fuse these features to improve the oil spill region segmentation accuracy. A case study, ablation experiments and model evaluation were also carried out in this work. Compared with other models, our proposed method achieved good results according to various evaluation metrics.
- Published
- 2023
- Full Text
- View/download PDF
43. Image Fusion: Challenges, Performance Metrics and Future Directions
- Author
-
Tilak Babu, S. B. G., Chintesh, I., Satyanarayana, V., Nandan, Durgesh, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Mallick, Pradeep Kumar, editor, Meher, Preetisudha, editor, Majumder, Alak, editor, and Das, Santos Kumar, editor
- Published
- 2020
- Full Text
- View/download PDF
44. 基于改进特征提取及融合模块的YOLOv3模型.
- Author
-
赵轩, 周凡, and 余汉成
- Subjects
- *
FEATURE extraction , *OBJECT recognition (Computer vision) , *PROBLEM solving , *DEEP learning , *A priori - Abstract
There is a certain optimization space for the feature extraction branch and multi-scale detection branch of YOLOv3 model. To solve this problem, this study proposes two structural improvement methods to improve the detection accuracy of the model on the target detection data set. For the three scales (13×13, 26×26, 52×52) of the YOLOv3 model, a priori anchor frames of different lengths and widths are used, and the label frames of the three scales are the same, and the feature fusion method between the design scales is used to improve the accuracy of the model. In view of the problem of convolutional layer spatial view sharing, the original convolutional layer can be replaced with deformable convolution to improve the accuracy of the model. The test result on the industrial tool library proves that the accuracy of the test set of the improved model is increased by 3.6 MAP when compared with the original YOLOv3. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Coal petrography extraction approach based on multiscale mixed-attention-based residual U-net.
- Author
-
Jin, Houxin, Cao, Le, Kan, Xiu, Sun, Weizhou, Yao, Wei, and Wang, Xialin
- Subjects
PETROLOGY ,COAL ,COKING coal ,IMAGE segmentation ,COKE (Coal product) ,FEATURE extraction ,COAL gasification - Abstract
Coal petrography extraction is crucial for the accurate analysis of coal reaction characteristics in coal gasification, coal coking, and metal smelting. Nevertheless, automatic extraction remains a challenging task because of the grayscale overlap between exinite and background regions in coal photomicrographs. Inspired by the excellent performance of neural networks in the image segmentation field, this study proposes a reliable coal petrography extraction method that achieves precise segmentation of coal petrography from the background regions. This method uses a novel semantic segmentation model based on Unet, referred to as M2AR-Unet. To improve the efficiency of network learning, the proposed M2AR-Unet framework takes Unet as a baseline and further optimizes the network structure in four ways, namely, an improved residual block composed of four units, a mixed attention module containing multiple attention mechanisms, an edge feature enhancement strategy, and a multiscale feature extraction module composed of a feature pyramid and atrous spatial pyramid pooling module. Compared to current state-of-the-art segmentation network models, the proposed M2AR-Unet offers improved coal petrography extraction integrity and edge extraction. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Surface Defect Detection of Solar Cells Based on Multiscale Region Proposal Fusion Network
- Author
-
Xiong Zhang, Ting Hou, Yawen Hao, Hong Shangguan, Anhong Wang, and Sichun Peng
- Subjects
Deep learning ,defects detection ,faster R-CNN ,multiscale fusion ,RPN ,solar cell ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Manufacturing process and human operational errors may cause small-sized defects, such as cracks, over-welding, and black edges, on solar cell surfaces. These surface defects are subtle and, therefore, difficult to observe and detect. Accurate detection and replacement of defective battery modules is necessary to ensure the energy conversion efficiency of solar cells. To improve the adaptability to the scale changes of various types of surface defects of solar cells, this study proposed a multi-feature region proposal fusion network (MF-RPN) structure to detect surface defects. In such a network, region proposals are extracted from different feature layers of convolutional neural networks. Additionally, considering that multiple aspect ratios and scale settings and the use of multiple RPNs, result in an overlap of candidate regions and lead to information redundancy, we designed a multiscale region proposal selection strategy (MRPSS) to reduce the number of region proposals and improve network accuracy. Owing to the complete learning of shallow-detail texture information and deep semantic information, our multiscale RPN fusion structure can effectively improve an object’s multiscale feature extraction ability for various scales and types of surface defects of solar cells. Experimental results demonstrate that our method outperforms other methods by achieving a higher detection accuracy.
- Published
- 2021
- Full Text
- View/download PDF
47. 基于 QBFM 矩和三维结构的图像哈希算法.
- Author
-
马林生 and 赵琰
- Subjects
- *
IMAGE fusion , *THREE-dimensional imaging , *IMAGE segmentation , *THREE-dimensional modeling , *ALGORITHMS , *PIXELS - Abstract
To enhance the performance of image classification and improve the accuracy and efficiency of copy detection, this paper proposed an image hash algorithm based on QBFM moments and three-dimensional structure. Firstly, it used normalization to process the color image, and obtained the Gaussian fusion image and Laplace fusion image through multi-scale fusion, then extracted the QBFM features of two fusion images. At the same time, it directly extracted gradient information of the Gaus-sian fusion image in the RGB color space and constructed a three-dimensional model, used the concave convex point information of peak and valley curve of gradient from different perspectives to obtain the three-dimensional local structure features. Then, it disposed the three-dimensional model of gradient image by equidistant segmentation, Counted the number of pixels and variance of each section as the three-dimensional global structure features. Finally, it combined the QBFM features and three-dimensional features of the image and scrambled to form the final hash sequence. Experimental results show that the algorithm has a better balance between robustness and discrimination. Compared with the existing hash algorithms, it has good image classification performance. In the copy detection experiment, the algorithm has the best recall and precision. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Attention-based for Multiscale Fusion Underwater Image Enhancement.
- Author
-
Zhixiong Huang, Jinjiang Li, and Zhen Hua
- Subjects
IMAGE enhancement (Imaging systems) ,IMAGE intensifiers ,IMAGE fusion ,IMAGE reconstruction ,LIGHT propagation - Abstract
Underwater images often suffer from color distortion, blurring and low contrast, which is caused by the propagation of light in the underwater environment being affected by the two processes: absorption and scattering. To cope with the poor quality of underwater images, this paper proposes a multiscale fusion underwater image enhancement method based on channel attention mechanism and local binary pattern (LBP). The network consists of three modules: feature aggregation, image reconstruction and LBP enhancement. The feature aggregation module aggregates feature information at different scales of the image, and the image reconstruction module restores the output features to high-quality underwater images. The network also introduces channel attention mechanism to make the network pay more attention to the channels containing important information. The detail information is protected by real-time superposition with feature information. Experimental results demonstrate that the method in this paper produces results with correct colors and complete details, and outperforms existing methods in quantitative metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Multiscale Fusion Signal Extraction for Spaceborne Photon-Counting Laser Altimeter in Complex and Low Signal-to-Noise Ratio Scenarios.
- Author
-
Nan, Yaming, Feng, Zhihui, Li, Bincheng, and Liu, Enhai
- Abstract
Extracting signal photons from noisy raw data is one of the critical processes for the new generation of spaceborne photon-counting laser altimeter. Affected by vast noise photon-counting events, the extraction of weak signal events still faces challenges in complex scenarios with low signal-to-noise ratio (SNR). Aiming to improve the extraction ability of signal photon events in these scenarios, a multiscale fusion signal extraction method was proposed, characterized by combining global spatial correlation constraint with optimized local spatial correlation constraint. The local constraint is implemented based on a density-based spatial clustering of applications with noise (DBSCAN) clustering method with adaptive parameter estimation, which is used to extract possible signal photons. A subsequent global constraint based on the spatial correlation of the terrain profiles is designed to remove the pseudo-signal photons clustered in the local constraints’ step. The global constraint is implemented based on a cost function, which is used to quantify different candidate paths. Our method was verified based on the actual Ice, Cloud, and land Elevation satellite-(ICESat2) data containing vegetation, mountains, and residential areas. The experimental results show that compared with the ICESat-2 extraction method, our method can significantly improve the precision and recall rate of signal photon events from the low SNR photon-counting data. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Progressive Back-Traced Dehazing Network Based on Multi-Resolution Recurrent Reconstruction
- Author
-
Qiaosi Yi, Aiwen Jiang, Juncheng Li, Jianyi Wan, and Mingwen Wang
- Subjects
Image dehaze ,image enhancement ,multiscale fusion ,haze removal ,image restoration ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In order to alleviate adverse impacts of haze on high-level vision tasks, image dehazing attracts great attention from computer vision research field in recent years. Most of existing methods are grouped into physical prior based and non-physical data-driven based categories. However, image dehazing is a challenging ill-conditioned and inherently ambiguous problem. Due to random distribution and concentration of haze, color distortion and excessive brightness often happen in physical prior based methods. Defects on high-frequency details' recovery are not solved well in non-physical data-driven methods. Therefore, to overcome these obstacles, in this paper, we have proposed an effective progressive back-traced dehazing network based on multi-resolution recurrent reconstruction strategies. A kind of irregular multi-scale convolution module is proposed to extract fine-grain local structures. And a kind of multi-resolution residual fusion module is proposed to progressively reconstruct intermediate haze-free images. We have compared our method with several popular state-of-the-art methods on public RESIDE and 2018 NTIRE Dehazing datasets. The experiment results demonstrate that our method could restore satisfactory high-frequency textures and high-fidelity colors. Related source code and parameters will be distributed on Github for further study.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.