126 results on '"residual module"'
Search Results
2. An unpaired SAR-to-optical image translation method based on Schrödinger bridge network and multi-scale feature fusion
- Author
-
Jinyu Wang, Haitao Yang, Yu He, Fengjie Zheng, Zhengjun Liu, and Hang Chen
- Subjects
SAR-to-optical translation ,Schrödinger's bridge ,Residual module ,Axial attention ,Medicine ,Science - Abstract
Abstract SAR-to-optical (S2O) translation is able to covert SAR into optical images, which help the interpreter to extract information efficiently. In the absence of strictly matched datasets, it is difficult for existing methods to complete training on unpaired data with a minimum amount of data. By employing the recent Schrödinger bridge-based transformation framework, a multiscale axial residual module (MARM) based on the concept of multi-scale feature fusion has been proposed in this paper. To enable efficient translation of SAR to optical images, the generator and discriminator of the model have been designed. Extensive experiments on the SEN1-2 dataset conducted, and the results show the superiority of the proposed method in terms of the generation quality. Compared with the classical CycleGAN, the proposed method can improve the FID metrics by 42.05%.
- Published
- 2024
- Full Text
- View/download PDF
3. Dynamic Gesture Recognition Based on 3D Central Difference Separable Residual LSTM Coordinate Attention Networks
- Author
-
Chen, Jie, Tie, Yun, Qi, Lin, Liang, Chengwu, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Qingshan, editor, Wang, Hanzi, editor, Ma, Zhanyu, editor, Zheng, Weishi, editor, Zha, Hongbin, editor, Chen, Xilin, editor, Wang, Liang, editor, and Ji, Rongrong, editor
- Published
- 2024
- Full Text
- View/download PDF
4. Super-resolution reconstruction algorithm for dim and blurred traffic sign images in complex environments
- Author
-
Yan Ma and Defeng Kong
- Subjects
traffic sign images ,super-resolution reconstruction ,residual module ,multi-layer feature fusion ,sub-pixel convolution ,Mathematics ,QA1-939 - Abstract
In poor lighting and rainy and foggy bad weather environments, road traffic signs are blurred and have low recognition, etc. A super-resolution reconstruction algorithm for complex lighting and bad weather traffic sign images was proposed. First, a novel attention residual module was designed to incorporate an aggregated feature attention mechanism on the jump connection side of the base residual module so that the deep network can obtain richer detail information; second, a cross-layer jump connection feature fusion mechanism was adopted to enhance the flow of information across layers as well as to prevent the problem of gradient disappearance of the deep network to enhance the reconstruction of the edge detail information; and lastly, a positive-inverse dual-channel sub-pixel convolutional up-sampling method was designed to reconstruct super-resolution images to obtain better pixel and spatial information expression. The evaluation model was trained on the Chinese traffic sign dataset in a natural scene, and when the scaling factor is 4, the average values of PSNR and SSIM are improved by 0.031 when compared with the latest release of the deep learning-based super-resolution reconstruction algorithm for single-frame images, MICU (Multi-level Information Compensation and U-net), the average values of PSNR and SSIM are improved by 0.031 dB and 0.083, and the actual test average reaches 20.946 dB and 0.656. The experimental results show that the reconstructed image quality of this paper's algorithm is better than the mainstream algorithms of comparison in terms of objective indexes and subjective feelings. The super-resolution reconstructed image has a higher peak signal-to-noise ratio and perceptual similarity. It can provide certain technical support for the research of safe driving assistive devices in natural scenes under multi-temporal varying illumination conditions and bad weather.
- Published
- 2024
- Full Text
- View/download PDF
5. IAE-KM3D a 3D Object Detection Method Based on an Improved KM3D Network.
- Author
-
Sun, Yang, Li, Song, Wang, Haiyang, Tian, Bin, and Li, Yi
- Subjects
OBJECT recognition (Computer vision) ,DEEP learning ,PROBLEM solving - Abstract
Deep learning-based 3D target detection methods need to solve the problem of insufficient 3D target detection accuracy. In this paper, the KM3D network is selected as the benchmark network after the experimental comparison of current mainstream algorithms, and the IAE-KM3D network algorithm based on the KM3D network is proposed. First, the Resnet V2 network is introduced, and the residual module is redesigned to improve the training capability of the new residual module with higher generalization. IBN NET is then introduced to carefully integrate instance normalization and batch normalization as building blocks to improve the model's detection accuracy in hue- and brightness-changing scenarios without increasing time loss. Then, a parameter-free attention mechanism, Simam, is introduced to improve the detection accuracy of the model. After that, the elliptical Gaussian kernel is introduced to improve the algorithm's ability to detect 3D targets. Finally, a new key point loss function is proposed to improve the algorithm's ability to train. Experiments using the KITTI dataset conclude that the IAE-KM3D network model significantly improves detection accuracy and outperforms the KM3D algorithm regarding detection performance compared to the original KM3D network. The improvements for AP
2D , AP3D , and APBEV are 5%, 12.5%, and 8.3%, respectively, and only a tiny amount of time loss and network parameters are added. Compared with other mainstream target detection algorithms, Monn3D, 3DOP, GS3D, and FQNet, the improved IAE-KM3D network in this paper significantly improves AP3D and APBEV , with fewer network parameters and shorter time consumption. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
6. Underwater Coherent Source Direction-of-Arrival Estimation Method Based on PGR-SubspaceNet.
- Author
-
Guo, Tuo, Xu, Yunyan, Bi, Yang, Ding, Shaochun, and Huang, Yong
- Subjects
DIRECTION of arrival estimation ,UNDERWATER acoustics ,SIGNAL-to-noise ratio ,STANDARD deviations ,COMPLEX variables - Abstract
In the field of underwater acoustics, the signal-to-noise ratio (SNR) is generally low, and the underwater environment is complex and variable, making target azimuth estimation highly challenging. Traditional model-based subspace methods exhibit significant performance degradation when dealing with coherent sources, low SNR, and small snapshot data. To overcome these limitations, an improved model based on SubspaceNet, called PConv-GAM Residual SubspaceNet (PGR-SubspaceNet), is proposed. This model embeds the global attention mechanism (GAM) into residual blocks that fuse PConv convolution, making it possible to capture richer cross-channel and positional information. This enhancement helps the model learn signal features in complex underwater conditions. Simulation results demonstrate that the underwater target azimuth estimation method based on PGR-SubspaceNet exhibits lower root mean square periodic error (RMSPE) values when handling different numbers of narrowband coherent sources. Under low SNR and limited snapshot conditions, its RMSPE values are significantly better than those of traditional methods and SubspaceNet-based enhanced subspace methods. PGR-SubspaceNet extracts more features, further improving the accuracy of direction-of-arrival estimation. Preliminary experiments in a pool validate the effectiveness and feasibility of the underwater target azimuth estimation method based on PGR-SubspaceNet. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. EDM: a enhanced diffusion models for image restoration in complex scenes
- Author
-
Wen, JiaYan, Zhuang, YuanSheng, and Deng, JunYi
- Published
- 2024
- Full Text
- View/download PDF
8. Effective Deep Learning‐Based Infrared Spectral Gas Identification Method.
- Author
-
Wang, Zhikang and Zhao, Guodong
- Subjects
- *
PARAMETRIC modeling , *GASES , *COMPARATIVE method , *INFRARED spectroscopy - Abstract
In order to detect infrared spectral gas components fast and correctly, an improved dilation residual module is proposed in this study by substituting the classic convolution module with the dilation convolution to have a broad receptive field. Based on the residual network, an efficient and effective dilation residual network called DA‐Resnet12 is developed for infrared spectral gas identification by reducing the size of the convolution kernel and the number of dilation convolution modules. The classification accuracy, training duration, and model parametric size are employed as assessment indices. The experimental results reveal that the proposed DA‐ResNet12 network outperforms other comparative methods in terms of model parameter number, accuracy, and time efficiency, proving the efficacy and efficiency of the proposed DA‐ResNet12 network model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Optic disc and cup segmentation for glaucoma detection using Attention U-Net incorporating residual mechanism.
- Author
-
Yuanyuan Chen, Yongpeng Bai, and Yifan Zhang
- Subjects
OPTIC disc ,ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,GLAUCOMA ,ARTIFICIAL intelligence ,EYE diseases - Abstract
Glaucoma is a common eye disease that can cause blindness. Accurate detection of the optic disc and cup disc is crucial for glaucoma diagnosis. Algorithm models based on artificial intelligence can assist doctors in improving detection performance. In this article, U-Net is used as the backbone network, and the attention and residual modules are integrated to construct an end-to-end convolutional neural network model for optic disc and cup disc segmentation. The U-Net backbone is used to infer the basic position information of optic disc and cup disc, the attention module enhances the model's ability to represent and extract features of optic disc and cup disc, and the residual module alleviates gradient disappearance or explosion that may occur during feature representation of the neural network. The proposed model is trained and tested on the DRISHTI-GS1 dataset. Results show that compared with the original U-Net method, our model can more effectively separate optic disc and cup disc in terms of overlap error, sensitivity, and specificity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Shell‐Net: A robust deep neural network for the joint segmentation of retinal fragments.
- Author
-
Pappu, Geetha Pavani, Uppudi, Prashanth Gowri Shankar, Biswal, Birendra, Kandula, Srinivasa Rao, Dhavala, Meher Savedasa, Potturi, Giri Madhav, Sharat, Paila Sai, Polapragada, Sridevi, and Datti, Nagadhara Harini
- Subjects
- *
RETINAL blood vessels , *DEEP learning , *OPTIC disc , *RETINAL imaging , *ARCHITECTURAL style , *IMAGE databases , *DIABETIC retinopathy - Abstract
Segmentation of retinal fragments like blood vessels, optic disc (OD), and optic cup (OC) enables the early detection of different retinal pathologies like diabetic retinopathy (DR), glaucoma, etc. This article proposed a novel deep learning architecture termed as Shell‐Net for the accurate segmentation of the retinal fragments. The main novelty of the architecture relies on the intellectual fusion of two different styled networks for attaining better segmentation results. The lower part of the Shell‐Net (feature condenser) follows the down‐sampling and up‐sampling style, whereas the upper part of the network (feature amplifier) follows the up‐sampling and down‐sampling style of architecture. In addition to this, an additional residual module (feature stabilizer) is integrated with the network to achieve more spatial information from lower levels. The lower part of the network reduces the data through summarization, enabling much scope for precise extraction of heavy details such as thick vessels, OD, and OC. On the contrary, the upper part of the network augments the data using duplication, assisting in the enlargement of minuscule details such as the tiny vessels and boundaries. Experiments were performed on publicly available datasets like Digital Retinal Images for Vessel Extraction (DRIVE), Child Heart and Health Study in England (CHASE_DB1), Structured Analysis of the Retina (STARE), Online Retinal Fundus Image Dataset for Glaucoma Analysis and Research (ORIGA), DRISHTI‐GS1, and Retinal Image database for Optic Nerve Evaluation (RIMONE r1). The network accomplished an average accuracy (ACC) and specificity (SPE) of 0.96 and 0.98 when tested on DRIVE, CHASE_DB1, and STARE datasets respectively for vessel segmentation. Furthermore, it outperformed previously existing models in OD and OC segmentation by achieving an average accuracy of 0.98 with a specificity of 0.99 on the DRISHTI_GS1 dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. An occluded cherry tomato recognition model based on improved YOLOv7.
- Author
-
Guangyu Hou, Haihua Chen, Yike Ma, Mingkun Jiang, Chen Hua, Chunmao Jiang, and Runxin Niu
- Subjects
TOMATOES ,CONVOLUTIONAL neural networks ,MATHEMATICAL convolutions ,CHERRIES ,FEATURE extraction ,OBJECT recognition (Computer vision) - Abstract
The typical occlusion of cherry tomatoes in the natural environment is one of the most critical factors affecting the accurate picking of cherry tomato picking robots. To recognize occluded cherry tomatoes accurately and efficiently using deep convolutional neural networks, a new occluded cherry tomato recognition model DSP-YOLOv7-CA is proposed. Firstly, images of cherry tomatoes with different degrees of occlusion are acquired, four occlusion areas and four occlusion methods are defined, and a cherry tomato dataset (TOSL) is constructed. Then, based on YOLOv7, the convolution module of the original residual edges was replaced with null residual edges, depth-separable convolutional layers were added, and jump connections were added to reuse feature information. Then, a depthseparable convolutional layer is added to the SPPF module with fewer parameters to replace the original SPPCSPC module to solve the problem of loss of small target information by different pooled residual layers. Finally, a coordinate attention mechanism (CA) layer is introduced at the critical position of the enhanced feature extraction network to strengthen the attention to the occluded cherry tomato. The experimental results show that the DSP-YOLOv7-CA model outperforms other target detection models, with an average detection accuracy (mAP) of 98.86%, and the number of model parameters is reduced from 37.62MB to 33.71MB, which is better on the actual detection of cherry tomatoes with less than 95% occlusion. Relatively average results were obtained on detecting cherry tomatoes with a shade level higher than 95%, but such cherry tomatoes were not targeted for picking. The DSP-YOLOv7-CA model can accurately recognize the occluded cherry tomatoes in the natural environment, providing an effective solution for accurately picking cherry tomato picking robots. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. A Symmetrical Approach to Brain Tumor Segmentation in MRI Using Deep Learning and Threefold Attention Mechanism.
- Author
-
Rahman, Ziaur, Zhang, Ruihong, and Bhutto, Jameel Ahmed
- Subjects
- *
DEEP learning , *BRAIN tumors , *COMPUTER-aided diagnosis , *MAGNETIC resonance imaging , *IMAGE segmentation , *BRAIN imaging - Abstract
The symmetrical segmentation of brain tumor images is crucial for both clinical diagnosis and computer-aided prognosis. Traditional manual methods are not only asymmetrical in terms of efficiency but also prone to errors and lengthy processing. A significant barrier to the process is the complex interplay between the deep learning network for MRI brain tumor imaging and the harmonious compound of both local and global feature information, which can throw off the balance in segmentation accuracy. Addressing this asymmetry becomes essential for precise diagnosis. In answer to this challenge, we introduce a balanced, end-to-end solution for brain tumor segmentation, incorporating modifications that mirror the U-Net architecture, ensuring a harmonious flow of information. Beginning with symmetric enhancement of the visual quality of MRI brain images, we then apply a symmetrical residual structure. By replacing the convolutional modules in both the encoder and decoder sections with deep residual modules, we establish a balance that counters the vanishing gradient problem commonly faced when the network depth increases. Following this, a symmetrical threefold attention block is integrated. This addition ensures a balanced fusion of local and global image features, fine-tuning the network to symmetrically discern and learn essential image characteristics. This harmonious integration remarkably amplifies the network's precision in segmenting MRI brain tumors. We further validate the equilibrium achieved by our proposed model using three brain tumor segmentation datasets and four metrics and by juxtaposing our model against 21 traditional and learning-based counterparts. The results confirm that our balanced approach significantly elevates performance in the segmentation of MRI brain tumor images without an asymmetrical increase in computational time. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Online fast localization method of UAVs based on heterologous image matching.
- Author
-
SUI Haigang, LI Jiajie, and GOU Guohua
- Subjects
IMAGE registration ,DRONE aircraft ,ARTIFICIAL satellites in navigation ,SATELLITE positioning - Abstract
Satellite navigation and positioning is one of the critical modules to ensure unmanned aerial vehicle (UAV) flight safety. When the satellite signal is weak or interfered, failure location will affect or even endanger the normal flight of UAVs. Vision-based methods can be used to locate UAVs with image matching. However, current image matching methods are unable to extract robust features due to the great difference between heterogenous images in time, which results in inadequate accuracy and efficiency. This paper proposes a fast feature matching method, which uses residual network to extract multiscale robust features and accelerates coarse matching of low-resolution feature maps using minimum Euclidean distance. The module above is used on high-resolution feature maps to achieve secondary fine matching. The homography matrix is used to correct matching pairs and achieve UAVs localization. The proposed method improves robustness and efficiency of localization for UAVs. The experimental results with Wuhan Suburb actural dataset show that the average accuracy of the proposed method is 2. 86 m, which is 2. 1% higher than the current typical matching algorithm. This method has obvious advantages on localization robustness and computing speed, which completes all images matching and localization on Jetson Xavier NX, while the localization frequency is optimal with the frequency of 1 Hz. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. 基于改进 U-Net 的不同容重小麦籽粒识别检测.
- Author
-
吕宗旺, 王玉琦, and 孙福艳
- Subjects
WHEAT - Abstract
Copyright of Journal of Henan Agricultural Sciences is the property of Editorial Board of Journal of Henan Agricultural Sciences and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
15. An Anchor-Aware Graph Autoencoder Fused with Gini Index Model for Link Prediction
- Author
-
Kumar, Shambhu, Bisht, Dinesh, and Jain, Arti
- Published
- 2024
- Full Text
- View/download PDF
16. An automatic power line inspection method based on an improved SegNet network
- Author
-
YANG Jian, LI Jian, and XU Shuo
- Subjects
uav inspection ,deep learning ,improved segnet ,residual module ,asymmetric convolution ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
UAVs (unmanned aerial vehicles) are now involved in intelligent transmission line inspection. Given the complex image backgrounds captured by UAVs and poor line inspection accuracy and low detection speed of UAVs, the paper proposes a power line inspection algorithm based on an improved SegNet model. Firstly, residual modules and asymmetric convolutions are introduced into the encoder to reduce the computational burden on the network. Secondly, the network layers of the decoding layer are reduced, and the features of the encoder and decoder are fused to improve inspection accuracy. Finally, the improved SegNet algorithm is used to train the power line dataset. The accuracy and mean intersection over union reach up to 89.4% and 86.62% respectively, and the single detection time is 46 ms. The experimental results show that the algorithm based on the improved SegNet model can achieve high-precision and real-time power line detection.
- Published
- 2023
- Full Text
- View/download PDF
17. Gesture recognition of continuous wavelet transform and deep convolution attention network
- Author
-
Xiaoguang Liu, Mingjin Zhang, Jiawei Wang, Xiaodong Wang, Tie Liang, Jun Li, Peng Xiong, and Xiuling Liu
- Subjects
semg ,gesture recognition ,continuous wavelet transform ,dcnn ,sam ,residual module ,Biotechnology ,TP248.13-248.65 ,Mathematics ,QA1-939 - Abstract
To solve the problem of missing data features using a deep convolutional neural network (DCNN), this paper proposes an improved gesture recognition method. The method first extracts the time-frequency spectrogram of surface electromyography (sEMG) using the continuous wavelet transform. Then, the Spatial Attention Module (SAM) is introduced to construct the DCNN-SAM model. The residual module is embedded to improve the feature representation of relevant regions, and reduces the problem of missing features. Finally, experiments with 10 different gestures are done for verification. The results validate that the recognition accuracy of the improved method is 96.1%. Compared with the DCNN, the accuracy is improved by about 6 percentage points.
- Published
- 2023
- Full Text
- View/download PDF
18. IAE-KM3D a 3D Object Detection Method Based on an Improved KM3D Network
- Author
-
Yang Sun, Song Li, Haiyang Wang, Bin Tian, and Yi Li
- Subjects
residual module ,instance normalization ,Simam attention ,Gaussian kernel ,key point loss function ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Deep learning-based 3D target detection methods need to solve the problem of insufficient 3D target detection accuracy. In this paper, the KM3D network is selected as the benchmark network after the experimental comparison of current mainstream algorithms, and the IAE-KM3D network algorithm based on the KM3D network is proposed. First, the Resnet V2 network is introduced, and the residual module is redesigned to improve the training capability of the new residual module with higher generalization. IBN NET is then introduced to carefully integrate instance normalization and batch normalization as building blocks to improve the model’s detection accuracy in hue- and brightness-changing scenarios without increasing time loss. Then, a parameter-free attention mechanism, Simam, is introduced to improve the detection accuracy of the model. After that, the elliptical Gaussian kernel is introduced to improve the algorithm’s ability to detect 3D targets. Finally, a new key point loss function is proposed to improve the algorithm’s ability to train. Experiments using the KITTI dataset conclude that the IAE-KM3D network model significantly improves detection accuracy and outperforms the KM3D algorithm regarding detection performance compared to the original KM3D network. The improvements for AP2D, AP3D, and APBEV are 5%, 12.5%, and 8.3%, respectively, and only a tiny amount of time loss and network parameters are added. Compared with other mainstream target detection algorithms, Monn3D, 3DOP, GS3D, and FQNet, the improved IAE-KM3D network in this paper significantly improves AP3D and APBEV, with fewer network parameters and shorter time consumption.
- Published
- 2024
- Full Text
- View/download PDF
19. 基于 SK 注意力残差网络的水下图像增强.
- Author
-
陈海秀 and 刘磊
- Abstract
In order to solve the problems of color distortion, key information blur and detail loss perplexed underwater image, an underwater image enhancement method based on SK attention residual network is proposed. The generator structure in the generative adversarial network is improved, and a residual module is introduced to reduce the feature loss between encoder and decoder, thus enhance the image detail and color. To make the network adapt to different scale feature maps to extract key information of images, the SK attention mechanism is added after the residual module. Meanwhile, a parametric rectified linear unit is used to improve the fitting ability of the network. This method is verified on real and synthetic underwater image datasets, and traditional method and deep learning method are used for subjective and objective evaluations. In the subjective effect analysis, it is found that the color, key information and detail features have been greatly improved in enhanced images. In the objective evaluation, it is found that the indicator values of the proposed method are higher than those of existing underwater image enhancement algorithms, which verifies the effectiveness of this method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. AFU-NET: A NOVEL U-NET NETWORK FOR RICE LEAF DISEASE SEGMENTATION.
- Author
-
Le Yang, Huanhuan Zhang, Zhengkang Zuo, Jun Peng, Xiaoyun Yu, Huibin Long, and Yuanjun Liao
- Abstract
Rice diseases adversely affect rice growth and yield. Precise spot segmentation helps to assess the severity of the disease so that appropriate control measures can be taken. In this article, we propose a segmentation method called AFU-Net for rice leaf diseases, and its performance is verified through experiments. Based on the traditional UNet, this method incorporates an attention mechanism, a residual module and a feature fusion module (FFM). The attention mechanism is embedded in skip connections, which enhances the learning of particular semantic features in the encoder layer. In addition, the residual module is integrated into the decoder layer, which deepens the network and enables it to extract richer semantic information. The proposed FFM structure effectively enhances the learning of boundary information and local detail features. The experimental results show that the mean intersection over union (mIoU), mean pixel accuracy (mPA) and Precision of the proposed model on the self-built rice leaf disease segmentation dataset are 87.25%, 92.23%, and 99.67%, respectively. All three evaluation indexes were improved over the control group, while the proposed model had the lowest number of parameters and displayed a good segmentation effect for smaller disease points and disease parts with less obvious characteristics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. DCTR U-Net: automatic segmentation algorithm for medical images of nasopharyngeal cancer in the context of deep learning.
- Author
-
Yan Zeng, PengHui Zeng, ShaoDong Shen, Wei Liang, Jun Li, Zhe Zhao, Kun Zhang, and Chong Shen
- Subjects
DEEP learning ,NASOPHARYNX cancer ,COMPUTER-assisted image analysis (Medicine) ,DIAGNOSTIC imaging ,SIGNAL convolution ,ALGORITHMS - Abstract
Nasopharyngeal carcinoma (NPC) is a malignant tumor that occurs in the wall of the nasopharyngeal cavity and is prevalent in Southern China, Southeast Asia, North Africa, and the Middle East. According to studies, NPC is one of the most common malignant tumors in Hainan, China, and it has the highest incidence rate among otorhinolaryngological malignancies. We proposed a new deep learning network model to improve the segmentation accuracy of the target region of nasopharyngeal cancer. Our model is based on the U-Net-based network, to which we add Dilated Convolution Module, Transformer Module, and Residual Module. The new deep learning network model can effectively solve the problem of restricted convolutional fields of perception and achieve global and local multi-scale feature fusion. In our experiments, the proposed network was trained and validated using 10-fold cross-validation based on the records of 300 clinical patients. The results of our network were evaluated using the dice similarity coefficient (DSC) and the average symmetric surface distance (ASSD). The DSC and ASSD values are 0.852 and 0.544 mm, respectively. With the effective combination of the Dilated Convolution Module, Transformer Module, and Residual Module, we significantly improved the segmentation performance of the target region of the NPC. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. IRDC-Net: An Inception Network with a Residual Module and Dilated Convolution for Sign Language Recognition Based on Surface Electromyography.
- Author
-
Wang, Xiangrui, Tang, Lu, Zheng, Qibin, Yang, Xilin, and Lu, Zhiyuan
- Subjects
- *
SIGN language , *CONVOLUTIONAL neural networks , *MATHEMATICAL convolutions , *FOURIER transforms , *CHINESE language , *DEAF people - Abstract
Deaf and hearing-impaired people always face communication barriers. Non-invasive surface electromyography (sEMG) sensor-based sign language recognition (SLR) technology can help them to better integrate into social life. Since the traditional tandem convolutional neural network (CNN) structure used in most CNN-based studies inadequately captures the features of the input data, we propose a novel inception architecture with a residual module and dilated convolution (IRDC-net) to enlarge the receptive fields and enrich the feature maps, applying it to SLR tasks for the first time. This work first transformed the time domain signal into a time–frequency domain using discrete Fourier transformation. Second, an IRDC-net was constructed to recognize ten Chinese sign language signs. Third, the tandem CNN networks VGG-net and ResNet-18 were compared with our proposed parallel structure network, IRDC-net. Finally, the public dataset Ninapro DB1 was utilized to verify the generalization performance of the IRDC-net. The results showed that after transforming the time domain sEMG signal into the time–frequency domain, the classification accuracy (acc) increased from 84.29% to 91.70% when using the IRDC-net on our sign language dataset. Furthermore, for the time–frequency information of the public dataset Ninapro DB1, the classification accuracy reached 89.82%; this value is higher than that achieved in other recent studies. As such, our findings contribute to research into SLR tasks and to improving deaf and hearing-impaired people's daily lives. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Cross-View Gait Recognition Model Combining Multi-Scale Feature Residual Structure and Self-Attention Mechanism
- Author
-
Jingxue Wang, Jun Guo, and Zhenghui Xu
- Subjects
Cross-view ,gait recognition ,residual module ,self-attention mechanism ,two-channel networks ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In the cross-view condition, the gait recognition rate caused by the vastly different gait silhouette maps is substantially reduced. To improve the accuracy of gait recognition under cross-view conditions, this paper proposes a cross-view gait recognition network model combining multi-scale feature residual module (MFRM) and self-attention (SA) mechanism based on Generative Adversarial Network (GAN). First, the local and global feature information in the input gait energy image is fully extracted using the MFRM. Then, the SA mechanism module is used to adjust the information of channel dimensions and capture the association between feature information and is introduced into both the generator and discriminator. Next, the model is trained using a two-channel network training strategy to avoid the pattern collapse problem during training. Finally, the generator and discriminator are optimized to improve the quality of the generated gait images. This paper conducts experiments using the CASIA-B and OU-MVLP public datasets. The experiments demonstrate that the MFRM can better obtain the local and global feature information of the images. The SA mechanism module can effectively establish global dependencies between features, so that the generated gait images have clearer and richer detail information. The average Rank-1 recognition accuracies of the results in this paper reach 91.1% and 97.8% on the two datasets respectively, which are both better than the current commonly used algorithms, indicating that the network model in this paper can well improve the gait recognition accuracy across perspectives.
- Published
- 2023
- Full Text
- View/download PDF
24. Animal Pose Estimation Algorithm Based on the Lightweight Stacked Hourglass Network
- Author
-
Wenwen Zhang, Yang Xu, Rui Bai, and Li Li
- Subjects
Animal pose estimation ,stacked hourglass networks ,lightweight ,residual module ,feature fusion ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Pose estimation has been a hot topic in the field of machine vision in recent years. Animals exist widely in nature, and the analysis of their shape and movement is important in many fields and industries. In the pose estimation task, to improve the detection accuracy, the existing models often need to consume a lot of computing and memory resources. Therefore, it is a key problem for the pose estimation methods to carry out a lightweight model and reduce the computational overhead on the premise of ensuring model accuracy. In this paper, we focus on the structure of the convolutional neural network in animal pose estimation, construct a lightweight and efficient stacked hourglass network model oriented to optimize the balance of model computation and accuracy, and implement the application algorithm design based on it. Aiming at the problem of large parameters in depthwise convolutional neural networks, a lightweight residual module is proposed, that is, based on the lightweight efficient channel attention improved conditional channel-weighted method (ICCW-Bottle), thereby reducing the weight of the network and obtaining the feature information of different scales. Given the problem that a large amount of feature information is easily lost after the network pooling operation, a lightweight dual-branch fusion module is proposed that fully integrates high-level semantic information and low-level detailed features under the condition of a small number of parameters. Finally, the same as the CC-SSL method: the model is trained jointly using synthetic and real animal datasets, but the CC-SSL method does not take into account the computational power of the model, which consumes a lot of time and memory to run. Through experiments, it is known that compared with the CC-SSL method, the PCK@0.05 of this method is increased by 5.5% on the TigDog dataset. The model in this paper reduces the number of parameters and calculations of the network while ensuring less information loss and model accuracy. The ablation experiment verifies the advancement and effectiveness of the overall network.
- Published
- 2023
- Full Text
- View/download PDF
25. Sign Language Recognition Based on Residual Network
- Author
-
Li, Xuebin, Zhao, Qinjun, Song, Shuaibo, Shen, Tao, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Liu, Qi, editor, Liu, Xiaodong, editor, Cheng, Jieren, editor, Shen, Tao, editor, and Tian, Yuan, editor
- Published
- 2022
- Full Text
- View/download PDF
26. Alzheimer’s Disease Classification Based on Improved 3D Convolutional Neural Network
- Author
-
Hu, Zhongyi, Wu, Qi, Jin, Shan, Lu, Xingjin, Chen, Changzu, Xiao, Lei, Gao, Libin, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Jia, Yingmin, editor, Zhang, Weicun, editor, Fu, Yongling, editor, Yu, Zhiyuan, editor, and Zheng, Song, editor
- Published
- 2022
- Full Text
- View/download PDF
27. Automated Bladder Lesion Segmentation Based on Res-Unet
- Author
-
Liang, Yinglu, Zhang, Qi, Liu, Yichen, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Jia, Yingmin, editor, Zhang, Weicun, editor, Fu, Yongling, editor, Yu, Zhiyuan, editor, and Zheng, Song, editor
- Published
- 2022
- Full Text
- View/download PDF
28. 基于改进SegNet的电力线自动检测方法.
- Author
-
杨 坚, 李 剑, and 徐 硕
- Subjects
ELECTRIC lines ,DEEP learning ,ALGORITHMS ,DRONE aircraft ,VIDEO coding - Abstract
Copyright of Zhejiang Electric Power is the property of Zhejiang Electric Power Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
29. Residual Depth Feature-Extraction Network for Infrared Small-Target Detection.
- Author
-
Wang, Lizhe, Zhang, Yanmei, Xu, Yanbing, Yuan, Ruixin, and Li, Shengyun
- Subjects
FEATURE extraction ,INFRARED imaging ,SIGNAL-to-noise ratio - Abstract
Deep-learning methods have exhibited exceptional performance in numerous target-detection domains, and their application is steadily expanding to include infrared small-target detection as well. However, the effect of existing deep-learning methods is weakened due to the lack of texture information and the low signal-to-noise ratio of infrared small-target images. To detect small targets in infrared images with limited information, a depth feature-extraction network based on a residual module is proposed in this paper. First, a global attention guidance enhancement module (GAGEM) is used to enhance the original infrared small target image in a single frame, which considers the global and local features. Second, this paper proposes a depth feature-extraction module (DFEM) for depth feature extraction. Our IRST-Involution adds the attention mechanism to the classic Involution module and combines it with the residual module for the feature extraction of the backbone network. Finally, the feature pyramid with self-learning weight parameters is used for feature fusion. The comparative experiments on three public datasets demonstrate that our proposed infrared small-target detection algorithm exhibits higher detection accuracy and better robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Facial expression recognition based on strong attention mechanism and residual network.
- Author
-
Qian, Zhizhe, Mu, Jing, Tian, Feng, Gao, Zhiyu, and Zhang, Jie
- Subjects
FACIAL expression ,ATTENTION ,FEATURE extraction - Abstract
Most facial expression recognition (FER) algorithms are based on shallow features, and the deep networks tend to lose some key features in the expression, such as eyes, nose and mouth. To address the limitations, we present in this paper a novel approach, named CBAM-Global-Efficient Channel Attention-ResNet (C-G-ECA-R). C-G-ECA-R combines a strong attention mechanism and residual network. The strong attention enhances the extraction of important features of expressions by embedding the channel and spatial attention mechanism before and after the residual module. The addition of Global-Efficient Channel Attention (G-ECA) into the residual module strengthens the extraction of key features and reduces the loss of facial information. The extensive experiments have been conducted on two publicly available datasets, Extended Cohn-Kanade and Japanese Female Facial Expression. The results demonstrate that our proposed C-G-ECA-R, especially under ResNet34, has achieved 98.98% and 97.65% accuracy, respectively for the two datasets, that are higher than the state-of-arts. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Multi-level semantic information guided image generation for few-shot steel surface defect classification
- Author
-
Liang Hao, Pei Shen, Zhiwei Pan, and Yong Xu
- Subjects
few-shot steel surface defect classification ,adversarial learning ,residual module ,multi-level semantic feature extractor ,Wasserstein divergence ,Physics ,QC1-999 - Abstract
Surface defect classification is one of key points in the field of steel manufacturing. It remains challenging primarily due to the rare occurrence of defect samples and the similarity between different defects. In this paper, a multi-level semantic method based on residual adversarial learning with Wasserstein divergence is proposed to realize sample augmentation and automatic classification of various defects simultaneously. Firstly, the residual module is introduced into model structure of adversarial learning to optimize the network structure and effectively improve the quality of samples generated by model. By substituting original classification layer with multiple convolution layers in the network framework, the feature extraction capability of model is further strengthened, enhancing the classification performance of model. Secondly, in order to better capture different semantic information, we design a multi-level semantic extractor to extract rich and diverse semantic features from real-world images to efficiently guide sample generation. In addition, the Wasserstein divergence is introduced into the loss function to effectively solve the problem of unstable network training. Finally, high-quality defect samples can be generated through adversarial learning, effectively expanding the limited training samples for defect classification. The experimental results substantiate that our proposed method can not only generate high-quality defect samples, but also accurately achieve the classification of defect detection samples.
- Published
- 2023
- Full Text
- View/download PDF
32. A Symmetrical Approach to Brain Tumor Segmentation in MRI Using Deep Learning and Threefold Attention Mechanism
- Author
-
Ziaur Rahman, Ruihong Zhang, and Jameel Ahmed Bhutto
- Subjects
symmetrical brain tumor segmentation ,threefold attention block ,biomedical images ,deep learning ,residual module ,Mathematics ,QA1-939 - Abstract
The symmetrical segmentation of brain tumor images is crucial for both clinical diagnosis and computer-aided prognosis. Traditional manual methods are not only asymmetrical in terms of efficiency but also prone to errors and lengthy processing. A significant barrier to the process is the complex interplay between the deep learning network for MRI brain tumor imaging and the harmonious compound of both local and global feature information, which can throw off the balance in segmentation accuracy. Addressing this asymmetry becomes essential for precise diagnosis. In answer to this challenge, we introduce a balanced, end-to-end solution for brain tumor segmentation, incorporating modifications that mirror the U-Net architecture, ensuring a harmonious flow of information. Beginning with symmetric enhancement of the visual quality of MRI brain images, we then apply a symmetrical residual structure. By replacing the convolutional modules in both the encoder and decoder sections with deep residual modules, we establish a balance that counters the vanishing gradient problem commonly faced when the network depth increases. Following this, a symmetrical threefold attention block is integrated. This addition ensures a balanced fusion of local and global image features, fine-tuning the network to symmetrically discern and learn essential image characteristics. This harmonious integration remarkably amplifies the network’s precision in segmenting MRI brain tumors. We further validate the equilibrium achieved by our proposed model using three brain tumor segmentation datasets and four metrics and by juxtaposing our model against 21 traditional and learning-based counterparts. The results confirm that our balanced approach significantly elevates performance in the segmentation of MRI brain tumor images without an asymmetrical increase in computational time.
- Published
- 2023
- Full Text
- View/download PDF
33. SGEResU-Net for brain tumor segmentation
- Author
-
Dongwei Liu, Ning Sheng, Tao He, Wei Wang, Jianxia Zhang, and Jianxin Zhang
- Subjects
brian tumor segmentation ,u-net ,spatial group-wise enhance ,residual module ,Biotechnology ,TP248.13-248.65 ,Mathematics ,QA1-939 - Abstract
The precise segmentation of tumor regions plays a pivotal role in the diagnosis and treatment of brain tumors. However, due to the variable location, size, and shape of brain tumors, the automatic segmentation of brain tumors is a relatively challenging application. Recently, U-Net related methods, which largely improve the segmentation accuracy of brain tumors, have become the mainstream of this task. Following merits of the 3D U-Net architecture, this work constructs a novel 3D U-Net model called SGEResU-Net to segment brain tumors. SGEResU-Net simultaneously embeds residual blocks and spatial group-wise enhance (SGE) attention blocks into a single 3D U-Net architecture, in which SGE attention blocks are employed to enhance the feature learning of semantic regions and reduce possible noise and interference with almost no extra parameters. Besides, the self-ensemble module is also utilized to improve the segmentation accuracy of brain tumors. Evaluation experiments on the Brain Tumor Segmentation (BraTS) Challenge 2020 and 2021 benchmarks demonstrate the effectiveness of the proposed SGEResU-Net for this medical application. Moreover, it achieves DSC values of 83.31, 91.64 and 86.85%, as well as Hausdorff distances (95%) of 19.278, 5.945 and 7.567 for the enhancing tumor, whole tumor, and tumor core on BraTS 2021 dataset, respectively.
- Published
- 2022
- Full Text
- View/download PDF
34. SERNet: Squeeze and Excitation Residual Network for Semantic Segmentation of High-Resolution Remote Sensing Images.
- Author
-
Zhang, Xiaoyan, Li, Linhui, Di, Donglin, Wang, Jian, Chen, Guangsheng, Jing, Weipeng, and Emam, Mahmoud
- Subjects
- *
REMOTE sensing , *DIGITAL elevation models , *IMAGE segmentation , *IMAGE processing , *SQUEEZED light - Abstract
The semantic segmentation of high-resolution remote sensing images (HRRSIs) is a basic task for remote sensing image processing and has a wide range of applications. However, the abundant texture information and wide imaging range of HRRSIs lead to the complex distribution of ground objects and unclear boundaries, which bring huge challenges to the segmentation of HRRSIs. To solve this problem, in this paper we propose an improved squeeze and excitation residual network (SERNet), which integrates several squeeze and excitation residual modules (SERMs) and a refine attention module (RAM). The SERM can recalibrate feature responses adaptively by modeling the long-range dependencies in the channel and spatial dimensions, which enables effective information to be transmitted between the shallow and deep layers. The RAM pays attention to global features that are beneficial to segmentation results. Furthermore, the ISPRS datasets were processed to focus on the segmentation of vegetation categories and introduce Digital Surface Model (DSM) images to learn and integrate features to improve the segmentation accuracy of surface vegetation, which has certain prospects in the field of forestry applications. We conduct a set of comparative experiments on ISPRS Vaihingen and Potsdam datasets. The results verify the superior performance of the proposed SERNet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. PGU-net+: Progressive Growing of U-net+ for Automated Cervical Nuclei Segmentation
- Author
-
Zhao, Jie, Dai, Lei, Zhang, Mo, Yu, Fei, Li, Meng, Li, Hongfeng, Wang, Wenjia, Zhang, Li, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Li, Quanzheng, editor, Leahy, Richard, editor, Dong, Bin, editor, and Li, Xiang, editor
- Published
- 2020
- Full Text
- View/download PDF
36. Strawberry Growth Period Recognition Method Under Greenhouse Environment Based on Improved YOLOv4
- Author
-
LONG Jiehua, GUO Wenzhong, LIN Sen, WEN Chaowu, ZHANG Yu, and ZHAO Chunjiang
- Subjects
object detection ,strawberry ,growth period recognition ,yolov4 ,residual module ,attention mechanism ,loss function ,Agriculture (General) ,S1-972 ,Technology (General) ,T1-995 - Abstract
Aiming at the real-time detection and classification of the growth period of crops in the current digital cultivation and regulation technology of facility agriculture, an improved YOLOv4 method for identifying the growth period of strawberries in a greenhouse environment was proposed. The attention mechanism into the Cross Stage Partial Residual (CSPRes) module of the YOLOv4 backbone network was introduced, and the target feature information of different growth periods of strawberries while reducing the interference of complex backgrounds was integrated, the detection accuracy while ensured real-time detection efficiency was improved. Took the smart facility strawberry in Yunnan province as the test object, the results showed that the detection accuracy (AP) of the YOLOv4-CBAM model during flowering, fruit expansion, green and mature period were 92.38%, 82.45%, 68.01% and 92.31%, respectively, the mean average precision (mAP) was 83.78%, the mean inetersection over union (mIoU) was 77.88%, and the detection time for a single image was 26.13 ms. Compared with the YOLOv4-SC model, mAP and mIoU were increased by 1.62% and 2.73%, respectively. Compared with the YOLOv4-SE model, mAP and mIOU increased by 4.81% and 3.46%, respectively. Compared with the YOLOv4 model, mAP and mIOU increased by 8.69% and 5.53%, respectively. As the attention mechanism was added to the improved YOLOv4 model, the amount of parameters increased, but the detection time of improved YOLOv4 models only slightly increased. At the same time, the number of fruit expansion period recognized by YOLOv4 was less than that of YOLOv4-CBAM, YOLOv4-SC and YOLOv4-SE, because the color characteristics of fruit expansion period were similar to those of leaf background, which made YOLOv4 recognition susceptible to leaf background interference, and added attention mechanism could reduce background information interference. YOLOv4-CBAM had higher confidence and number of identifications in identifying strawberry growth stages than YOLOv4-SC, YOLOv4-SE and YOLOv4 models, indicated that YOLOv4-CBAM model can extract more comprehensive and rich features and focus more on identifying targets, thereby improved detection accuracy. YOLOv4-CBAM model can meet the demand for real-time detection of strawberry growth period status.
- Published
- 2021
- Full Text
- View/download PDF
37. Seismic Impedance Inversion Based on Residual Attention Network.
- Author
-
Wu, Bangyu, Xie, Qiao, and Wu, Baohai
- Subjects
- *
CONVOLUTIONAL neural networks , *DEEP learning - Abstract
Deep learning (DL) has achieved promising results for impedance inversion via seismic data. Generally, these networks, composed of convolution layers and residual blocks, tend to deliver good results with deep architectures. Nevertheless, deep networks accompany a large number of parameters and longer training time. The volume of seismic data, especially 3-D scenarios, is very large. Therefore, it is particularly important to improve the accuracy while ensuring the model efficiency for practical implementation. With the flourishing new modules and techniques, DL has set the state of the art in many applications across a wide range of scientific and engineering disciplines. In this article, we present a residual attention network (ResANet), a convolutional neural network (CNN) incorporating with residual modules, and two attention mechanisms: channelwise attention and feature-map attention, for seismic impedance inversion. The proposed network can fuse multiscale channel information and recalibrate channelwise feature responses as well as receptive fields adaptively. At the same time, ResANet adopts grouped convolution, dilated convolution, and dropout techniques to improve the computation efficiency and stability. The Marmousi2 synthetic model and field data test results show that the proposed network outperforms several comparable neural networks in accuracy and generalization ability while ensuring efficiency for seismic data impedance inversion. For the field data test, transfer learning is also evoked to further improve the performance. ResANet tends to predict impedance with high resolution and strong lateral continuity compared with three closely related networks. The accuracy of ResANet is improved by 1–2 orders of magnitude on the six well logs provided in field dataset tests compared with commercial software (InverTrace Plus module in Jason) using the constrained sparse spike inversion (CSSI) method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. IRDNU-Net: Inception residual dense nested u-net for brain tumor segmentation.
- Author
-
AboElenein, Nagwa M., Songhao, Piao, and Afifi, Ahmed
- Subjects
BRAIN tumors ,VIDEO coding ,CONVOLUTIONAL neural networks ,IMAGE segmentation ,COMPUTATIONAL complexity - Abstract
Accurate segmentation of brain tumors is an essential stage in treatment planning. Fully convolutional neural networks, specifically the encoder-decoder architectures such as U-net, have proven successful in medical image segmentation. However, segmenting brain tumors with complex structure requires building a deeper and wider model which increases the computational complexity and may also cause the gradient vanishing problem. Therefore, in this work, we propose a novel encoder-decoder architecture, called Inception Residual Dense Nested U-Net (IRDNU-Net). In this model carefully designed Residual and Inception modules are used in place of standard U-Net convolutional layers to increase the width of the model without increasing the computational complexity. Additionally, in the proposed architecture, the encoder and decoder are connected via a sequence of Inception-Residual densely nested paths to extract more information and increase the depth of the network while reducing the number of network parameters. The proposed segmentation architecture was evaluated on two large brain tumor segmentation benchmark datasets; the BraTS'2019 and BraTS'2020. It achieved a mean Dice similarity coefficient of 0.888 for the whole tumor region, 0.876 for the core region, and 0.819 for the enhancement region. Experimental results illuminate that IRDNU-Net outperforms U-Net by 1.8%, 11.4%, and 11.7% in the whole tumor, core tumor, and enhancing tumor, respectively. Moreover, the IRDNU-Net enables a great improvement on the accuracy compared to comparative approaches, and its ability in the face of challenging problems, such as small tumor regions, with fewer parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. 基于改进特征提取及融合模块的YOLOv3模型.
- Author
-
赵轩, 周凡, and 余汉成
- Subjects
- *
FEATURE extraction , *OBJECT recognition (Computer vision) , *PROBLEM solving , *DEEP learning , *A priori - Abstract
There is a certain optimization space for the feature extraction branch and multi-scale detection branch of YOLOv3 model. To solve this problem, this study proposes two structural improvement methods to improve the detection accuracy of the model on the target detection data set. For the three scales (13×13, 26×26, 52×52) of the YOLOv3 model, a priori anchor frames of different lengths and widths are used, and the label frames of the three scales are the same, and the feature fusion method between the design scales is used to improve the accuracy of the model. In view of the problem of convolutional layer spatial view sharing, the original convolutional layer can be replaced with deformable convolution to improve the accuracy of the model. The test result on the industrial tool library proves that the accuracy of the test set of the improved model is increased by 3.6 MAP when compared with the original YOLOv3. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Real-Time Target Detection Method Based on Lightweight Convolutional Neural Network
- Author
-
Juntong Yun, Du Jiang, Ying Liu, Ying Sun, Bo Tao, Jianyi Kong, Jinrong Tian, Xiliang Tong, Manman Xu, and Zifan Fang
- Subjects
Deep learning ,target detection ,MobileNets-SSD ,depthwise separable convolution ,residual module ,Biotechnology ,TP248.13-248.65 - Abstract
The continuous development of deep learning improves target detection technology day by day. The current research focuses on improving the accuracy of target detection technology, resulting in the target detection model being too large. The number of parameters and detection speed of the target detection model are very important for the practical application of target detection technology in embedded systems. This article proposed a real-time target detection method based on a lightweight convolutional neural network to reduce the number of model parameters and improve the detection speed. In this article, the depthwise separable residual module is constructed by combining depthwise separable convolution and non–bottleneck-free residual module, and the depthwise separable residual module and depthwise separable convolution structure are used to replace the VGG backbone network in the SSD network for feature extraction of the target detection model to reduce parameter quantity and improve detection speed. At the same time, the convolution kernels of 1 × 3 and 3 × 1 are used to replace the standard convolution of 3 × 3 by adding the convolution kernels of 1 × 3 and 3 × 1, respectively, to obtain multiple detection feature graphs corresponding to SSD, and the real-time target detection model based on a lightweight convolutional neural network is established by integrating the information of multiple detection feature graphs. This article used the self-built target detection dataset in complex scenes for comparative experiments; the experimental results verify the effectiveness and superiority of the proposed method. The model is tested on video to verify the real-time performance of the model, and the model is deployed on the Android platform to verify the scalability of the model.
- Published
- 2022
- Full Text
- View/download PDF
41. Second-order ResU-Net for automatic MRI brain tumor segmentation
- Author
-
Ning Sheng, Dongwei Liu, Jianxia Zhang, Chao Che, and Jianxin Zhang
- Subjects
brain tumor segmentation ,second-order statistics ,u-net ,residual module ,Biotechnology ,TP248.13-248.65 ,Mathematics ,QA1-939 - Abstract
Tumor segmentation using magnetic resonance imaging (MRI) plays a significant role in assisting brain tumor diagnosis and treatment. Recently, U-Net architecture with its variants have become prevalent in the field of brain tumor segmentation. However, the existing U-Net models mainly exploit coarse first-order features for tumor segmentation, and they seldom consider the more powerful second-order statistics of deep features. Therefore, in this work, we aim to explore the effectiveness of second-order statistical features for brain tumor segmentation application, and further propose a novel second-order residual brain tumor segmentation network, i.e., SoResU-Net. SoResU-Net utilizes a number of second-order modules to replace the original skip connection operations, thus augmenting the series of transformation operations and increasing the non-linearity of the segmentation network. Extensive experimental results on the BraTS 2018 and BraTS 2019 datasets demonstrate that SoResU-Net outperforms its baseline, especially on core tumor and enhancing tumor segmentation, illuminating the effectiveness of second-order statistical features for the brain tumor segmentation application.
- Published
- 2021
- Full Text
- View/download PDF
42. IRDC-Net: An Inception Network with a Residual Module and Dilated Convolution for Sign Language Recognition Based on Surface Electromyography
- Author
-
Xiangrui Wang, Lu Tang, Qibin Zheng, Xilin Yang, and Zhiyuan Lu
- Subjects
sign language recognition ,surface electromyogram ,inception network ,residual module ,dilated convolution ,Chemical technology ,TP1-1185 - Abstract
Deaf and hearing-impaired people always face communication barriers. Non-invasive surface electromyography (sEMG) sensor-based sign language recognition (SLR) technology can help them to better integrate into social life. Since the traditional tandem convolutional neural network (CNN) structure used in most CNN-based studies inadequately captures the features of the input data, we propose a novel inception architecture with a residual module and dilated convolution (IRDC-net) to enlarge the receptive fields and enrich the feature maps, applying it to SLR tasks for the first time. This work first transformed the time domain signal into a time–frequency domain using discrete Fourier transformation. Second, an IRDC-net was constructed to recognize ten Chinese sign language signs. Third, the tandem CNN networks VGG-net and ResNet-18 were compared with our proposed parallel structure network, IRDC-net. Finally, the public dataset Ninapro DB1 was utilized to verify the generalization performance of the IRDC-net. The results showed that after transforming the time domain sEMG signal into the time–frequency domain, the classification accuracy (acc) increased from 84.29% to 91.70% when using the IRDC-net on our sign language dataset. Furthermore, for the time–frequency information of the public dataset Ninapro DB1, the classification accuracy reached 89.82%; this value is higher than that achieved in other recent studies. As such, our findings contribute to research into SLR tasks and to improving deaf and hearing-impaired people’s daily lives.
- Published
- 2023
- Full Text
- View/download PDF
43. Lightweight Dual-Stream Residual Network for Single Image Super-Resolution
- Author
-
Yichun Jiang, Yunqing Liu, Weida Zhan, and Depeng Zhu
- Subjects
Deep convolutional neural networks ,single image super-resolution ,lightweight model ,model compression ,residual module ,up-sampling module ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The deep convolutional neural network has achieved great success in the Single Image Super-resolution task. It is obviously that among the well-known super-resolution methods, the deep learning-based algorithms show the most advanced performance. However, the most advanced algorithms currently use complex networks with a large number of parameters, which makes it difficult to apply deep learning algorithms on mobile devices. To solve this problem, we propose a lightweight dual-residual network(LDRN) for single image super-resolution, which has better reconstruction quality than most current advanced lightweight algorithms. Due to its fewer parameters and computational expense, real-time and mobile applications of our networks can be easily realized. On the basis of the residual module, we propose a new residual unit, which uses two depthwise separable (DW) convolution to obtain better balance between feature extraction capacity and lightweight performance. We further design a dual-stream residual block, which contains a multiplication branch and an addition branch. The dual-stream residual block can improve the reconstruction performance more effectively than expanding the network width. In addition, we also designed a new up-sampling module to simplify the previous up-sampling methods. Extensive experimental results show that our network has better reconstruction performance and lightweight performance than most existing state-of-the-art algorithms. Our code is available at https://github.com/Jiangyichun-cust/pytorch-LDRN.
- Published
- 2021
- Full Text
- View/download PDF
44. A Rectal CT Tumor Segmentation Method Based on Improved U-Net.
- Author
-
Dong, Haowei, Zhang, Haifei, Wu, Fang, Qiu, Jianlin, Zhang, Jian, and Wang, Haoyu
- Subjects
- *
RECTAL cancer , *ENDORECTAL ultrasonography , *COMPUTED tomography , *CANCER diagnosis , *IMAGE segmentation ,RECTUM tumors - Abstract
Automatic and accurate segmentation of tumor area from rectal CT image plays an extremely key role in the treatment and diagnosis of rectal cancer. This paper proposes the MR-U-Net network model. The improvement is that a pair of encoder and decoder is added longitudinally to the U-shaped structure, which is the network structure of the fifth layer, and a residual module is added horizontally to the encoder and decoder of each layer. This model is used to conduct targeted research on the automatic segmentation method of rectal cancer. [H. Gao et al., Rectal tumor segmentation method based on U-Net improved model, J. Comput. Appl.40(8) (2020) 2392–2397] also improved U-Net and used the same dataset as this paper, but the Dice coefficient of all targets was only 83.15%, and the Dice coefficient of small targets was only 87.17%. This paper evaluates the improved MR-U-Net network model with the three indicators of precision, recall and Dice coefficient, and finds that in comparison to Ref. 4 the precision is 95.13%, 2.29% higher than the former work, recall is 94.28%, higher than the former work by 0.34%, Dice coefficient of all targets is 88.45%, increased by 5.3% compared with the former work, and the small targets Dice coefficient is increased by 1.28%, which is the best optimization state of this paper. Experiments show that for datasets with extremely skewed positive and negative samples, the MR-U-Net network structure after improving the hyperparameters in the optimizer can more accurately segment the rectal CT tumor lesion area. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Ancient mural restoration based on a modified generative adversarial network
- Author
-
Jianfang Cao, Zibang Zhang, Aidi Zhao, Hongyan Cui, and Qi Zhang
- Subjects
Generative adversarial network ,Fully convolutional network ,Residual module ,Mural restoration ,Fine Arts ,Analytical chemistry ,QD71-142 - Abstract
Abstract How to effectively protect ancient murals has become an urgent and important problem. Digital image processing developments have made it possible to repair damaged murals to a certain extent. This study proposes a consistency-enhanced generative adversarial network (GAN) model to repair missing mural areas. First, the convolutional layer from a fully convolutional network (FCN) is used to extract deep image features; then, through deconvolution, the features are mapped to the size of the original image and the repaired image is output, thereby completing the regenerative network. Next, global and local discriminant networks are applied to determine whether the repaired mural image is “authentic” in terms of both the modified and unmodified areas. In adversarial learning, the generative and discriminant network models are optimized to better complete the mural repair. The network introduces a dilated convolution that increases the convolution kernel’s receptive field. Each network convolutional layer joins in the batch standardization (BN) process to accelerate network convergence and increase the number of network layers and adopts a residual module to avoid the vanishing gradient problem and further optimizing the network. Compared with existing mural restoration algorithms, the proposed algorithm increases the peak signal-to-noise ratio (PSNR) by an average of 6–8 dB and increases the structural similarity (SSIM) index by 0.08–0.12. From a visual perspective, this algorithm successfully complements mural images with complex textures and large missing areas; thus, it may contribute to digital restorations of ancient murals.
- Published
- 2020
- Full Text
- View/download PDF
46. Cooperative Coupled Generative Networks for Generalized Zero-Shot Learning
- Author
-
Liang Sun, Junjie Song, Ye Wang, and Baoyu Li
- Subjects
Zero-shot learning ,generalized zero-shot learning ,generative adversarial network ,neural network ,residual module ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Compared with zero-shot learning (ZSL), the generalized zero-shot learning (GZSL) is more challenging since its test samples are taken from both seen and unseen classes. Most previous mapping-based methods perform well on ZSL, while their performance degrades on GZSL. To solve this problem, inspired by the ensemble learning, this paper proposes a model with cooperative coupled generative networks (CCGN). Firstly, to alleviate the hubness problem, the reverse visual feature space is taken as the embedding space, with the mapping achieved by a visual feature center generation network. To learn a proper visual representation of each class, we propose a coupled of generative networks, which cooperate with each other to synthesize a visual feature center template of the class. Secondly, to improve the generative ability of the coupled networks, we further employ a deeper network to generate. Meanwhile, to alleviate loss semantic information problem caused by multiple network layers, a residual module is employed. Thirdly, to mitigate overfitting and to increase scalability, an adversarial network is introduced to discriminate the generation of visual feature centers. Finally, a reconstruction network, which reverses the generation process, is employed to restrict the structural correlation between the generated visual feature center and the original semantic representation of each class. Extensive experiments on five benchmark datasets (AWA1, AWA2, CUB, SUN, APY) demonstrate that the proposed algorithm yields satisfactory results, as compared with the state-of-the-art methods.
- Published
- 2020
- Full Text
- View/download PDF
47. Attention Gate ResU-Net for Automatic MRI Brain Tumor Segmentation
- Author
-
Jianxin Zhang, Zongkang Jiang, Jing Dong, Yaqing Hou, and Bin Liu
- Subjects
MRI ,brain tumor segmentation ,U-Net ,attention gate ,residual module ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Brain tumor segmentation technology plays a pivotal role in the process of diagnosis and treatment of MRI brain tumors. It helps doctors to locate and measure tumors, as well as develop treatment and rehabilitation strategies. Recently, MRI brain tumor segmentation methods based on U-Net architecture have become popular as they largely improve the segmentation accuracy by applying skip connection to combine high-level feature information and low-level feature information. Meanwhile, researchers have demonstrated that introducing attention mechanism into U-Net can enhance local feature expression and improve the performance of medical image segmentation. In this work, we aim to explore the effectiveness of a recent attention module called attention gate for brain tumor segmentation task, and a novel Attention Gate Residual U-Net model, i.e., AGResU-Net, is further presented. AGResU-Net integrates residual modules and attention gates with a primeval and single U-Net architecture, in which a series of attention gate units are added into the skip connection for highlighting salient feature information while disambiguating irrelevant and noisy feature responses. AGResU-Net not only extracts abundant semantic information to enhance the ability of feature learning, but also pays attention to the information of small-scale brain tumors. We extensively evaluate attention gate units on three authoritative MRI brain tumor benchmarks, i.e., BraTS 2017, BraTS 2018 and BraTS 2019. Experimental results illuminate that models with attention gate units, i.e., Attention Gate U-Net (AGU-Net) and AGResU-Net, outperform their baselines of U-Net and ResU-Net, respectively. In addition, AGResU-Net achieves competitive performance than the representative brain tumor segmentation methods.
- Published
- 2020
- Full Text
- View/download PDF
48. Steganalysis of Variable Size Image Based on Efficient Feature Fusion.
- Author
-
XIAO Ruixue, FENG Yingwei, and QU Jianping
- Abstract
In order to improve the efficiency and accuracy of steganalysis, and to adapt to multi-dimensional input images, a variable size steganalysis model based on efficient feature fusion is proposed. In the preprocessing layer, the multi-dimensional convolution kernel initialized by the multi-order high pass filters of spatial rich model is added to the network learning to improve the convergence efficiency and detection performance of the model. In the feature extraction layer, based on the idea of feature fusion, two subnetworks composed of Ghost bottleneck layer, residual module and dense connection module are designed. Then, the output abstract steganographic features and nonlinear high-dimensional steganography features are fused to obtain the dependency information of steganographic features, which is conducive to enhance the feature expression ability of the model. The improved spatial pyramid pooling is used to adapt the variable size image samples and enrich the diversity of steganography features. Simulation results show that the model can correctly capture the key steganographic signals, and possesses high convergence efficiency. The detection accuracy of the WOW steganographic algorithm with embedding rate of 0.2 and 0.4 is 82.6% and 96.5%, respectively, and the detection accuracy of S-UNIWARD steganographic algorithm with embedding rate of 0.2 and 0.4 is 81.4% and 95.2%, which are significantly higher than that of SRM and YedroudjNet steganalysis model. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
49. 改进YOLOv4 的温室环境下草莓生育期识别方法.
- Author
-
龙洁花, 郭文忠, 林森, 文朝武, 张宇, and 赵春江
- Abstract
Aiming at the real-time detection and classification of the growth period of crops in the current digital cultivation and regulation technology of facility agriculture, an improved YOLOv4 method for identifying the growth period of strawberries in a greenhouse environment was proposed. The attention mechanism into the Cross Stage Partial Residual (CSPRes) module of the YOLOv4 backbone network was introduced, and the target feature information of different growth periods of strawberries while reducing the interference of complex backgrounds was integrated, the detection accuracy while ensured real-time detection efficiency was improved. Took the smart facility strawberry in Yunnan province as the test object, the results showed that the detection accuracy (AP) of the YOLOv4-CBAM model during flowering, fruit expansion, green and mature period were 92.38%, 82.45%, 68.01% and 92.31%, respectively, the mean average precision (mAP) was 83.78%, the mean inetersection over union (mIoU) was 77.88%, and the detection time for a single image was 26.13 ms. Compared with the YOLOv4-SC model, mAP and mIoU were increased by 1.62% and 2.73%, respectively. Compared with the YOLOv4-SE model, mAP and mIOU increased by 4.81% and 3.46%, respectively. Compared with the YOLOv4 model, mAP and mIOU increased by 8.69% and 5.53%, respectively. As the attention mechanism was added to the improved YOLOv4 model, the amount of parameters increased, but the detection time of improved YOLOv4 models only slightly increased. At the same time, the number of fruit expansion period recognized by YOLOv4 was less than that of YOLOv4-CBAM, YOLOv4-SC and YOLOv4-SE, because the color characteristics of fruit expansion period were similar to those of leaf background, which made YOLOv4 recognition susceptible to leaf background interference, and added attention mechanism could reduce background information interference. YOLOv4- CBAM had higher confidence and number of identifications in identifying strawberry growth stages than YOLOv4-SC, YOLOv4-SE and YOLOv4 models, indicated that YOLOv4-CBAM model can extract more comprehensive and rich features and focus more on identifying targets, thereby improved detection accuracy. YOLOv4-CBAM model can meet the demand for real-time detection of strawberry growth period status. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
50. 基于改进U-Net 网络的隧道裂缝分割算法研究.
- Author
-
常惠, 饶志强, 赵玉林, and 李益晨
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.