374 results on '"convolutional block attention module"'
Search Results
2. PSNet: A non-uniform illumination correction method for underwater images based pseudo-siamese network
- Author
-
Zhao, Wenfeng, Rong, Shenghui, Feng, Chen, and He, Bo
- Published
- 2025
- Full Text
- View/download PDF
3. BWO-CAformer: An improved Informer model for AQI prediction in Beijing and Wuhan
- Author
-
Dong, Xu, Li, Deyi, Wang, Wenbo, and Shen, Yang
- Published
- 2025
- Full Text
- View/download PDF
4. Using Linear Channel Attention to Enhance Real-Time Colonoscopy Object Detection
- Author
-
Le, Qiwen, Dong, Lanfang, Tang, Yingchao, Kong, Derun, Wu, Aijiu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ge, Shuzhi Sam, editor, Luo, Zhuojing, editor, Wang, Yanen, editor, Samani, Hooman, editor, Ji, Ruihang, editor, and He, Hongsheng, editor
- Published
- 2025
- Full Text
- View/download PDF
5. Remaining Useful Life Prediction of Rolling Bearings Based on CBAM-CNN-LSTM.
- Author
-
Sun, Bo, Hu, Wenting, Wang, Hao, Wang, Lei, and Deng, Chengyang
- Subjects
- *
CONVOLUTIONAL neural networks , *REMAINING useful life , *LONG short-term memory , *RELIABILITY in engineering , *DEEP learning - Abstract
Predicting the Remaining Useful Life (RUL) is vital for ensuring the reliability and safety of equipment and components. This study introduces a novel method for predicting RUL that utilizes the Convolutional Block Attention Module (CBAM) to address the problem that Convolutional Neural Networks (CNNs) do not effectively leverage data channel features and spatial features in residual life prediction. Firstly, Fast Fourier Transform (FFT) is applied to convert the data into the frequency domain. The resulting frequency domain data is then used as input to the convolutional neural network for feature extraction; Then, the weights of channel features and spatial features are assigned to the extracted features by CBAM, and the weighted features are then input into the Long Short-Term Memory (LSTM) network to learn temporal features. Finally, the effectiveness of the proposed model is verified using the PHM2012 bearing dataset. Compared to several existing RUL prediction methods, the mean squared error, mean absolute error, and root mean squared error of the proposed method in this paper are reduced by 53%, 16.87%, and 31.68%, respectively, which verifies the superiority of the method. Meanwhile, the experimental results demonstrate that the proposed method achieves good RUL prediction accuracy across various failure modes. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
6. Malicious Document Detection Based on GGE Visualization.
- Author
-
Wang, Youhe, Sun, Yi, Li, Yujie, and Zhou, Chuanqi
- Subjects
GRAYSCALE model ,FEATURE extraction ,ENTROPY (Information theory) ,ENTROPY ,ENGINEERING - Abstract
With the development of anti-virus technology, malicious documents have gradually become the main pathway of Advanced Persistent Threat (APT) attacks, therefore, the development of effective malicious document classifiers has become particularly urgent. Currently, detection methods based on document structure and behavioral features encounter challenges in feature engineering, these methods not only have limited accuracy, but also consume large resources, and usually can only detect documents in specific formats, which lacks versatility and adaptability. To address such problems, this paper proposes a novel malicious document detection method-visualizing documents as GGE images (Grayscale, Grayscale matrix, Entropy). The GGE method visualizes the original byte sequence of the malicious document as a grayscale image, the information entropy sequence of the document as an entropy image, and at the same time, the grayscale level co-occurrence matrix and the texture and spatial information stored in it are converted into grayscale matrix image, and fuses the three types of images to get the GGE color image. The Convolutional Block Attention Module-EfficientNet-B0 (CBAM-EfficientNet-B0) model is then used for classification, combining transfer learning and applying the pre-trained model on the ImageNet dataset to the feature extraction process of GGE images. As shown in the experimental results, the GGE method has superior performance compared with other methods, which is suitable for detecting malicious documents in different formats, and achieves an accuracy of 99.44% and 97.39% on Portable Document Format (PDF) and office datasets, respectively, and consumes less time during the detection process, which can be effectively applied to the task of detecting malicious documents in real-time. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
7. Utilizing active learning and attention-CNN to classify vegetation based on UAV multispectral data.
- Author
-
Miao, Sheng, Wang, Chuanlong, Kong, Guangze, Yuan, Xiuhe, Shen, Xiang, and Liu, Chao
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL intelligence , *DRONE aircraft , *VEGETATION classification , *LEARNING strategies , *DEEP learning - Abstract
This paper presents a deep learning model based on an active learning strategy. The model achieves accurate identification of vegetation types in the study area by utilizing multispectral data obtained from preprocessing of unmanned aerial vehicle (UAV) remote sensing equipment. This approach offers advantages such as high data accuracy, mobility, and easy data collection. In active learning, the minimum confidence scoring method and a sampling technique based on a data pool are employed to reduce labeling costs. The deep learning model incorporates a semantic segmentation gated full fusion module that integrates a dual attention mechanism. This module enhances the capture of detailed texture information, optimally allocates spectral weights, and improves the model's ability to distinguish between similar categories. At a labeling cost of 20%, the average accuracy of the model is 93.2%. Compared with other models, the proposed model achieved the highest classification accuracy in the case of limited training samples. At full annotation cost, the average accuracy is 95.32%, with only a difference of about 2%, but saving 80% of annotation cost. Therefore, active learning strategies can filter out high-value samples that are beneficial for model training, greatly reducing the annotation cost of samples Finally, the recognition results of surface vegetation cover types in the study area are presented, and the model's accuracy is verified through field investigation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Human Violence Detection in Videos Using Key Frame Identification and 3D CNN with Convolutional Block Attention Module.
- Author
-
Akula, Venkatesh and Kavati, Ilaiah
- Subjects
- *
CONVOLUTIONAL neural networks , *CAMCORDERS , *VIDEO surveillance , *SOURCE code , *LEARNING modules - Abstract
In recent years, there has been an increase in demand for intelligent automatic surveillance systems to detect abnormal activities at various places, such as schools, hospitals, prisons, psychiatric centers, and public gatherings. The availability of video surveillance cameras in such places enables techniques for automatically identifying violent actions and alerting the authorities to minimize loss. Deep learning-based models, such as Convolutional Neural Networks (CNNs), have shown better performance in detecting violent activities by utilizing the spatiotemporal features of video frames. In this work, we propose a violence detection model based on 3D CNN, which employs a DenseNet architecture for enhanced spatiotemporal feature capture. First, the video's redundant frames are discarded by identifying the key frames in the video. We exploit the Multi-Scale Structural Similarity Index Measure (MS-SSIM) technique to identify the key frames of the video, which contain significant information about the video. Key frame identification helps to reduce the complexity of the model. Next, the identified video key frames with the lowest MS-SSIM are forwarded to 3D CNN to extract spatiotemporal features. Furthermore, we exploit the Convolutional Block Attention Module (CBAM) to increase the representational capabilities of the 3D CNN. The results on different benchmark datasets show that the proposed violence detection method performs better than most of the existing methods. The source code for the proposed method is publicly available at https://github.com/venkateshakula19/violence-detection-using-keyframe-extraction-and-CNN-with-attention-CBAM [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. ViT-DualAtt: An efficient pornographic image classification method based on Vision Transformer with dual attention.
- Author
-
Cai, Zengyu, Xu, Liusen, Zhang, Jianwei, Feng, Yuan, Zhu, Liang, and Liu, Fangmei
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE recognition (Computer vision) , *TRANSFORMER models , *YOUNG adults , *VIRTUAL communities - Abstract
Pornographic images not only pollute the internet environment, but also potentially harm societal values and the mental health of young people. Therefore, accurately classifying and filtering pornographic images is crucial to maintaining the safety of the online community. In this paper, we propose a novel pornographic image classification model named ViT-DualAtt. The model adopts a CNN-Transformer hierarchical structure, combining the strengths of Convolutional Neural Networks (CNNs) and Transformers to effectively capture and integrate both local and global features, thereby enhancing feature representation accuracy and diversity. Moreover, the model integrates multi-head attention and convolutional block attention mechanisms to further improve classification accuracy. Experiments were conducted using the nsfw_data_scrapper dataset publicly available on GitHub by data scientist Alexander Kim. Our results demonstrated that ViT-DualAtt achieved a classification accuracy of 97.2% ± 0.1% in pornographic image classification tasks, outperforming the current state-of-the-art model (RepVGG-SimAM) by 2.7%. Furthermore, the model achieves a pornographic image miss rate of only 1.6%, significantly reducing the risk of pornographic image dissemination on internet platforms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Domain-Adaptive Underwater Target Detection Method Based on GPA + CBAM
- Author
-
Qidong LIU, Xin SHEN, Hailu LIU, Lu CONG, and Xianping FU
- Subjects
underwater target detection ,graph-induced prototype alignment ,domain adaptation ,convolutional block attention module ,Naval architecture. Shipbuilding. Marine engineering ,VM1-989 - Abstract
Underwater target detection is often more susceptible to domain shift and reduced detection accuracy. In response to this phenomenon, this article proposed a domain-adaptive underwater target detection method based on graph-induced prototype alignment(GPA). GPA obtained instance-level features in the image through graph-based information propagation between region proposals and then derived prototype representations for category-level domain alignment. The above operations could effectively aggregate different modal information of underwater targets, thereby achieving alignment between the source and target domains and reducing the impact of domain shift. In addition, in order to make the neural network focus on instance-level features under different water domain distributions, a convolutional block attention module(CBAM) was added. The experimental results have shown that the proposed method can effectively improve detection accuracy during domain shift.
- Published
- 2024
- Full Text
- View/download PDF
11. Cow Hoof Slippage Detecting Method Based on Enhanced DeepLabCut Model
- Author
-
NIAN Yue, ZHAO Kaixuan, and JI Jiangtao
- Subjects
deep learning ,cow hoof slippage ,resnet50 ,decision tree ,convolutional block attention module ,Agriculture (General) ,S1-972 ,Technology (General) ,T1-995 - Abstract
[Objective]The phenomenon of hoof slipping occurs during the walking process of cows, which indicates the deterioration of the farming environment and a decline in the cows' locomotor function. Slippery grounds can lead to injuries in cows, resulting in unnecessary economic losses for farmers. To achieve automatically recognizing and detecting slippery hoof postures during walking, the study focuses on the localization and analysis of key body points of cows based on deep learning methods. Motion curves of the key body points were analyzed, and features were extracted. The effectiveness of the extracted features was verified using a decision tree classification algorithm, with the aim of achieving automatic detection of slippery hoof postures in cows.[Method]An improved localization method for the key body points of cows, specifically the head and four hooves, was proposed based on the DeepLabCut model. Ten networks, including ResNet series, MobileNet-V2 series, and EfficientNet series, were selected to respectively replace the backbone network structure of DeepLabCut for model training. The root mean square error(RMSE), model size, FPS, and other indicators were chosen, and after comprehensive consideration, the optimal backbone network structure was selected as the pre-improved network. A network structure that fused the convolutional block attention module (CBAM) attention mechanism with ResNet-50 was proposed. A lightweight attention module, CBAM, was introduced to improve the ResNet-50 network structure. To enhance the model's generalization ability and robustness, the CBAM attention mechanism was embedded into the first convolution layer and the last convolution layer of the ResNet-50 network structure. Videos of cows with slippery hooves walking in profile were predicted for key body points using the improved DeepLabCut model, and the obtained key point coordinates were used to plot the motion curves of the cows' key body points. Based on the motion curves of the cows' key body points, the feature parameter Feature1 for detecting slippery hooves was extracted, which represented the local peak values of the derivative of the motion curves of the cows' four hooves. The feature parameter Feature2 for predicting slippery hoof distances was extracted, specifically the minimum local peak points of the derivative curve of the hooves, along with the local minimum points to the left and right of these peaks. The effectiveness of the extracted Feature1 feature parameters was verified using a decision tree classification model. Slippery hoof feature parameters Feature1 for each hoof were extracted, and the standard deviation of Feature1 was calculated for each hoof. Ultimately, a set of four standard deviations for each cow was extracted as input parameters for the classification model. The classification performance was evaluated using four common objective metrics, including accuracy, precision, recall, and F1-Score. The prediction accuracy for slippery hoof distances was assessed using RMSE as the evaluation metric.[Results and Discussion]After all ten models reached convergence, the loss values ranked from smallest to largest were found in the EfficientNet series, ResNet series, and MobileNet-V2 series, respectively. Among them, ResNet-50 exhibited the best localization accuracy in both the training set and validation set, with RMSE values of only 2.69 pixels and 3.31 pixels, respectively. The MobileNet series had the fastest inference speed, reaching 48 f/s, while the inference speeds of the ResNet series and MobileNet series were comparable, with ResNet series performing slightly better than MobileNet series. Considering the above factors, ResNet-50 was ultimately selected as the backbone network for further improvements on DeepLabCut. Compared to the original ResNet-50 network, the ResNet-50 network improved by integrating the CBAM module showed a significant enhancement in localization accuracy. The accuracy of the improved network increased by 3.7% in the training set and by 9.7% in the validation set. The RMSE between the predicted body key points and manually labeled points was only 2.99 pixels, with localization results for the right hind hoof, right front hoof, left hind hoof, left front hoof, and head improved by 12.1%, 44.9%, 0.04%, 48.2%, and 39.7%, respectively. To validate the advancement of the improved model, a comparison was made with the mainstream key point localization model, YOLOv8s-pose, which showed that the RMSE was reduced by 1.06 pixels compared to YOLOv8s-pose. This indicated that the ResNet-50 network integrated with the CBAM attention mechanism possessed superior localization accuracy. In the verification of the cow slippery hoof detection classification model, a 10-fold cross-validation was conducted to evaluate the performance of the cow slippery hoof classification model, resulting in average values of accuracy, precision, recall, and F1-Score at 90.42%, 0.943, 0.949, and 0.941, respectively. The error in the calculated slippery hoof distance of the cows, using the slippery hoof distance feature parameter Feature2, compared to the manually calibrated slippery hoof distance was found to be 1.363 pixels.[Conclusion]The ResNet-50 network model improved by integrating the CBAM module showed a high accuracy in the localization of key body points of cows. The cow slippery hoof judgment model and the cow slippery hoof distance prediction model, based on the extracted feature parameters for slippery hoof judgment and slippery hoof distance detection, both exhibited small errors when compared to manual detection results. This indicated that the proposed enhanced deeplabcut model obtained good accuracy and could provide technical support for the automatic detection of slippery hooves in cows.
- Published
- 2024
- Full Text
- View/download PDF
12. Research on Deep Learning Detection Model for Pedestrian Objects in Complex Scenes Based on Improved YOLOv7.
- Author
-
Hu, Jun, Zhou, Yongqi, Wang, Hao, Qiao, Peng, and Wan, Wenwei
- Subjects
- *
OBJECT recognition (Computer vision) , *AUTONOMOUS robots , *PEDESTRIANS , *AUTONOMOUS vehicles , *DETECTORS , *FEATURE extraction - Abstract
Objective: Pedestrian detection is very important for the environment perception and safety action of intelligent robots and autonomous driving, and is the key to ensuring the safe action of intelligent robots and auto assisted driving. Methods: In response to the characteristics of pedestrian objects occupying a small image area, diverse poses, complex scenes and severe occlusion, this paper proposes an improved pedestrian object detection method based on the YOLOv7 model, which adopts the Convolutional Block Attention Module (CBAM) attention mechanism and Deformable ConvNets v2 (DCNv2) in the two Efficient Layer Aggregation Network (ELAN) modules of the backbone feature extraction network. In addition, the detection head is replaced with a Dynamic Head (DyHead) detector head with an attention mechanism; unnecessary background information around the pedestrian object is also effectively excluded, making the model learn more concentrated feature representations. Results: Compared with the original model, the log-average miss rate of the improved YOLOv7 model is significantly reduced in both the Citypersons dataset and the INRIA dataset. Conclusions: The improved YOLOv7 model proposed in this paper achieved good performance improvement in different pedestrian detection problems. The research in this paper has important reference significance for pedestrian detection in complex scenes such as small, occluded and overlapping objects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Microexpression Recognition Method Based on ADP-DSTN Feature Fusion and Convolutional Block Attention Module.
- Author
-
Song, Junfang, Lei, Shanzhong, and Wu, Wenzhe
- Subjects
CONVOLUTIONAL neural networks ,FACIAL expression ,OPTICAL flow ,STRUCTURAL optimization ,DEEP learning - Abstract
Microexpressions are subtle facial movements that occur within an extremely brief time frame, often revealing suppressed emotions. These expressions hold significant importance across various fields, including security monitoring and human–computer interaction. However, the accuracy of microexpression recognition is severely constrained by the inherent characteristics of these expressions. To address the issue of low detection accuracy regarding the subtle features present in microexpressions' facial action units, this paper proposes a microexpression action unit detection algorithm, Attention-embedded Dual Path and Shallow Three-stream Networks (ADP-DSTN), that incorporates an attention-embedded dual path and a shallow three-stream network. First, an attention mechanism was embedded after each Bottleneck layer in the foundational Dual Path Networks to extract static features representing subtle texture variations that have significant weights in the action units. Subsequently, a shallow three-stream 3D convolutional neural network was employed to extract optical flow features that were particularly sensitive to temporal and discriminative characteristics specific to microexpression action units. Finally, the acquired static facial feature vectors and optical flow feature vectors were concatenated to form a fused feature vector that encompassed more effective information for recognition. Each facial action unit was then trained individually to address the issue of weak correlations among the facial action units, thereby facilitating the classification of microexpression emotions. The experimental results demonstrated that the proposed method achieved great performance across several microexpression datasets. The unweighted average recall (UAR) values were 80.71%, 89.55%, 44.64%, 80.59%, and 88.32% for the SAMM, CASME II, CAS(ME)
3 , SMIC, and MEGC2019 datasets, respectively. The unweighted F1 scores (UF1) were 79.32%, 88.30%, 43.03%, 81.12%, and 88.95%, respectively. Furthermore, when compared to the benchmark model, our proposed model achieved better performance with lower computational complexity, characterized by a Floating Point Operations (FLOPs) value of 1087.350 M and a total of 6.356 × 106 model parameters. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
14. Motor Fault Diagnosis Based on Convolutional Block Attention Module-Xception Lightweight Neural Network.
- Author
-
Xie, Fengyun, Fan, Qiuyang, Li, Gang, Wang, Yang, Sun, Enguang, and Zhou, Shengtong
- Subjects
- *
CONVOLUTIONAL neural networks , *FAULT diagnosis , *DEEP learning , *MOTOR learning , *GRAYSCALE model - Abstract
Electric motors play a crucial role in self-driving vehicles. Therefore, fault diagnosis in motors is important for ensuring the safety and reliability of vehicles. In order to improve fault detection performance, this paper proposes a motor fault diagnosis method based on vibration signals. Firstly, the vibration signals of each operating state of the motor at different frequencies are measured with vibration sensors. Secondly, the characteristic of Gram image coding is used to realize the coding of time domain information, and the one-dimensional vibration signals are transformed into grayscale diagrams to highlight their features. Finally, the lightweight neural network Xception is chosen as the main tool, and the attention mechanism Convolutional Block Attention Module (CBAM) is introduced into the model to enforce the importance of the characteristic information of the motor faults and realize their accurate identification. Xception is a type of convolutional neural network; its lightweight design maintains excellent performance while significantly reducing the model's order of magnitude. Without affecting the computational complexity and accuracy of the network, the CBAM attention mechanism is added, and Gram's corner field is combined with the improved lightweight neural network. The experimental results show that this model achieves a better recognition effect and faster iteration speed compared with the traditional Convolutional Neural Network (CNN), ResNet, and Xception networks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. A Novel Perceptual Constrained cycleGAN With Attention Mechanisms for Unsupervised MR‐to‐CT Synthesis.
- Author
-
Zhu, Ruiming, Liu, Xinliang, Li, Mingrui, Qian, Wei, and Teng, Yueyang
- Subjects
- *
ARTIFICIAL neural networks , *RADIOTHERAPY treatment planning , *COMPUTED tomography , *FEATURE extraction , *MAGNETIC resonance , *RADIATION exposure - Abstract
Radiotherapy treatment planning (RTP) requires both magnetic resonance (MR) and computed tomography (CT) modalities. However, conducting separate MR and CT scans for patients leads to misalignment, increased radiation exposure, and higher costs. To address these challenges and mitigate the limitations of supervised synthesis methods, we propose a novel unsupervised perceptual attention image synthesis model based on cycleGAN (PA‐cycleGAN). The innovation of PA‐cycleGAN lies in its model structure, which incorporates dynamic feature encoding and deep feature extraction to improve the understanding of image structure and contextual information. To ensure the visual authenticity of the synthetic images, we design a hybrid loss function that incorporates perceptual constraints using high‐level features extracted by deep neural networks. Our PA‐cycleGAN achieves notable results, with an average peak signal‐to‐noise ratio (PSNR) of 28.06, structural similarity (SSIM) of 0.95, and mean absolute error (MAE) of 46.90 on a pelvic dataset. Additionally, we validate the generalization of our method by conducting experiments on an additional head dataset. These experiments demonstrate that PA‐cycleGAN consistently outperforms other state‐of‐the‐art methods in both quantitative metrics and image synthesis quality. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Improved MobileNet V3-Based Identification Method for Road Adhesion Coefficient.
- Author
-
Li, Binglin, Xu, Jianqiang, Lian, Yufeng, Sun, Fengyu, Zhou, Jincheng, and Luo, Jun
- Subjects
- *
TRAFFIC safety , *PAVEMENTS , *AUTOMOBILE safety , *SURFACE states , *FEATURE extraction - Abstract
To enable the timely adjustment of the control strategy of automobile active safety systems, enhance their capacity to adapt to complex working conditions, and improve driving safety, this paper introduces a new method for predicting road surface state information and recognizing road adhesion coefficients using an enhanced version of the MobileNet V3 model. On one hand, the Squeeze-and-Excitation (SE) is replaced by the Convolutional Block Attention Module (CBAM). It can enhance the extraction of features effectively by considering both spatial and channel dimensions. On the other hand, the cross-entropy loss function is replaced by the Bias Loss function. It can reduce the random prediction problem occurring in the optimization process to improve identification accuracy. Finally, the proposed method is evaluated in an experiment with a four-wheel-drive ROS robot platform. Results indicate that a classification precision of 95.53% is achieved, which is higher than existing road adhesion coefficient identification methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Power Supply Risk Identification Method of Active Distribution Network Based on Transfer Learning and CBAM-CNN.
- Author
-
Liu, Hengyu, Sun, Jiazheng, Pan, Yongchao, Hu, Dawei, Song, Lei, Xu, Zishang, Yu, Hailong, and Liu, Yang
- Subjects
- *
CONVOLUTIONAL neural networks , *POWER distribution networks , *POWER resources , *DEEP learning , *FEATURE extraction - Abstract
With the development of the power system, power users begin to use their own power supply in order to improve the power economy, but this also leads to the occurrence of the risk of self-provided power supply. The actual distribution network has few samples of power supply risk and it is difficult to identify the power supply risk by using conventional deep learning methods. In order to achieve high accuracy of self-provided power supply risk identification with small samples, this paper proposes a combination of transfer learning, convolutional block attention module (CBAM), and convolutional neural network (CNN) to identify the risk of self-provided power supply in an active distribution network. Firstly, in order to be able to further identify whether or not a risk will be caused based on completing the identification of the faulty line, we propose that it is necessary to identify whether or not the captive power supply on the faulty line is in operation. Second, in order to achieve high-precision identification and high-efficiency feature extraction, we propose to embed the CBAM into a CNN to form a CBAM-CNN model, so as to achieve high-efficiency feature extraction and high-precision risk identification. Finally, the use of transfer learning is proposed to solve the problem of low risk identification accuracy due to the small number of actual fault samples. Simulation experiments show that compared with other methods, the proposed method has the highest recognition accuracy and the best effect, and the risk recognition accuracy of active distribution network backup power is high in the case of fewer samples. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Non-intrusive residential load identification based on load feature matrix and CBAM-BiLSTM algorithm.
- Author
-
Shunfu Lin, Bing Zhao, Yinfeng Zhan, Junsu Yu, Xiaoyan Bian, Dongdong Li, and Yongxin Xiong
- Subjects
ARTIFICIAL neural networks ,IDENTIFICATION ,OPTIMIZATION algorithms ,ALGORITHMS ,MACHINE learning ,MATRICES (Mathematics) - Abstract
With the increasing demand for the refined management of residential loads, the study of the non-invasive load monitoring (NILM) technologies has attracted much attention in recent years. This paper proposes a novel method of residential load identification based on load feature matrix and improved neural networks. Firstly, it constructs a unified scale bitmap format gray image consisted of multiple load feature matrix including: V-I characteristic curve, 1-16 harmonic currents, 1- cycle steady-state current waveform, maximum and minimum current values, active and reactive power. Secondly, it adopts a convolutional layer to extract image features and performs further feature extraction through a convolutional block attention module (CBAM). Thirdly, the feature matrix is converted and input to a bidirectional long short-term memory (BiLSTM) for training and identification. Furthermore, the identification results are optimized with dynamic time warping (DTW). The effectiveness of the proposed method is verified by the commonly used PLAID database. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. 基于改进 DeepLabCut 模型的奶牛滑蹄检测方法.
- Author
-
年 悦, 赵凯旋, and 姬江涛
- Abstract
Copyright of Smart Agriculture is the property of Smart Agriculture Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
20. Lightweight deep hybrid CNN with attention mechanism for enhanced underwater image restoration
- Author
-
Karthikeyan, V., Praveen, S., and Nandan, S. Sudeep
- Published
- 2025
- Full Text
- View/download PDF
21. Human monkeypox disease prediction using novel modified restricted Boltzmann machine-based equilibrium optimizer
- Author
-
D. Devarajan, P. Dhana lakshmi, S. Krishnaveni, and S. Senthilkumar
- Subjects
Human Monkeypox disease prediction ,Modified restricted boltzmann machine ,Equilibrium optimizer ,Convolutional block attention module ,Medicine ,Science - Abstract
Abstract While the globe continues to struggle to recover from the devastation brought on by the COVID-19 virus's extensive distribution, the recent worrying rise in human monkeypox outbreaks in several nations raises the possibility of a novel worldwide pandemic. The symptoms of human monkeypox resemble those of chickenpox and traditional measles, with a few subtle variations like the various kinds of skin blisters. A range of deep learning techniques have demonstrated encouraging results in image-oriented tumor cell, Covid-19 diagnosis, and skin disease prediction tasks. Hence, it becomes necessary to perform the prediction of the new monkeypox disease using deep learning techniques. In this paper, an image-oriented human monkeypox disease prediction is performed with the help of novel deep learning methodology. Initially, the data is gathered from the standard benchmark dataset called Monkeypox Skin Lesion Dataset. From the collected data, the pre-processing is accomplished using image resizing and image normalization as well as data augmentation techniques. These pre-processed images undergo the feature extraction that is performed by the Convolutional Block Attention Module (CBAM) approach. The extracted features undergo the final prediction phase using the Modified Restricted Boltzmann Machine (MRBM), where the parameter tuning in RBM is accomplished by the nature inspired optimization algorithm referred to as Equilibrium Optimizer (EO), with the consideration of error minimization as the major objective function. Simulation findings demonstrate that the proposed model performed better than the remaining models at monkeypox prediction. The proposed MRBM-EO for the suggested human monkeypox disease prediction model in terms of RMSE is 75.68%, 70%, 60.87%, and 43.75% better than PSO-SVM, Xception-CBAM-Dense, ShuffleNet, and RBM respectively. Similarly, the proposed MRBM-EO for the suggested human monkeypox disease prediction model with respect to accuracy is 9.22%, 7.75%, 3.77%, and 10.90% better than PSO-SVM, Xception-CBAM-Dense, ShuffleNet, and RBM respectively.
- Published
- 2024
- Full Text
- View/download PDF
22. Seizure Detection Based on Lightweight Inverted Residual Attention Network.
- Author
-
Lv, Hongbin, Zhang, Yongfeng, Xiao, Tiantian, Wang, Ziwei, Wang, Shuai, Feng, Hailing, Zhao, Xianxun, and Zhao, Yanna
- Subjects
- *
SEIZURES (Medicine) , *PILOCARPINE , *DIAGNOSIS of epilepsy , *PEOPLE with epilepsy , *ELECTROENCEPHALOGRAPHY - Abstract
Timely and accurately seizure detection is of great importance for the diagnosis and treatment of epilepsy patients. Existing seizure detection models are often complex and time-consuming, highlighting the urgent need for lightweight seizure detection. Additionally, existing methods often neglect the key characteristic channels and spatial regions of electroencephalography (EEG) signals. To solve these issues, we propose a lightweight EEG-based seizure detection model named lightweight inverted residual attention network (LRAN). Specifically, we employ a four-stage inverted residual mobile block (iRMB) to effectively extract the hierarchical features from EEG. The convolutional block attention module (CBAM) is introduced to make the model focus on important feature channels and spatial information, thereby enhancing the discrimination of the learned features. Finally, convolution operations are used to capture local information and spatial relationships between features. We conduct intra-subject and inter-subject experiments on a publicly available dataset. Intra-subject experiments obtain 99.25% accuracy in segment-based detection and 0.36/h false detection rate (FDR) in event-based detection, respectively. Inter-subject experiments obtain 84.32% accuracy. Both sets of experiments maintain high classification accuracy with a low number of parameters, where the multiply accumulate operations (MACs) are 25.86 M and the number of parameters is 0.57 M. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. SPCN: An Innovative Soybean Pod Counting Network Based on HDC Strategy and Attention Mechanism.
- Author
-
Li, Ximing, Zhuang, Yitao, Li, Jingye, Zhang, Yue, Wang, Zhe, Zhao, Jiangsan, Li, Dazhi, and Gao, Yuefang
- Subjects
PLANT breeding ,FEATURE extraction ,COMPUTER vision ,GENERALIZATION ,COUNTING - Abstract
Soybean pod count is a crucial aspect of soybean plant phenotyping, offering valuable reference information for breeding and planting management. Traditional manual counting methods are not only costly but also prone to errors. Existing detection-based soybean pod counting methods face challenges due to the crowded and uneven distribution of soybean pods on the plants. To tackle this issue, we propose a Soybean Pod Counting Network (SPCN) for accurate soybean pod counting. SPCN is a density map-based architecture based on Hybrid Dilated Convolution (HDC) strategy and attention mechanism for feature extraction, using the Unbalanced Optimal Transport (UOT) loss function for supervising density map generation. Additionally, we introduce a new diverse dataset, BeanCount-1500, comprising of 24,684 images of 316 soybean varieties with various backgrounds and lighting conditions. Extensive experiments on BeanCount-1500 demonstrate the advantages of SPCN in soybean pod counting with an Mean Absolute Error(MAE) and an Mean Squared Error(MSE) of 4.37 and 6.45, respectively, significantly outperforming the current competing method by a substantial margin. Its excellent performance on the Renshou2021 dataset further confirms its outstanding generalization potential. Overall, the proposed method can provide technical support for intelligent breeding and planting management of soybean, promoting the digital and precise management of agriculture in general. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. GA-UNet: A Lightweight Ghost and Attention U-Net for Medical Image Segmentation.
- Author
-
Pang, Bo, Chen, Lianghong, Tao, Qingchuan, Wang, Enhui, and Yu, Yanmei
- Subjects
DIAGNOSTIC imaging equipment ,BRAIN tumor diagnosis ,DIAGNOSTIC imaging ,GLIOMAS ,THREE-dimensional imaging ,DESCRIPTIVE statistics ,ARTIFICIAL neural networks ,DEEP learning ,COMPUTER-aided diagnosis ,DIGITAL image processing ,BRAIN tumors - Abstract
U-Net has demonstrated strong performance in the field of medical image segmentation and has been adapted into various variants to cater to a wide range of applications. However, these variants primarily focus on enhancing the model's feature extraction capabilities, often resulting in increased parameters and floating point operations (Flops). In this paper, we propose GA-UNet (Ghost and Attention U-Net), a lightweight U-Net for medical image segmentation. GA-UNet consists mainly of lightweight GhostV2 bottlenecks that reduce redundant information and Convolutional Block Attention Modules that capture key features. We evaluate our model on four datasets, including CVC-ClinicDB, 2018 Data Science Bowl, ISIC-2018, and BraTS 2018 low-grade gliomas (LGG). Experimental results show that GA-UNet outperforms other state-of-the-art (SOTA) models, achieving an F1-score of 0.934 and a mean Intersection over Union (mIoU) of 0.882 on CVC-ClinicDB, an F1-score of 0.922 and a mIoU of 0.860 on the 2018 Data Science Bowl, an F1-score of 0.896 and a mIoU of 0.825 on ISIC-2018, and an F1-score of 0.896 and a mIoU of 0.853 on BraTS 2018 LGG. Additionally, GA-UNet has fewer parameters (2.18M) and lower Flops (4.45G) than other SOTA models, which further demonstrates the superiority of our model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Human monkeypox disease prediction using novel modified restricted Boltzmann machine-based equilibrium optimizer.
- Author
-
Devarajan, D., Dhana lakshmi, P., Krishnaveni, S., and Senthilkumar, S.
- Subjects
MONKEYPOX ,OPTIMIZATION algorithms ,BOLTZMANN machine ,DEEP learning ,DATA augmentation - Abstract
While the globe continues to struggle to recover from the devastation brought on by the COVID-19 virus's extensive distribution, the recent worrying rise in human monkeypox outbreaks in several nations raises the possibility of a novel worldwide pandemic. The symptoms of human monkeypox resemble those of chickenpox and traditional measles, with a few subtle variations like the various kinds of skin blisters. A range of deep learning techniques have demonstrated encouraging results in image-oriented tumor cell, Covid-19 diagnosis, and skin disease prediction tasks. Hence, it becomes necessary to perform the prediction of the new monkeypox disease using deep learning techniques. In this paper, an image-oriented human monkeypox disease prediction is performed with the help of novel deep learning methodology. Initially, the data is gathered from the standard benchmark dataset called Monkeypox Skin Lesion Dataset. From the collected data, the pre-processing is accomplished using image resizing and image normalization as well as data augmentation techniques. These pre-processed images undergo the feature extraction that is performed by the Convolutional Block Attention Module (CBAM) approach. The extracted features undergo the final prediction phase using the Modified Restricted Boltzmann Machine (MRBM), where the parameter tuning in RBM is accomplished by the nature inspired optimization algorithm referred to as Equilibrium Optimizer (EO), with the consideration of error minimization as the major objective function. Simulation findings demonstrate that the proposed model performed better than the remaining models at monkeypox prediction. The proposed MRBM-EO for the suggested human monkeypox disease prediction model in terms of RMSE is 75.68%, 70%, 60.87%, and 43.75% better than PSO-SVM, Xception-CBAM-Dense, ShuffleNet, and RBM respectively. Similarly, the proposed MRBM-EO for the suggested human monkeypox disease prediction model with respect to accuracy is 9.22%, 7.75%, 3.77%, and 10.90% better than PSO-SVM, Xception-CBAM-Dense, ShuffleNet, and RBM respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Gearbox Fault Diagnosis Based on MSCNN-LSTM-CBAM-SE.
- Author
-
He, Chao, Yasenjiang, Jarula, Lv, Luhui, Xu, Lihua, and Lan, Zhigang
- Subjects
- *
FAULT diagnosis , *GEARBOXES , *DIAGNOSIS methods - Abstract
Ensuring the safety of mechanical equipment, gearbox fault diagnosis is crucial for the stable operation of the whole system. However, existing diagnostic methods still have limitations, such as the analysis of single-scale features and insufficient recognition of global temporal dependencies. To address these issues, this article proposes a new method for gearbox fault diagnosis based on MSCNN-LSTM-CBAM-SE. The output of the CBAM-SE module is deeply integrated with the multi-scale features from MSCNN and the temporal features from LSTM, constructing a comprehensive feature representation that provides richer and more precise information for fault diagnosis. The effectiveness of this method has been validated with two sets of gearbox datasets and through ablation studies on this model. Experimental results show that the proposed model achieves excellent performance in terms of accuracy and F1 score, among other metrics. Finally, a comparison with other relevant fault diagnosis methods further verifies the advantages of the proposed model. This research offers a new solution for accurate fault diagnosis of gearboxes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Research on semantic segmentation algorithm of high latitude urban river ice based on deep transfer learning.
- Author
-
Zhao, Wangyuan, Xue, Yanzhuo, Han, Fenglei, Peng, Xiao, Zhao, Yiming, Zhang, Jiawei, Yang, Jianfeng, Lin, Qi, and Wu, Yuliang
- Subjects
- *
ICE on rivers, lakes, etc. , *DEEP learning , *AERIAL photography , *LATITUDE , *FEATURE extraction , *TECHNOLOGY transfer , *PYRAMIDS - Abstract
Automated observation methods for monitoring river ice in high-latitude urban areas are crucial for resource utilization, risk assessment, and navigation. However, current research lacks actual-scale river ice classification, such as low-altitude surveys. This study established a dataset of river ice in the Songhua River near Harbin, Northeast China, using UAV aerial photography and applied the RININet semantic segmentation algorithm for precise classification of different ice types in low-altitude aerial remote sensing images. To address environmental challenges, a feature extraction method integrating channel and spatial attention mechanisms was adopted, along with an improved pyramid pool structure to enhance feature recognition. Additionally, a two-stage transfer learning method established an ice recognition database, overcoming issues like small data volume and high annotation costs. Comparative evaluation metrics demonstrated the high accuracy of the semantic segmentation framework. Furthermore, a method for estimating ice blockage risk was proposed, applicable to various urban river ice management tasks, with practical significance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Underwater Side-Scan Sonar Target Detection: YOLOv7 Model Combined with Attention Mechanism and Scaling Factor.
- Author
-
Wen, Xin, Wang, Jian, Cheng, Chensheng, Zhang, Feihu, and Pan, Guang
- Subjects
- *
SONAR , *ARTIFICIAL neural networks , *SONAR imaging , *OBJECT recognition (Computer vision) , *UNDERWATER exploration - Abstract
Side-scan sonar plays a crucial role in underwater exploration, and the autonomous detection of side-scan sonar images is vital for detecting unknown underwater environments. However, due to the complexity of the underwater environment, the presence of a few highlighted areas on the targets, blurred feature details, and difficulty in collecting data from side-scan sonar, achieving high-precision autonomous target recognition in side-scan sonar images is challenging. This article addresses this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in side-scan sonar images. Firstly, given that side-scan sonar images contain large areas of irrelevant information, this paper introduces the Swin-Transformer for dynamic attention and global modeling, which enhances the model's focus on the target regions. Secondly, the Convolutional Block Attention Module (CBAM) is utilized to further improve feature representation and enhance the neural network model's accuracy. Lastly, to address the uncertainty of geometric features in side-scan sonar target features, this paper innovatively incorporates a feature scaling factor into the YOLOv7 model. The experiment initially verified the necessity of attention mechanisms in the public dataset. Subsequently, experiments on our side-scan sonar (SSS) image dataset show that the improved YOLOv7 model has 87.9% and 49.23% in its average accuracy ( m A P 0.5 ) and ( m A P 0.5:0.95), respectively. These results are 9.28% and 8.41% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this paper has great potential for object detection and the recognition of side-scan sonar images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Hybrid Deep Learning Network with Convolutional Attention for Detecting Epileptic Seizures from EEG Signals
- Author
-
Mekruksavanich, Sakorn, Jitpattanakul, Anuchit, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, and Arai, Kohei, editor
- Published
- 2024
- Full Text
- View/download PDF
30. Improved Multi-modal Image Fusion with Attention and Dense Networks: Visual and Quantitative Evaluation
- Author
-
Banerjee, Ankan, Patra, Dipti, Roy, Pradipta, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Kaur, Harkeerat, editor, Jakhetiya, Vinit, editor, Goyal, Puneet, editor, Khanna, Pritee, editor, Raman, Balasubramanian, editor, and Kumar, Sanjeev, editor
- Published
- 2024
- Full Text
- View/download PDF
31. Efficient Video Deblurring Guided by Motion Magnitude and Convolutional Block Attention Module
- Author
-
Zhang, Yiying, Zhang, Menghui, Zang, Yuxing, Zhu, Shuang, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Tan, Kay Chen, Series Editor, Wang, Wei, editor, Mu, Jiasong, editor, Liu, Xin, editor, and Na, Zhenyu Na, editor
- Published
- 2024
- Full Text
- View/download PDF
32. Covid-19 Detection Based on Chest X-ray Images Using Attention Mechanism Modules and Weight Uncertainty in Bayesian Neural Networks
- Author
-
Chen, Huan, Hsieh, Jia‐You, Hsu, Hsin-Yao, Chang, Yi-Feng, Xhafa, Fatos, Series Editor, Souri, Alireza, editor, and Bendak, Salaheddine, editor
- Published
- 2024
- Full Text
- View/download PDF
33. Improved YOLOv7 Small Object Detection Algorithm for Seaside Aerial Images
- Author
-
Yu, Miao, Jia, YinShan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Lu, Huimin, editor, and Cai, Jintong, editor
- Published
- 2024
- Full Text
- View/download PDF
34. Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
- Author
-
QIN Hao, LI Shuangyi, ZHAO Di, MENG Haowei, and SONG Bin
- Subjects
high-throughput satellite ,behavior cloning ,deep reinforcement learning ,proximal policy optimization ,convolutional block attention module ,Telecommunication ,TK5101-6720 - Abstract
In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency.
- Published
- 2024
- Full Text
- View/download PDF
35. A novel dense-net deep neural network with enhanced feature selection method for classification of different stages of tuberculosis using chest X-ray images
- Author
-
Patankar, Mamta, Chaurasia, Vijayshri, and Shandilya, Madhu
- Published
- 2024
- Full Text
- View/download PDF
36. Septic Arthritis Modeling Using Sonographic Fusion with Attention and Selective Transformation: a Preliminary Study
- Author
-
Lo, Chung-Ming and Lai, Kuo-Lung
- Published
- 2024
- Full Text
- View/download PDF
37. Triple Attention Mechanism with YOLOv5s for Fish Detection.
- Author
-
Long, Wei, Wang, Yawen, Hu, Lingxi, Zhang, Jintao, Zhang, Chen, Jiang, Linhua, and Xu, Lihong
- Subjects
- *
DEEP learning , *FISH farming , *FEATURE extraction , *TRADITIONAL farming , *MOVING average process , *POLLUTION - Abstract
Traditional fish farming methods suffer from backward production, low efficiency, low yield, and environmental pollution. As a result of thorough research using deep learning technology, the industrial aquaculture model has experienced gradual maturation. A variety of complex factors makes it difficult to extract effective features, which results in less-than-good model performance. This paper proposes a fish detection method that combines a triple attention mechanism with a You Only Look Once (TAM-YOLO)model. In order to enhance the speed of model training, the process of data encapsulation incorporates positive sample matching. An exponential moving average (EMA) is incorporated into the training process to make the model more robust, and coordinate attention (CA) and a convolutional block attention module are integrated into the YOLOv5s backbone to enhance the feature extraction of channels and spatial locations. The extracted feature maps are input to the PANet path aggregation network, and the underlying information is stacked with the feature maps. The method improves the detection accuracy of underwater blurred and distorted fish images. Experimental results show that the proposed TAM-YOLO model outperforms YOLOv3, YOLOv4, YOLOv5s, YOLOv5m, and SSD, with a mAP value of 95.88%, thus providing a new strategy for fish detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. 基于注意力机制和迁移学习的 服装分类方法.
- Author
-
陈金广, 黄晓菊, and 马丽丽
- Abstract
Aimed the low efficiency and low accuracy of clothing image classification, a clothing image classification method based on attention mechanism and transfer learning was proposed. The pre-trained ResNet50 network model was used for transfer learning on the clothing dataset to reduce the dependence on the dataset and the network training time. Image dataset was processed by data augmentation of geometric transform and color jitter to improve the generalization ability of the model. Convolutional block attention module(CBAM)was added to the ResNet50-based network, and attention of different region of clothing was improved from both channel and spatial dimensions in turn. Then the feature expression capability was enhanced. The validation was performed on two datasets of CD and IDFashion with different background interference. Experimental results show that the proposed model can extract more clothing feature information, and the average classification accuracy in the IDFashion dataset is 95.60%, which is higher than that of ResNet50, ResNet50+STN and ResNet50+ECA models by 6.65%, 6.69%, 6.62%, which improves the accuracy and efficiency of clothing image classification to some extent. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Anterior mediastinal nodular lesion segmentation from chest computed tomography imaging using UNet based neural network with attention mechanisms.
- Author
-
Wang, Yi, Jeong, Won Gi, Zhang, Hao, Choi, Younhee, Jin, Gong Yong, and Ko, Seok-Bum
- Subjects
COMPUTER-aided diagnosis ,COMPUTED tomography ,ATTENTION ,SENSITIVITY & specificity (Statistics) ,THYMOMA ,IMAGE segmentation - Abstract
Automated detection of anterior mediastinal nodular lesions (AMLs) has significance for clinical usage as it is challenging for radiologists to accurately identify AMLs from chest computed tomography (CT) imaging due to various factors, including poor resolution, variations in intensity and the similarity of the AMLs to other tissues. To assist radiologists in AML detection from chest CT imaging, a UNet-based computer-aided detection (CADe) system is proposed to segment AMLs from slice images of the chest CT scans. The proposed network adopts a modified UNet architecture. To guide the proposed network to selectively focus on AMLs and potentially disregard others in the image, different attention mechanisms are utilized in the proposed network, including the self-attention mechanism and the convolutional block attention module (CBAM). The proposed network was trained and evaluated on 180 chest CT scans which consist of 180 AMLs. 90 AMLs were identified as thymic cysts, and 90 AMLs were diagnosed as thymoma. The proposed network achieved an average dice similarity coefficient (DSC) of 93.23 with 5-fold cross-validation, for which the mean Intersection over Union (IoU), sensitivity and specificity were 90.29, 93.98 and 95.68 respectively. Our method demonstrated an improved segmentation performance over state-of-the-art segmentation networks, including UNet, ResUNet, TransUNet and UNet++. The proposed network employing attention mechanisms exhibited a promising result for segmenting AMLs from chest CT imaging and could be used to automate the AML detection process for achieving improved diagnostic reliability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. PL-DINO: An Improved Transformer-Based Method for Plant Leaf Disease Detection.
- Author
-
Li, Wei, Zhu, Lizhou, and Liu, Jun
- Subjects
PLANT diseases ,OBJECT recognition (Computer vision) ,ORGANIC farming ,FOLIAGE plants ,CROPS - Abstract
Agriculture is important for ecology. The early detection and treatment of agricultural crop diseases are meaningful and challenging tasks in agriculture. Currently, the identification of plant diseases relies on manual detection, which has the disadvantages of long operation time and low efficiency, ultimately impacting the crop yield and quality. To overcome these disadvantages, we propose a new object detection method named "Plant Leaf Detection transformer with Improved deNoising anchOr boxes (PL-DINO)". This method incorporates a Convolutional Block Attention Module (CBAM) into the ResNet50 backbone network. With the assistance of the CBAM block, the representative features can be effectively extracted from leaf images. Next, an EQualization Loss (EQL) is employed to address the problem of class imbalance in the relevant datasets. The proposed PL-DINO is evaluated using the publicly available PlantDoc dataset. Experimental results demonstrate the superiority of PL-DINO over the related advanced approaches. Specifically, PL-DINO achieves a mean average precision of 70.3%, surpassing conventional object detection algorithms such as Faster R-CNN and YOLOv7 for leaf disease detection in natural environments. In brief, PL-DINO offers a practical technology for smart agriculture and ecological monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. YOLOvT: CSPNet-based attention for a lightweight textile defect detection model.
- Author
-
Hu, Xiaohan, Dai, Ning, Hu, Xudong, and Yuan, Yanhong
- Subjects
TRANSFORMER models ,CONVOLUTIONAL neural networks ,CLASSIFICATION algorithms - Abstract
Fabric inspection is a crucial process in the textile industry's quality control. Due to the varying structures, textures, geometric features, and spatial distributions of fabric defects, manual fabric inspection is costly and inefficient. Existing fabric defect detection algorithms struggle to strike a balance among efficiency, accuracy, applicability, and deployment costs. In this model, an efficient lightweight fabric defect detection and classification algorithm based on deep convolutional neural networks is proposed. First, the algorithm performs cluster analysis on the fabric defect dataset to ensure that prior boxes better recall objects with fabric defect geometries and spatial characteristics. Next is fusing the convolutional block attention module attention mechanism and Swin Transformer module with the CSPNet structure. This fusion enhances the model's focus on local features and its ability to capture global contextual information without sacrificing the model's inference speed. Moreover, WIoU or Wise-IoU is used as the bounding box loss function of the model, which improves the convergence speed of the bounding box loss and enhances the positioning ability of the model. Finally, the performance of the improved model was validated on a public dataset, showing varying degrees of improvement compared to the baseline model and other state-of-the-art algorithms, meeting the requirements of modern textile processes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Unsupervised image-to-image translation with multiscale attention generative adversarial network.
- Author
-
Wang, Fasheng, Zhang, Qing, Zhao, Qianyi, Wang, Mengyin, and Sun, Fuming
- Subjects
GENERATIVE adversarial networks ,SPINE ,IMAGE registration - Abstract
Unsupervised image-to-image translation refers to translating images from the source domain to the target domain, assuring that the translated images have the style of the target domain while retaining the content of the source domain. Although existing image-to-image translation methods can map an image from the source domain to the target domain, the translation results are prone to visual artifacts, and the texture and shape of the input image cannot match the target domain well. The reason for this phenomenon is that the generator ignores the most differential information between the source and target domains, preventing the extraction of the rich image feature information. In this paper, we propose a multiscale attention-generative adversarial network (MSA-GAN) for unsupervised image-to-image translation. In MSA-GAN, we design a multiscale attention network (MSANet) as the backbone of the generator, which consists of the Res2Net block and convolutional block attention module (CBAM). MSANet can extract global and local features and effectively alleviate the detail missing and blurry problems in image translation. It also focuses on the important image features and improves the ability of the network to extract features from the most distinguishing regions between the source and target domains, which allows it to better translate the texture details and object shape. In addition, to generate high-quality images, we introduce the perceptual loss to constrain high-level feature information. Extensive experimental results show that the proposed MSA-GAN achieves competitive performance in image-to-image translation. Our model outperforms several advanced models on several public benchmark datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. GLAD: Advanced Attention Mechanism-Based Model for Grape Leaf Disease Detection.
- Author
-
Thatha, Venkata Nagaraju, Kumari, Polukonda Mary Kamala, Sirisha, Uddagiri, Manoj, Valisetty Venkata Ram, and Praveen, Surapaneni Phani
- Subjects
CONVOLUTIONAL neural networks ,DATA augmentation ,AGRICULTURAL industries - Abstract
Diseases affecting grape leaves can have a wide variety of symptoms and a complicated history in vineyards, making detection and diagnosis a formidable issue. The complexity of these problems is frequently too much for existing detection algorithms to handle. Hence, a new method called GLAD (Grape Leaf Disease Detection) was developed. GLAD makes use of the PLANT-VILLAGE dataset, which has been hand-picked to detect grape diseases. We added the self-attention mechanism to make it more effective, and it now can collect data on grape leaf illnesses all over the world better. Adaptively spatial feature fusion (ASFF) technology and BiFPN feature fusion network provide more robust models and improve grape leaf disease fusion by reducing complex background interference. The Shuffle Attention approach is also used to make identifying diseases in grape leaves easier. The dataset is enriched using data augmentation methodologies and transfer learning to identify diseases affecting grape leaves. As part of this process, the model's parameters are adjusted using data from other plant disease datasets. Despite several obstacles, the experimental findings show that the suggested model is intelligent enough to identify grape leaf disorders. Its real-time target detection capabilities are on full display when it outperforms state-of-the-art methods. A powerful and effective tool for the agricultural sector, GLAD is a major step forward in solving the problems associated with grape leaf disease identification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Fall detection on embedded platform using infrared array sensor for healthcare applications.
- Author
-
Jiang, Yan, Gong, Tianyi, He, Lingfeng, Yan, Shicheng, Wu, Xiaoping, and Liu, Jianyang
- Subjects
- *
SENSOR arrays , *DEEP learning , *INFRARED imaging , *INFRARED cameras , *INFRARED equipment , *VISIBLE spectra , *IMAGE sensors - Abstract
Previous vision-based research has predominantly used common visible light cameras as sensors for detecting falls in home environments. While some studies have explored the use of infrared cameras for this purpose, personal privacy protection and computational capability on an embedded platform remain crucial concerns. To address these challenges and achieve accurate human fall detection on an embedded platform, we propose a new lightweight human fall detection method based on a deep learning network. In the first stage, we designed an image acquisition device based on an infrared array sensor to collect an infrared human fall dataset (https://github.com/Flier-01/Deeplearning-based-Fall-Detection-Using-Infrared-Array-Dataset). This dataset consists of 10240 images, including 5216 pictures of falls, 4024 pictures of non-fall walking, and 1000 pictures of other poses. Furthermore, we have included an additional set of 10 videos specifically for testing purposes. These images were captured within living environments with varying ambient temperatures. To address challenges associated with infrared images, such as excessive noise and low definition, we adopted the RetinexNet algorithm to preprocess the collected images. This pre-processing step significantly improves the quality of the infrared images, enabling more accurate analysis and detection. Subsequently, we developed a modified YOLOv5 network that incorporates a comprehensive enhancement strategy by integrating the CBAM and TPH modules. These modules enhance the network's ability to capture and extract features relevant to fall detection. Furthermore, to optimize the network's performance, we employed the GhostNet architecture and deployed the resulting model on the Huawei Altas embedded platform. Through video testing, our fall detection system achieved a real-time detection frame rate of 38.61 FPS, surpassing the performance of the original YOLOv5-based fall detector, which attained a frame rate of 34.78 FPS. Notably, our proposed method demonstrated remarkable performance in terms of fall detection accuracy. The average accuracy of our fall detector reached an impressive 96.52%, outperforming the original YOLOv5 fall detector, which achieved an average accuracy of 88.46%. These experimental results affirm the superiority of our approach, exhibiting improved fall detection accuracy and real-time performance compared to the original YOLOv5 algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. A New Method to Detect Buffalo Mastitis Using Udder Ultrasonography Based on Deep Learning Network.
- Author
-
Zhang, Xinxin, Li, Yuan, Zhang, Yiping, Yao, Zhiqiu, Zou, Wenna, Nie, Pei, and Yang, Liguo
- Subjects
- *
DEEP learning , *ARTIFICIAL neural networks , *MASTITIS , *RECEIVER operating characteristic curves , *LIE detectors & detection , *EARLY diagnosis - Abstract
Simple Summary: In this study, deep learning combined with udder ultrasonography of buffalo was used for the detection of mastitis for the first time, with the aim of establishing an accurate, rapid, and inexpensive method to detect buffalo mastitis instead of routine laboratory examination. This method provides a basis for mastitis detection in buffaloes mostly raised by small farmers and has the opportunity to be used in a variety of dairy animals in the future. Mastitis is one of the most predominant diseases with a negative impact on ranch products worldwide. It reduces milk production, damages milk quality, increases treatment costs, and even leads to the premature elimination of animals. In addition, failure to take effective measures in time will lead to widespread disease. The key to reducing the losses caused by mastitis lies in the early detection of the disease. The application of deep learning with powerful feature extraction capability in the medical field is receiving increasing attention. The main purpose of this study was to establish a deep learning network for buffalo quarter-level mastitis detection based on 3054 ultrasound images of udders from 271 buffaloes. Two data sets were generated with thresholds of somatic cell count (SCC) set as 2 × 105 cells/mL and 4 × 105 cells/mL, respectively. The udders with SCCs less than the threshold value were defined as healthy udders, and otherwise as mastitis-stricken udders. A total of 3054 udder ultrasound images were randomly divided into a training set (70%), a validation set (15%), and a test set (15%). We used the EfficientNet_b3 model with powerful learning capabilities in combination with the convolutional block attention module (CBAM) to train the mastitis detection model. To solve the problem of sample category imbalance, the PolyLoss module was used as the loss function. The training set and validation set were used to develop the mastitis detection model, and the test set was used to evaluate the network's performance. The results showed that, when the SCC threshold was 2 × 105 cells/mL, our established network exhibited an accuracy of 70.02%, a specificity of 77.93%, a sensitivity of 63.11%, and an area under the receiver operating characteristics curve (AUC) of 0.77 on the test set. The classification effect of the model was better when the SCC threshold was 4 × 105 cells/mL than when the SCC threshold was 2 × 105 cells/mL. Therefore, when SCC ≥ 4 × 105 cells/mL was defined as mastitis, our established deep neural network was determined as the most suitable model for farm on-site mastitis detection, and this network model exhibited an accuracy of 75.93%, a specificity of 80.23%, a sensitivity of 70.35%, and AUC 0.83 on the test set. This study established a 1/4 level mastitis detection model which provides a theoretical basis for mastitis detection in buffaloes mostly raised by small farmers lacking mastitis diagnostic conditions in developing countries. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Twin-stage Unet-like network for single image deraining.
- Author
-
Zhou, Weina and Wang, Xiu
- Abstract
The performance of visual processing is commonly constrained in extreme outside weather such as heavy rain. Rain streaks may substantially damage image optical quality and impact image processing in many scenarios. Thus, it has practical application value in researching the problem of single image rain removal. However, removing rain streaks from a single image is a challenging task. Although end-to-end learning approaches based on convolutional neural networks have lately made significant progress on this task, most existing methods still cannot perform deraining well. They fail to process the details of the background layer, resulting in the loss of certain information. To address this issue, we propose a single image deraining network named twin-stage Unet-like network (TUNet). Specifically, a reconstitution residual block (RRB) is presented as the basic structure of encoder–decoder to obtain more spatial contextual information for extracting rain components. Then, a residual sampling module (RSM) is introduced to perform downsampling and upsampling operations to preserve residual properties in the structure while obtaining deeper image features. Finally, the convolutional block attention module (CBAM) is adopted to fuse shallow and deep features of the same size in the model. Extensive experiments on five publicly synthetic datasets and a real-world dataset demonstrate that our proposed TUNet model outperforms the state-of-the-art deraining approaches. The average PSNR value of TUNet is 0.41 dB higher than the state-of-the-art method (OSAM-Net) on synthetic datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. CAEM-GBDT: a cancer subtype identifying method using multi-omics data and convolutional autoencoder network
- Author
-
Jiquan Shen, Xuanhui Guo, Hanwen Bai, and Junwei Luo
- Subjects
cancer subtype ,cancer subtype identification ,convolutional autoencode ,convolutional block attention module ,multi-omics ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
The identification of cancer subtypes plays a very important role in the field of medicine. Accurate identification of cancer subtypes is helpful for both cancer treatment and prognosis Currently, most methods for cancer subtype identification are based on single-omics data, such as gene expression data. However, multi-omics data can show various characteristics about cancer, which also can improve the accuracy of cancer subtype identification. Therefore, how to extract features from multi-omics data for cancer subtype identification is the main challenge currently faced by researchers. In this paper, we propose a cancer subtype identification method named CAEM-GBDT, which takes gene expression data, miRNA expression data, and DNA methylation data as input, and adopts convolutional autoencoder network to identify cancer subtypes. Through a convolutional encoder layer, the method performs feature extraction on the input data. Within the convolutional encoder layer, a convolutional self-attention module is embedded to recognize higher-level representations of the multi-omics data. The extracted high-level representations from the convolutional encoder are then concatenated with the input to the decoder. The GBDT (Gradient Boosting Decision Tree) is utilized for cancer subtype identification. In the experiments, we compare CAEM-GBDT with existing cancer subtype identifying methods. Experimental results demonstrate that the proposed CAEM-GBDT outperforms other methods. The source code is available from GitHub at https://github.com/gxh-1/CAEM-GBDT.git.
- Published
- 2024
- Full Text
- View/download PDF
48. Plant leaf disease recognition based on improved SinGAN and improved ResNet34
- Author
-
Jiaojiao Chen, Haiyang Hu, and Jianping Yang
- Subjects
plant leaf disease identification ,SinGAN ,autoencoder ,convolutional block attention module ,ResNet34 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The identification of plant leaf diseases is crucial in precision agriculture, playing a pivotal role in advancing the modernization of agriculture. Timely detection and diagnosis of leaf diseases for preventive measures significantly contribute to enhancing both the quantity and quality of agricultural products, thereby fostering the in-depth development of precision agriculture. However, despite the rapid development of research on plant leaf disease identification, it still faces challenges such as insufficient agricultural datasets and the problem of deep learning-based disease identification models having numerous training parameters and insufficient accuracy. This paper proposes a plant leaf disease identification method based on improved SinGAN and improved ResNet34 to address the aforementioned issues. Firstly, an improved SinGAN called Reconstruction-Based Single Image Generation Network (ReSinGN) is proposed for image enhancement. This network accelerates model training speed by using an autoencoder to replace the GAN in the SinGAN and incorporates a Convolutional Block Attention Module (CBAM) into the autoencoder to more accurately capture important features and structural information in the images. Random pixel Shuffling are introduced in ReSinGN to enable the model to learn richer data representations, further enhancing the quality of generated images. Secondly, an improved ResNet34 is proposed for plant leaf disease identification. This involves adding CBAM modules to the ResNet34 to alleviate the limitations of parameter sharing, replacing the ReLU activation function with LeakyReLU activation function to address the problem of neuron death, and utilizing transfer learning-based training methods to accelerate network training speed. This paper takes tomato leaf diseases as the experimental subject, and the experimental results demonstrate that: (1) ReSinGN generates high-quality images at least 44.6 times faster in training speed compared to SinGAN. (2) The Tenengrad score of images generated by the ReSinGN model is 67.3, which is improved by 30.2 compared to the SinGAN, resulting in clearer images. (3) ReSinGN model with random pixel Shuffling outperforms SinGAN in both image clarity and distortion, achieving the optimal balance between image clarity and distortion. (4) The improved ResNet34 achieved an average recognition accuracy, recognition precision, recognition accuracy (redundant as it’s similar to precision), recall, and F1 score of 98.57, 96.57, 98.68, 97.7, and 98.17%, respectively, for tomato leaf disease identification. Compared to the original ResNet34, this represents enhancements of 3.65, 4.66, 0.88, 4.1, and 2.47%, respectively.
- Published
- 2024
- Full Text
- View/download PDF
49. Research on distributed network intrusion detection system for IoT based on honeyfarm
- Author
-
Hao WU, Jiajia HAO, and Yunlong LU
- Subjects
NIDS ,federated learning ,honeyfarm ,convolutional block attention module ,IoT ,Telecommunication ,TK5101-6720 - Abstract
To solve the problems that the network intrusion detection system in the Internet of things couldn’t identify new attacks and has limited flexibility, a network intrusion detection system based on honeyfarm was proposed, which could effectively identify abnormal traffic and have continuous learning ability.Firstly, considering the characteristics of the convolutional block attention module, an abnormal traffic detection model was developed, focusing on both channel and spatial dimensions, to enhance the model’s recognition abilities.Secondly, a model training scheme utilizing federated learning was employed to enhance the model’s generalization capabilities.Finally, the abnormal traffic detection model at the edge nodes was continuously updated and iterated based on the honeyfarm, so as to improve the system’s accuracy in recognizing new attack traffic.The experimental results demonstrate that the proposed system not only effectively detects abnormal behavior in network traffic, but also continually enhances performance in detecting abnormal traffic.
- Published
- 2024
- Full Text
- View/download PDF
50. Automatic Blood Cell Detection Based on Advanced YOLOv5s Network
- Author
-
Yinggang He
- Subjects
Blood cell detection ,YOLOv5s ,BiFPN ,convolutional block attention module ,transformer ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
There is a great demand for automatic detection and classification of blood cells (BCs) in clinical medical diagnoses. Traditional methods, such as hematology analyzer and manual counting were laborious, time intensive, and limited by analysts’ professional experience and knowledge. In this paper, the one-stage network based upon improved YOLOv5s is provided to detect BCs. First, the Transformer and bidirectional feature pyramid network (BiFPN) are introduced into the backbone network and neck network for refining the adaptive features, respectively. Second, Convolutional Block Attention Module (CBAM) is added to neck network outputs to strengthen the key features in space and channel. In addition, an Efficient Intersection over Union (EIoU) was introduced to improve model accuracy regarding localization and performance. The improvements are embedded into the YOLOv5s model and termed YOLOv5s-TRBC. The experiments on the blood cell dataset (BCCD) show that in the three types of BCs detections, the mean average precision (mAP) of the method proposed reached 93.5%. Furthermore, comparative experiments demonstrate that the model could perform favorably against the counterparts with respect to mAP rate, and the model’s Giga Floating-point Operations Per Second (GFLOPs) is reduced to 1/6 of YOLOv5, which provides a potential solution for future computer-aid diagnostic systems.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.