6,041 results on '"U-Net"'
Search Results
2. Retinex decomposition based low‐light image enhancement by integrating Swin transformer and U‐Net‐like architecture.
- Author
-
Wang, Zexin, Qingge, Letu, Pan, Qingyi, and Yang, Pei
- Subjects
- *
TRANSFORMER models , *IMAGE intensifiers , *VISUAL perception , *REFLECTANCE , *TEST methods - Abstract
Low‐light images are captured in environments with minimal lighting, such as nighttime or underwater conditions. These images often suffer from issues like low brightness, poor contrast, lack of detail, and overall darkness, significantly impairing human visual perception and subsequent high‐level visual tasks. Enhancing low‐light images holds great practical significance. Among the various existing methods for Low‐Light Image Enhancement (LLIE), those based on the Retinex theory have gained significant attention. However, despite considerable efforts in prior research, the challenge of Retinex decomposition remains unresolved. In this study, an LLIE network based on the Retinex theory is proposed, which addresses these challenges by integrating attention mechanisms and a U‐Net‐like architecture. The proposed model comprises three modules: the Decomposition module (DECM), the Reflectance Recovery module (REFM), and the Illumination Enhancement module (ILEM). Its objective is to decompose low‐light images based on the Retinex theory and enhance the decomposed reflectance and illumination maps using attention mechanisms and a U‐Net‐like architecture. We conducted extensive experiments on several widely used public datasets. The qualitative results demonstrate that the approach produces enhanced images with superior visual quality compared to the existing methods on all test datasets, especially for some extremely dark images. Furthermore, the quantitative evaluation results based on metrics PSNR, SSIM, LPIPS, BRISQUE, and MUSIQ show the proposed model achieves superior performance, with PSNR and BRISQUE significantly outperforming the baseline approaches, where (PSNR, mean BRISQUE) values of the proposed method and the second best results are (17.14, 17.72) and (16.44, 19.65). Additionally, further experimental results such as ablation studies indicate the effectiveness of the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. LV-YOLO: logistic vehicle speed detection and counting using deep learning based YOLO network.
- Author
-
Rani, N. Gopika, Priya, N. Hema, Ahilan, A., and Muthukumaran, N.
- Abstract
In the era of smart cities and advancing transportation technologies, predicting logistic vehicle and vehicle speed is pivotal to enhancing traffic management, safety, and overall transportation efficiency. Properly predicting vehicle and vehicle speed is critical to the interests of both road users and traffic authorities. However, accurately predicting the vehicle speed and logistics vehicle of a single trip is a difficult task. In some cases, unpredicted accidents will happen, so death cases will increase. To overcome these issues, a novel Logistic Vehicle speed detection using the YOLO (LV-YOLO) method has been introduced to detect logistical vehicles and speed using the YOLO network. The proposed framework is divided into three layers such as image acquisition, segmentation layer, and detection layer. In the image acquisition layer, a CCTV camera captures highway traffic video. The collected video is converted into frames. In the segmentation layer, the video frame is segmented using U-Net, which segments the vehicle in the video frames. The detection layer performs truck detection, and speed detection using LV-YOLO on segmented frames based on the Boxy Vehicle dataset. The simulated results show that the LV-YOLO technique maintains excellent mAP levels of 99.42%. The LV-YOLO improves the overall mAP by 1.72, 5.42, and 0.82% better than the Simple Vehicle Counting System, Real-Time Detection, and Advance YOLOv3 Model for vehicle detection, 4.81, and 2.63% better than Deep Learning and CAN protocol, and 1D-CNN speed estimation mode for speed prediction respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. DUFormer: dual-channel image splicing detection based on anchor-shaped U-Net and stepwise transformer for power systems.
- Author
-
Tian, Xiuxia, Zhao, Jianren, and Wen, Longfang
- Abstract
The safe operation of intelligent power systems relies on the authenticity and integrity of image data. However, splicing-based image tampering, a common form of image forgery, poses severe challenges to the security monitoring of power systems. Addressing the limitations of traditional image splicing detection techniques in power system applications, this paper introduces DUFormer, a dual-channel image splicing detection model that combines anchor-shaped U-Net and stepwise Transformer. The model explores image features through the stepwise Transformer and precisely locates small-sized tampered areas using the anchor-shaped U-Net, enhancing the recognition capability for tampering of various scales. Tests on the substation splicing forgery dataset (SSFD) dataset, which contains 1192 tampered images of power systems, show that DUFormer achieved a 32.76% improvement in intersection over union and a 29.77% improvement in F1 score, and a reduction in mean absolute error by 0.05 relative to the second-best performing model. Additionally, evaluations on multiple public datasets confirm that DUFormer surpasses existing detection technologies on various performance metrics, especially exhibiting outstanding performance at the level of detail. This paper also examines the model's robustness against JPEG compression operation to ensure its effectiveness in real-world applications. This research not only improves the pixel-level detection accuracy of power image splicing but also lays a solid foundation for the development of future security monitoring technologies for intelligent power systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Tsnet: a two-stage network for image dehazing with multi-scale fusion and adaptive learning.
- Author
-
Gong, Xiaolin, Zheng, Zehan, and Du, Heyuan
- Abstract
Image dehazing has been a popular topic of research for a long time. Previous deep learning-based image dehazing methods have failed to achieve satisfactory dehazing effects on both synthetic datasets and real-world datasets, exhibiting poor generalization. Moreover, single-stage networks often result in many regions with artifacts and color distortion in output images. To address these issues, this paper proposes a two-stage image dehazing network called TSNet, mainly consisting of the multi-scale fusion module (MSFM) and the adaptive learning module (ALM). Specifically, MSFM and ALM enhance the generalization of TSNet. The MSFM can obtain large receptive fields at multiple scales and integrate features at different frequencies to reduce the differences between inputs and learning objectives. The ALM can actively learn of regions of interest in images and restore texture details more effectively. Additionally, TSNet is designed as a two-stage network, where the first-stage network performs image dehazing, and the second-stage network is employed to improve issues such as artifacts and color distortion present in the results of the first-stage network. We also change the learning objective from ground truth images to opposite fog maps, which improves the learning efficiency of TSNet. Extensive experiments demonstrate that TSNet exhibits superior dehazing performance on both synthetic and real-world datasets compared to previous state-of-the-art methods. The related code is released at https://github.com/zzhlovexuexi/TSNet. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. U-NET: A Supervised Approach for Monaural Source Separation.
- Author
-
Basir, Samiul, Hossain, Md. Nahid, Hosen, Md. Shakhawat, Ali, Md. Sadek, Riaz, Zainab, and Islam, Md. Shohidul
- Subjects
- *
CONVOLUTIONAL neural networks , *FOURIER transforms , *DEEP learning - Abstract
Separating speech is a challenging area of research, especially when trying to separate the desired source from its combination. Deep learning has arisen as a promising solution, surpassing traditional methods. While prior research has mainly focused on the magnitude, log-magnitude, or a combination of the magnitude and phase portions, a new approach using the Short-time Fourier Transform (STFT), and a deep Convolutional Neural Network named U-NET has been proposed. This method, unlike others, considers both the real and imaginary components for decomposition. During the training stage, the mixed time-domain signal undergoes a transformation into a frequency-domain signal by using STFT, producing a mixed complex spectrogram. The spectrogram's real and imaginary parts are then divided and combined into a single matrix. The newly formed matrix is fed through U-NET to extract the source components. The same process is repeated at testing. The resulting concatenated matrix for the mixed test signal is passed through the saved model to generate two enhanced concatenated matrices for each source. These matrices are then transformed back into time-domain signals using inverse STFT by extracting the magnitude and phase. The proposed approach has been evaluated using the GRID audio visual corpuses, with results showing improved quality and intelligibility compared to the existing methods, as demonstrated by objective measurement metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. SRU-Net: a novel spatiotemporal attention network for sclera segmentation and recognition.
- Author
-
Mashayekhbakhsh, Tara, Meshgini, Saeed, Rezaii, Tohid Yousefi, and Makouei, Somayeh
- Abstract
Segmenting sclera images for effective recognition under non-cooperative conditions poses a significant challenge due to the prevalent noise. While U-Net-based methods have shown success, their limitations in accurately segmenting objects with varying shapes necessitate innovative approaches. This paper introduces the spatiotemporal residual encoding and decoding network (SRU-Net), featuring multi-spatiotemporal feature integration (Ms-FI) modules and attention-pool mechanisms to enhance segmentation accuracy and robustness. Ms-FI modules within SRU-Net’s encoders and decoders identify salient feature regions and prune responses, while attention-pool modules improve segmentation robustness. To assess the proposed SRU-Net, we conducted experiments using six datasets, employing precision, recall, and F1-score metrics. The experimental results demonstrate the superiority of SRU-Net over state-of-the-art methods. Specifically, SRU-Net achieves F1-score values of 94.58%, 98.31%, 98.49%, 97.52%, 95.3%, 97.47%, and 93.11% for MSD, MASD, SVBPI, MASD+MSD, UBIRIS.v1, UBIRIS.v2, and MICHE, respectively. Further evaluation in recognition tasks, with metrics such as AUC, EER, VER@0.1%FAR, and VER@1%FAR considered for the six datasets. The proposed pipeline, comprising SRU-Net and auto encoders (AE), outperforms previous research for all datasets. Particularly noteworthy is the comparison of EER, where SRU-Net + AE exhibits the best recognition results, achieving an EER of 9.42%, 3.81%, and 5.73% for MSD, MASD, and MICHE datasets, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. MSU-Net: the multi-scale supervised U-Net for image splicing forgery localization.
- Author
-
Yu, Hao, Su, Lichao, Dai, Chenwei, and Wang, Jinli
- Abstract
Image splicing forgery, that is, copying some parts of an image into another image, is one of the frequently used tampering methods in image forgery. As a research hotspot in recent years, deep learning has been used in image forgery detection. However, current deep learning methods have two drawbacks: first, they are too simple in feature fusion; second, they rely only on a single cross-entropy loss as the loss function, leading to models prone to overfitting. To address these issues, a image splicing forgery localization method based on multi-scale supervised U-shaped network, named MSU-Net, is proposed in this paper. First, a triple-stream feature extraction module is designed, which combines the noise view and edge information of the input image to extract semantic-related and semantic-agnostic features. Second, a feature hierarchical fusion mechanism is proposed that introduces a channel attention mechanism layer by layer to perceive multi-level manipulation trajectories, avoiding the loss of information in semantic-related and semantic-agnostic shallow features during the convolution process. Finally, a strategy for multi-scale supervision is developed, a boundary artifact localization module is designed to compute the edge loss, and a contrastive learning module is introduced to compute the contrastive loss. Through extensive experiments on several public datasets, MSU-Net demonstrates high accuracy in localizing tampered regions and outperforms state-of-the-art methods. Additional attack experiments show that MSU-Net exhibits good robustness against Gaussian blur, Gaussian noise, and JPEG compression attacks. Besides, MSU-Net is superior in terms of model complexity and localization speed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Titanium Alloy Weld Time-of-Flight Diffraction Image Denoising Based on a Wavelet Feature Fusion Deep-Learning Model.
- Author
-
Zhi, Zelin, Jiang, Hongquan, Yang, Deyan, Yue, Kun, Gao, Jianmin, Cheng, Zhixiang, Xu, Yongjun, Geng, Qiang, and Zhou, Wei
- Subjects
- *
IMAGE denoising , *WELDED joints , *WELDING , *NONDESTRUCTIVE testing , *IMAGE fusion , *TITANIUM alloys - Abstract
Images of titanium alloy welds detected by time-of-flight diffraction (TOFD) have problems, including large noise signals and many interference streaks around the defects, all of which seriously limit the accuracy and effectiveness of defect recognition. Existing image denoising methods lack the knowledge of the noise characteristics of TOFD images of titanium alloy weld and the preprocessing experience of technicians in the field. In addition, it is difficult to select the parameters of the preprocessing methods, and they are easily influenced by the level of technical personnel, resulting in low efficiency and poor consistency in preprocessing. To address these problems, we proposed a denoising method based on the combination of wavelet band features and deep-learning theory for TOFD images of titanium alloy weld. First, based on the wavelet preprocessing method and the experience of nondestructive testing (NDT) technicians, we constructed an image pair dataset consisting of the original TOFD images of titanium alloy weld and the desired target images to realize the accumulation of engineers' preprocessing knowledge. Second, we constructed a multiband wavelet feature fusion U-net image denoising model (WU-net) and designed a loss function under three constraints of image consistency, image texture information consistency, and structural similarity. This model was able to learn to achieve end-to-end adaptive denoising for TOFD images of titanium alloy weld. Third, we illustrated and validated the effectiveness of TOFD image preprocessing for titanium alloy weld. The results showed that the proposed method effectively eliminated TOFD image noise and improved the accuracy of defect recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. CST-UNet: Cross Swin Transformer Enhanced U-Net with Masked Bottleneck for Single-Channel Speech Enhancement.
- Author
-
Zhang, Zipeng, Chen, Wei, Guo, Weiwei, Liu, Yiming, Yang, Jianhua, and Liu, Houguang
- Subjects
- *
SPEECH enhancement , *TRANSFORMER models , *COMPUTATIONAL complexity , *CORPORA , *DEEP learning - Abstract
Speech enhancement performance has improved significantly with the introduction of deep learning models, especially methods based on the Long–Short-Term Memory architecture. However, these methods face challenges such as high computational complexity and redundancy of input features. To address these issues, we propose a U-Net-based approach that utilizes an encoder/decoder to extract more concise features, thereby enhancing single-channel speech performance and reducing computation complexity. The proposed method includes a Cross-Swin-Transformer block and a masked bottleneck module, which down-samples features while preserving the detailed representation through skip connections and carefully designed blocks. The bottleneck module extracts coarse representations of hidden features as masks. We evaluated our method against other U-Net-based approaches on VCTK and DNS corpora using CBAK, eSTOI, PESQ, STOI, and SI-SDR metrics. The results demonstrate that the proposed method achieves promising performance while significantly reducing computational complexity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Real-time anomaly detection for 'Remote' bus stop surveillance using unsupervised conditional generative adversarial networks.
- Author
-
Xi, Beihao and Chen, Qingkui
- Subjects
- *
GENERATIVE adversarial networks , *PUBLIC safety , *BUS stops , *IMAGE segmentation , *ROAD safety measures , *INTRUSION detection systems (Computer security) , *VIDEO surveillance - Abstract
In response to the imbalance between normal and abnormal samples in existing anomaly detection datasets, as well as the complexity in defining anomalies, we introduce a new dataset named Remote Stop to provide data support for existing algorithms. Concurrently, we propose an unsupervised video anomaly detection method based on conditional generative adversarial networks. Our approach trains the model to learn the distribution of normal video data, enabling it to identify anomalous events. The incorporation of a spatial attention mechanism enhances the model's performance in detecting abnormal behaviors in video frames while maintaining high processing efficiency. Moreover, unlike other methods that assess the entire image, our approach uses overlapping image blocks to determine anomalies, enhancing the accuracy and robustness of the model in image segmentation. These innovations not only address the issues of scarce samples and high-cost labeling but also provide new perspectives and tools for video anomaly detection in the field of public safety. The effectiveness of the model was validated on the Avenue and Ped2 datasets and applied to our newly created dataset (Remote Stop), achieving an AUC of 84.3% and processing 61 video frames per second. This enables efficient sequential processing of large-scale video data, offering positive contributions to enhancing public road safety by providing early warnings and enabling timely preventive measures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Automated shoreline extraction process for unmanned vehicles via U-net with heuristic algorithm.
- Author
-
Prokop, Katarzyna, Połap, Dawid, Włodarczyk-Sielicka, Marta, Połap, Karolina, Jaszcz, Antoni, and Stateczny, Andrzej
- Subjects
HEURISTIC algorithms ,DATABASES ,GEOGRAPHIC boundaries ,IMAGE processing ,REAL estate development - Abstract
Detecting the shoreline is an important task for its potential use. The shoreline allows cropping of the image into two separate areas that present the water area and the shore. It is particularly interesting because the images can be used to analyze pollution, land development, or even waterfront erosion. Unfortunately, automatic shoreline detection is a complex problem due to numerous physical and atmospheric issues. In this paper, we present a solution based on a U-net convolutional network, that is trained to shoreline detection on a dedicated database. The database is automatically generated by applying image processing techniques and a heuristic algorithm. Using heuristics, optimal values of mask generation parameters are determined. Consequently, the solution allows for the automation of generating a set of masks by analyzing the boundary line and the efficiency of the segmentation network. The proposed solution allows for the analysis of the coastline, where potential obstacles and even occurring waves can be quickly detected. To evaluate the proposed solution, tests were carried out in real conditions, which showed the effectiveness of the model. In addition, tests were carried out on a publicly available database, which allowed for obtaining higher results than existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Cell nuclei image segmentation using U-Net and DeepLabV3+ with transfer learning and regularization.
- Author
-
Koishiyeva, Dina, Sydybayeva, Madina, Belginova, Saule, Yeskendirova, Damelya, Azamatova, Zhanerke, Kalpebayev, Azamat, and Beketova, Gulzhanat
- Subjects
MACHINE learning ,COMPUTER vision ,CELL nuclei ,FEATURE extraction ,IMAGE segmentation - Abstract
Semantic nuclei segmentation is a challenging area of computer vision. Accurate nuclei segmentation can help medics in diagnosing many diseases. Automatic nuclei segmentation can help medics in diagnosing many diseases such as cancer by providing automatic tissue analysis. Deep learning algorithms allow automatic feature extraction from medical images, however, hematoxylin and eosin (H&E) stained images are challenging due to variability in staining and textures. Using pre-trained models in deep learning speeds up development and improves their performance. This paper compares Deeplabv3+ and U-Net deep learning methods with the pre-trained models ResNet-50 and EfficientNetB4 embedded in their architecture. In addition, different regularization and dropout parameters are applied to prevent overtraining. The experiment was conducted on the PanNuke dataset consisting of nearly 8,000 histological images and annotated nuclei. As a result, the ResNet50-based DeepLabV3+ model with L2 regularization of 0.02 and dropout of 0.7 showed efficiency with dice coefficient (DCS) of 0.8356, intersection over union (IOU) of 0.7280, and loss of 0.3212 on the test set. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Towards improved U-Net for efficient skin lesion segmentation.
- Author
-
Nampalle, Kishore Babu, Pundhir, Anshul, Jupudi, Pushpamanjari Ramesh, and Raman, Balasubramanian
- Subjects
SKIN imaging ,DEEP learning ,SKIN cancer ,DIAGNOSTIC imaging ,MEDICAL personnel - Abstract
Skin cancer is a highly lethal disease, and detecting it at an early stage is critical. Skin lesion segmentation is a complex process involving identifying the infected area in an image with low contrast, variable size, and position. This task is essential in medical analysis, as it helps clinicians focus on a specific area of the image before further analysis. Our paper introduces a new method for improving the segmentation of medical images by providing the efficient neural connections to design efficient U-Net architecture. We have utilized skip paths to the encoder and minimize the semantic gap between concatenated feature maps. This leads to more precise segmentation outcomes. We have used the PH2 and ISIC-2018 as benchmark dataset to validate the effectiveness of the proposed approach and surpass the available benhcmark performance. We have obtained approximately 96.18% accuracy with the PH2 dataset and 96.09% accuracy with the ISIC-2018 dataset. The outcomes of our architecture are quite impressive, and they exhibit superior performance over both the baseline model and other state-of-the-art techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Enhancing liver tumor segmentation with UNet-ResNet: Leveraging ResNet’s power.
- Author
-
Sheela, K. Selva, Justus, Vivek, Asaad, Renas Rajab, and Kumar, R. Lakshmana
- Subjects
- *
ARTIFICIAL neural networks , *LIVER tumors , *LIVER cancer , *DEEP learning , *COMPUTED tomography - Abstract
Liver cancer poses a significant health challenge due to its high incidence rates and complexities in detection and treatment. Accurate segmentation of liver tumors using medical imaging plays a crucial role in early diagnosis and treatment planning. This study proposes a novel approach combining U-Net and ResNet architectures with the Adam optimizer and sigmoid activation function. The method leverages ResNet’s deep residual learning to address training issues in deep neural networks. At the same time, U-Net’s structure facilitates capturing local and global contextual information essential for precise tumor characterization. The model aims to enhance segmentation accuracy by effectively capturing intricate tumor features and contextual details by integrating these architectures. The Adam optimizer expedites model convergence by dynamically adjusting the learning rate based on gradient statistics during training. To validate the effectiveness of the proposed approach, segmentation experiments are conducted on a diverse dataset comprising 130 CT scans of liver cancers. Furthermore, a state-of-the-art fusion strategy is introduced, combining the robust feature learning capabilities of the UNet-ResNet classifier with Snake-based Level Set Segmentation. Experimental results demonstrate impressive performance metrics, including an accuracy of 0.98 and a minimal loss of 0.10, underscoring the efficacy of the proposed methodology in liver cancer segmentation. This fusion approach effectively delineates complex and diffuse tumor shapes, significantly reducing errors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. AN INTELLIGENT MODEL FOR BENIGN AND MALIGNANT PULMONARY NODULE ANALYSIS USING U-NET NETWORKS AND MULTILEVEL ATTENTION MECHANISMS.
- Author
-
YANG, QING and CHEN, JUN
- Subjects
- *
CONVOLUTIONAL neural networks , *IMAGE processing , *COMPUTER vision , *COMPUTER-assisted image analysis (Medicine) , *IMAGE analysis , *IMAGE segmentation , *LUNGS - Abstract
A key study area throughout the medical sector for image processing and analysis is medical image segmentation. The diagnosis and treatment strategies of doctors may have a solid foundation owing to accurate and effective medical image segmentation. Conventional approaches in this field rely on manual feature extraction, which makes segmentation complex, costs doctors’ time and energy, and involves a subjective evaluation that is readily susceptible to diagnostic errors. Researchers have applied convolutional neural network-based deep learning techniques to the segmentation of medical images as a result of their impressive advancements and successes in the field of computer vision. The research described here uses the U-Net network’s outstanding feature learning capabilities and end-to-end processing mode for lung CT image segmentation via fully convolutional network (FCN) research. However, focusing on valuable, crucial information aspects in the U-Net network is challenging. This study employed multilevel attention mechanisms on the basis of U-Net networks to enhance the model’s accuracy in lung CT image segmentation. These mechanisms were inspired by attention mechanisms. By improving the segmentation accuracy and optimizing the segmentation effect, the new model embeds a self-attention module in front of each upsampling layer in the U-Net model. This module provides more detailed information by stitching the self-attention module of the original image and then suppresses irrelevant and redundant information by using the effect of feature extraction of the upsampling layer. Several additional comparative experiments were conducted on the 2019nCoVR dataset. The outcomes demonstrate the efficacy of the optimized model described in this paper and its application results in improved segmentation effects in lung CT images. Additionally, the new model has distinct advantages over existing approaches that are typical of medical image segmentation, which represent its higher level of lung CT image segmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Diabetic retinopathy disease detection using shapley additive ensembled densenet-121 resnet-50 model.
- Author
-
Mary, A. Rosline and Kavitha, P.
- Subjects
DIABETIC retinopathy ,ARTIFICIAL intelligence ,RETINAL blood vessels ,DEEP learning ,FUNDUS oculi ,IMAGE recognition (Computer vision) ,EYE abnormalities ,EYE diseases - Abstract
Diabetic retinopathy (DR) is a common eye disease that results in vision loss by damaging the blood vessels. Diabetic patients are at high risk of developing DR owing to the damage of retinal lesions, thereby causing clots, injuries and bleeding. The disease leads to abnormal changes in the structure of the retina. Therefore, the timely detection and early treatment of eye diseases are essential for preventing people from vision loss. Ophthalmologists distinguish DR based on features such as exudes, microaneurysms, blood vessel area, hemorrhages, etc. An artificial intelligence (AI) method is proposed by ensembling a deep learning (DL) model with the explainable AI based Shapley additive (SHAP) method for DR image segmentation and classification. The proposed model uses fundus images to detect abnormalities in the eye. First, the DR images are collected from the Asia Pacific Tele-Ophthalmology Society 2019 blindness detection (APTOS 2019) dataset. Data augmentation is performed to artificially increase the size of a training dataset by generating new data samples from existing ones by rescaling, flipping, rotating, zooming, etc. In order to improve the quality and enhance specific features, pre-processing is performed. The pre-processed images are segmented using the improved U-Net model, where the severity of the disease gets predicted, and the fundus images are segmented using the trained model to attain precise abstraction of retinal blood vessels. Finally, the proposed study used the Shapley Additive Ensembled DenseNet-121 ResNet-50 (SAE-DR) model to detect DR disease based on the features. To improve the readability of the deep learning model, an explainable AI based Shapley additive method is proposed. Then, compare the results of the proposed model with existing state-of-the-art methods. The simulation results prove that the proposed model achieves superior detection performance with an accuracy of 98.69%, sensistivity of 86.23%, specificity of 97.54%, F-score of 90.26%, precision of 94.26% and processing time of 0.153 s. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Rs-net: Residual Sharp U-Net architecture for pavement crack segmentation and severity assessment.
- Author
-
Ali, Luqman, AlJassmi, Hamad, Swavaf, Mohammed, Khan, Wasif, and Alnajjar, Fady
- Subjects
CRACKING of pavements ,IMAGE segmentation ,DEEP learning ,PAVEMENTS ,PLAINS - Abstract
U-net, a fully convolutional network-based image segmentation method, has demonstrated widespread adaptability in the crack segmentation task. The combination of the semantically dissimilar features of the encoder (shallow layers) and the decoder (deep layers) in the skip connections leads to blurry features map and leads to undesirable over- or under-segmentation of target regions. Additionally, the shallow architecture of the U-Net model prevents the extraction of more discriminatory information from input images. This paper proposes a Residual Sharp U-Net (RS-Net) architecture for crack segmentation and severity assessment in pavement surfaces to address these limitations. The proposed architecture uses residual block in the U-Net model to extract a more insightful representation of features. In addition to that, a sharpening kernel filter is used instead of plain skip connections to generate a fine-tuned encoder features map before combining it with decoder features maps to reduce the dissimilarity between them and smoothes artifacts in the network layers during early training. The proposed architecture is also integrated with various morphological operations to assess the severity of cracks and categorize them into hairline, medium, and severe labels. Experiments results demonstrated that the RS-Net model has promising segmentation performance, outperforming earlier U-Net variations on testing data for crack segmentation and severity assessment, with a promising accuracy (>0.97) [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. BFNet: a full-encoder skip connect way for medical image segmentation.
- Author
-
Siyu Zhan, Quan Yuan, Xin Lei, Rui Huang, Lu Guo, Ke Liu, and Rong Chen
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,COMPUTER-assisted image analysis (Medicine) ,DEEP learning ,IMAGE segmentation - Abstract
In recent years, semantic segmentation in deep learning has been widely applied in medical image segmentation, leading to the development of numerous models. Convolutional Neural Network (CNNs) have achieved milestone achievements in medical image analysis. Particularly, deep neural networks based on U-shaped architectures and skip connections have been extensively employed in various medical image tasks. U-Net is characterized by its encoder-decoder architecture and pioneering skip connections, along with multi-scale features, has served as a fundamental network architecture for many modifications. But U-Net cannot fully utilize all the information from the encoder layer in the decoder layer. U-Net++ connects mid parameters of different dimensions through nested and dense skip connections. However, it can only alleviate the disadvantage of not being able to fully utilize the encoder information and will greatly increase the model parameters. In this paper, a novel BFNet is proposed to utilize all feature maps from the encoder at every layer of the decoder and reconnects with the current layer of the encoder. This allows the decoder to better learn the positional information of segmentation targets and improves learning of boundary information and abstract semantics in the current layer of the encoder. Our proposed method has a significant improvement in accuracy with 1.4 percent. Besides enhancing accuracy, our proposed BFNet also reduces network parameters. All the advantages we proposed are demonstrated on our dataset. We also discuss how different loss functions influence this model and some possible improvements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Development and performance evaluation of fully automated deep learning-based models for myocardial segmentation on T1 mapping MRI data.
- Author
-
Manzke, Mathias, Iseke, Simon, Böttcher, Benjamin, Klemenz, Ann-Christin, Weber, Marc-André, and Meinel, Felix G.
- Subjects
- *
MYOCARDIAL perfusion imaging , *DEEP learning , *DATA mapping , *CARDIAC magnetic resonance imaging - Abstract
To develop a deep learning-based model capable of segmenting the left ventricular (LV) myocardium on native T1 maps from cardiac MRI in both long-axis and short-axis orientations. Models were trained on native myocardial T1 maps from 50 healthy volunteers and 75 patients using manual segmentation as the reference standard. Based on a U-Net architecture, we systematically optimized the model design using two different training metrics (Sørensen-Dice coefficient = DSC and Intersection-over-Union = IOU), two different activation functions (ReLU and LeakyReLU) and various numbers of training epochs. Training with DSC metric and a ReLU activation function over 35 epochs achieved the highest overall performance (mean error in T1 10.6 ± 17.9 ms, mean DSC 0.88 ± 0.07). Limits of agreement between model results and ground truth were from -35.5 to + 36.1 ms. This was superior to the agreement between two human raters (-34.7 to + 59.1 ms). Segmentation was as accurate for long-axis views (mean error T1: 6.77 ± 8.3 ms, mean DSC: 0.89 ± 0.03) as for short-axis images (mean error ΔT1: 11.6 ± 19.7 ms, mean DSC: 0.88 ± 0.08). Fully automated segmentation and quantitative analysis of native myocardial T1 maps is possible in both long-axis and short-axis orientations with very high accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Gray-Scale Extraction of Bone Features from Chest Radiographs Based on Deep Learning Technique for Personal Identification and Classification in Forensic Medicine.
- Author
-
Kim, Yeji, Yoon, Yongsu, Matsunobu, Yusuke, Usumoto, Yosuke, Eto, Nozomi, and Morishita, Junji
- Subjects
- *
POSTMORTEM imaging , *RECEIVER operating characteristic curves , *FORENSIC pathology , *CHEST X rays , *DEEP learning - Abstract
Post-mortem (PM) imaging has potential for identifying individuals by comparing ante-mortem (AM) and PM images. Radiographic images of bones contain significant information for personal identification. However, PM images are affected by soft tissue decomposition; therefore, it is desirable to extract only images of bones that change little over time. This study evaluated the effectiveness of U-Net for bone image extraction from two-dimensional (2D) X-ray images. Two types of pseudo 2D X-ray images were created from the PM computed tomography (CT) volumetric data using ray-summation processing for training U-Net. One was a projection of all body tissues, and the other was a projection of only bones. The performance of the U-Net for bone extraction was evaluated using Intersection over Union, Dice coefficient, and the area under the receiver operating characteristic curve. Additionally, AM chest radiographs were used to evaluate its performance with real 2D images. Our results indicated that bones could be extracted visually and accurately from both AM and PM images using U-Net. The extracted bone images could provide useful information for personal identification in forensic pathology. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Innovative Deep Learning Approaches for High-Precision Segmentation and Characterization of Sandstone Pore Structures in Reservoirs.
- Author
-
Suo, Limin, Wang, Zhaowei, Liu, Hailong, Cui, Likai, Sun, Xianda, and Qin, Xudong
- Subjects
MACHINE learning ,NATURAL gas prospecting ,POROSITY ,PETROLEUM prospecting ,DEEP learning ,SUPERVISED learning - Abstract
The detailed characterization of the pore structure in sandstone is pivotal for the assessment of reservoir properties and the efficiency of oil and gas exploration. Traditional fully supervised learning algorithms are limited in performance enhancement and require a substantial amount of accurately annotated data, which can be challenging to obtain. To address this, we introduce a semi-supervised framework with a U-Net backbone network. Our dataset was curated from 295 two-dimensional CT grayscale images, selected at intervals from nine 4 mm sandstone core samples. To augment the dataset, we employed StyleGAN2-ADA to generate a large number of images with a style akin to real sandstone images. This approach allowed us to generate pseudo-labels through semi-supervised learning, with only a small subset of the data being annotated. The accuracy of these pseudo-labels was validated using ensemble learning methods. The experimental results demonstrated a pixel accuracy of 0.9993, with a pore volume discrepancy of just 0.0035 compared to the actual annotated data. Furthermore, by reconstructing the three-dimensional pore structure of the sandstone, we have shown that the synthetic three-dimensional pores can effectively approximate the throat length distribution of the real sandstone pores and exhibit high precision in simulating throat shapes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. RegMamba: An Improved Mamba for Medical Image Registration.
- Author
-
Hu, Xin, Chen, Jiaqi, and Chen, Yilin
- Subjects
DIAGNOSTIC imaging ,TRANSFORMER models ,SAMPLING (Process) ,IMAGE registration ,POPULARITY ,RECORDING & registration - Abstract
Deformable medical image registration aims to minimize the differences between fixed and moving images to provide comprehensive physiological or structural information for further medical analysis. Traditional learning-based convolutional network approaches usually suffer from the problem of perceptual limitations, and in recent years, the Transformer architecture has gained popularity for its superior long-range relational modeling capabilities, but still faces severe computational challenges in handling high-resolution medical images. Recently, selective state-space models have shown great potential in the vision domain due to their fast inference and efficient modeling. Inspired by this, in this paper, we propose RegMamba, a novel medical image registration architecture that combines convolutional and state-space models (SSMs), designed to efficiently capture complex correspondence in registration while maintaining efficient computational effort. Firstly our model introduces Mamba to efficiently remotely model and process potential dependencies of the data to capture large deformations. At the same time, we use a scaled convolutional layer in Mamba to alleviate the problem of spatial information loss in 3D data flattening processing in Mamba. Then, a deformable convolutional residual module (DCRM) is proposed to adaptively adjust the sampling position and process deformations to capture more flexible spatial features while learning fine-grained features of different anatomical structures to construct local correspondences and improve model perception. We demonstrate the advanced registration performance of our method on the LPBA40 and IXI public datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Detection of Scratch Defects on Metal Surfaces Based on MSDD-UNet.
- Author
-
Liu, Yan, Qin, Yunbai, Lin, Zhonglan, Xia, Haiying, and Wang, Cong
- Subjects
METALLIC surfaces ,METAL defects ,SURFACE defects ,METAL detectors - Abstract
In this work, we enhanced the U-shaped network and proposed a method for detecting scratches on metal surfaces based on the Metal Surface Defect Detection U-Net (MSDD-UNet). Initially, we integrated a downsampling approach using a Space-To-Depth module and a lightweight channel attention module to address the loss of contextual information in feature maps that results from multiple convolution and pooling operations. Building on this, we developed an improved attention module that utilizes image frequency decomposition and cross-channel self-attention mechanisms, as well as the strengths of convolutional encoders and self-attention blocks. Additionally, this attention module was integrated into the skip connections between the encoder and decoder. The purpose was to capture dense contextual information, highlight small and fine target areas, and assist in localizing micro and fine scratch defects. In response to the severe foreground–background class imbalance in scratch images, a hybrid loss function combining focal loss and D i c e loss was put forward to train the model for precise scratch segmentation. Finally, experiments were conducted on two surface defect datasets. The results reveal that our proposed method is more advantageous than other state-of-the-art scratch segmentation methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. LYNSU: automated 3D neuropil segmentation of fluorescent images for Drosophila brains.
- Author
-
Kai-Yi Hsu, Chi-Tin Shih, Nan-Yow Chen, and Chung-Chuan Lo
- Subjects
FRUIT flies ,IMAGE segmentation ,DATABASES ,THREE-dimensional imaging ,DROSOPHILA - Abstract
The brain atlas, which provides information about the distribution of genes, proteins, neurons, or anatomical regions, plays a crucial role in contemporary neuroscience research. To analyze the spatial distribution of those substances based on images from different brain samples, we often need to warp and register individual brain images to a standard brain template. However, the process of warping and registration may lead to spatial errors, thereby severely reducing the accuracy of the analysis. To address this issue, we develop an automated method for segmenting neuropils in the Drosophila brain for fluorescence images from the FlyCircuit database. This technique allows future brain atlas studies to be conducted accurately at the individual level without warping and aligning to a standard brain template. Our method, LYNSU (Locating by YOLO and Segmenting by U-Net), consists of two stages. In the first stage, we use the YOLOv7 model to quickly locate neuropils and rapidly extract smallscale 3D images as input for the second stage model. This stage achieves a 99.4% accuracy rate in neuropil localization. In the second stage, we employ the 3D U-Net model to segment neuropils. LYNSU can achieve high accuracy in segmentation using a small training set consisting of images from merely 16 brains. We demonstrate LYNSU on six distinct neuropils or structures, achieving a high segmentation accuracy comparable to professional manual annotations with a 3D Intersection-over-Union (IoU) reaching up to 0.869. Our method takes only about 7 s to segment a neuropil while achieving a similar level of performance as the human annotators. To demonstrate a use case of LYNSU, we applied it to all female Drosophila brains from the FlyCircuit database to investigate the asymmetry of the mushroom bodies (MBs), the learning center of fruit flies. We used LYNSU to segment bilateral MBs and compare the volumes between left and right for each individual. Notably, of 8,703 valid brain samples, 10.14% showed bilateral volume differences that exceeded 10%. The study demonstrated the potential of the proposed method in high-throughput anatomical analysis and connectomics construction of the Drosophila brain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. A novel generative adversarial network‐based super‐resolution approach for face recognition.
- Author
-
Chougule, Amit, Kolte, Shreyas, Chamola, Vinay, and Hussain, Amir
- Subjects
- *
DEEP learning , *FACE perception , *GENERATIVE adversarial networks , *COMPUTER vision , *ARCHITECTURAL style , *HIGH resolution imaging - Abstract
Face recognition is an essential feature required for a range of computer vision applications such as security, attendance systems, emotion detection, airport check‐in, and many others. The super‐resolution of subject images is an important and challenging element in numerous scenarios. At times the images are low resolution and need to be processed through super‐resolution techniques to gain more accurate results. For the problem of image super‐resolution, deep learning‐based face recognition systems have been explored in recent years; however, low‐resolution face recognition remains an arduous task. Generative adversarial network (GAN) based models are a promising approach to address this challenge. However, conventional GAN‐based models may generate images that differ significantly from an original high‐resolution image in the test set to the point that the identity of the target face may be changed. To address this shortcoming, we propose a novel U‐Net style generator architecture, where skip‐connections between the encoder and decoder layer can help in preserving the facial characteristics of the input image in the generated image, thus curbing the generator's ability to generate an entirely new image and training it to generate an image more similar in characteristics to the original image. In addition to statistical metrics like structural similarity index measure and Fréchet inception distance, we compute the pixel‐wise distance between the original and model‐generated images to ascertain that our model generates as close to the original images as possible. While we train the model for 4× super‐resolution (64 × 64 images to 256 × 256), our architecture can also be trained for an arbitrary resizing scale. Finally, the number of faces detected over high‐resolution images generated by our model is shown to be higher than state‐of‐the‐art high‐resolution image creation models for face recognition tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Efficient Extraction of Coronary Artery Vessels from Computed Tomography Angiography Images Using ResUnet and Vesselness.
- Author
-
Alirr, Omar Ibrahim, Al-Absi, Hamada R. H., Ashtaiwi, Abduladhim, and Khalifa, Tarek
- Subjects
- *
COMPUTED tomography , *CORONARY angiography , *ANGIOGRAPHY , *CORONARY arteries , *DEEP learning , *CARDIOVASCULAR diseases , *HEART , *THERAPEUTICS - Abstract
Accurate and efficient segmentation of coronary arteries from CTA images is crucial for diagnosing and treating cardiovascular diseases. This study proposes a structured approach that combines vesselness enhancement, heart region of interest (ROI) extraction, and the ResUNet deep learning method to accurately and efficiently extract coronary artery vessels. Vesselness enhancement and heart ROI extraction significantly improve the accuracy and efficiency of the segmentation process, while ResUNet enables the model to capture both local and global features. The proposed method outperformed other state-of-the-art methods, achieving a Dice similarity coefficient (DSC) of 0.867, a Recall of 0.881, and a Precision of 0.892. The exceptional results for segmenting coronary arteries from CTA images demonstrate the potential of this method to significantly contribute to accurate diagnosis and effective treatment of cardiovascular diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Deep Learning–Based Prediction of Tunnel Face Stability in Layered Soils Using Images of Random Fields.
- Author
-
Zhang, Zheming, Wang, Ze Zhou, Goh, Siang Huat, and Ji, Jian
- Subjects
- *
RANDOM fields , *CONVOLUTIONAL neural networks , *SOILS , *FAILURE mode & effects analysis , *RANDOM numbers - Abstract
The stability analysis of tunnel faces in multilayered soils presents challenges due to the inherent variability in natural soils. Although the random field finite-element methods offer a reliable approach to address such variability, their heavy computational demands have been a significant drawback. To overcome this limitation, this study presents a novel deep learning–based method for efficient tunnel face stability analysis in layered soils with spatial variability. By combining the merits of convolutional neural networks (CNNs) and U-Net, the proposed method trains surrogate models using a small but sufficient number of random field images to effectively learn high-level features that encompass spatial variabilities, which significantly enhances computational efficiency. In particular, U-Net generates precise displacement field images based on random field images, enabling the discrimination of tunnel face collapse failure modes. To validate the effectiveness of this proposal, a comprehensive case study involving layered soils with spatial variabilities is conducted. The remarkable agreement between the outputs of CNNs and U-Net and the predictions of finite-element simulations underscores the promising potential of using deep-learning models as a surrogate for analyzing the stability of tunnel faces in spatially variable layered soils. Last but not least, the key innovation of this work lies in the pioneering application of U-Net for geotechnical reliability analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Advancing Parsimonious Deep Learning Weather Prediction Using the HEALPix Mesh.
- Author
-
Karlbauer, Matthias, Cresswell‐Clay, Nathaniel, Durran, Dale R., Moreno, Raul A., Kurth, Thorsten, Bonev, Boris, Brenowitz, Noah, and Butz, Martin V.
- Subjects
- *
MACHINE learning , *NUMERICAL weather forecasting , *DEEP learning , *EVOLUTION equations , *PREDICTION models , *WEATHER forecasting - Abstract
We present a parsimonious deep learning weather prediction model to forecast seven atmospheric variables with 3‐hr time resolution for up to 1‐year lead times on a 110‐km global mesh using the Hierarchical Equal Area isoLatitude Pixelization (HEALPix). In comparison to state‐of‐the‐art (SOTA) machine learning (ML) weather forecast models, such as Pangu‐Weather and GraphCast, our DLWP‐HPX model uses coarser resolution and far fewer prognostic variables. Yet, at 1‐week lead times, its skill is only about 1 day behind both SOTA ML forecast models and the SOTA numerical weather prediction model from the European Center for Medium‐Range Weather Forecasts. We report several improvements in model design, including switching from the cubed sphere to the HEALPix mesh, inverting the channel depth of the U‐Net, and introducing gated recurrent units (GRU) on each level of the U‐Net hierarchy. The consistent east‐west orientation of all cells on the HEALPix mesh facilitates the development of location‐invariant convolution kernels that successfully propagate weather patterns across the globe without requiring separate kernels for the polar and equatorial faces of the cube sphere. Without any loss of spectral power after the first 2 days, the model can be unrolled autoregressively for hundreds of steps into the future to generate realistic states of the atmosphere that respect seasonal trends, as showcased in 1‐year simulations. Plain Language Summary: Weather forecasting traditionally relies on numerical weather prediction models that solve physical equations to simulate the evolution of the atmosphere. Such numerical models are compute intensive, and their performance is increasingly challenged by less compute demanding but still highly sophisticated machine learning (ML) approaches. Yet, a downside for many of these new ML models is they tend to drift away from climatology while producing excessively smoothed fields if they are iteratively stepped forward for several months. Here, a parsimonious machine learning model is developed to forecast just seven atmospheric variables that can be stepped forward to give realistic weather patterns over a full year. Despite using at least a factor of 10 less variables than the 67–227 in the best ML models, our model generates 8‐day forecasts with errors that are only a day behind those from state‐of‐the‐art ML forecasts. Our model provides a path toward sub‐seasonal and seasonal forecasting that could potentially improve planning for agriculture, water resources, disaster preparedness, and energy production. Key Points: The model forecasts seven atmospheric variables, an order of magnitude less than that used in state‐of‐the‐art ML weather forecast modelsForecasts are generated on the HEALPix mesh, facilitating the development of location invariant convolution kernelsWithout converging to climatology, the model produces realistic atmospheric states in 365‐day iterative rollouts [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Intelligent skin lesion segmentation using deformable attention Transformer U‐Net with bidirectional attention mechanism in skin cancer images.
- Author
-
Cai, Lili, Hou, Keke, and Zhou, Su
- Subjects
- *
TRANSFORMER models , *SKIN imaging , *MELANOMA , *SKIN cancer , *GLOBAL method of teaching - Abstract
Background: In recent years, the increasing prevalence of skin cancers, particularly malignant melanoma, has become a major concern for public health. The development of accurate automated segmentation techniques for skin lesions holds immense potential in alleviating the burden on medical professionals. It is of substantial clinical importance for the early identification and intervention of skin cancer. Nevertheless, the irregular shape, uneven color, and noise interference of the skin lesions have presented significant challenges to the precise segmentation. Therefore, it is crucial to develop a high‐precision and intelligent skin lesion segmentation framework for clinical treatment. Methods: A precision‐driven segmentation model for skin cancer images is proposed based on the Transformer U‐Net, called BiADATU‐Net, which integrates the deformable attention Transformer and bidirectional attention blocks into the U‐Net. The encoder part utilizes deformable attention Transformer with dual attention block, allowing adaptive learning of global and local features. The decoder part incorporates specifically tailored scSE attention modules within skip connection layers to capture image‐specific context information for strong feature fusion. Additionally, deformable convolution is aggregated into two different attention blocks to learn irregular lesion features for high‐precision prediction. Results: A series of experiments are conducted on four skin cancer image datasets (i.e., ISIC2016, ISIC2017, ISIC2018, and PH2). The findings show that our model exhibits satisfactory segmentation performance, all achieving an accuracy rate of over 96%. Conclusion: Our experiment results validate the proposed BiADATU‐Net achieves competitive performance supremacy compared to some state‐of‐the‐art methods. It is potential and valuable in the field of skin lesion segmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. PSC diffusion: patch-based simplified conditional diffusion model for low-light image enhancement.
- Author
-
Wan, Fei, Xu, Bingxin, Pan, Weiguo, and Liu, Hongzhe
- Abstract
Low-light image enhancement is pivotal for augmenting the utility and recognition of visuals captured under inadequate lighting conditions. Previous methods based on Generative Adversarial Networks (GAN) are affected by mode collapse and lack attention to the inherent characteristics of low-light images. This paper propose the Patch-based Simplified Conditional Diffusion Model (PSC Diffusion) for low-light image enhancement due to the outstanding performance of diffusion models in image generation. Specifically, recognizing the potential issue of gradient vanishing in extremely low-light images due to smaller pixel values, we design a simplified U-Net architecture with SimpleGate and Parameter-free attention (SimPF) block to predict noise. This architecture utilizes parameter-free attention mechanism and fewer convolutional layers to reduce multiplication operations across feature maps, resulting in a 12–51% reduction in parameters compared to U-Nets used in several prominent diffusion models, which also accelerates the sampling speed. In addition, preserving intricate details in images during the diffusion process is achieved through employing a patch-based diffusion strategy, integrated with global structure-aware regularization, which effectively enhances the overall quality of the enhanced images. Experiments show that the method proposed in this paper achieves richer image details and better perceptual quality, while the sampling speed is over 35% faster than similar diffusion model-based methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Automatic seismic fault identification based on an improved U-Net network.
- Author
-
Wu, Jizhong, Shi, Ying, Wang, Kexin, Yang, Chenyu, and Yang, Qianqian
- Subjects
- *
CONVOLUTIONAL neural networks , *FAULT zones , *FEATURE extraction , *ARTIFICIAL intelligence - Abstract
Fault identification is a key step in structural interpretation. Traditional fault identification methods are easily affected by the seismic quality and interpreter experience, and the identification methods of complex fault zones and multiscale faults require further improvement. The U-Net, frequently used at present, has achieved satisfactory results in fault identification. However, this network has a limited ability to extract and recover feature information and does not have a mechanism of concern, limiting its ability to identify complex faults. To solve these problems, this study proposes an improved U-Net network model based on a conventional U-Net network. A multiscale residual module was used to extract the features instead of the U-Net's two-layer convolution. The residual jump connection replaced the U-Net jump connection to avoid semantic loss caused by the fusion of high- and low-level semantic information. An attention mechanism was introduced to integrate the global, local, spatial, and channel features to ensure that the model could extract image features from various dimensions to the maximum extent. An improved U-Net network model was applied to train the model data and test the actual field data. Our results show that the proposed network model is better at fault identification, avoids human interference to a certain extent, and is advantageous compared to the conventional U-Net method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. An improved fuzzy c-means method based on multivariate skew-normal distribution for brain MR image segmentation.
- Author
-
Guiyuan Zhu, Shengyang Liao, Tianming Zhan, and Yunjie Chen
- Abstract
Accurate segmentation of magnetic resonance (MR) images is crucial for providing doctors with effective quantitative information for diagnosis. However, the presence of weak boundaries, intensity inhomogeneity, and noise in the images poses challenges for segmentation models to achieve optimal results. While deep learning models can offer relatively accurate results, the scarcity of labeled medical imaging data increases the risk of overfitting. To tackle this issue, this paper proposes a novel fuzzy c-means (FCM) model that integrates a deep learning approach. To address the limited accuracy of traditional FCM models, which employ Euclidean distance as a distance measure, we introduce a measurement function based on the skewed normal distribution. This function enables us to capture more precise information about the distribution of the image. Additionally, we construct a regularization term based on the Kullback-Leibler (KL) divergence of high-confidence deep learning results. This regularization term helps enhance the final segmentation accuracy of the model. Moreover, we incorporate orthogonal basis functions to estimate the bias field and integrate it into the improved FCM method. This integration allows our method to simultaneously segment the image and estimate the bias field. The experimental results on both simulated and real brain MR images demonstrate the robustness of our method, highlighting its superiority over other advanced segmentation algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. A DAU-Net-ConvLSTM Model for Daytime Sea Fog Segmentation.
- Author
-
Xiaokang Hu, Taorong Qiu, Yiqi Liao, Jing Wang, and Liangwei Lin
- Subjects
IMAGE segmentation ,WARNINGS - Abstract
Sea fog poses risks to coastal activities, necessitating effective monitoring and early warning systems. This study introduces a deep learning approach tailored for sea fog segmentation, considering its nonlinear multiscale variability, textural patterns, and temporal aspects. The U-Net model serves as the foundational network, enhanced by asymmetric multi-scale convolution modules to create the DAU-Net. This improved model effectively identifies sea fog features in images. Integrating the DAU-Net with ConvLSTM results in the DAU-Net-ConvLSTM model, which uses bidirectional ConvLSTM for processing temporal sequence data and refining segmentation outcomes. Comparative testing against seven segmentation models on augmented sea fog datasets revealed our model's superiority, achieving a 90.4% Kappa score and 86.4% MIOU. It outperforms existing CNN models like U-Net, U-Net++, Deeplab v3, and temporally-focused models like RNN, STGRU, 3D CNN-LSTM. This highlights its robust segmentation capabilities and potential for real-world applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
35. From CNN to Transformer: A Review of Medical Image Segmentation Models.
- Author
-
Yao, Wenjian, Bai, Jiajun, Liao, Wei, Chen, Yuheng, Liu, Mengjuan, and Xie, Yao
- Subjects
TUBERCULOSIS diagnosis ,LIVER radiography ,DIAGNOSTIC imaging ,OVARIAN tumors ,CHEST X rays ,NATURAL language processing ,DEEP learning ,MATHEMATICAL models ,ARTIFICIAL neural networks ,DIGITAL image processing ,THEORY - Abstract
Medical image segmentation is an important step in medical image analysis, especially as a crucial prerequisite for efficient disease diagnosis and treatment. The use of deep learning for image segmentation has become a prevalent trend. The widely adopted approach currently is U-Net and its variants. Moreover, with the remarkable success of pre-trained models in natural language processing tasks, transformer-based models like TransUNet have achieved desirable performance on multiple medical image segmentation datasets. Recently, the Segment Anything Model (SAM) and its variants have also been attempted for medical image segmentation. In this paper, we conduct a survey of the most representative seven medical image segmentation models in recent years. We theoretically analyze the characteristics of these models and quantitatively evaluate their performance on Tuberculosis Chest X-rays, Ovarian Tumors, and Liver Segmentation datasets. Finally, we discuss the main challenges and future trends in medical image segmentation. Our work can assist researchers in the related field to quickly establish medical segmentation models tailored to specific regions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A Review of Advancements and Challenges in Liver Segmentation.
- Author
-
Wei, Di, Jiang, Yundan, Zhou, Xuhui, Wu, Di, and Feng, Xiaorong
- Subjects
CONVOLUTIONAL neural networks ,DEEP learning ,COMPUTER-assisted image analysis (Medicine) ,DIAGNOSTIC imaging ,IMAGE segmentation - Abstract
Liver segmentation technologies play vital roles in clinical diagnosis, disease monitoring, and surgical planning due to the complex anatomical structure and physiological functions of the liver. This paper provides a comprehensive review of the developments, challenges, and future directions in liver segmentation technology. We systematically analyzed high-quality research published between 2014 and 2024, focusing on liver segmentation methods, public datasets, and evaluation metrics. This review highlights the transition from manual to semi-automatic and fully automatic segmentation methods, describes the capabilities and limitations of available technologies, and provides future outlooks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Optimizing TEM Image Segmentation: Advancements in DRU-Net Architecture with Dense Residual Connections and Attention Mechanisms.
- Author
-
Naik, M. Nagaraju, Dimmita, Nagajyothi, Chintamaneni, Vijayalakshmi, Rao, P. Srinivasa, Rajeswaran, Nagalingam, Jaffar, Amar Y., Aldosari, Fahd M., Eid, Wesam N., and Alharbi, Ayman A.
- Subjects
TRANSMISSION electron microscopy ,IMAGE transmission ,CELL imaging ,SCANNING electron microscopy - Abstract
This study introduces an innovative enhancement to the U-Net architecture, termed Modified DRU-Net, aiming to improve the segmentation of cell images in Transmission Electron Microscopy (TEM). Traditional U-Net models, while effective, often struggle to capture fine-grained details and preserve contextual information critical for accurate biomedical image segmentation. To overcome these challenges, Modified DRU-Net integrates dense residual connections and attention mechanisms into the U-Net framework. Dense connections enhance gradient flow and feature reuse, while residual connections mitigate the vanishing gradient problem, facilitating better model training. Attention blocks in the up-sampling path selectively focus on relevant features, boosting segmentation accuracy. Additionally, a combined loss function, merging focal loss and dice loss, addresses class imbalance and improves segmentation performance. Experimental results demonstrate that Modified DRU-Net significantly enhances performance metrics, underscoring its effectiveness in achieving detailed and accurate cell image segmentation in TEM images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Brain tumor image segmentation using model average ensembling of deep networks.
- Author
-
Mishra, Ajey Shakti, Acharya, Upendra Kumar, Srivastava, Akanksha, Modi, Aashi Rohit, and Kumar, Sandeep
- Abstract
In the biomedical field, identification of brain tumors along with their location, regions of spreading, and speed of extension are of utmost importance to decide the treatment for Brain Tumors. Automated segmentation plays a major role in detection because manual extraction of the brain tumor sub-regions from MRI volume is monotonous, error-prone, and intricate. Deep learning significantly contributed to outperforming these issues since it is aware of their complexity. Therefore, a technique for the automated segmentation of MRI brain pictures has been developed using model average ensembling of deep networks such 3D CNN and U-Net architectures. 3D CNN and U-Net architecture have made remarkable progress on the task of segmentation of brain tumors. Due to their reliability, ensembling of these models have been opted to have a model with greater reliability. The novelty of this paper is to build a robust segmentation technique by model average ensembling of 3D CNN and U-Net models for abnormality identification by improving the image quality using preprocessing methods. The model includes the testing set BraTS-19 as its input dataset. After performing a lot of experiments, it has been observed that the obtained dice scores by the proposed model for TC (Tumor Core), WT (Whole Tumor), and ET (Enhancing Tumor) are 0.9603, 0.9201, and 0.9237 respectively. The obtained dice scores from the ensembling technique are better than existing techniques. The demonstrated results show the supremacy of the proposed method with an overall accuracy greater than 96%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. 结合U-Net和STGAN的多时相遥感图像云去除算法.
- Author
-
王, 卓, 马, 骏, 郭, 毅, 周, 川杰, 柏, 彬, and 李, 峰
- Subjects
MACHINE learning ,IMAGE reconstruction ,CLOUDINESS ,REMOTE sensing ,DEEP learning - Abstract
Copyright of Journal of Remote Sensing is the property of Editorial Office of Journal of Remote Sensing & Science Publishing Co. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
40. Lung tumor cell classification with lightweight mobileNetV2 and attention-based SCAM enhanced faster R-CNN.
- Author
-
Jenipher, V. Nisha and Radhika, S.
- Abstract
Early and precise detection of lung tumor cell is paramount for providing adequate medication and increasing the survivability of the patients. To achieve this, the Enhanced Faster R-CNN with MobileNetV2 and SCAM framework is bestowed for improving the diagnostic accuracy of lung tumor cell classification. The U-Net architecture optimized by Stochastic Gradient Descent (SGD) is employed to carry out clinical image segmentation. The developed approach leverages the advantage of the lightweight design MobileNetV2 backbone network and the attention mechanism called Spatial and Channel Attention Module (SCAM) for improving the feature extraction as well as the feature representation and localization process of lung tumor cell. The proposed method integrated a MobileNetV2 backbone network due to its lightweight design for deriving valuable features of the input clinical images to reduce the complexity of the network architecture. Moreover, it also incorporates the attention module SCAM for the creation of spatially and channel wise informative features to enhance the lung tumor cell features representation and also its localization to concentrate on important locations. To assess the efficacy of the method, several high performance lung tumor cell classification techniques ECNN, Lung-Retina Net, CNN-SVM, CCDC-HNN, and MTL-MGAN, and datasets including Lung-PET-CT-Dx dataset, LIDC-IDRI dataset, and Chest CT-Scan images dataset are taken to carry out experimental evaluation. By conducting the comprehensive comparative analysis for different metrics with respect to different methods, the proposed method obtains the impressive performance rate with accuracy of 98.6%, specificity of 96.8%, sensitivity of 97.5%, and precision of 98.2%. Furthermore, the experimental outcomes also reveal that the proposed method reduces the complexity of the network and obtains improved diagnostic outcomes with available annotated data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Three‐dimensional morphological characterization of blood droplets during the dynamic coagulation process.
- Author
-
Li, Yao, Li, Wangbiao, Zhang, Xiaoman, Lin, Hui, Li, Dezi, and Li, Zhifang
- Abstract
In this study, we employed a method integrating optical coherence tomography (OCT) with the U‐Net and Visual Geometry Group (VGG)‐Net frameworks within a convolutional neural network for quantitative characterization of the three dimensional whole blood during the dynamic coagulation process. VGG‐Net architecture for the identification of blood droplets across three distinct coagulation stages including drop, gelation, and coagulation achieves an accuracy of up to 99%. In addition, the U‐Net architecture demonstrated proficiency in effectively segmenting uncoagulated and coagulated portions of whole blood, as well as the background. Notably, parameters such as volume of uncoagulated and coagulated segments of the whole blood were successfully employed for the precise quantification of the coagulation process, which indicates well for the potential of future clinical diagnostics and analyses. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Improved Brain Tumor Segmentation in MR Images with a Modified U-Net.
- Author
-
Alquran, Hiam, Alslatie, Mohammed, Rababah, Ali, and Mustafa, Wan Azani
- Subjects
BRAIN tumors ,MAGNETIC resonance imaging ,MEDICAL care ,IMAGE segmentation ,TREATMENT effectiveness ,DEEP learning - Abstract
Detecting brain tumors is crucial in medical diagnostics due to the serious health risks these abnormalities present to patients. Deep learning approaches can significantly improve localization in various medical issues, particularly brain tumors. This paper emphasizes the use of deep learning models to segment brain tumors using a large dataset. The study involves comparing modifications to U-Net structures, including kernel size, number of channels, dropout ratio, and changing the activation function from ReLU to Leaky ReLU. Optimizing these parameters has notably enhanced brain tumor segmentation in MR images, achieving a Global Accuracy of 99.4% and a dice similarity coefficient of 90.2%. The model was trained, validated, and tested on many magnetic resonance images, with a training time not exceeding 19 min on a powerful GPU. This approach can be extended in medical care and hospitals to assist radiologists in identifying tumor locations and suspicious regions, thereby improving diagnosis and treatment effectiveness. The software could also be integrated into MR equipment protocols. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Mapping Irrigation Methods in the Northwestern US Using Deep Learning Classification.
- Author
-
Nouwakpo, S. K., Bjorneberg, D., McGwire, K., and Hoque, O.
- Subjects
SPRINKLERS ,SPRINKLER irrigation ,IRRIGATION water ,REMOTE-sensing images ,DEEP learning - Abstract
Many agricultural areas of the western United States and other parts of the world practice irrigation using a variety of irrigation methods. Maps of irrigation methods are needed but existing technologies are often unable to distinguish between different irrigation methods when they co‐exist on the same landscape. In this study, we develop a deep learning irrigation methods mapping tool for broad scale application. The technique uses a U‐Net model trained on Landsat 5‐ and 8‐derived input images. Training data consisted in irrigation method classified as Flood, Sprinkler or Other on agricultural fields from the Utah Water Related Land Use data set and additional labeling in selected areas of southern Idaho. An ensemble of 10 trained models had an overall accuracy of 0.78. Precision for Flood, Sprinkler and Other were 0.73, 0.82, and 0.80 while recall values were 0.75, 0.74, and 0.84 respectively. Model performance was generally stable throughout the training years but varied by areas. The best performance was obtained in regions with uniform irrigation method across large patches while small fields of contrasting irrigation method with their surroundings were inadequately predicted. Model prediction in an irrigated watershed of southern Idaho for 2006, 2011, 2013, and 2016 were consistent with previously published survey data. This methodology provides a tool for water resource managers to estimate irrigation methods in agricultural watersheds where natural precipitation is low during the growing season and irrigation methods include center pivots, wheel lines and flood irrigation. Plain Language Summary: Many agricultural areas of the western United States practice irrigation using a variety of irrigation methods. Irrigation methods can be classified into 3 main groups: surface (or flood), sprinkler systems and micro‐irrigation systems. Flood and sprinkler irrigation account for 90% of irrigated areas in the United States but impact water resources differently. Flood irrigation has been associated with many adverse effects on water quality whereas sprinkler systems are promoted as improved irrigation alternatives to preserve water quantity and quality. Maps of irrigation methods are needed to improve assessment of irrigation methods on water quantity and quality. In this study, we develop an irrigation methods mapping tool by training a deep learning model on publicly available satellite imagery. The model was trained on the Utah Water Related Land Use data set and additional data from southern Idaho. The trained model correctly predicted irrigation method over 78% of the test area. This methodology provides a tool for water resource managers to estimate irrigation methods in agricultural watersheds where natural precipitation is low during the growing season and irrigation methods include center pivots, wheel lines and flood irrigation. Key Points: A deep learning model was developed to predict the type of irrigation (Flood, Sprinkler or Other) used in areas of the northwestern USAOverall the model predicted the right type of irrigation with an accuracy of 78%This tool has application in other irrigated agricultural areas of the semi‐arid northwest where irrigation methods are varied [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. AHC-Net: a road crack segmentation network based on dual attention mechanism and multi-feature fusion.
- Author
-
Shi, Lin, Zhang, Ruijun, Wu, Yafeng, Cui, Dongyan, Yuan, Na, Liu, Jinyun, and Ji, Zhanlin
- Abstract
To solve the problem of incomplete and inaccurate pavement crack detection, an improved U-Net model based on dual attention mechanism and multi-feature fusion is proposed. Firstly, a new encoding module ACI is designed, which has the feature of multi-scale feature extraction, significantly improves the sensing ability of the damaged area, reduces the background interference, and realizes more accurate segmentation. Secondly, a new decoding module HAD is designed, which avoids the network degradation problem caused by gradient vanishing and the growth of network layers and can retain the most subtle feature information during the decoding process. Finally, convolutional block attention module (CBAM) is introduced in the encoding part to effectively extract global and local detail information, and the criss-cross attention mechanism is also introduced in the decoding part to prevent the loss of marginalized information. The model proposed in this article was tested on the public datasets DeepCrack, CrackSeg478, and AsphaltCrack300, and compared with other advanced methods. The experimental results indicate that this method can detect road cracks more accurately and possesses considerable robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Remote Sensing Image Recognition of Dust Cover Net Construction Waste: A Method Combining Convolutional Block Attention Module and U-Net.
- Author
-
Shangwei Lv, Xiaoyu Liu, and Yifei Cao
- Subjects
CONSTRUCTION & demolition debris ,REMOTE sensing ,IMAGE recognition (Computer vision) ,DUST ,DATA mining ,URBAN growth - Abstract
With the acceleration of urban development, the annual production of urban construction waste has been increasing yearly, which brings considerable challenges for urban supervision and management, and how to quickly and accurately identify construction waste is of great practical significance. In this paper, we propose a remote sensing image dust cover net construction waste recognition algorithm based on the improved U-network model to realize construction waste target recognition. The algorithm first prepares a dust cover net construction waste identification dataset using Google high-resolution remote sensing imagery as the database. Second, VGG16 is adopted as the backbone network of the U-Net model to improve the feature expression ability of the model. Finally, the Convolution Block Attention Module (CBAM) is embedded into the U-Net network to construct the CBAM-U-Net model to enhance the information extraction accuracy of high-resolution remote sensing images. With the remote sensing image encompassing Daxing District in Beijing as an example, the results show that the proposed algorithm can automatically and efficiently recognize the dust cover net construction waste with 95.51% recognition accuracy and 95.08% Mlou, which puts forward a new idea for the supervision of construction waste. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Development of a deep-learning phenotyping tool for analyzing image-based strawberry phenotypes.
- Author
-
Jean Nepo Ndikumana, Unseok Lee, Ji Hye Yoo, Yeboah, Samuel, Soo Hyun Park, Taek Sung Lee, Young Rog Yeoung, and Hyoung Seok Kim
- Subjects
AGRICULTURE ,DEEP learning ,RESEARCH personnel ,REGRESSION analysis ,STATISTICAL correlation - Abstract
Introduction: In strawberry farming, phenotypic traits (such as crown diameter, petiole length, plant height, flower, leaf, and fruit size) measurement is essential as it serves as a decision-making tool for plant monitoring and management. To date, strawberry plant phenotyping has relied on traditional approaches. In this study, an image-based Strawberry Phenotyping Tool (SPT) was developed using two deep-learning (DL) architectures, namely "YOLOv4" and "U-net" integrated into a single system. We aimed to create the most suitable DL-based tool with enhanced robustness to facilitate digital strawberry plant phenotyping directly at the natural scene or indirectly using captured and stored images. Methods: Our SPT was developed primarily through two steps (subsequently called versions) using image data with different backgrounds captured with simple smartphone cameras. The two versions (V1 and V2) were developed using the same DL networks but differed by the amount of image data and annotation method used during their development. For V1, 7,116 images were annotated using the single-target non-labeling method, whereas for V2, 7,850 images were annotated using the multitarget labeling method. Results: The results of the held-out dataset revealed that the developed SPT facilitates strawberry phenotype measurements. By increasing the dataset size combined with multitarget labeling annotation, the detection accuracy of our system changed from 60.24% in V1 to 82.28% in V2. During the validation process, the system was evaluated using 70 images per phenotype and their corresponding actual values. The correlation coefficients and detection frequencies were higher for V2 than for V1, confirming the superiority of V2. Furthermore, an image-based regression model was developed to predict the fresh weight of strawberries based on the fruit size (R2 = 0.92). Discussion: The results demonstrate the efficiency of our system in recognizing the aforementioned six strawberry phenotypic traits regardless of the complex scenario of the environment of the strawberry plant. This tool could help farmers and researchers make accurate and efficient decisions related to strawberry plant management, possibly causing increased productivity and yield potential. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. STC-UNet: renal tumor segmentation based on enhanced feature extraction at different network levels.
- Author
-
Hu, Wei, Yang, Shouyi, Guo, Weifeng, Xiao, Na, Yang, Xiaopeng, and Ren, Xiangyang
- Subjects
KIDNEY tumors ,FEATURE extraction ,TRANSFORMER models ,DEEP learning ,IMAGE segmentation - Abstract
Renal tumors are one of the common diseases of urology, and precise segmentation of these tumors plays a crucial role in aiding physicians to improve diagnostic accuracy and treatment effectiveness. Nevertheless, inherent challenges associated with renal tumors, such as indistinct boundaries, morphological variations, and uncertainties in size and location, segmenting renal tumors accurately remains a significant challenge in the field of medical image segmentation. With the development of deep learning, substantial achievements have been made in the domain of medical image segmentation. However, existing models lack specificity in extracting features of renal tumors across different network hierarchies, which results in insufficient extraction of renal tumor features and subsequently affects the accuracy of renal tumor segmentation. To address this issue, we propose the Selective Kernel, Vision Transformer, and Coordinate Attention Enhanced U-Net (STC-UNet). This model aims to enhance feature extraction, adapting to the distinctive characteristics of renal tumors across various network levels. Specifically, the Selective Kernel modules are introduced in the shallow layers of the U-Net, where detailed features are more abundant. By selectively employing convolutional kernels of different scales, the model enhances its capability to extract detailed features of renal tumors across multiple scales. Subsequently, in the deeper layers of the network, where feature maps are smaller yet contain rich semantic information, the Vision Transformer modules are integrated in a non-patch manner. These assist the model in capturing long-range contextual information globally. Their non-patch implementation facilitates the capture of fine-grained features, thereby achieving collaborative enhancement of global–local information and ultimately strengthening the model's extraction of semantic features of renal tumors. Finally, in the decoder segment, the Coordinate Attention modules embedding positional information are proposed aiming to enhance the model's feature recovery and tumor region localization capabilities. Our model is validated on the KiTS19 dataset, and experimental results indicate that compared to the baseline model, STC-UNet shows improvements of 1.60%, 2.02%, 2.27%, 1.18%, 1.52%, and 1.35% in IoU, Dice, Accuracy, Precision, Recall, and F1-score, respectively. Furthermore, the experimental results demonstrate that the proposed STC-UNet method surpasses other advanced algorithms in both visual effectiveness and objective evaluation metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. LMGU-NET: methodological intervention for prediction of bone health for clinical recommendations.
- Author
-
Amiya, Gautam, Murugan, Pallikonda Rajasekaran, Ramaraj, Kottaimalai, Govindaraj, Vishnuvarthanan, Vasudevan, Muneeswaran, Thirumurugan, M., Abdullah, S. Sheik, and Thiyagarajan, Arunprasath
- Subjects
- *
BONE health , *DUAL-energy X-ray absorptiometry , *BONE density , *X-rays , *TEXTURE analysis (Image processing) , *FEATURE extraction , *X-ray imaging - Abstract
Osteoporosis (OP) is a bone-related ailment that aggravates owing to the decline in bone mineral density (BMD) or during deviations in the structure or quality of bone that may surge to fractures. The low BMD can be recognized from computed tomography (CT), X-ray, or Dual Energy X-ray absorptiometry (DXA/DEXA). Texture analysis is the most significant and distinguishing image feature. An enhanced discrimination power texture feature extraction system is developed for volumetric images by combining two complementary types of information: local binary patterns (LBP) and normalized grey-level co-occurrence matrix-based (nGLCM) techniques to extract features and U-Net for classification. The developed algorithm was validated on a Kaggle dataset comprising X-ray images acquired from patients suffering from osteoporosis. The modified U-Net (ModU-Net) semantic segmentation classifier is used for segmenting the low bone mass sections in the processed image. The developed LGMU-Net algorithm outperforms conventional feature extraction approaches and neural networks with a Dice Score of 88.82%, Tanimoto Co-efficient index of 71.74%, MSE of 0.0321, and PSNR of 65.74 dB. This method assists physicians in making early diagnoses and also protects patients from bone fraility and eventual fractures by ensuring that they follow the medications/surgery options prescribed by the doctors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Monitoring upwelling regions in major coastal zones using deep learning and sea surface temperature images.
- Author
-
Belmajdoub, Hanae, Minaoui, Khalid, El Aouni, Anass, and El Abidi, Zineb
- Subjects
- *
OCEAN temperature , *DEEP learning , *COASTS , *CONVOLUTIONAL neural networks , *UPWELLING (Oceanography) , *ARCHITECTURAL design - Abstract
The coastal region of northwest Africa experiences a persistent and variable upwelling phenomenon throughout most of the year, contributing to the presence of one of the world's largest and most productive fishing ports. In this study, we introduce Deep ${\rm{Coas}}{{\rm{t}}_{up}}$ Coas t up -Net a convolutional neural network architecture designed for identifying and extracting upwelling regions from satellite sea surface temperature (SST) images. Our model is trained on a dataset of SST images captured along the Moroccan Atlantic coast, utilizing essential input parameters, including SST, latitudinal position ($LA{T_{pos}}$ LA T pos ), and distance from the coastline (${D_{coast}}$ D coast ). By incorporating these physical parameters as input features and training the model with corresponding masks for the Atlantic coast of Morocco, the Deep ${\rm{Coas}}{{\rm{t}}_{up}}$ Coas t up -Net network learns to accurately segment upwelling regions in major coastal zones and demonstrates its generalization capability. We validate our methodology using a quantitative index. The results of this validation showcase the effectiveness of our approach. Subsequently, we explore seasonal and interannual upwelling trends within a 21-year time series, spanning from 2000 to 2020. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. A Deep Learning Approach to Segment Coastal Marsh Tidal Creek Networks from High-Resolution Aerial Imagery.
- Author
-
Dutt, Richa, Ortals, Collin, He, Wenchong, Curran, Zachary Charles, Angelini, Christine, Canestrelli, Alberto, and Jiang, Zhe
- Subjects
- *
BIOTIC communities , *CONVOLUTIONAL neural networks , *SALT marshes , *COASTAL wetlands , *IMAGE segmentation , *DEEP learning - Abstract
Tidal creeks play a vital role in influencing geospatial evolution and marsh ecological communities in coastal landscapes. However, evaluating the geospatial characteristics of numerous creeks across a site and understanding their ecological relationships pose significant challenges due to the labor-intensive nature of manual delineation from imagery. Traditional methods rely on manual annotation in GIS interfaces, which is slow and tedious. This study explores the application of Attention-based Dense U-Net (ADU-Net), a deep learning image segmentation model, for automatically classifying creek pixels in high-resolution (0.5 m) orthorectified aerial imagery in coastal Georgia, USA. We observed that ADU-Net achieved an outstanding F1 score of 0.98 in identifying creek pixels, demonstrating its ability in tidal creek mapping. The study highlights the potential of deep learning models for automated tidal creek mapping, opening avenues for future investigations into the role of creeks in marshes' response to environmental changes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.