6,445 results on '"U‐Net"'
Search Results
2. Pancreas Segmentation Using SRGAN Combined with U-Net Neural Network
- Author
-
Tualombo, Mayra Elizabeth, Reyes, Iván, Vizcaino-Imacaña, Paulina, Morocho-Cayamcela, Manuel Eugenio, Ghosh, Ashish, Editorial Board Member, Berrezueta-Guzman, Santiago, editor, Torres, Rommel, editor, Zambrano-Martinez, Jorge Luis, editor, and Herrera-Tapia, Jorge, editor
- Published
- 2025
- Full Text
- View/download PDF
3. A Comprehensive Exploration of Network-Based Approaches for Singing Voice Separation
- Author
-
Sakthidevi, S. P., Divya, C., Kowsalya, V., Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Shrivastava, Vivek, editor, Bansal, Jagdish Chand, editor, and Panigrahi, B. K., editor
- Published
- 2025
- Full Text
- View/download PDF
4. Oil Spill Detection in SAR Images: A U-Net Semantic Segmentation Framework with Multiple Backbones
- Author
-
Das, Koushik, Janardhan, Prashanth, Singh, Manas Ranjan, di Prisco, Marco, Series Editor, Chen, Sheng-Hong, Series Editor, Vayas, Ioannis, Series Editor, Kumar Shukla, Sanjay, Series Editor, Sharma, Anuj, Series Editor, Kumar, Nagesh, Series Editor, Wang, Chien Ming, Series Editor, Cui, Zhen-Dong, Series Editor, Lu, Xinzheng, Series Editor, Janardhan, Prashanth, editor, Choudhury, Parthasarathi, editor, and Kumar, D. Nagesh, editor
- Published
- 2025
- Full Text
- View/download PDF
5. Dual-View Dual-Boundary Dual U-Nets for Multiscale Segmentation of Oral CBCT Images
- Author
-
Liang, Jiarui, Wang, Rui, Rao, Songhui, Xu, Feng, Xiang, Jie, Wang, Bin, Yan, Tianyi, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lin, Zhouchen, editor, Cheng, Ming-Ming, editor, He, Ran, editor, Ubul, Kurban, editor, Silamu, Wushouer, editor, Zha, Hongbin, editor, Zhou, Jie, editor, and Liu, Cheng-Lin, editor
- Published
- 2025
- Full Text
- View/download PDF
6. Bridge the Gap of Semantic Context: A Boundary-Guided Context Fusion UNet for Medical Image Segmentation
- Author
-
Chen, Yu, Wu, Jiahua, Wang, Da-Han, Zhang, Xinxin, Zhu, Shunzhi, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lin, Zhouchen, editor, Cheng, Ming-Ming, editor, He, Ran, editor, Ubul, Kurban, editor, Silamu, Wushouer, editor, Zha, Hongbin, editor, Zhou, Jie, editor, and Liu, Cheng-Lin, editor
- Published
- 2025
- Full Text
- View/download PDF
7. Artificial Intelligence-Based Quantification and Prognostic Assessment of CD3, CD8, CD146, and PDGF-Rβ Biomarkers in Sporadic Colorectal Cancer
- Author
-
Lohmann, Florencia Adriana, Specterman Zabala, Martín Isac, Soarez, Julieta Natalia, Dádamo, Maximiliano, Loresi, Mónica Alejandra, de las Nieves Diaz, María, Pavicic, Walter Hernán, Bolontrade, Marcela Fabiana, Risk, Marcelo Raúl, Santino, Juan Pablo, Vaccaro, Carlos Alberto, Piñero, Tamara Alejandra, Ghosh, Ashish, Editorial Board Member, Florez, Hector, editor, and Astudillo, Hernán, editor
- Published
- 2025
- Full Text
- View/download PDF
8. Sustainable Development Through Deep Learning-Based Waveform Segmentation: A Review
- Author
-
Saini, Aryan, Sharma, Dushyant, Tomar, Aditya Singh, Sharma, Pavika, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Whig, Pawan, editor, Silva, Nuno, editor, Elngar, Ahmad A., editor, Aneja, Nagender, editor, and Sharma, Pavika, editor
- Published
- 2025
- Full Text
- View/download PDF
9. Sleep arousal detection for monitoring of sleep disorders using one-dimensional convolutional neural network-based U-Net and bio-signals
- Author
-
Mishra, Priya and Swetapadma, Aleena
- Published
- 2024
- Full Text
- View/download PDF
10. Conditional image-to-image translation generative adversarial network (cGAN) for fabric defect data augmentation.
- Author
-
Mohammed, Swash Sami and Clarke, Hülya Gökalp
- Subjects
- *
GENERATIVE adversarial networks , *DATA augmentation , *ARTIFICIAL intelligence , *LUNG tumors , *BRAIN tumors , *LUNGS - Abstract
The availability of comprehensive datasets is a crucial challenge for developing artificial intelligence (AI) models in various applications and fields. The lack of large and diverse public fabric defect datasets forms a major obstacle to properly and accurately developing and training AI models for detecting and classifying fabric defects in real-life applications. Models trained on limited datasets struggle to identify underrepresented defects, reducing their practicality. To address these issues, this study suggests using a conditional generative adversarial network (cGAN) for fabric defect data augmentation. The proposed image-to-image translator GAN features a conditional U-Net generator and a 6-layered PatchGAN discriminator. The conditional U-Network (U-Net) generator can produce highly realistic synthetic defective samples and offers the ability to control various characteristics of the generated samples by taking two input images: a segmented defect mask and a clean fabric image. The segmented defect mask provides information about various aspects of the defects to be added to the clean fabric sample, including their type, shape, size, and location. By augmenting the training dataset with diverse and realistic synthetic samples, the AI models can learn to identify a broader range of defects more accurately. This technique helps overcome the limitations of small or unvaried datasets, leading to improved defect detection accuracy and generalizability. Moreover, this proposed augmentation method can find applications in other challenging fields, such as generating synthetic samples for medical imaging datasets related to brain and lung tumors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Research on insulator image segmentation and defect recognition technology based on U‐Net and YOLOv7.
- Author
-
Chen, Jiawen, Cai, Chao, Yan, Fangbin, and Zhou, Bowen
- Subjects
ARTIFICIAL neural networks ,IMAGE segmentation ,DATA augmentation ,ELECTRIC lines ,SAMPLE size (Statistics) - Abstract
This study focuses on aerial images in power line inspection, using a small sample size and concentrating on accurately segmenting insulators in images and identifying potential "self‐explode" defects through deep learning methods. The research process consists of four key steps: image segmentation of insulators, identification of small connected regions, data augmentation of original samples, and detection of insulator defects using the YOLO v7 model. In this paper, due to the small sample size, sample expansion is considered first. A sliding window approach is adopted to crop images, increasing the number of training samples. Subsequently, the U‐Net neural network model for semantic segmentation is used to train insulator images, thereby generating preliminary mask images of insulators. Then, through connected region area filtering techniques, smaller connected regions are removed to eliminate small speckles in the predicted mask images, obtaining more accurate insulator mask images. The evaluation metric for image recognition, the dice coefficient, is 93.67%. To target the identification of insulator defects, 35 images with insulator defects from the original samples are augmented. These images are input into the YOLO v7 network for further training, ultimately achieving effective detection of insulator "self‐explode" defects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Deep learning in image segmentation for cancer.
- Author
-
Rai, Robba
- Published
- 2024
- Full Text
- View/download PDF
13. Segmentation and visualization of the Shampula dragonfly eye glass bead CT images using a deep learning method.
- Author
-
Liao, Lingyu, Cheng, Qian, Zhang, Xueyan, Qu, Liang, Liu, Siran, Ma, Shining, Chen, Kunlong, Liu, Yue, Wang, Yongtian, and Song, Weitao
- Abstract
Micro-computed tomography (CT) of ancient Chinese glass dragonfly eye beads has enabled detailed exploration of their internal structures, contributing to our understanding of their manufacture. Segmentation of these CT images is essential but challenging due to variation in grayscale values and the presence of bubbles. This study introduces a U-Net-based model called EBV-SegNet, which enables efficient and accurate segmentation and visualization of these beads. We developed, trained, and tested the model using a dataset comprising four typical Shampula dragonfly eye beads, and the results demonstrated high-precision segmentation and precise delineation of the beads' complex structures. These segmented data were further analyzed using the Visualization Toolkit for advanced volume rendering and reconstruction. Our application of EBV-SegNet to Shampula beads suggests the likelihood of two distinct manufacturing techniques, underscoring the potential of the model for enhancing the analysis of cultural artifacts using three-dimensional visualization and deep learning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Multi-stage semi-supervised learning enhances white matter hyperintensity segmentation.
- Author
-
Duarte, Kauê T. N., Sidhu, Abhijot S., Barros, Murilo C., Gobbi, David G., McCreary, Cheryl R., Saad, Feryal, Camicioli, Richard, Smith, Eric E., Bento, Mariana P., and Frayne, Richard
- Abstract
Introduction: White matter hyperintensities (WMHs) are frequently observed on magnetic resonance (MR) images in older adults, commonly appearing as areas of high signal intensity on fluid-attenuated inversion recovery (FLAIR) MR scans. Elevated WMH volumes are associated with a greater risk of dementia and stroke, even after accounting for vascular risk factors. Manual segmentation, while considered the ground truth, is both labor-intensive and time-consuming, limiting the generation of annotated WMH datasets. Un-annotated data are relatively available; however, the requirement of annotated data poses a challenge for developing supervised machine learning models. Methods: To address this challenge, we implemented a multi-stage semi-supervised learning (M3SL) approach that first uses un-annotated data segmented by traditional processing methods ("bronze" and "silver" quality data) and then uses a smaller number of "gold"-standard annotations for model refinement. The M3SL approach enabled fine-tuning of the model weights with the gold-standard annotations. This approach was integrated into the training of a U-Net model for WMH segmentation. We used data from three scanner vendors (over more than five scanners) and from both cognitively normal (CN) adult and patients cohorts [with mild cognitive impairment and Alzheimer's disease (AD)]. Results: An analysis of WMH segmentation performance across both scanner and clinical stage (CN, MCI, AD) factors was conducted. We compared our results to both conventional and transfer-learning deep learning methods and observed better generalization with M3SL across different datasets. We evaluated several metrics (F -measure, IoU , and Hausdorff distance) and found significant improvements with our method compared to conventional (p < 0.001) and transfer-learning (p < 0.001). Discussion: These findings suggest that automated, non-machine learning, tools have a role in a multi-stage learning framework and can reduce the impact of limited annotated data and, thus, enhance model performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Dual-attention U-Net and multi-convolution network for single-image rain removal.
- Author
-
Zheng, Ziyang, Chen, Zhixiang, Wang, Shuqi, and Wang, Wenpeng
- Abstract
Images taken on rainy days have rain streaks of varying degrees of intensity, which seriously affect the visibility of the background scene. Aiming at the above problems, we propose a rain mark removal algorithm based on the combination of dual-attention mechanism U-Net and multi-convolution. First, we add a double attention mechanism to the encoder of U-Net. It can give different weights to the rain mark features that need to be extracted in different channels and spaces so that sufficient rain mark features can be obtained. With different dilation factors, we can obtain rain mark characteristics of different depths. Secondly, the multi-convolutional channel integrates the characteristics of rain streaks and prepares sufficient rain mark information for the task of clearing rain streaks. By introducing a cyclic rain streaks detection and removal mechanism into the network architecture, it can achieve gradual removal of rain streaks. Even in the case of heavy rain, our algorithm can get good results. Finally, we tested on both synthetic and real datasets to obtain subjective results and objective evaluations. Experimental results show that for the rainy day image de-rain task with different intensities of rain streaks, our algorithm is more robust. Moreover, the ability of our algorithm to remove rain streaks is better than that of the other five different classical algorithms. The de-raining images produced by our algorithm are visually sharper, and its visibility enhancements are effective for computer vision applications (Google Vision API). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Accurate identification of salt domes using deep learning techniques: Transformers, generative artificial intelligence and liquid state machines.
- Author
-
Souadih, Kamal, Mohammedi, Anis, and Chergui, Sofia
- Subjects
- *
GENERATIVE artificial intelligence , *SALT domes , *NATURAL gas reserves , *TRANSFORMER models , *PETROLEUM prospecting , *DEEP learning - Abstract
Across various global regions abundant in oil and natural gas reserves, the presence of substantial sub‐surface salt deposits holds significant relevance. Accurate identification of salt domes becomes crucial for enterprises engaged in oil and gas exploration. Our research introduces a precise method for the automatic detection of salt domes, leveraging advanced deep learning architectures such as U‐net, transformers, artificial intelligence generative models and liquid state machines. In comparison with state‐of‐the‐art techniques, our model demonstrates superior performance, achieving a stable and validated 96%$96\%$ intersection over the union metric, indicating high accuracy and robustness. Furthermore, the Dice similarity coefficient attaining 90%$90\%$ underscores the model's proficiency in closely aligning with ground truth across diverse scenarios. This evaluation, conducted on 1000 seismic images, reveals that our proposed architecture is not only comparable but often surpasses existing segmentation models in effectiveness and reliability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Removing cloud shadows from ground-based solar imagery.
- Author
-
Chaoui, Amal, Morgan, Jay Paul, Paiement, Adeline, and Aboudarham, Jean
- Abstract
The study and prediction of space weather entails the analysis of solar images showing structures of the Sun’s atmosphere. When imaged from the Earth’s ground, images may be polluted by terrestrial clouds which hinder the detection of solar structures. We propose a new method to remove cloud shadows, based on a U-Net architecture, and compare classical supervision with conditional GAN. We evaluate our method on two different imaging modalities, using both real images and a new dataset of synthetic clouds. Quantitative assessments are obtained through image quality indices (RMSE, PSNR, SSIM, and FID). We demonstrate improved results with regards to the traditional cloud removal technique and a sparse coding baseline, on different cloud types and textures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Imbalanced segmentation for abnormal cotton fiber based on GAN and multiscale residual U-Net.
- Author
-
Yang, Shuo, Li, Jingbin, Li, Yang, Nie, Jing, Ercisli, Sezai, and Khan, Muhammad Attique
- Subjects
COTTON fibers ,FIBERS ,PIXELS ,SCARCITY - Abstract
The scale of white foreign fibers in bobbin yarn is small, resulting in multiple types of data imbalance in the dataset. These imbalances include a severe imbalance of foreign fiber pixels compared to background pixels and an imbalance in the size target scale. Consequently, conventional semantic segmentation networks struggle to segment these fibers effectively. First, in tackling the scarcity of white foreign fiber instances within bobbin yarn samples, this research utilizes original foreign fiber images to train the DCGAN and generate adequate training samples. Secondly, a multiscale residual U-Net is constructed to extract foreign fiber features from different scales. The network is encouraged to learn semantic features at each scale and each layer of the decoding stage. This overcomes the problem of scale imbalance in the foreign fiber dataset and enhances the model's capability to extract weak semantic information from small targets. Thirdly, a weighted binary cross-entropy loss function is integrated into the network's training phase to rectify the class imbalance and refine segmentation performance. This function adjusts the weighting of foreign fiber pixel data, thereby addressing the disproportionate distribution between foreign fibers and background pixels within the dataset. Finally, the proposed method is experimentally validated using a dataset of white foreign fibers. The experimental results show that the proposed method achieves better results in the critical evaluation metrics, as evidenced by the accuracy of 97.52 %, the MIoU of 95.26 %, the DICE coefficient of 81.29 %, and the F1 Score of 84.92 %. These statistics demonstrate the method's efficacy in achieving high-precision segmentation of white foreign fibers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Segmentation and classification of white blood SMEAR images using modified CNN architecture.
- Author
-
Kumar, Indrajeet and Rawat, Jyoti
- Abstract
The classification and recognition of leukocytes or WBCs in blood smear images presents a key role in the corresponding diagnosis of specific diseases, such as leukemia, tumor, hematological disorders, etc. The computerized framework for automated segmentation & classification of WBCs nucleus contributes an important role for the recognition of WBCs related disorders. Therefore, this work emphasizes WBCs nucleus segmentation using modified U-Net architecture and the segmented WBCs nucleus are further classified into their subcategory i.e., basophil, eosinophil, neutrophil, monocyte and lymphocyte. The classification and nucleus characterization task has been performed using VGGNet and MobileNet V2 architecture. Initially, collected instances are passed to the preprocessing phase for image rescaling and normalization. The rescaled and normalized instances are passed to the U-Net model for nucleus segmentation. Extracted nucleus are forwarded to the classification phase for their class identifications. Furthermore, the functioning of the intended design will be compared with other modern methods. By the end of this study a successful model classifying various nucleus morphologies such as Basophil, Eosinophil, Lymphocyte, Monocyte and Neutrophil was obtained where overall test accuracy achieved was 97.0% for VGGNet classifier and 94.0% for MobileNet V2 classifier. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. RM-UNet: UNet-like Mamba with rotational SSM module for medical image segmentation.
- Author
-
Tang, Hao, Huang, Guoheng, Cheng, Lianglun, Yuan, Xiaochen, Tao, Qi, Chen, Xuhang, Zhong, Guo, and Yang, Xiaohui
- Abstract
Accurate segmentation of tissues and lesions is crucial for disease diagnosis, treatment planning, and surgical navigation. Yet, the complexity of medical images presents significant challenges for traditional Convolutional Neural Networks and Transformer models due to their limited receptive fields or high computational complexity. State Space Models (SSMs) have recently shown notable vision performance, particularly Mamba and its variants. However, their feature extraction methods may not be sufficiently effective and retain some redundant structures, leaving room for parameter reduction. In response to these challenges, we introduce a methodology called Rotational Mamba-UNet, characterized by Residual Visual State Space (ResVSS) block and Rotational SSM Module. The ResVSS block is devised to mitigate network degradation caused by the diminishing efficacy of information transfer from shallower to deeper layers. Meanwhile, the Rotational SSM Module is devised to tackle the challenges associated with channel feature extraction within State Space Models. Finally, we propose a weighted multi-level loss function, which fully leverages the outputs of the decoder's three stages for supervision. We conducted experiments on ISIC17, ISIC18, CVC-300, Kvasir-SEG, CVC-ColonDB, Kvasir-Instrument datasets, and Low-grade Squamous Intraepithelial Lesion datasets provided by The Third Affiliated Hospital of Sun Yat-sen University, demonstrating the superior segmentation performance of our proposed RM-UNet. Additionally, compared to the previous VM-UNet, our model achieves a one-third reduction in parameters. Our code is available at https://github.com/Halo2Tang/RM-UNet. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Image Denoising Using Deblur Generative Adversarial Network Denoising U-Net.
- Author
-
Usha Rani, B., Aruna, R., Velrajkumar, P., Amuthan, N., and Sivakumar, N.
- Subjects
- *
CONVOLUTIONAL neural networks , *GENERATIVE adversarial networks , *IMAGE denoising , *SIGNAL-to-noise ratio , *RANDOM noise theory - Abstract
Convolutional neural networks (CNNs) are becoming increasingly popular for image denoising. U-Nets, a type of CNN architecture, have been shown to be effective for this task. However, the impact of shallow layers on deeper layers decreases as the depth of the network increases. To address this issue, the authors propose a new image denoising method called DGANDU-Net. DGANDU-Net combines the DeblurGAN design with a U-Net architecture. This combination allows DGANDU-Net to effectively remove noise from images while preserving fine details. The authors also propose the use of two loss functions, mean square error (MSE) and perceptual loss, to improve the performance of DGANDU-Net. MSE is used to learn and improve the extracted features, while perceptual loss is used to produce the final denoised image. The authors evaluate the performance of DGANDU-Net on a variety of noise levels and find that it outperforms other state-of-the-art denoising algorithms in terms of both visual quality and two evaluation indices, including peak signal-to-noise ratio (PSNR) and Structural Similarity Index Measure (SSIM). Specifically, for extremely noisy environments with a noise standard deviation of 75, DGANDU-Net achieves an average PSNR of 37.39dB in the test dataset. The authors conclude that DGANDU-Net is a promising new method for image denoising that has the potential to significantly improve the quality of medical images used for diagnosis and treatment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Liver tumor segmentation using G-Unet and the impact of preprocessing and postprocessing methods.
- Author
-
D J, Deepak and B S, Sunil Kumar
- Subjects
CONVOLUTIONAL neural networks ,LIVER tumors ,COMPUTED tomography ,THERAPEUTICS ,LIVER - Abstract
Accurate liver and lesion segmentation plays a crucial role in the clinical assessment and therapeutic planning of hepatic diseases. The segmentation of the liver and lesions using automated techniques is a crucial undertaking that holds the potential to facilitate the early detection of malignancies and the effective management of patients' treatment requirements by medical professionals. This research presents the Generalized U-Net (G-Unet), a unique hybrid model designed for segmentation tasks. The G-Unet model is capable of incorporating other models such as convolutional neural networks (CNN), residual networks (ResNets), and densely connected convolutional neural networks (DenseNet) into the general U-Net framework. The G-Unet model, which consists of three distinct configurations, was assessed using the LiTS dataset. The results indicate that G-Unet demonstrated a high level of accuracy in segmenting the data. Specifically, the G-Unet model, configured with DenseNet architecture, produced a liver tumor segmentation accuracy of 72.9% dice global score. This performance is comparable to the existing state-of-the-art methodologies. The study also showcases the influence of different preprocessing and postprocessing techniques on the accuracy of segmentation. The utilization of Hounsfield Unit (HU) windowing and histogram equalization as preprocessing approaches, together with the implementation of conditional random fields as postprocessing techniques, resulted in a notable enhancement of 3.35% in the accuracy of tumor segmentation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Enhanced WGAN Model for Diagnosing Laryngeal Carcinoma.
- Author
-
Kim, Sungjin, Chang, Yongjun, An, Sungjun, Kim, Deokseok, Cho, Jaegu, Oh, Kyungho, Baek, Seungkuk, and Choi, Bo K.
- Subjects
- *
GENERATIVE artificial intelligence , *PREDICTIVE tests , *PREDICTION models , *COMPUTER-assisted image analysis (Medicine) , *RESEARCH funding , *EARLY detection of cancer , *DIAGNOSTIC errors , *LARYNGOSCOPY , *ARTIFICIAL neural networks , *COMPUTER-aided diagnosis , *MACHINE learning ,LARYNGEAL tumors - Abstract
Simple Summary: This study aimed to enhance the accuracy of detecting laryngeal carcinoma using a modified AI model based on U-Net. The model was designed to automatically identify lesions in endoscopic images. Researchers addressed issues such as mode collapse and gradient explosion to ensure stable performance, achieving 99% accuracy in detecting malignancies. The study found that malignant tumors were detected more reliably than benign ones. This technology could help reduce human error in diagnoses, allowing for earlier detection and treatment. Furthermore, it has the potential to be applied in other medical fields, benefiting overall healthcare. This study modifies the U-Net architecture for pixel-based segmentation to automatically classify lesions in laryngeal endoscopic images. The advanced U-Net incorporates five-level encoders and decoders, with an autoencoder layer to derive latent vectors representing the image characteristics. To enhance performance, a WGAN was implemented to address common issues such as mode collapse and gradient explosion found in traditional GANs. The dataset consisted of 8171 images labeled with polygons in seven colors. Evaluation metrics, including the F1 score and intersection over union, revealed that benign tumors were detected with lower accuracy compared to other lesions, while cancers achieved notably high accuracy. The model demonstrated an overall accuracy rate of 99%. This enhanced U-Net model shows strong potential in improving cancer detection, reducing diagnostic errors, and enhancing early diagnosis in medical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. U-Net Semantic Segmentation-Based Calorific Value Estimation of Straw Multifuels for Combined Heat and Power Generation Processes.
- Author
-
Li, Lianming, Wang, Zhiwei, and He, Defeng
- Subjects
- *
TRANSFORMER models , *INDUSTRIALISM , *IMAGE segmentation , *STRAW , *UNITS of time - Abstract
This paper proposes a system for real-time estimation of the calorific value of mixed straw fuels based on an improved U-Net semantic segmentation model. This system aims to address the uncertainty in heat and power generation per unit time in combined heat and power generation (CHPG) systems caused by fluctuations in the calorific value of straw fuels. The system integrates an industrial camera, moisture detector, and quality sensors to capture images of the multi-fuel straw. It applies the improved U-Net segmentation network for semantic segmentation of the images, accurately calculating the proportion of each type of straw. The improved U-Net network introduces a self-attention mechanism in the skip connections of the final layer of the encoder, replacing traditional convolutions by depthwise separable convolutions, as well as replacing the traditional convolutional bottleneck layers with Transformer encoder. These changes ensure that the model achieves high segmentation accuracy and strong generalization capability while maintaining good real-time performance. The semantic segmentation results of the straw images are used to calculate the proportions of different types of straw and, combined with moisture content and quality data, the calorific value of the mixed fuel is estimated in real time based on the elemental composition of each straw type. Validation using images captured from an actual thermal power plant shows that, under the same conditions, the proposed model has only a 0.2% decrease in accuracy compared to the traditional U-Net segmentation network, while the number of parameters is significantly reduced by 74%, and inference speed is improved 23%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Attention-Enhanced Urban Fugitive Dust Source Segmentation in High-Resolution Remote Sensing Images.
- Author
-
He, Xiaoqing, Wang, Zhibao, Bai, Lu, Fan, Meng, Chen, Yuanlin, and Chen, Liangfu
- Subjects
- *
PARTICULATE matter , *DUST control , *REMOTE sensing , *FEATURE extraction , *IMAGE segmentation , *DUST , *FUGITIVE emissions - Abstract
Fugitive dust is an important source of total suspended particulate matter in urban ambient air. The existing segmentation methods for dust sources face challenges in distinguishing key and secondary features, and they exhibit poor segmentation at the image edge. To address these issues, this paper proposes the Dust Source U-Net (DSU-Net), enhancing the U-Net model by incorporating VGG16 for feature extraction, and integrating the shuffle attention module into the jump connection branch to enhance feature acquisition. Furthermore, we combine Dice Loss, Focal Loss, and Activate Boundary Loss to improve the boundary extraction accuracy and reduce the loss oscillation. To evaluate the effectiveness of our model, we selected Jingmen City, Jingzhou City, and Yichang City in Hubei Province as the experimental area and established two dust source datasets from 0.5 m high-resolution remote sensing imagery acquired by the Jilin-1 satellite. Our created datasets include dataset HDSD-A for dust source segmentation and dataset HDSD-B for distinguishing the dust control measures. Comparative analyses of our proposed model with other typical segmentation models demonstrated that our proposed DSU-Net has the best detection performance, achieving a mIoU of 93% on dataset HDSD-A and 92% on dataset HDSD-B. In addition, we verified that it can be successfully applied to detect dust sources in urban areas. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation from CT scans.
- Author
-
Lan, Xiaoke and Jin, Wenbing
- Subjects
- *
COMPUTED tomography , *DEEP learning , *COVID-19 , *DIAGNOSTIC imaging , *STATISTICAL correlation - Abstract
Accurate segmentation of COVID-19 lesions from medical images is essential for achieving precise diagnosis and developing effective treatment strategies. Unfortunately, this task presents significant challenges, owing to the complex and diverse characteristics of opaque areas, subtle differences between infected and healthy tissue, and the presence of noise in CT images. To address these difficulties, this paper designs a new deep-learning architecture (named MD-Net) based on multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation. In our framework, the U-shaped structure serves as the cornerstone to facilitate complex hierarchical representations essential for accurate segmentation. Then, by introducing the multi-scale input layers (MIL), the network can effectively analyze both fine-grained details and contextual information in the original image. Furthermore, we introduce an SE-Conv module in the encoder network, which can enhance the ability to identify relevant information while simultaneously suppressing the transmission of extraneous or non-lesion information. Additionally, we design a dense decoder aggregation (DDA) module to integrate feature distributions and important COVID-19 lesion information from adjacent encoder layers. Finally, we conducted a comprehensive quantitative analysis and comparison between two publicly available datasets, namely Vid-QU-EX and QaTa-COV19-v2, to assess the robustness and versatility of MD-Net in segmenting COVID-19 lesions. The experimental results show that the proposed MD-Net has superior performance compared to its competitors, and it exhibits higher scores on the Dice value, Matthews correlation coefficient (Mcc), and Jaccard index. In addition, we also conducted ablation studies on the Vid-QU-EX dataset to evaluate the contributions of each key component within the proposed architecture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Retinal blood vessel segmentation using a deep learning method based on modified U-NET model.
- Author
-
Sanjeewani, Yadav, Arun Kumar, Akbar, Mohd, Kumar, Mohit, and Yadav, Divakar
- Subjects
RETINAL blood vessels ,FUNDUS oculi ,DIABETIC retinopathy ,ALGORITHMS ,PHYSICIANS - Abstract
Retinal blood vessel segmentation is important for detection of several highly prevalent, vision-threatening diseases such as diabetic retinopathy. Automatic retinal blood vessel segmentation is crucial to overcome the limitations posed by diagnoses by doctors. In recent times, deep learning-based methods have achieved great success in automatically segmenting retinal blood vessels from images. In this paper, a U-Net-based architecture is proposed to segment the retinal blood vessels from fundus images of the eye. Three pre-processing algorithms are proposed to enhance the performance of the proposed method further. Based on experimental evaluation of the publicly available DRIVE dataset, the proposed method achieves 0.9577 average accuracies (Acc), 0.7436 sensitivity (Se), 0.9838 specificities (Sp) and 0.7931 F1-score. The proposed method outperforms the recent state-of-art approaches in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Ground subsidence prediction with high precision: a novel spatiotemporal prediction model with Interferometric Synthetic Aperture Radar technology.
- Author
-
Tao, Qiuxiang, Xiao, Yixin, Hu, Leyin, Liu, Ruixiang, and Li, Xuepeng
- Subjects
- *
SYNTHETIC aperture radar , *MINE subsidences , *STANDARD deviations , *RECURRENT neural networks , *MINES & mineral resources - Abstract
As the extraction of mineral resources intensifies, ground subsidence in mining areas has escalated, posing substantial challenges to sustainable development and operational safety. This subsidence, resulting directly from mining activities, significantly compromises the safety of nearby residents by damaging residential structures and infrastructure. Thus, developing precise and dependable methods for predicting ground subsidence is crucial. This study introduces an innovative Cabs-Unet model, which enhances the U-Net architecture by integrating a Convolutional Block Attention Module (CBAM) and Depthwise Separable Convolutions (DSC). This model aims to predict the spatiotemporal dynamics of the Interferometric Synthetic Aperture Radar (InSAR) time series. Employing Small Baseline Subset Interferometric Synthetic Aperture Radar (SBAS InSAR) technology, we gathered and validated data on ground subsidence at the Pengzhuang coal mine from May 2017 to November 2021, covering 130 scenes, with its accuracy corroborated by levelling survey results. An empirical evaluation of the Cabs-Unet model in two distinct subsidence zones demonstrated superior performance over conventional methods like Convolutional Long Short-Term Memory (ConvLSTM) and Predictive Recurrent Neural Network (PredRNN), with Root Mean Square Error (RMSE) values of 1.44 and 1.70, respectively. These findings highlight the model’s efficacy in accurately predicting spatiotemporal InSAR ground subsidence. Further predictive analysis using InSAR data indicated an expected increase in subsidence, projecting cumulative declines of −457 mm in Area A and −1278 mm in Area B by 17 July 2022. Our model proves effective in assessing subsidence, promptly detecting potential risks and facilitating the rapid implementation of risk mitigation strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Automated High-Precision Recognition of Solar Filaments Based on an Improved U 2 -Net.
- Author
-
Jiang, Wendong and Li, Zhengyang
- Subjects
- *
SOLAR magnetic fields , *SOLAR active regions , *SOLAR activity , *SOLAR flares , *DEEP learning , *CORONAL mass ejections ,SOLAR filaments - Abstract
Solar filaments are a significant solar activity phenomenon, typically observed in full-disk solar observations in the H-alpha band. They are closely associated with the magnetic fields of solar active regions, solar flare eruptions, and coronal mass ejections. With the increasing volume of observational data, the automated high-precision recognition of solar filaments using deep learning is crucial. In this study, we processed full-disk H-alpha solar images captured by the Chinese H-alpha Solar Explorer in 2023 to generate labels for solar filaments. The preprocessing steps included limb-darkening removal, grayscale transformation, K-means clustering, particle erosion, multiple closing operations, and hole filling. The dataset containing solar filament labels is constructed for deep learning. We developed the Attention U2-Net neural network for deep learning on the solar dataset by introducing an attention mechanism into U2-Net. In the results, Attention U2-Net achieved an average Accuracy of 0.9987, an average Precision of 0.8221, an average Recall of 0.8469, an average IoU of 0.7139, and an average F1-score of 0.8323 on the solar filament test set, showing significant improvements compared to other U-net variants. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Precision Segmentation of Subretinal Fluids in OCT Using Multiscale Attention-Based U-Net Architecture.
- Author
-
Karn, Prakash Kumar and Abdulla, Waleed H.
- Subjects
- *
MACULAR degeneration , *OPTICAL coherence tomography , *MACULAR edema , *RETINAL diseases , *COMPUTER-assisted image analysis (Medicine) - Abstract
This paper presents a deep-learning architecture for segmenting retinal fluids in patients with Diabetic Macular Oedema (DME) and Age-related Macular Degeneration (AMD). Accurate segmentation of multiple fluid types is critical for diagnosis and treatment planning, but existing techniques often struggle with precision. We propose an encoder–decoder network inspired by U-Net, processing enhanced OCT images and their edge maps. The encoder incorporates Residual and Inception modules with an autoencoder-based multiscale attention mechanism to extract detailed features. Our method shows superior performance across several datasets. On the RETOUCH dataset, the network achieved F1 Scores of 0.82 for intraretinal fluid (IRF), 0.93 for subretinal fluid (SRF), and 0.94 for pigment epithelial detachment (PED). The model also performed well on the OPTIMA and DUKE datasets, demonstrating high precision, recall, and F1 Scores. This architecture significantly enhances segmentation accuracy and edge precision, offering a valuable tool for diagnosing and managing retinal diseases. Its integration of dual-input processing, multiscale attention, and advanced encoder modules highlights its potential to improve clinical outcomes and advance retinal disease treatment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Automated estimation of thoracic rotation in chest X-ray radiographs: a deep learning approach for enhanced technical assessment.
- Author
-
Sun, Jiuai, Hou, Pengfei, Li, Kai, Wei, Ling, Zhao, Ruifeng, and Wu, Zhonghang
- Subjects
- *
CHEST X rays , *DEEP learning , *X-ray imaging , *CHEST examination , *QUALITY control - Abstract
Objectives: This study aims to develop an automated approach for estimating the vertical rotation of the thorax, which can be used to assess the technical adequacy of chest X-ray radiographs (CXRs). Methods: Total 800 chest radiographs were used to train and establish segmentation networks for outlining the lungs and spine regions in chest X-ray images. By measuring the widths of the left and right lungs between the central line of segmented spine and the lateral sides of the segmented lungs, the quantification of thoracic vertical rotation was achieved. Additionally, a life-size, full body anthropomorphic phantom was employed to collect chest radiographic images under various specified rotation angles for assessing the accuracy of the proposed approach. Results: The deep learning networks effectively segmented the anatomical structures of the lungs and spine. The proposed approach demonstrated a mean estimation error of less than 2° for thoracic rotation, surpassing existing techniques and indicating its superiority. Conclusions: The proposed approach offers a robust assessment of thoracic rotation and presents new possibilities for automated image quality control in chest X-ray examinations. Advances in knowledge: This study presents a novel deep-learning-based approach for the automated estimation of vertical thoracic rotation in chest X-ray radiographs. The proposed method enables a quantitative assessment of the technical adequacy of CXR examinations and opens up new possibilities for automated screening and quality control of radiographs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Improving 3D dose prediction for breast radiotherapy using novel glowing masks and gradient‐weighted loss functions.
- Author
-
Moore, Lance C., Nematollahi, Fatemeh, Li, Lingyi, Meyers, Sandra M., and Kisling, Kelly
- Subjects
- *
DEEP learning , *PREDICTION models , *BREAST cancer , *PERFORMANCE standards , *AUTOMATED planning & scheduling , *BREAST , *LUNGS - Abstract
Background: The quality of treatment plans for breast cancer can vary greatly. This variation could be reduced by using dose prediction to automate treatment planning. Our work investigates novel methods for training deep‐learning models that are capable of producing high‐quality dose predictions for breast cancer treatment planning. Purpose: The goal of this work was to compare the performance impact of two novel techniques for deep learning dose prediction models for tangent field treatments for breast cancer. The first technique, a "glowing" mask algorithm, encodes the distance from a contour into each voxel in a mask. The second, a gradient‐weighted mean squared error (MSE) loss function, emphasizes the error in high‐dose gradient regions in the predicted image. Methods: Four 3D U‐Net deep learning models were trained using the planning CT and contours of the heart, lung, and tumor bed as inputs. The dataset consisted of 305 treatment plans split into 213/46/46 training/validation/test sets using a 70/15/15% split. We compared the impact of novel "glowing" anatomical mask inputs and a novel gradient‐weighted MSE loss function to their standard counterparts, binary anatomical masks, and MSE loss, using an ablation study methodology. To assess performance, we examined the mean error and mean absolute error (ME/MAE) in dose across all within‐body voxels, the error in mean dose to heart, ipsilateral lung, and tumor bed, dice similarity coefficient (DSC) across isodose volumes defined by 0%–100% prescribed dose thresholds, and gamma analysis (3%/3 mm). Results: The combination of novel glowing masks and gradient weighted loss function yielded the best‐performing model in this study. This model resulted in a mean ME of 0.40%, MAE of 2.70%, an error in mean dose to heart and lung of −0.10 and 0.01 Gy, and an error in mean dose to the tumor bed of −0.01%. The median DSC at 50/95/100% isodose levels were 0.91/0.87/0.82. The mean 3D gamma pass rate (3%/3 mm) was 93%. Conclusions: This study found the combination of novel anatomical mask inputs and loss function for dose prediction resulted in superior performance to their standard counterparts. These results have important implications for the field of radiotherapy dose prediction, as the methods used here can be easily incorporated into many other dose prediction models for other treatment sites. Additionally, this dose prediction model for breast radiotherapy has sufficient performance to be used in an automated planning pipeline for tangent field radiotherapy and has the major benefit of not requiring a PTV for accurate dose prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Comparative Approach to De-Noising TEMPEST Video Frames.
- Author
-
Vizitiu, Alexandru Mădălin, Sandu, Marius Alexandru, Dobrescu, Lidia, Focșa, Adrian, and Molder, Cristian Constantin
- Subjects
- *
CONVOLUTIONAL neural networks , *OPTICAL character recognition , *SCIENTIFIC community , *COMPARATIVE method , *ADAPTIVE filters - Abstract
Analysis of unintended compromising emissions from Video Display Units (VDUs) is an important topic in research communities. This paper examines the feasibility of recovering the information displayed on the monitor from reconstructed video frames. The study holds particular significance for our understanding of security vulnerabilities associated with the electromagnetic radiation of digital displays. Considering the amount of noise that reconstructed TEMPEST video frames have, the work in this paper focuses on two different approaches to de-noising images for efficient optical character recognition. First, an Adaptive Wiener Filter (AWF) with adaptive window size implemented in the spatial domain was tested, and then a Convolutional Neural Network (CNN) with an encoder–decoder structure that follows both classical auto-encoder model architecture and U-Net architecture (auto-encoder with skip connections). These two techniques resulted in an improvement of more than two times on the Structural Similarity Index Metric (SSIM) for AWF and up to four times for the SSIM for the Deep Learning (DL) approach. In addition, to validate the results, the possibility of text recovery from processed noisy frames was studied using a state-of-the-art Tesseract Optical Character Recognition (OCR) engine. The present work aims to bring to attention the security importance of this topic and the non-negligible character of VDU information leakages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. An Automated Clubbed Fingers Detection System Based on YOLOv8 and U-Net: A Tool for Early Prediction of Lung and Cardiovascular Diseases.
- Author
-
Hsu, Wen-Shin, Liu, Guan-Tsen, Chen, Su-Juan, Wei, Si-Yu, and Wang, Wei-Hsun
- Subjects
- *
PROCESS capability , *LUNG diseases , *IMAGE segmentation , *DEEP learning , *CLOUD computing , *CARDIOVASCULAR diseases - Abstract
Background/Objectives: Lung and cardiovascular diseases are leading causes of mortality worldwide, yet early detection remains challenging due to the subtle symptoms. Digital clubbing, characterized by the bulbous enlargement of the fingertips, serves as an early indicator of these diseases. This study aims to develop an automated system for detecting digital clubbing using deep-learning models for real-time monitoring and early intervention. Methods: The proposed system utilizes the YOLOv8 model for object detection and U-Net for image segmentation, integrated with the ESP32-CAM development board to capture and analyze finger images. The severity of digital clubbing is determined using a custom algorithm based on the Lovibond angle theory, categorizing the condition into normal, mild, moderate, and severe. The system was evaluated using 1768 images and achieved cloud-based and real-time processing capabilities. Results: The system demonstrated high accuracy (98.34%) in real-time detection with precision (98.22%), sensitivity (99.48%), and specificity (98.22%). Cloud-based processing achieved slightly lower but robust results, with an accuracy of 96.38%. The average processing time was 0.15 s per image, showcasing its real-time potential. Conclusions: This automated system provides a scalable and cost-effective solution for the early detection of digital clubbing, enabling timely intervention for lung and cardiovascular diseases. Its high accuracy and real-time capabilities make it suitable for both clinical and home-based health monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. A "Region-Specific Model Adaptation (RSMA)"-Based Training Data Method for Large-Scale Land Cover Mapping.
- Author
-
Li, Congcong, Xian, George, and Jin, Suming
- Subjects
- *
MACHINE learning , *LAND cover , *HABITAT conservation , *BIOGEOCHEMICAL cycles , *DATABASES , *DEEP learning - Abstract
An accurate and historical land cover monitoring dataset for Alaska could provide fundamental information for a range of studies, such as conservation habitats, biogeochemical cycles, and climate systems, in this distinctive region. This research addresses challenges associated with the extraction of training data for timely and accurate land cover classifications in Alaska over longer time periods (e.g., greater than 10 years). Specifically, we designed the "Region-Specific Model Adaptation (RSMA)" method for training data. The method integrates land cover information from the National Land Cover Database (NLCD), LANDFIRE's Existing Vegetation Type (EVT), and the National Wetlands Inventory (NWI) and machine learning techniques to generate robust training samples based on the Anderson Level II classification legend. The assumption of the method is that spectral signatures vary across regions because of diverse land surface compositions; however, despite these variations, there are consistent, collective land cover characteristics that span the entire region. Building upon this assumption, this research utilized the classification power of deep learning algorithms and the generalization ability of RSMA to construct a model for the RSMA method. Additionally, we interpreted existing vegetation plot information for land cover labels as validation data to reduce inconsistency in the human interpretation. Our validation results indicate that the RSMA method improved the quality of the training data derived solely from the NLCD by approximately 30% for the overall accuracy. The validation assessment also demonstrates that the RSMA method can generate reliable training data on large scales in regions that lack sufficient reliable data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A Transformer-Unet Generative Adversarial Network for the Super-Resolution Reconstruction of DEMs.
- Author
-
Zheng, Xin, Xu, Zhaoqi, Yin, Qian, Bao, Zelun, Chen, Zhirui, and Wang, Sizhu
- Subjects
- *
GENERATIVE adversarial networks , *DIGITAL elevation models , *ENVIRONMENTAL sciences , *GEOLOGY , *AGRICULTURE - Abstract
A new model called the Transformer-Unet Generative Adversarial Network (TUGAN) is proposed for super-resolution reconstruction of digital elevation models (DEMs). Digital elevation models are used in many fields, including environmental science, geology and agriculture. The proposed model uses a self-similarity Transformer (SSTrans) as the generator and U-Net as the discriminator. SSTrans, a model that we previously proposed, can yield good reconstruction results in structurally complex areas but has little advantage when the surface is simple and smooth because too many additional details have been added to the data. To resolve this issue, we propose the novel TUGAN model, where U-Net is capable of multilayer jump connections, which enables the discriminator to consider both global and local information when making judgments. The experiments show that TUGAN achieves state-of-the-art results for all types of terrain details. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Lightweight decoder U-net crack segmentation network based on depthwise separable convolution.
- Author
-
Yu, Yongbo, Zhang, Yage, Yu, Junyang, and Yue, Jianwei
- Abstract
Cracks are a common type of damage found on the surfaces of concrete buildings and roads. Accurately identifying the width and direction of these cracks is critical for maintaining and evaluating such structures. However, challenges such as irregular crack shapes and complex background interference persist in the crack identification task. To address these challenges, we propose a semantic segmentation network for cracks (DSU-Net) based on U-Net. A lightweight decoder is built through depthwise separable convolution to reduce model complexity and better retain the high-level features extracted by the encoder. Three modules are designed to improve the performance of the model. First, a feature enhancement module (DCM) that combines CBAM and squeeze excitation (cSE) is constructed to further enhance and optimize the intermediate features extracted by the encoder. Secondly, a neighboring layer information fusion module (NIF) is constructed to enrich the semantic information of extracted features. Finally, a feature refinement module (FRM) is constructed using multi-layer convolutional skip connections to make the final refinement of the features extracted by the model. Experiments were conducted using three datasets: DeepCrack, Crack500, and CCSS. The segmentation effect was tested, and nine models were used for comparative experiments. The test results showed an average improvement of 1.29% and 1.89% in the three datasets compared to the suboptimal models MIoU and F1, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Reasoning cartographic knowledge in deep learning-based map generalization with explainable AI.
- Author
-
Fu, Cheng, Zhou, Zhiyong, Xin, Yanan, and Weibel, Robert
- Subjects
- *
ARTIFICIAL neural networks , *VISUAL analytics , *ARTIFICIAL intelligence , *GENERALIZATION , *VISUALIZATION - Abstract
Cartographic map generalization involves complex rules, and a full automation has still not been achieved, despite many efforts over the past few decades. Pioneering studies show that some map generalization tasks can be partially automated by deep neural networks (DNNs). However, DNNs are still used as black-box models in previous studies. We argue that integrating explainable AI (XAI) into a DL-based map generalization process can give more insights to develop and refine the DNNs by understanding what cartographic knowledge exactly is learned. Following an XAI framework for an empirical case study, visual analytics and quantitative experiments were applied to explain the importance of input features regarding the prediction of a pre-trained ResU-Net model. This experimental case study finds that the XAI-based visualization results can easily be interpreted by human experts. With the proposed XAI workflow, we further find that the DNN pays more attention to the building boundaries than the interior parts of the buildings. We thus suggest that boundary intersection over union is a better evaluation metric than commonly used intersection over union in qualifying raster-based map generalization results. Overall, this study shows the necessity and feasibility of integrating XAI as part of future DL-based map generalization development frameworks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Spreading anomaly semantic segmentation and 3D reconstruction of binder jet additive manufacturing powder bed images.
- Author
-
Gourley, Alexander, Kaufman, Jonathan, Aman, Bashu, Schwalbach, Edwin, Beuth, Jack, Rueschhoff, Lisa, and Reeja-Jayan, B.
- Subjects
- *
CONVOLUTIONAL neural networks , *CERAMIC powders , *MANUFACTURING processes , *IMAGE analysis , *RAPID tooling - Abstract
Variability in the inherently dynamic nature of additive manufacturing introduces imperfections that hinder the commercialization of new materials. Binder jetting produces ceramic and metallic parts, but low green densities and spreading anomalies reduce the predictability and processability of resulting geometries. In situ feedback presents a method for robust evaluation of spreading anomalies, reducing the number of required builds to refine processing parameters in a multivariate space. In this study, we report layer-wise powder bed semantic segmentation for the first time with a visually light ceramic powder, alumina, or Al2O3, leveraging an image analysis software to rapidly segment optical images acquired during the additive manufacturing process. Using preexisting image analysis tools allowed for rapid analysis of 316 stainless steel and alumina powders with small data sets by providing an accessible framework for implementing neural networks. Models trained on five build layers for each material to classify base powder, parts, streaking, short spreading, and bumps from recoater friction with testing categorical accuracies greater than 90%. Lower model performance accompanied the more subtle spreading features present in the white alumina compared to the darker steel. Applications of models to new builds demonstrated repeatability with the resulting models, and trends in classified pixels reflected corrections made to processing parameters. Through the development of robust analysis techniques and feedback for new materials, parameters can be corrected as builds progress. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. In-Vehicle Environment Noise Speech Enhancement Using Lightweight Wave-U-Net.
- Author
-
Kang, Byung Ha, Park, Hyun Jun, Lee, Sung Hee, Choi, Yeon Kyu, Lee, Myoung Ok, and Han, Sung Won
- Subjects
- *
CONVOLUTIONAL neural networks , *SPEECH enhancement , *SPEECH perception , *DEEP learning , *NETWORK performance - Abstract
With the rapid advancement of AI technology, speech recognition has also advanced quickly. In recent years, speech-related technologies have been widely implemented in the automotive industry. However, in-vehicle environment noise inhibits the recognition rate, resulting in poor speech recognition performance. Numerous speech enhancement methods have been proposed to mitigate this performance degradation. Filter-based methodologies have been used to remove existing vehicle environment noise; however, they remove only limited noise. In addition, there is the constraint that there are limits to the size of models that can be mounted inside a vehicle. Therefore, making the model lighter while increasing speech quality in a vehicle environment is an essential factor. This study proposes a Wave-U-Net with a depthwise-separable convolution to overcome these limitations. We built various convolutional blocks using the Wave-U-Net model as a baseline to analyze the results, and we designed the network by adding squeeze-and-excitation network to improve performance without significantly increasing the parameters. The experimental results show how much noise is lost through spectrogram visualization, and that the proposed model improves performance in eliminating noise compared with conventional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Prediction of carcass rib eye area by ultrasound images in sheep using computer vision.
- Author
-
Júnior, Francisco Albir Lima, Filho, Luiz Antônio Silva Figueiredo, de Sousa Júnior, Antônio, Silva, Romuere Rodrigues Veloso e., Barbosa, Bruna Lima, de Brito Vieira, Rafaela, Rocha, Artur Oliveira, de Moura Oliveira, Tiago, and Sarmento, José Lindenberg Rocha
- Subjects
- *
ULTRASONIC imaging , *COMPUTER vision , *RANDOM forest algorithms , *SHEEP , *AREA measurement - Abstract
The present research created a tool to measure ultrasound images of the rib eye area in sheep. One hundred twenty-one ultrasound images of sheep were captured, with regions of interest segmented using the U-Net algorithm. The metrics adopted to evaluate automatic segmentations were Dicescore and intersection over union. Finally, a regression analysis was performed using the AdaBoost Regressor and Random Forest Regressor algorithms and the fit of the models was evaluated using the Mean Square Residuals, mean absolute error and coefficient of determination. The values obtained for the Dice metric were 0.94, and for Intersection over Union it was 0.89, demonstrating a high similarity between the actual and predicted values, ranging from 0 to 1. The values of Mean Quadratic Residuals, mean absolute error and coefficient The determination of the regressor models indicated the best fit for the Random Forest Regressor. The U-Net algorithm efficiently segmented ultrasound images of the Longissimus Dorsi muscle, with greater precision than the measurements performed by the specialist. This efficient segmentation allowed the standardization of rib eye area measurements and, consequently, the phenotyping of beef sheep on a large scale. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. MTU-Net: Multi-Task Convolutional Neural Network for Breast Calcification Segmentation from Mammograms.
- Author
-
Alghamdi, Manal
- Abstract
Computer-Aided Detection (CAD) is a technology that helps radiologists identify malignant microcalcifications (MCs) on mammograms. By minimizing observational oversight, CAD enhances the radiologist's detection accuracy. However, the high incidence of false positives limits the reliance on these technologies. Breast Arterial Calcifications (BAC) are a common source of false positives. Effective identification and elimination of these false positives are crucial for improving CAD performance in detecting malignant MCs. This paper presents a model that can eliminate BACs from positive findings, thereby enhancing the accuracy of CAD. Inspired by the successful outcomes of the UNet model in various biomedical segmentation tasks, a multitask U-Net (MTU-Net) was developed to simultaneously segment different types of calcifications, including MCs and BACs, in mammograms. This was achieved by integrating multiple fully connected output nodes in the output layer and applying different objective functions for each calcification type instead of training different models or using one model with a shared objective function for different classes. The experimental results demonstrate that the proposed MTU-Net model can reduce training and inference times compared to separate multi-structure segmentation problems. In addition, this helps the model converge faster and delivers better segmentation results for specific samples. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. ENSEMBLE LEARNING-BASED AUTOMATIC DETECTION OF LANDSLIDE AREAS FROM AERIAL PHOTOGRAPHS.
- Author
-
Opara, Jonpaul Nnamdi, Moriwaki, Ryo, and Pang-jo Chun
- Abstract
Landslides pose a significant threat to human life and property worldwide. Japan, with its vulnerability to these natural disasters, records a high incidence of landslides. The Geospatial Information Authority of Japan employs experts to visually examine aerial photographs before and after landslide events, a costly and time-consuming approach that can limit accuracy. This study aims to aid in mitigating the damage caused by landslides through accurate and efficient mapping and prediction. An Ensemble U-Net model integrating three U-Nets has been proposed to predict landslide areas from aerial photographs. Comparative analysis with a single U-Net model revealed that the Ensemble model significantly outperformed the single model in all accuracy measures, including precision, recall, and F1-score. The ensemble model's average intersection over union (IoU) value of 0.80 also indicated a stronger agreement between the predicted outcome and ground truth than the single U-Net model. Visual analysis of prediction results further demonstrated the superiority of the ensemble model in aligning closely with the ground truth, thereby reducing misidentification and missed detections. The proposed Ensemble U-Net model's potential to enhance the accuracy and efficiency of landslide mapping seems promising. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Study on Fractal Damage of Concrete Cracks Based on U-Net.
- Author
-
Xie, Ming, Wang, Zhangdong, Yin, Li'e, Xu, Fangbo, Wu, Xiangdong, and Xu, Mengqi
- Subjects
CONVOLUTIONAL neural networks ,REINFORCED concrete ,CRACKING of concrete ,FRACTAL dimensions ,IMAGE processing - Abstract
The damage degree of a reinforced concrete structure is closely related to the generation and expansion of cracks. However, the traditional damage assessment methods of reinforced concrete structures have defects, including low efficiency of crack detection, low accuracy of crack extraction, and dependence on the experience of inspectors to evaluate the damage of structures. Because of the above problems, this paper proposes a damage assessment method for concrete members combining the U-Net convolutional neural network and crack fractal features. Firstly, the collected test crack images are input into U-Net for segmenting and extracting the output cracks. The damage to the concrete structure is then classified into four empirical levels according to the damage index (DI). Subsequently, a linear regression equation is constructed between the fractal dimension (D) of the cracks and the damage index (DI) of the reinforced concrete members. The damage assessment is then performed by predicting the damage index using linear regression. The method was subsequently employed to predict the damage level of a reinforced concrete shear wall–beam combination specimen, which was then compared with the actual damage level. The results demonstrate that the damage assessment method for concrete members proposed in this study is capable of effectively identifying the damage degree of the concrete members, indicating that the method is both robust and generalizable. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Segmentation of Glacier Area Using U-Net through Landsat Satellite Imagery for Quantification of Glacier Recession and Its Impact on Marine Systems.
- Author
-
Robbins, Edmund, Breininger, Robert D., Jiang, Maxwell, Madera, Michelle, White, Ryan T., and Kachouie, Nezamoddin N.
- Subjects
LANDSAT satellites ,REMOTE-sensing images ,LAND cover ,CLIMATE change ,SURFACE area - Abstract
Glaciers have experienced a global trend of recession within the past century. Quantification of glacier variations using satellite imagery has been of great interest due to the importance of glaciers as freshwater resources and as indicators of climate change. Spatiotemporal glacier dynamics must be monitored to quantify glacier variations. The potential methods to quantify spatiotemporal glacier dynamics with increasing complexity levels include detecting the terminus location, measuring the length of the glacier from the accumulation zone to the terminus, quantifying the glacier surface area, and measuring glacier volume. Although some deep learning methods designed purposefully for glacier boundary segmentation have achieved acceptable results, these models are often localized to the region where their training data were acquired and further rely on the training sets that were often curated manually to highlight glacial regions. Due to the very large number of glaciers, it is practically impossible to perform a worldwide study of glacier dynamics using manual methods. As a result, an automated or semi-automated method is highly desirable. The current study has built upon our previous works moving towards identification methods of the 2D glacier profile for glacier area segmentation. In this study, a deep learning method is proposed for segmentation of temporal Landsat images to quantify the glacial region within the Mount Cook/Aoraki massif located in the Southern Alps/Kā Tiritiri o te Moana of New Zealand/Aotearoa. Segmented glacial regions can be further utilized to determine the relationship of their variations due to climate change. This model has demonstrated promising performance while trained on a relatively small dataset. The permanent ice and snow class was accurately segmented at a 92% rate by the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Enhancing Brain Tumor Diagnosis with L-Net: A Novel Deep Learning Approach for MRI Image Segmentation and Classification.
- Author
-
Dénes-Fazakas, Lehel, Kovács, Levente, Eigner, György, and Szilágyi, László
- Subjects
CONVOLUTIONAL neural networks ,CANCER diagnosis ,PITUITARY tumors ,IMAGE recognition (Computer vision) ,MAGNETIC resonance imaging ,BRAIN tumors - Abstract
Background: Brain tumors are highly complex, making their detection and classification a significant challenge in modern medical diagnostics. The accurate segmentation and classification of brain tumors from MRI images are crucial for effective treatment planning. This study aims to develop an advanced neural network architecture that addresses these challenges. Methods: We propose L-net, a novel architecture combining U-net for tumor boundary segmentation and a convolutional neural network (CNN) for tumor classification. These two units are coupled such a way that the CNN classifies the MRI images based on the features extracted by the U-net while segmenting the tumor, instead of relying on the original input images. The model is trained on a dataset of 3064 high-resolution MRI images, encompassing gliomas, meningiomas, and pituitary tumors, ensuring robust performance across different tumor types. Results: L-net achieved a classification accuracy of up to 99.6%, surpassing existing models in both segmentation and classification tasks. The model demonstrated effectiveness even with lower image resolutions, making it suitable for diverse clinical settings. Conclusions: The proposed L-net model provides an accurate and unified approach to brain tumor segmentation and classification. Its enhanced performance contributes to more reliable and precise diagnosis, supporting early detection and treatment in clinical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. An Efficient Deep Learning Based AgriResUpNet Architecture for Semantic Segmentation of Crop and Weed Images.
- Author
-
Hussain, Ali Asgar and Nair, Pramod Sekharan
- Subjects
WEED control ,AGRICULTURE ,CROPS ,GROWING season ,SYSTEM identification ,DEEP learning - Abstract
The environment and agricultural output are being challenged by weeds and crops. Innovative advancements in alternative weed control techniques that seek to reduce the need on herbicides have been spurred by the growing demand for sustainable weed control methods. Having weed recognition that is sufficiently successful is a hurdle to the implementation of these methods for selective in-crop application. Deep learning has shown remarkable promise in a variety of vision tasks, leading to the development of several effective image-based weeds and crops identification systems. This study looks at the newest developments in deep learning techniques for pixel-wise semantic segmentation of identifying crops and weeds. The hardest problem is semantic segmentation-based weeds and crops recognition, which needs to be solved for smart farming to work well. The goal is to give each pixel of a picture its own class name. Deep learning for smart farming has a number of useful applications, one of the most important of which is identifying the exact position of crops and weeds on farms. There are already quite complex systems for separating weeds and crops, with millions of factors that need more time to train. To get around these problems, we suggest an AgriResUpNet design based on deep learning. It is a carefully put together mix of the U-Net and residual learning frameworks. To see how well the suggested model works on the crop and weed GitHub dataset in terms of pixel accuracy, precision, f1-sore, and IoU measures. The suggested network is tested and compared with other cutting-edge networks using a crop and weed dataset that is open to the public. Also, the AgriResUpNet model gave an IoU of 97.58% for weeds and 94.82% for crops. In addition, the data show that the AgriResUpNet can easily find both crops and weeds. This suggests that this design is better for finding weeds early in the growing season. It was shown in experiments and comparisons that the suggested network does better than existing designs in intersection over union (IoU) and F1-score. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. 融合Partial 卷积与残差细化的遥感影像建筑物提取算法.
- Author
-
侯佳兴, 齐向明, 郝明, and 张进
- Abstract
Copyright of Journal of Frontiers of Computer Science & Technology is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
49. DABT-U-Net: Dual Attentive BConvLSTM U-Net with Transformers and Collaborative Patch-based Approach for Accurate Retinal Vessel Segmentation.
- Author
-
Jalali, Y., Fateh, M., and Rezvani, M.
- Subjects
RETINAL blood vessels ,EYE diseases ,IMAGE segmentation ,ACCURACY ,EARLY diagnosis - Abstract
The segmentation of retinal vessels is vital for timely diagnosis. and treatment of various eye diseases. However, due to inherent characteristics of retinal vessels in fundus images such as changes in thickness, direction, and complexity of vessels, as well as imbalanced contrast between background and vessels, segmenting retinal vessels continues to pose significant challenges. Also, despite advancements in CNNbased methods, challenges such as insufficient extraction of structural information, complexity, overfitting, preference for local information, and poor performance in noisy conditions persist. To address these drawbacks, in this paper we proposed a novel modified U-Net named DABT-U-Net. Our method enhances discriminative capability by introducing Hierarchical Dilated Convolution (HDC), Dual Attentive BConvLSTM, and Multi-Head Self-Attention (MHSA) blocks. Additionally, we adopt a collaborative patch-based training approach to mitigate data scarcity and overfitting. Evaluation on the DRIVE and STARE datasets shows that DABT-U-Net achieves superior accuracy, sensitivity, and F1 score compared to existing methods, demonstrating its effectiveness in retinal vessel segmentation. Specifically, our proposed method demonstrates improvements in accuracy, sensitivity, and F1 score by 0.32%, 0.61%, and 0.14%, respectively, on the DRIVE dataset, and by 0.07%, 0.83%, and 0.14% on the STARE dataset compared to a less effective approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Wound Tissue Segmentation and Classification Using U-Net and Random Forest.
- Author
-
Arjun, V. S., Chandrasekhar, Leena, and Jaseena, K. U.
- Subjects
RANDOM forest algorithms ,CHRONIC wounds & injuries ,TISSUE wounds ,DIGITAL image processing ,NURSE practitioners ,WOUND healing - Abstract
Analysing wound tissue is a crucial research field for assessing the progression of wound healing. Wounds exhibit certain attributes concerning colour and texture, although these features can vary among different wound images. Research in this field serves multiple purposes, including confirming the presence of chronic wounds, identifying infected wounds, determining the origin of the wound and addressing other factors that classify and characterise various types of wounds. Wounds pose a substantial health concern. Currently, clinicians and nurses mainly evaluate the healing status of wounds based on visual examination. This paper presents an outline of digital image processing and traditional machine learning methods for the tissue analysis of chronic wound images. Here, we propose a novel wound tissue analysis system that consists of wound image pre-processing, wound area segmentation and wound analysis by tissue segmentation. The wound area is extracted using a simple U-Net segmentation model. Granulation, slough and necrotic tissues are the three primary forms of wound tissues. The k -means clustering technique is employed to assign labels to tissues. Within the wound boundary, the tissue classification is performed by applying the Random Forest classification algorithm. Both segmentation (U-Net) and classification (Random Forest) models are trained, and the segmentation gives 99% accuracy, and the classification model gives 99.21% accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.